[Tool]相似度分析- Simian簡介

  • 10970
  • 0
  • 2011-12-13

[Tool]相似度分析- Simian簡介

前言
要怎麼找到系統裡面有哪些程式碼是重複的?很遺憾的是,我在Visual Studio裡面,似乎沒有看到這樣的工具。

而這篇文章要介紹的Simian,就是靜態程式碼分析裡面,用來掃描有哪些程式碼是重複的。(通常也就是copy/paste的code)

重複的程式碼,很有可能就會有壞味道,也就是需要重構的候選名單,透過工具的掃描、CI的基礎建設,可以讓每一次的簽入,都知道最新的系統是否有壞味道的產生。

簡介

  1. Simian官網:http://www.harukizaemon.com/simian/
  2. 這套工具在商用是要收費的,非商用與教育則可以試用。
  3. 可透過.jar或.exe來執行,是屬於command line的方式執行。可透過Visual Studio的外部工具更方便的使用。(但引數還是得自己打)
  4. 支援的語言不少:
    image
  5. 引數說明請參考:http://www.harukizaemon.com/simian/installation.html裡面的Command Line Interface。

 

範例

  1. 下載完Simian之後,打開Visual Studio,選『工具=>外部工具』,加入一個新的外部工具,命令的部份,請輸入你的Simian.exe的位置,例如:『C:\Users\91\Desktop\simian-2.3.33\bin\simian-2.3.33.exe』,引數的部分則根據需求與上面簡介第五點的說明來決定,例如:『-formatter=vs:c:\temp\joey_simian.log -language=cs $(ProjectDir)/**/*.cs』。這是指要將分析的結果放到c:\temp\joey_simian.log這個檔案。針對的語言是c#,要掃描現在程式所屬的專案底下所有的*.cs檔案。(若有需要排除特定的pattern,例如測試程式,可透過-excludes引數,例如:『-includes=**/*.cs -excludes=**/*Test.cs』)
     image
  2. 接著在Visual Studio上,選『工具=>Simian solution』,就會針對該引數所決定的方式去做程式碼分析。
  3. 這邊使用PetShop當做範例,掃描整個solution裡面所有的*.cs檔案,擷取最後的結果做說明:
    image 

可以看到最大的重複程式碼,是有70行程式碼重複,也就是Web\App_Code\CustomList.cs的第13行~第145行,與Web\App_Code\CustomGrid.cs的第13行到第143行一樣。

實際的檢查一下:

CustomList.cs

using System;
using System.Collections;
using System.Collections.Specialized;
using System.Text;
using System.Text.RegularExpressions;
using System.Web.UI;
using System.Web.UI.WebControls;

namespace PetShop.Web {

    public class CustomList : DataList {
        //Static constants
        protected const string HTML1 = "<table cellpadding=0 cellspacing=0><tr><td colspan=2>";
        protected const string HTML2 = "</td></tr><tr><td class=paging align=left>";
        protected const string HTML3 = "</td><td align=right class=paging>";
        protected const string HTML4 = "</td></tr></table>";
        private static readonly Regex RX = new Regex(@"^&page=\d+", RegexOptions.Compiled);
        private const string LINK_PREV = "<a href=?page={0}>&#060;&nbsp;Previous</a>";
        private const string LINK_MORE = "<a href=?page={0}>More&nbsp;&#062;</a>";
        private const string KEY_PAGE = "page";
        private const string COMMA = "?";
        private const string AMP = "&";

        protected string emptyText;
        private IList dataSource;
        private int pageSize = 10;
        private int currentPageIndex;
        private int itemCount;

        override public object DataSource {
            set {
                //This try catch block is to avoid issues with the VS.NET designer
                //The designer will try and bind a datasource which does not derive from ILIST
                try {
                    dataSource = (IList)value;
                    ItemCount = dataSource.Count;
                }
                catch {
                    dataSource = null;
                    ItemCount = 0;
                }
            }
        }

        public int PageSize {
            get { return pageSize; }
            set { pageSize = value; }
        }

        protected int PageCount {
            get { return (ItemCount - 1) / pageSize; }
        }

        virtual protected int ItemCount {
            get { return itemCount; }
            set { itemCount = value; }
        }

        virtual public int CurrentPageIndex {
            get { return currentPageIndex; }
            set { currentPageIndex = value; }
        }

        public string EmptyText {
            set { emptyText = value; }
        }

        public void SetPage(int index) {
            OnPageIndexChanged(new DataGridPageChangedEventArgs(null, index));
        }

        override protected void OnLoad(EventArgs e) {
            if (Visible) {
                string page = Context.Request[KEY_PAGE];
                int index = (page != null) ? int.Parse(page) : 0;
                SetPage(index);
            }
        }


        /// <summary>
        /// Overriden method to control how the page is rendered
        /// </summary>
        /// <param name="writer"></param>
        override protected void Render(HtmlTextWriter writer) {

            //Check there is some data attached
            if (ItemCount == 0) {
                writer.Write(emptyText);
                return;
            }

            //Mask the query
            string query = Context.Request.Url.Query.Replace(COMMA, AMP);
            query = RX.Replace(query, string.Empty);

           
            // Write out the first part of the control, the table header
            writer.Write(HTML1);

            // Call the inherited method
            base.Render(writer);
            
            // Write out a table row closure
            writer.Write(HTML2);

            //Determin whether next and previous buttons are required
            //Previous button?
            if (currentPageIndex > 0)
                writer.Write(string.Format(LINK_PREV, (currentPageIndex - 1) + query));

            //Close the table data tag
            writer.Write(HTML3);

            //Next button?
            if (currentPageIndex < PageCount)
                writer.Write(string.Format(LINK_MORE, (currentPageIndex + 1) + query));

            //Close the table
            writer.Write(HTML4);
        }

        override protected void OnDataBinding(EventArgs e) {

            //Work out which items we want to render to the page
            int start = CurrentPageIndex * pageSize;
            int size = Math.Min(pageSize, ItemCount - start);

            IList page = new ArrayList();

            //Add the relevant items from the datasource
            for (int i = 0; i < size; i++)
                page.Add(dataSource[start + i]);

            //set the base objects datasource
            base.DataSource = page;
            base.OnDataBinding(e);

        }

        public event DataGridPageChangedEventHandler PageIndexChanged;

        virtual protected void OnPageIndexChanged(DataGridPageChangedEventArgs e) {
            if (PageIndexChanged != null)
                PageIndexChanged(this, e);
        }
    }
}

CustomeGrid.cs

using System;
using System.Collections;
using System.Collections.Specialized;
using System.Text;
using System.Text.RegularExpressions;
using System.Web.UI;
using System.Web.UI.WebControls;

namespace PetShop.Web {

    public class CustomGrid : Repeater {
        //Static constants
        protected const string HTML1 = "<table cellpadding=0 cellspacing=0><tr><td colspan=2>";
        protected const string HTML2 = "</td></tr><tr><td class=paging align=left>";
        protected const string HTML3 = "</td><td align=right class=paging>";
        protected const string HTML4 = "</td></tr></table>";
        private static readonly Regex RX = new Regex(@"^&page=\d+", RegexOptions.Compiled);
        private const string LINK_PREV = "<a href=?page={0}>&#060;&nbsp;Previous</a>";
        private const string LINK_MORE = "<a href=?page={0}>More&nbsp;&#062;</a>";
        private const string KEY_PAGE = "page";
        private const string COMMA = "?";
        private const string AMP = "&";

        protected string emptyText;
        private IList dataSource;
        private int pageSize = 10;
        private int currentPageIndex;
        private int itemCount;

        override public object DataSource {
            set {
                //This try catch block is to avoid issues with the VS.NET designer
                //The designer will try and bind a datasource which does not derive from ILIST
                try {
                    dataSource = (IList)value;
                    ItemCount = dataSource.Count;
                }
                catch {
                    dataSource = null;
                    ItemCount = 0;
                }
            }
        }

        public int PageSize {
            get { return pageSize; }
            set { pageSize = value; }
        }

        protected int PageCount {
            get { return (ItemCount - 1) / pageSize; }
        }

        virtual protected int ItemCount {
            get { return itemCount; }
            set { itemCount = value; }
        }

        virtual public int CurrentPageIndex {
            get { return currentPageIndex; }
            set { currentPageIndex = value; }
        }

        public string EmptyText {
            set { emptyText = value; }
        }

        public void SetPage(int index) {
            OnPageIndexChanged(new DataGridPageChangedEventArgs(null, index));
        }

        override protected void OnLoad(EventArgs e) {
            if (Visible) {
                string page = Context.Request[KEY_PAGE];
                int index = (page != null) ? int.Parse(page) : 0;
                SetPage(index);
            }
        }

        /// <summary>
        /// Overriden method to control how the page is rendered
        /// </summary>
        /// <param name="writer"></param>
        override protected void Render(HtmlTextWriter writer) {

            //Check there is some data attached
            if (ItemCount == 0) {
                writer.Write(emptyText);
                return;
            }

            //Mask the query
            string query = Context.Request.Url.Query.Replace(COMMA, AMP);
            query = RX.Replace(query, string.Empty);

            // Write out the first part of the control, the table header
            writer.Write(HTML1);

            // Call the inherited method
            base.Render(writer);

            // Write out a table row closure
            writer.Write(HTML2);

            //Determin whether next and previous buttons are required
            //Previous button?
            if (currentPageIndex > 0)
                writer.Write(string.Format(LINK_PREV, (currentPageIndex - 1) + query));

            //Close the table data tag
            writer.Write(HTML3);

            //Next button?
            if (currentPageIndex < PageCount)
                writer.Write(string.Format(LINK_MORE, (currentPageIndex + 1) + query));

            //Close the table
            writer.Write(HTML4);
        }

        override protected void OnDataBinding(EventArgs e) {

            //Work out which items we want to render to the page
            int start = CurrentPageIndex * pageSize;
            int size = Math.Min(pageSize, ItemCount - start);

            IList page = new ArrayList();

            //Add the relevant items from the datasource
            for (int i = 0; i < size; i++)
                page.Add(dataSource[start + i]);

            //set the base objects datasource
            base.DataSource = page;
            base.OnDataBinding(e);

        }

        public event DataGridPageChangedEventHandler PageIndexChanged;

        virtual protected void OnPageIndexChanged(DataGridPageChangedEventArgs e) {
            if (PageIndexChanged != null)
                PageIndexChanged(this, e);
        }
    }
}

可以看到,基本上除了繼承不同的Control,其他的程式碼一模一樣。

 

結論

Features文件裡面可以看到更詳細的設定,例如要忽略哪些東西,threshold要超過幾行才要顯示等等。值得一提的是,Simian會自動略過『變數』宣告成不一樣,導致程式碼區塊不一樣。也就是,就算變數名稱宣告成不一樣,其他程式碼結構與邏輯一樣,Simian還是可以檢查出來,這些程式碼是Duplicate的。

透過這些工具的掃描,可以讓我們在做code review的時候,又少了一個需要在意的點,訂出品質指標後,就交給CI去產生報表,交給CI去設定簽入原則。馬上就可以知道,有哪些程式碼重複性太高,需要重構。

by the way,學校的老師應該用這個來掃描作業才對,馬上就知道同學之間有沒有哪些程式碼是互相抄襲的。

補充
在開發過程中,不想輸出成檔案,想直接在輸出視窗裡面看到分析結果也是可以的喔:

image


blog 與課程更新內容,請前往新站位置:http://tdd.best/