[Tool]相似度分析- Simian簡介
前言
要怎麼找到系統裡面有哪些程式碼是重複的?很遺憾的是,我在Visual Studio裡面,似乎沒有看到這樣的工具。
而這篇文章要介紹的Simian,就是靜態程式碼分析裡面,用來掃描有哪些程式碼是重複的。(通常也就是copy/paste的code)
重複的程式碼,很有可能就會有壞味道,也就是需要重構的候選名單,透過工具的掃描、CI的基礎建設,可以讓每一次的簽入,都知道最新的系統是否有壞味道的產生。
簡介
- Simian官網:http://www.harukizaemon.com/simian/
- 這套工具在商用是要收費的,非商用與教育則可以試用。
- 可透過.jar或.exe來執行,是屬於command line的方式執行。可透過Visual Studio的外部工具更方便的使用。(但引數還是得自己打)
-
支援的語言不少:
- 引數說明請參考:http://www.harukizaemon.com/simian/installation.html裡面的Command Line Interface。
範例
-
下載完Simian之後,打開Visual Studio,選『工具=>外部工具』,加入一個新的外部工具,命令的部份,請輸入你的Simian.exe的位置,例如:『C:\Users\91\Desktop\simian-2.3.33\bin\simian-2.3.33.exe』,引數的部分則根據需求與上面簡介第五點的說明來決定,例如:『-formatter=vs:c:\temp\joey_simian.log -language=cs $(ProjectDir)/**/*.cs』。這是指要將分析的結果放到c:\temp\joey_simian.log這個檔案。針對的語言是c#,要掃描現在程式所屬的專案底下所有的*.cs檔案。(若有需要排除特定的pattern,例如測試程式,可透過-excludes引數,例如:『-includes=**/*.cs -excludes=**/*Test.cs』)
- 接著在Visual Studio上,選『工具=>Simian solution』,就會針對該引數所決定的方式去做程式碼分析。
-
這邊使用PetShop當做範例,掃描整個solution裡面所有的*.cs檔案,擷取最後的結果做說明:
可以看到最大的重複程式碼,是有70行程式碼重複,也就是Web\App_Code\CustomList.cs的第13行~第145行,與Web\App_Code\CustomGrid.cs的第13行到第143行一樣。
實際的檢查一下:
CustomList.cs
using System;
using System.Collections;
using System.Collections.Specialized;
using System.Text;
using System.Text.RegularExpressions;
using System.Web.UI;
using System.Web.UI.WebControls;
namespace PetShop.Web {
public class CustomList : DataList {
//Static constants
protected const string HTML1 = "<table cellpadding=0 cellspacing=0><tr><td colspan=2>";
protected const string HTML2 = "</td></tr><tr><td class=paging align=left>";
protected const string HTML3 = "</td><td align=right class=paging>";
protected const string HTML4 = "</td></tr></table>";
private static readonly Regex RX = new Regex(@"^&page=\d+", RegexOptions.Compiled);
private const string LINK_PREV = "<a href=?page={0}>< Previous</a>";
private const string LINK_MORE = "<a href=?page={0}>More ></a>";
private const string KEY_PAGE = "page";
private const string COMMA = "?";
private const string AMP = "&";
protected string emptyText;
private IList dataSource;
private int pageSize = 10;
private int currentPageIndex;
private int itemCount;
override public object DataSource {
set {
//This try catch block is to avoid issues with the VS.NET designer
//The designer will try and bind a datasource which does not derive from ILIST
try {
dataSource = (IList)value;
ItemCount = dataSource.Count;
}
catch {
dataSource = null;
ItemCount = 0;
}
}
}
public int PageSize {
get { return pageSize; }
set { pageSize = value; }
}
protected int PageCount {
get { return (ItemCount - 1) / pageSize; }
}
virtual protected int ItemCount {
get { return itemCount; }
set { itemCount = value; }
}
virtual public int CurrentPageIndex {
get { return currentPageIndex; }
set { currentPageIndex = value; }
}
public string EmptyText {
set { emptyText = value; }
}
public void SetPage(int index) {
OnPageIndexChanged(new DataGridPageChangedEventArgs(null, index));
}
override protected void OnLoad(EventArgs e) {
if (Visible) {
string page = Context.Request[KEY_PAGE];
int index = (page != null) ? int.Parse(page) : 0;
SetPage(index);
}
}
/// <summary>
/// Overriden method to control how the page is rendered
/// </summary>
/// <param name="writer"></param>
override protected void Render(HtmlTextWriter writer) {
//Check there is some data attached
if (ItemCount == 0) {
writer.Write(emptyText);
return;
}
//Mask the query
string query = Context.Request.Url.Query.Replace(COMMA, AMP);
query = RX.Replace(query, string.Empty);
// Write out the first part of the control, the table header
writer.Write(HTML1);
// Call the inherited method
base.Render(writer);
// Write out a table row closure
writer.Write(HTML2);
//Determin whether next and previous buttons are required
//Previous button?
if (currentPageIndex > 0)
writer.Write(string.Format(LINK_PREV, (currentPageIndex - 1) + query));
//Close the table data tag
writer.Write(HTML3);
//Next button?
if (currentPageIndex < PageCount)
writer.Write(string.Format(LINK_MORE, (currentPageIndex + 1) + query));
//Close the table
writer.Write(HTML4);
}
override protected void OnDataBinding(EventArgs e) {
//Work out which items we want to render to the page
int start = CurrentPageIndex * pageSize;
int size = Math.Min(pageSize, ItemCount - start);
IList page = new ArrayList();
//Add the relevant items from the datasource
for (int i = 0; i < size; i++)
page.Add(dataSource[start + i]);
//set the base objects datasource
base.DataSource = page;
base.OnDataBinding(e);
}
public event DataGridPageChangedEventHandler PageIndexChanged;
virtual protected void OnPageIndexChanged(DataGridPageChangedEventArgs e) {
if (PageIndexChanged != null)
PageIndexChanged(this, e);
}
}
}
CustomeGrid.cs
using System;
using System.Collections;
using System.Collections.Specialized;
using System.Text;
using System.Text.RegularExpressions;
using System.Web.UI;
using System.Web.UI.WebControls;
namespace PetShop.Web {
public class CustomGrid : Repeater {
//Static constants
protected const string HTML1 = "<table cellpadding=0 cellspacing=0><tr><td colspan=2>";
protected const string HTML2 = "</td></tr><tr><td class=paging align=left>";
protected const string HTML3 = "</td><td align=right class=paging>";
protected const string HTML4 = "</td></tr></table>";
private static readonly Regex RX = new Regex(@"^&page=\d+", RegexOptions.Compiled);
private const string LINK_PREV = "<a href=?page={0}>< Previous</a>";
private const string LINK_MORE = "<a href=?page={0}>More ></a>";
private const string KEY_PAGE = "page";
private const string COMMA = "?";
private const string AMP = "&";
protected string emptyText;
private IList dataSource;
private int pageSize = 10;
private int currentPageIndex;
private int itemCount;
override public object DataSource {
set {
//This try catch block is to avoid issues with the VS.NET designer
//The designer will try and bind a datasource which does not derive from ILIST
try {
dataSource = (IList)value;
ItemCount = dataSource.Count;
}
catch {
dataSource = null;
ItemCount = 0;
}
}
}
public int PageSize {
get { return pageSize; }
set { pageSize = value; }
}
protected int PageCount {
get { return (ItemCount - 1) / pageSize; }
}
virtual protected int ItemCount {
get { return itemCount; }
set { itemCount = value; }
}
virtual public int CurrentPageIndex {
get { return currentPageIndex; }
set { currentPageIndex = value; }
}
public string EmptyText {
set { emptyText = value; }
}
public void SetPage(int index) {
OnPageIndexChanged(new DataGridPageChangedEventArgs(null, index));
}
override protected void OnLoad(EventArgs e) {
if (Visible) {
string page = Context.Request[KEY_PAGE];
int index = (page != null) ? int.Parse(page) : 0;
SetPage(index);
}
}
/// <summary>
/// Overriden method to control how the page is rendered
/// </summary>
/// <param name="writer"></param>
override protected void Render(HtmlTextWriter writer) {
//Check there is some data attached
if (ItemCount == 0) {
writer.Write(emptyText);
return;
}
//Mask the query
string query = Context.Request.Url.Query.Replace(COMMA, AMP);
query = RX.Replace(query, string.Empty);
// Write out the first part of the control, the table header
writer.Write(HTML1);
// Call the inherited method
base.Render(writer);
// Write out a table row closure
writer.Write(HTML2);
//Determin whether next and previous buttons are required
//Previous button?
if (currentPageIndex > 0)
writer.Write(string.Format(LINK_PREV, (currentPageIndex - 1) + query));
//Close the table data tag
writer.Write(HTML3);
//Next button?
if (currentPageIndex < PageCount)
writer.Write(string.Format(LINK_MORE, (currentPageIndex + 1) + query));
//Close the table
writer.Write(HTML4);
}
override protected void OnDataBinding(EventArgs e) {
//Work out which items we want to render to the page
int start = CurrentPageIndex * pageSize;
int size = Math.Min(pageSize, ItemCount - start);
IList page = new ArrayList();
//Add the relevant items from the datasource
for (int i = 0; i < size; i++)
page.Add(dataSource[start + i]);
//set the base objects datasource
base.DataSource = page;
base.OnDataBinding(e);
}
public event DataGridPageChangedEventHandler PageIndexChanged;
virtual protected void OnPageIndexChanged(DataGridPageChangedEventArgs e) {
if (PageIndexChanged != null)
PageIndexChanged(this, e);
}
}
}
可以看到,基本上除了繼承不同的Control,其他的程式碼一模一樣。
結論
在Features文件裡面可以看到更詳細的設定,例如要忽略哪些東西,threshold要超過幾行才要顯示等等。值得一提的是,Simian會自動略過『變數』宣告成不一樣,導致程式碼區塊不一樣。也就是,就算變數名稱宣告成不一樣,其他程式碼結構與邏輯一樣,Simian還是可以檢查出來,這些程式碼是Duplicate的。
透過這些工具的掃描,可以讓我們在做code review的時候,又少了一個需要在意的點,訂出品質指標後,就交給CI去產生報表,交給CI去設定簽入原則。馬上就可以知道,有哪些程式碼重複性太高,需要重構。
by the way,學校的老師應該用這個來掃描作業才對,馬上就知道同學之間有沒有哪些程式碼是互相抄襲的。
補充
在開發過程中,不想輸出成檔案,想直接在輸出視窗裡面看到分析結果也是可以的喔:
blog 與課程更新內容,請前往新站位置:http://tdd.best/