Changes between Version 2 and Version 3 of SuffixTreeBasedDuplicateCodeAnalysis


Ignore:
Timestamp:
May 26, 2012, 3:11:55 PM (13 years ago)
Author:
manualwiki
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SuffixTreeBasedDuplicateCodeAnalysis

    v2 v3  
    11= Duplicate code analysis = 
     2 
     3In large program systems often occure duplicate codes, which is a computer  
     4programming term for a sequence of source code that occurs more than once. There  
     5are two ways in which two code sequences can be duplicates of each other:  
     6syntactically and functionally. This new feature can detect the syntactically  
     7similar duplicates, that only differ in atoms, constants (integer, float and  
     8string) and the name of variables. A lot of identical or very similar code  
     9fragments can result the application of copy and paste, for example. 
    210 
    311== Parameters == 
    412 
    5 ||Parameter||Description||Default value||Example|| 
    6 ||files||files in which the search is carried out||all files from the database||{files,["/usr/local/lib/erlang/lib/mnesia-4.5/src/mnesia_log.erl","/usr/local/lib/erlang/lib/mnesia-4.5/src/mnesia_lib.erl"]}|| 
    7 ||minlen||minimal length of a clone(in tokens)||10||{minlen,50}|| 
    8 ||minnum||minimal number of clones in one clone group||2||{minnum,5}|| 
    9 ||overlap||scale of the overlap||0||{overlap,1}|| 
     13Parameters can be defined by a proplist. The possible properties are summed up  
     14in the table below. All the properties are optional. 
    1015 
    11 * {files, Files} 
    12   Files = [ Module::atom(), Filepath::string() | File::string() | RegExp::string() ] 
    13 * {minlen, integer()} 
    14 * {minnum, integer()} 
    15 * {overlap, integer()} 
    16 * {output, Filename::string()} 
    17 * {name, atom()} 
     16||=Parameter=||=Description=||=Type=||=Default value=||=Example=|| 
     17||files||files in which the search is carried out||[Module::atom() [[BR]] |Filepath::string() [[BR]] |RegExp::string() [[BR]] |File::string()]||all files from the database||{files,["/usr/local/lib/erlang/lib/mnesia-4.5/src/mnesia_log.erl", module]}|| 
     18||minlen||minimal length of a clone(length is in tokens)||integer()||10||{minlen,50}|| 
     19||minnum||minimal number of clones in one clone group||integer()||2||{minnum,5}|| 
     20||overlap||maximum length that duplicates can overlap each other (length is in tokens)||integer()||0||{overlap,1}|| 
     21||output||name of the file in which to save the result of the analysis||string()||-||{output,"result.txt"}|| 
     22||name||name of the result in the table||atom()||the string "temp" concatenated with the timestamp||{name,referl}|| 
     23 
     24== Interfaces == 
     25 
     26=== Web interface === 
     27See [wiki:WebInterface/CodeDuplicates Duplicate code analysis] on the web interface. 
     28 
     29=== Console interface === 
     30 
     31We currently have two interface functions in the ri module: search_duplicates/0  
     32and search_duplicates/1. The first uses the default values of the parameters.  
     33The second takes a proplist as parameter described above. Both interface  
     34function provide information about the progress of the process. The result is  
     35the list of clone groups. Every clones are defined by the path of the file and  
     36information about the start and end positions (line and column number). 
     37An example can be found at the bottom of the page. 
    1838 
    1939== Examples == 
    2040 
    2141{{{#!erlang 
    22 ri:search_duplicates([ 
    23         {files, ["/home/csibe/Downloads/examples/one.erl", 
    24                          "^(/home)[0-9a-zA-Z/_]+(/src)$"]}]). 
     42ri:search_duplicates(). 
    2543}}} 
    2644 
    2745{{{#!erlang 
    2846ri:search_duplicates([ 
    29         {files, ["/home/csibe/Downloads/examples/one.erl", 
    30                          "^(/home)[0-9a-zA-Z/_]+(/src)$"]}, 
     47        {files, ["/home/user/dups/dup1.erl", 
     48                 "/home/[0-9a-zA-Z/_.\-]+/src"]}, 
    3149        {minlen, 50}, 
    32         {overlap, 1}]). 
     50        {minnum, 3}, 
     51        {overlap, 1}, 
     52        {output, "result.txt"}, 
     53        {name, filtered}]). 
    3354}}} 
     55 
     56Small example of the return value: 
     57 
     58{{{#!erlang 
     59ri:search_duplicates([{files,[dup1, dup2]}]). 
     60 
     61Initial clone detection started. 
     62Initial clone detection finished. 
     63Trimming clones started. 
     64Trimming clones finished. 
     65Filter clones finished. 
     66Calculating positions started. 
     67Calculating positions finished. 
     68[[[{filepath,"/home/user/dups/dup1.erl"}, 
     69   {startpos,{15,1}}, 
     70   {endpos,{17,24}}], 
     71  [{filepath,"/home/user/dups/dup2.erl"}, 
     72   {startpos,{5,1}}, 
     73   {endpos,{7,24}}]], 
     74 [[{filepath,"/home/user/dups/dup1.erl"}, 
     75   {startpos,{8,1}}, 
     76   {endpos,{14,11}}], 
     77  [{filepath,"/home/user/dups/dup2.erl"}, 
     78   {startpos,{12,1}}, 
     79   {endpos,{18,11}}]]] 
     80}}}