| 2 | |
| 3 | In large program systems often occure duplicate codes, which is a computer |
| 4 | programming term for a sequence of source code that occurs more than once. There |
| 5 | are two ways in which two code sequences can be duplicates of each other: |
| 6 | syntactically and functionally. This new feature can detect the syntactically |
| 7 | similar duplicates, that only differ in atoms, constants (integer, float and |
| 8 | string) and the name of variables. A lot of identical or very similar code |
| 9 | fragments can result the application of copy and paste, for example. |
5 | | ||Parameter||Description||Default value||Example|| |
6 | | ||files||files in which the search is carried out||all files from the database||{files,["/usr/local/lib/erlang/lib/mnesia-4.5/src/mnesia_log.erl","/usr/local/lib/erlang/lib/mnesia-4.5/src/mnesia_lib.erl"]}|| |
7 | | ||minlen||minimal length of a clone(in tokens)||10||{minlen,50}|| |
8 | | ||minnum||minimal number of clones in one clone group||2||{minnum,5}|| |
9 | | ||overlap||scale of the overlap||0||{overlap,1}|| |
| 13 | Parameters can be defined by a proplist. The possible properties are summed up |
| 14 | in the table below. All the properties are optional. |
11 | | * {files, Files} |
12 | | Files = [ Module::atom(), Filepath::string() | File::string() | RegExp::string() ] |
13 | | * {minlen, integer()} |
14 | | * {minnum, integer()} |
15 | | * {overlap, integer()} |
16 | | * {output, Filename::string()} |
17 | | * {name, atom()} |
| 16 | ||=Parameter=||=Description=||=Type=||=Default value=||=Example=|| |
| 17 | ||files||files in which the search is carried out||[Module::atom() [[BR]] |Filepath::string() [[BR]] |RegExp::string() [[BR]] |File::string()]||all files from the database||{files,["/usr/local/lib/erlang/lib/mnesia-4.5/src/mnesia_log.erl", module]}|| |
| 18 | ||minlen||minimal length of a clone(length is in tokens)||integer()||10||{minlen,50}|| |
| 19 | ||minnum||minimal number of clones in one clone group||integer()||2||{minnum,5}|| |
| 20 | ||overlap||maximum length that duplicates can overlap each other (length is in tokens)||integer()||0||{overlap,1}|| |
| 21 | ||output||name of the file in which to save the result of the analysis||string()||-||{output,"result.txt"}|| |
| 22 | ||name||name of the result in the table||atom()||the string "temp" concatenated with the timestamp||{name,referl}|| |
| 23 | |
| 24 | == Interfaces == |
| 25 | |
| 26 | === Web interface === |
| 27 | See [wiki:WebInterface/CodeDuplicates Duplicate code analysis] on the web interface. |
| 28 | |
| 29 | === Console interface === |
| 30 | |
| 31 | We currently have two interface functions in the ri module: search_duplicates/0 |
| 32 | and search_duplicates/1. The first uses the default values of the parameters. |
| 33 | The second takes a proplist as parameter described above. Both interface |
| 34 | function provide information about the progress of the process. The result is |
| 35 | the list of clone groups. Every clones are defined by the path of the file and |
| 36 | information about the start and end positions (line and column number). |
| 37 | An example can be found at the bottom of the page. |
| 55 | |
| 56 | Small example of the return value: |
| 57 | |
| 58 | {{{#!erlang |
| 59 | ri:search_duplicates([{files,[dup1, dup2]}]). |
| 60 | |
| 61 | Initial clone detection started. |
| 62 | Initial clone detection finished. |
| 63 | Trimming clones started. |
| 64 | Trimming clones finished. |
| 65 | Filter clones finished. |
| 66 | Calculating positions started. |
| 67 | Calculating positions finished. |
| 68 | [[[{filepath,"/home/user/dups/dup1.erl"}, |
| 69 | {startpos,{15,1}}, |
| 70 | {endpos,{17,24}}], |
| 71 | [{filepath,"/home/user/dups/dup2.erl"}, |
| 72 | {startpos,{5,1}}, |
| 73 | {endpos,{7,24}}]], |
| 74 | [[{filepath,"/home/user/dups/dup1.erl"}, |
| 75 | {startpos,{8,1}}, |
| 76 | {endpos,{14,11}}], |
| 77 | [{filepath,"/home/user/dups/dup2.erl"}, |
| 78 | {startpos,{12,1}}, |
| 79 | {endpos,{18,11}}]]] |
| 80 | }}} |