Install
To install RefactorErl you can use the referl script located in the bin directory of the release. Unzip the release and type the following command:
bin/referl -build tool
To use the web interface of the tool, you have to provide the path to the YAWS' ebin directory during compilation:
bin/referl -build tool -yaws_path path_to_yaws_ebin
System requirements: Install page
Build configuration options: Parameters of the referl script
Starting the tool
To start the tool with the default config use: bin/referl. This will store the source code representation in a Mnesia database. However, to decrease memory footprint and speed up the tool you may want to use the Kyoto Cabinet backend of RefactorErl:
bin/referl -db kcmini
For further options please check the StartUp page
Interfaces: ri, web
RefactorErl? provides several user interfaces.
The interactive Erlang shell interface -- ri -- gives you a simple command (function call) based usage. To ask help about the parameters of the function you can use the helper functions, such as ri:h() or ri:*_h().
If you successfully installed Yaws to your machine, you can use the web interface of the tool as well. When you compile the tool you have to provide the path to the YAWS' ebin directory: bin/referl -build tool -yaws_path path_to_yaws_ebin
When starting the web interface with ri:start_web/1, you have to configure the webserver. Without configuration, the webserver will be started at localhost:8001
Building a database
For a first time user, we suggest using the ri interface to build the database of RefactorErl from the source code. There are several options to analyse the source files. For details see the file management page. Here we discuss some basic scenarios.
You can add files using the ri:add/* functions. To add a single module just provide the path to the file as an argument: ri:add("path_to_file")
The same command can be used to add directories recursievely: ri:add("path_to_dir")
If your included files are located in a separate directory, please add them as an include environment> ri:addenv(include, "path_to_include_dir")
If your software follows the Erlang application hierarchy and you have a library of your software.
- Add the path to the application libarary as an environment variable: ri:addenv(appbase, "path_to_my_lib")
- Add the files using a subkey of the application base path and the name of the application> ri:add(my_lib, my_application)
- This mode helps RefactorErl to find the include files in the appropriate include folders.
Using the tool
RefactorErl, as stated in its name, originally started as a refactoring project for Erlang. During the years of development, the focus of the project was shifted to the direction of code comprehension and software maintenance support. The main features of the tool are:
- Refactoring
- Semantic queries
- Dependency detection and visualisation
- Software complexity metrics
- Bad smell detection
- Checking design rules
- Vulnerability detection
- Clustering
- Duplicated code detection
- etc
Demo
In the next examples, we will use the source of the Mnesaia database as a target software.
- starting the tool: bin/referl -db kcmini
- building the database:
Eshell V10.6.1 (abort with ^G) (refactorerl@localhost)1> ri:ls(). {{ok,[]},{error,[]}} ok (refactorerl@localhost)2> ri:envs(). output = original appbase = "/usr/local/Cellar/erlang/22.2.1/lib/erlang/lib" ok (refactorerl@localhost)3> ri:add(erlang, mnesia). Adding: /usr/local/Cellar/erlang/22.2.1/lib/erlang/lib/mnesia-4.16.2/src | 5.20 kB/s >>>>>>>>>>>>>>>>>>>| [ 420/ 420] mnesia.erl | 6.78 kB/s >>>>>>>>>>>>>>>>>>>| [ 5/ 5] mnesia_app.erl | 8.12 kB/s >>>>>>>>>>>>>>>>>>>| [ 3/ 3] mnesia_backend_type.erl | 5.13 kB/s >>>>>>>>>>>>>>>>>>>| [ 12/ 12] mnesia_backup.erl |>>>>>>>>>>>> 6.05 kB/s| [ 48/ 87] mnesia_bup.erl .... (refactorerl@localhost)4> ri:ls(). {{ok,["/usr/local/Cellar/erlang/22.2.1/lib/erlang/lib/mnesia-4.16.2/src/mnesia_app.erl", "/usr/local/Cellar/erlang/22.2.1/lib/erlang/lib/mnesia-4.16.2/src/mnesia_controller.erl", "/usr/local/Cellar/erlang/22.2.1/lib/erlang/lib/mnesia-4.16.2/src/mnesia_ext_sup.erl", "/usr/local/Cellar/erlang/22.2.1/lib/erlang/lib/mnesia-4.16.2/src/mnesia_frag_hash.erl", ....
ri:ls() checks the content of the database, so it lists the already analysed files. ri:envs() checks the environmental variables of RefactorErl. You might realise that the path of the Erlang installation library was already set, so we can use its subkey to add the mnesia application: ri:add(erlang, mnesia).
The query ri:q("mods.loc:sum"). returns the line of code analyzed:
(refactorerl@localhost)5> ri:q("mods.loc:sum"). sum = 24299
Using the query language
The semantic query language of RefactorErl can be used to gather information about the source code according to the interest of the developer.
The query language was designed according to the syntactic/semantic entities of the Erlang language. So it introduces files, macros, modules, functions, expressions, records, record fields, etc.
The detailed description of the queries can be found here. The description of the entities, its properties and selectors are defined here, but you can use the ? selector in ri: ri:q("mods.?"). For further examples please check this page.
Once you build a query, you need an initial selector to start. That can be either a position based entity selection @fun -- the function pointed in the web interface, or a global starting point, like mods -- all analysed modules.
Once you selected an entity, you may ask some property of that entity:
- mods.name -- the name of the module
or ask its connected entities:
- mods.funs -- functions defined in the modules
You can also filter the entities:
- mods[name=foo].funs.calls -- what are the functions that are called in the functions of the foo module
In the following, we will show some queries on the previously built database from the source of Mnesia.
Detecting relations, gathering information about the source code
- ri:q(mods[name=mnesia_log].funs). -- lists all the functions from the mnesia_log module
- ri:q(mods[name=mnesia_log].funs.name). -- lists the name of the functions from the mnesia_log module
- ri:q(mods[name=mnesia_log].funs.refs). -- lists the references (the function applications) of the the functions from the mnesia_log module
- ri:q(mods[name=mnesia_log].funs[.refs]). -- lists the function which are called somewhere ([.ref] behaes like an embedded query. If the result is empty, that is a false filetr, otherwise it is true)
- mods[name=mnesia_log].funs[name=open_log] -- search the definitions of the mnesio_log:open_log functions
- mods[name=mnesia_log].funs[name=open_log, arity=4] -- search the definition of the mnesio_log:open_log/4 function
- mods[name=mnesia_log].funs[name=open_log].refs -- search the references of the mnesio_log:open_log functions
- mods[name=mnesia_log].funs[name=open_log].called_by -- search the functions that calls the mnesio_log:open_log function
- @fun.called_by -- search the functions that call the pointed function
- mods[name=mnesia_log].funs[name=open_log].calls -- search the functions that are called from the mnesio_log:open_log functions
- @fun.calls -- search the functions tha call the pointed function
- mods.records[name=mnesia_select] -- list the mnesia_select record definition
- files.records[name=mnesia_select] -- list the mnesia_select record definiton
- mods.records[name=mnesia_select].refs -- list the record usgaes of the mnesia_select record
- @record.refs -- list the usages of the pointed record
- mods.records[name=mnesia_select].field[name=orig].refs -- list the mnesia_select record expressions where the orig field is used
- @field.refs -- list the references of the pointed record field
- mods.funs.exprs.sub[type=atom, value=mnesia_tid_locks] -- list all mnesia_tid_locks atoms
- mods.funs.exprs.sub[type=string, value~"Error message.*"] -- list all strings which contains the "Error message" string
- mods[name=mnesia_log].funs.exprs.sub[type=tuple,[.sub[index=1].origin[type=atom,value =backup_args]]] -- list all functions which was called with a tuple as an argument containing backup_args atom as a first argument (foo({backup_args, Sth1, Sth2}, Sth3))
- @expr.origin -- list the posisble values of the pointed expression
- mods[name=mnesia_log].funs[name=open_log].refs[.param[index=1].origin[type=atom, value=decision_log]] -- list the applications of the mnesia_log:open_log function where the value of the first parameter (the name of the table) can be decision_log
- mods[name=mnesia].funs[name=foldr].called_by -- list the functions that calls mnesia:foldr
- mods[name=mnesia].funs[name=foldr].called_by[mod /= mnesia] -- list the calls to the foldr function from an other module than the defining mnesia
- mods.macros -- list alldefined macros
- mods.macro[name="DEBUG_TAB"].refs -- list the usages of the DEBUG_TAB macro
- @macro.refs -- list the appications of the pointed macro
- @expr.macro_value -- show the expanded value of a macsro application
Once you run the ri:anal_dyn() command, the tool analyses the dynamic function calls as well. This allows the query language to list the dynamic references of the functions as well:
- mods.funs.dynrefs
- mods[name=mnesia].funs[name=write, arity=1].dynrefs -- list the apply or MFA calls that refers to the mnesia:write/1 function
- @fun.dynrefs -- list the dynamic refernces of the pointed function
Checking design rules
- mods.funs.is_tail_rec -- list whether a function is tail recursive
- mods.funs[loc>50] -- lists long functions
- mods.funs[max_depth_of_cases>3] -- list the functions that are too deeply nested
- mods.funs[branches_of_recursion>5] -- list those functions that considered to complax and has more than 5 recursive branches
- mods[max_length_of_line>80] -- lists the modules containing lines longer than 80 characters
- mods.funs[max_length_of_line>80] -- lists the functions containing lines longer than 80 characters
- mods.export_all_used or mods[export_all_used] -- list the modules containing export_all compile attribute
- mods[name=M].funs[.called_by[module/=M]] -- list the functions that are used outside of the defining module
- mods.funs.calls[name=format, module=io] -- list the io:format calls if exist
- mods.funs[.calls[name=format, module=io]] -- list the function that contains io:format calls
- mods[.funs.calls[name=format, module=io]] -- list the modules that contains io:format calls
- mods[name~"xyz."][.funs.calls[not (module~"xyz.*")]] -- list all the modules staring with xyz that contains calls to modules that are not starting with xyz
- mods[name~"xyz.*"].funs[.calls[not ((module~"xzy.*") or (module in | erlang,lists |)) ]] -- the same as before. but allowing a few library modules
- mods.funs.exprs.sub[type=integer, not .macro] -- list the integers that are not defined in a macro
- mods.funs.vars[name~"atom"], mods.funs.vars[name~"string"], mods.funs.vars[name~"list"] -- list variable names containing the text atom, string, list, etc..
- mods.funs.exprs.sub[type=tuple,[.sub[index>5]]] -- list big tuples
- mods.funs[.exprs.sub[type=tuple,[.sub[index>5]]]] -- list functions containing big tuples
- mods[no_tabs=false] -- list modules containing tabs
- mods.funs.exprs.sub[type=catch_expr] -- list the expressions where catch used
- mods.funs[.exprs.sub[type=catch_expr]] -- list function conating catch
Detecting vulnerabilities
- Check the presentation about the vulnerability checkers: http://plc.inf.elte.hu/erlang/dl/secure_coding_presentation.pdf
- Check a submitted paper about them: http://plc.inf.elte.hu/erlang/dl/submitted_paper.pdf
- mods.funs.unsecure_calls -- Lists all the possible vulnerabilities
- mods.funs.unsecure_interoperability -- Lists interoperability related weaknesses
- mods.funs.unsecure_concurrency -- Identifies concurrency related issues
- mods.funs.unsecure_os_call -- Checks for OS injection
- mods.funs.unsecure_port_creation -- Identifies port creation related issues
- mods.funs.unsecure_file_operation -- Lists unsecure file handling
- mods.funs.unstable_call -- Shows possible atom exhaustion
- mods.funs.nif_calls -- Identifies unsecure NIF calls
- mods.funs.unsecure_port_drivers -- Lists the unsecure ddll usage
- mods.funs.decommissioned_crypto -- Lists the legacy functions from crypto module
- mods.funs.unsecure_compile_operations -- Shows unsecure compile/code loading related operations
- mods.funs.unsecure_process_linkage -- Lists unsecure process linkage
- mods.funs.unsecure_prioritization -- Identifies unsecure process prioritization
- mods.funs.unsecure_ets_traversal -- Lists unsecure ETS traversal
- mods.funs.unsafe_network -- Checks for unsecure kernel related operation
- mods.funs.unsecure_xml_usage -- Identifies unsecure xml parsing
- mods.funs.unsecure_communication -- Lists unsecure communication related settings
Features of the web interface
Database browser
- The web interface of the tool provides a view where the content of the database can be observed.
- You can search for a certain module or header
- and view the content of the source in the code browser
Code browser
- The web interface provides a code browser to check the source code with clickable code parts
- built-in queries can be started from the source that navigates to the queries tab
- clickable entities are functions, records, variables, macros, etc.
- built-in dynamic function calls
- user-defined function references
- references, definition, macro value on right-click
- provides a function quick list view
- plain text search in available
- semantic search from one file (position-based)
- you can just point some entity and write a query starting with @ -- @expr, @fun, etc
- queries are shared with other users
Dependency graph view
- Dependency defined through function calls between modules
- You can customize the graph by filtering the included modules, excluding library dependencies, searching cyclic dependencies, etc.
Duplicated code detection
- Different duplicated searching algorithms are supported
- Parameters can be set based on the selected algorithm to customize the searching
- Searching on the full database may take a long time