wiki:MetricQuery

Version 9 (modified by manualwiki, 12 years ago) (diff)

--

Metric Queries

A metric query language was designed to query some metric information about Erlang programs. Metrics also can be used as properties in our semantic query language.

Defined metrics

module_sum

The domain of the query is a module. The sum of the chosen complexity structure metrics measured on the modules functions. The proper metrics adjusted in a list can be implemented in the desired number and order.

line_of_code

The domain of the query is a module or a function. The number of the lines of part of the text, function, or module. The number of empty lines is not included in the sum. As the number of lines can be measured on more functions, or modules and the system is capable of returning the sum of these, the number of lines of the whole loaded program text can be enquired.

char_of_code

The domain of the query is a module or a function. The number of characters in a program script. This metric is capable of measuring both the codes of functions and modules and with the help of aggregating functions we can enquire the total and average number of characters in a cluster, or in the whole source text.

number_of_fun

The domain of the query is a module. This metric gives the number of functions implemented in the concrete module, but it does not contain the number of non-defined functions in the module.

number_of_macros

The domain of the query is a module. This metric gives the number of defined macros in the concrete module, or modules. It is also possible to enquire the number of implemented macros in a module.

number_of_records

The domain of the query is a module. This metric gives the number of defined records in a module. It is also possible to enquire the number of implemented records in a module.

included_files

The domain of the query is a module. This metric gives the number of visible header files in a module.

imported_modules

The domain of the query is a module. This metric gives the number of imported modules used in a concrete module. The metric does not contain the number of qualified calls (calls that have the following form: module:function).

number_of_funpath

The domain of the query is a module. The total number of function paths in a module. The metric, besides the number of internal function links, also contains the number of external paths, or the number of paths that lead outward from the module. It is very similar to the metric called cohesion.

function_calls_in

The domain of the query is a module. Gives the number of function calls into a module from other modules. It can not be implemented to measure a concrete function. For that we use the calls_for/1 function.

function_calls_out

The domain of the query is a module. Gives the number of every function call from a module towards other modules. It can not be implemented to measure a concrete function. For that we use the calls_from/1 function.

cohesion

The domain of the query is a module. The number of call-paths of functions that call each other. By call-path we mean that an f1 function calls f2 (e.g. f1()->f2().). If f2 also calls f1, then the two calls still count as one call-path.

function_sum

The domain of the query is a function. The sum calculated from the functions complexity metrics that characterizes the complexity of the function. It can be calculated using various metrics together. We can define metrics that are necessary to calculate the metrics constituting the sum (with enumeration in the referl_metrics module).

max_depth_of_calling

The domain of the query is a module or a function. The length of function call-chains, namely the chain with the maximum depth. The depth of calling in the following example is 3.

...
f([A|B], Acc) ->
   Acc0 = exec(A, Acc),
   f(B, Acc0);
f([], Acc0)->
    Acc0.

exec(A, Acc)->
   io:format("~w",[A]),
   A + Acc.
...

max_depth_of_cases

The domain of the query is a module or a function. Gives the maximum of case control structures embedded in case of a concrete function (how deeply are the case control structures embedded). In case of a module it measures the same regarding all the functions in the module. Measuring does not break in case of case expressions, namely when the case is not embedded into a case structure. However, the following embedding does not increase the sum.

...
A = case B of
        1 -> 2;
        2 -> ok
    end
...

min_depth_of_cases

The domain of the query is a module or a function. Gives the minimum of the maximums of case control structures embedded in case of a concrete function (how deeply are the case control structures embedded). In case of a module it measures the same regarding all the functions in the module. Measuring does not break in case of case expressions, namely when the case is not embedded into a case structure. However, the following embedding does not increase the sum.

...
A = case B of
        1 -> 2;
        2 -> ok
    end
...

max_depth_of_structs

The domain of the query is a module or a function. Gives the maximum of structures embedded in function (how deeply are the block, case, fun, if, receive, try control structures embedded). In case of a module it measures the same regarding all the functions in the module.

number_of_funclauses

The domain of the query is a module or a function. Gives the number of a functions clauses. Counts all distinct branches, but does not add the functions having the same name, but different arity, to the sum. The number of funclauses in the following example is 2.

...
f(Fun, [H|Tail])->
    Fun(H),
    f(Tail);

f(_, [])->
    ok.

f(A, B)->
    A + B.
...

branches_of_recursion

The domain of the query is a module or a function. Gives the number of a certain function's branches, how many times a function calls itself, and not the number of clauses it has besides definition. The branches of recursion in the following example is 2.

quicksort([H|T]) ->
    {Smaller_Ones,Larger_Ones} = split(H,T,{[],[]}),
    lists:append( quicksort(Smaller_Ones),
                  [H | quicksort(Larger_Ones)]
                );
quicksort([]) -> [].

split(Pivot, [H|T], {Acc_S, Acc_L}) ->
    if Pivot > H -> New_Acc = { [H|Acc_S] , Acc_L };
       true      -> New_Acc = { Acc_S , [H|Acc_L] }
    end,
    split(Pivot,T,New_Acc);
split(_,[],Acc) -> Acc.

calls_for_function

The domain of the query is a function. This metric gives the number of calls for a concrete function. It is not equivalent with the number of other functions calling the function, because all of these other functions can refer to the measured one more than once.

calls_from_function

The domain of the query is a function. This metric gives the number of calls from a certain function, namely how many times does a function refer to another one (the result includes recursive calls as well).

number_of_funexpr

The domain of the query is a module or a function. Gives the number of function expressions in a module. It does not measure the call of function expressions, only their initiation. In the next example the number of the funexpr is 1.

...
F = fun(A) -> A + 1 end,
F(1),
F2 = fun a/1,
...

number_of_messpass

The domain of the query is a module or a function. In case of functions it measures the number of code snippets implementing messages from a function, while in case of modules it measures the total number of messages in all of the modules functions.

fun_return_points

The domain of the query is a module or a function. The metric gives the number of the functions possible return points (or the functions of the given module).

average_size

The domain of the query is a module or a function. The average value of the given complexity metrics (e.g. Average branches_of_recursion calculated from the functions of the given module).

max_length_of_line

The domain of the query is a module or a function. It gives the length of the longest line of the given module or function.

average_length_of_line

The domain of the query is a module or a function. It gives the average length of the lines within the given module or function.

no_space_after_comma

The domain of the query is a module or a function. It gives the number of cases when there are not any whitespaces after a comma or a semicolon in the given module's or function's text.

is_tail_recursive

The domain of the query is a function. It returns with 1, if the given function is tail recursive; with 0, if it is recursive, but not tail recursive; and -1 if it is not a recursive function (direct and indirect recursions are also examined). If we use this metric from the semanctic query language, the result is converted to tail_rec, non_tail_rec or non_rec atom.

mcCabe

McCabe cyclomatic complexity metric. Available for modules and functions. We define it based on the control flow graph of the functions with the number of different execution paths of a function, namely the number of different outputs of the function.

otp_used

Gives the number of OTP callback modules used in modules.

Aggregations, filters on query results

We can extend our queries with filters. A filter can be formatting and aggregating function or the definition of the structure of the result.

The possible aggregation filters are listed below:

  • max: maximum on the result list
  • tolist: default return value of the query
  • totext: string format of the result
  • fmaxname: maximum with the name of the node
  • avg: average on the result list
  • min: minimum of the result list
  • sum: sum of the result list

Examples

Simple query

With the following query we count the number of functions of the modules given in the list.

show number_of_fun for module ('a','b')

where

  • number_of_fun: a function giving the number of functions
  • module: the type of node in the query
  • ('a','b'): contains the names of modules in which we calculate the metrics. In case the type of the node was defined as function the list must contain the following elements: The name of the module, in which the function was defined, the name of the function and its arity. In this case the list can have more than one element. The next list {'test','f',1} defines a function which is defined in the test module. Its name is f, and its arity is 1.

Advanced query

In the next example we would like to define the number of recursive calls of two functions defined in the a module, the number of branches on which the particular function calls itself, and we sum up the two results with the help of the sum aggregating function.

show branches_of_recursion for function ({'a','f',1},{'a','g',0}) sum

At the end of queries we can place filters which filter the results that are received at the output, or which are aggregating functions which change the result of the query.

Using metric queries from different interfaces

Metric queries can be executed from the console interface. To learn about the usage, please visit this page . Metrics also can be used as properties in our semantic query language which is available from every interface.

Metric analyser mode

The RefactorErl supports querying bad smells in the code. This feature depends on the metric analyzer which is disabled by default.

The metric analyser mode has to be manually enabled. To learn about the commands please visit the following page .

When metrics mode is turned on, RefactorErl initializes the internal metrics representation by creating the necessary tables, and loading them with the available module and function nodes. It also calculates the initial values of the metrics.

The limits of the metrics can currently be configured by editing the file metricmod.defs. An example of the current format of the file is shown below:

{module_metrics, [{line_of_code,{100,1000}},
{char_of_code,{100,60000}}, {number_of_fun,{0,10}}, ...]}.
{function_metrics, [{line_of_code,{0,20}},
{char_of_code,{0,600}}, {function_sum,{0,infty}}, ...]}.

The file contains two Erlang terms, one for the module level metrics and another for the function level metrics. The metrics analyser system has built-in defaults; any options given here override the defaults. For a given metric, the lower and upper limits can be given, e.g. the limits on the lines of code in the module are overridden in this file so that they are considered correct only if they are between 100 and 1000.

For example, the module a has 5 lines of code, which does not fit the arbitrary range 10..20, so module a has bad smell which can be queried.