Impact Analysis Heuristics

JRipples provides several modules that implement impact analysis heuristics; heuristics are used to determine change probabilities in components and guide the programmers to these components during IA. The results of the heuristics are displayed in the third column of JRipples views. Below is a short overview of these heuristics

Important note. The supported heuristics operate at granularity of classes and might not be available for the components of other granularities. Also, DBH and CCIR heuristics compute change probabilities based on a selected active component. The active component is the component with the most recent updated mark; also, the active component can be selected manually in the same way it is done for the Active class filter menu.

Dependency Based Heuristic (DBH)

Dependency based heuristics examine interactions between the source code of classes: the more two classes interact, the more a change in one class is likely to propagate to the other. The interactions considered by these heuristics are the source code dependencies that can be extracted using static program analysis; these might include method calls, aggregation, inheritance and so forth.

An extensive survey of static dependency based heuristics for IA was done in [1]. PIM heuristic (Method Invocations between classes, accounting for Polymorphism) performed particularly well in this evaluation and is selected for our case study.

Formally, let C(x,y) be a function that analyzes source code and counts the number of times a class x calls methods of a class y as well as methods of the base classes of y that are overloaded in y. Then, PIM(x,y)=C(x,y)+C(y,x).

We further normalize F and refer to it as Dependency Based Heuristic (DBH): DBH=PIM / (PIM+1).

Change Request to Class Similarity Evaluated by Information Retrieval Heuristic (RCIR)

A change request is a document that describes an update required in a software system, and the typical change request consists of a text describing details of this update. The heuristic in [2] uses Information Retrieval (IR)[3] to compare the text of a change request to the text in the source code of classes and is based on the following assumption: terms appearing in the text of the change request also appear in the source code of the impacted classes.

The heuristic pre-processes the source code of classes to identify meaningful words; this may include splitting composite identifiers, removing language-specific stop-words, and so forth. The similarity between the text of the query and the source code of the classes identifies relevant classes. Formally, let IR be a function that measures similarity between the text of two documents on the scale between 0 and 1. Then, for a class c and a change request r, IR(c,r) is the value of the heuristics for the class c.

Class To Class Similarity Evaluated By Information Retrieval Heuristic (CCIR)

The lsisem heuristic [4],[5] is based on the idea that classes sharing similar terms in their source code are likely to implement same concepts; therefore, these classes are also likely to change together.

The lsisem heuristic preprocesses the source code of classes like CRIR to identify meaningful words. The change probability for a pair of classes is then measured based on the similarity of the source code of these classes. Formally, let IR be a function that measures similarity between the text of two documents on the scale between 0 and 1. Then, for a class x and a class y, lsisem(x,y)=IR(x,y).

References