Mohsen Vakilian's Blog

April 25, 2008

Change Distilling

Filed under: evolution — mohsenvakilian @ 9:52 pm

The ChangeDistiller is an Eclipse plugin based on an algorithm described in a paper entitled “Change Distilling: Tree Differencing for Fine Grained Source Code Change Extraction“. It’s an improvement over the existing algorithm by Chawathe et al. for change detection in hierarchically structured information.

Chawathe et al. extract source code changes based on tree differencing. The algorithm splits the problem into two tasks:

  • Finding a “good” matching between the nodes of the trees T1 and T2.
  • Finding a minimum “conforming” edit script that transforms T1 into T2, given the computed matching.

Chawathe et al. define two fundamental matching criteria (one for leaves and one for inner nodes) necessary for the algorithm to produce a “good” matching set with which a minimum conforming edit script is achieved. The assumption behind these matching criteria is that:

For each leaf in T1 only one similar leaf in T2 can be found.

The problem is that this assumption fails for source codes. Thus, we don’t get a good matching due to mismatch propagation. Consequently, the resulting edit script won’t be optimum anymore. When the assumption fails, the algorithm applies a postprocessing step. However, Fluri et al. claim that there are a number of tree constellations in which the postprocessing step doesn’t improve the matching.

To solve these shortcomings, several options are proposed:

  1. Using best match instead of first match for leaves
  2. Using bigrams as string similarity matching instead of Levenshtein
  3. Trying out different values for thresholds
  4. Enabling dynamic thresholds
  5. Using inner node similarity weighting

They evaluated each configuration against the benchmark they had collected from ArgoUML, Azureus, and JDT.

In their implementation, they use the Eclipse compare plug-in to extract the methods and attributes that have changed. This prefiltering results lead to smaller trees for comparison. ASTs of both revisions are created using the AST visitor from JDT. And, the intermediate ASTs T1 and T2 are then fed into the change distilling algorithm. The output is a set of basic tree edit operations.

In spite of various improvements they used, the algorithm is still limited in finding the appropriate number of move operations. They tried to make their edit script optimum but not so readable. The script is described in terms of basic AST operations but it could be presented in a more intuitive way.



  1. Hi, Mohsen,

    Have you tried “ChangeDistiller”?
    I can’t find any download options from the links above.
    If you know how to download this Eclipse plugin, please let me know.
    Thank you!

    – Cox

    Comment by Cox Chen — May 6, 2008 @ 9:07 am | Reply

  2. I contacted Beat Fluri and he provided me the source code for the algorithm. But, he told me that running the distiller on arbitrary projects needs the whole Evolizer framework. And, Evolizer wasn’t available for public at that time.

    Comment by mohsenvakilian — May 6, 2008 @ 9:51 am | Reply

  3. Thank you for your information.

    Comment by Cox Chen — May 6, 2008 @ 10:01 pm | Reply

  4. Hi Mohsen,

    May you please provide me the contact details of Beat Fluri or kindly send me the source code to wzedan[at]


    Comment by Waleed Zedan — June 26, 2008 @ 3:19 am | Reply

  5. You may follow the first link listed by google when you search for “Beat Fluri” to find his homepage.

    Comment by mohsenvakilian — June 26, 2008 @ 9:01 pm | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Create a free website or blog at

%d bloggers like this: