Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
tobhey authored Aug 12, 2021
1 parent e75b86d commit fb2c98a
Showing 1 changed file with 8 additions and 2 deletions.
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ Open the [Jupyter notebook](./finegrained-traceability.ipynb) in Google Colab an

See [INSTALL](./INSTALL.md) for the local installation.

## Process

Since instantiating the fastText model and the spacy lemmatizer is expensive, the trace link recovery process is split into two phases: precalculation and trace link processing.
The following image shows a simplified architecture of the process.

Expand Down Expand Up @@ -56,12 +58,16 @@ This phase consists of the following activities:
* Apply similarity threshold filters
* Do the evaluation: calculate F1 or Mean Average Precision (MAP)

There is a corresponding [TraceabilityRunner](./TraceabilityRunner.py) for each configuration of the paper which automatically does this setup. Since the precalculated files are also included in the repository, you only have to instantiate a `TraceabilityRunner` and call the `calculate_f1` or the `calculate_map` method. See [App.py](./App.py) for an example.
There is a corresponding [TraceabilityRunner](./TraceabilityRunner.py) for each configuration of the paper which automatically does this setup. Since the precalculated files are also included in the repository, you only have to instantiate a `TraceabilityRunner` and call the `calculate_f1` or the `calculate_map` method. See [App.py](./App.py) or [Jupyter notebook](./finegrained-traceability.ipynb) for an example.

**Note 1:** Both `calculate_f1` and `calculate_map` have the optional parameters `matrix_file_path` and `artifact_map_file_path` to locate the precalculated files. The default value is `None` which means that the precalculated files are loaded from their default locations.

**Note 2:** Itrust doesn't uses use case templates. Therefore, running Itrust with any `BaseLineUCT*Runner` will not work.

## Step-by-Step

A step-by-step description of how to reproduce the results in the paper is given in the [Jupyter notebook](./finegrained-traceability.ipynb).

## Datasets

This repository contains four datasets:
Expand All @@ -86,4 +92,4 @@ The datasets can be found in the [datasets](./datasets/) folder.

The original SMOS and eAnci dataset can be attributed to Gethers et al., On integrating orthogonal information retrieval methods to improve traceability recovery. In 2011 27th IEEE International Conference on Software Maintenance (ICSM), Sep. 2011. Available: https://doi.org/10.1109/ICSM.2011.6080780

The original eTour dataset was provided for the TEFSE challenge at 6th International Workshop on Traceability in Emerging Forms of Software Engineering (TEFSE), 2011 and was retrieved from http://coest.org/
The original eTour dataset was provided for the TEFSE challenge at 6th International Workshop on Traceability in Emerging Forms of Software Engineering (TEFSE), 2011 and was retrieved from http://coest.org/

0 comments on commit fb2c98a

Please sign in to comment.