Skip to content
adelorenz edited this page Apr 28, 2015 · 1 revision

#Results summary

The application summarizes the result of an evolution in JSON:

  • best solution,
  • performance of the solution,
  • execution times,
  • list of the extracted strings by the solution,
  • full experiment configuration
  • the list of dataset examples.

An example of result report JSON is as follows (repetitive parts have replaced with ...), individual fields are commented later:

{
  "datasetName": "Log/MAC",
  "methodDescription": "Web config",
  "comment": "This is an important experiment with an interesting configuration",
  "experimentDate": "Jul 24, 2014 3:45:30 PM",
  "machineHardwareSpecifications": "Intel(R) Core(TM)2 Duo CPU 	T6600  @ 2.20GHz",
  "bestSolution": {
        "solution": "(\\w++:\\w++:\\w++:\\w++:\\w++:\\w++)",
    "solutionJS": "((?=(\\w+))\\2:(?=(\\w+))\\3:(?=(\\w+))\\4:(?=(\\w+))\\5:(?=(\\w+))\\6:(?=(\\w+))\\7)",
    "trainingPerformances": {
    "match precision": 1.0,
    "character accuracy": 1.0,
    "character precision": 1.0,
    "match recall": 1.0,
    "character recall": 1.0,
    "match f-measure": 1.0
    },
    "validationPerformances": {
    "match precision": 1.0,
    "character accuracy": 1.0,
    "character precision": 1.0,
    "match recall": 1.0,
    "character recall": 1.0,
    "match f-measure": 1.0
    },
    "learningPerformances": {
    "match precision": 1.0,
    "character accuracy": 1.0,
    "character precision": 1.0,
    "match recall": 1.0,
    "character recall": 1.0,
    "match f-measure": 1.0
    },
    "fitness": [
    0.0,
    0.0,
    31.0
    ]
  },
  "bestExtractions": [
    [
    {"start": 119,"end": 136},
    {"start": 161,"end": 178}
    ],
    [
    {"start": 120,"end": 137},{"start": 162,"end": 179}
    ], ...
  ],
  "bestExtractionsStrings": [
    [
    "5c:0a:5b:63:4a:82",
    "50:06:04:92:53:44"
    ],
    [
    "00:21:dd:bc:95:44",
    "00:14:31:83:c6:8d"
    ], ...
  ],
  "examples": [
    {
    "string": "Jan 12 06:26:19: ACCEPT service http from 119.63.193.196 to firewall(pub-nic), prefix: \"none\" (in: eth0 119.63.193.196(5c:0a:5b:63:4a:82):4399 -> 140.105.63.164(50:06:04:92:53:44):80 TCP flags: ****S* len:60 ttl:32)",
    "match": … (refer to the dataset definition for the full example format)
  ],
  "bestStats": [
    {"fp": 0,"tp": 2},
    {"fp": 0,"tp": 2},
    {"fp": 0,"tp": 2}
  ],
  "overallExecutionTimeMillis": 5845,
  "numberMatches": 6,
  "numberUnmatches": 9,
  "numberMatchedChars": 102,
  "numberUnmatchedChars": 523,
  "numberAnnotatedChars": 625,
  "numberAllChars": 625,
  "numberTrainingMatches": 4,
  "numberTrainingUnmatches": 6,
  "characterEvaluations": 11234000,
  "configuration": {
    … (details of the evolution configuration)
  }
}

In detail:

  • datasetName is a copy of the dataset name (when provided)
  • methodDescription contains optional details about the evolution --i.e. “Web config” when using the web application and “Console config” when using the console application.
  • comment contains optional operator notes about the evolution result, they are requested at evolution start.
  • experimentDate is the start date of the experiment
  • machineHardwareSpecifications adds information about the hardware the evolution has run on.
  • bestSolution describes the generated solution:
    • solution in Java regex format, and
    • solutionJS the same in Javascript format.
    • performance is provided separately for the training set, validation set and learning set (i.e., the union of training and validation); match precision, match recall, match f-measure are computed at the match level; character precision, character recall, character accuracy are computed at the level of individual characters.
  • bestExtractions is a list of the extractions of the bestSolution; each list element is an extraction, represented as a pair of boundaries in the corresponding example.
  • bestExtractionsStrings is the same as above, except that each extraction is a substring.
  • bestExtractionsStats provides a list of per-example stats: number of true positive and false positive extractions (positive: when the extraction equals one match). examples is a copy of the dataset examples (this is an optional field for console application and can be omitted).
  • overallExecutionTimeMillis overall execution time in milliseconds.
  • characterEvaluations are the overall character evaluations done by the regex engine (is the most computationally intensive task in the application).
  • numberMatches, numberUnmatches, numberMatchedChars, numberUnmatchedChars, numberAnnotatedChars, numberAllChars, numberTrainingMatches, numberTrainingUnmatches are quantitative indices about the used dataset.
  • configuration is a copy of the evolution configuration containing, number of generations, population size, termination criteria settings and GP engine internal details.