Skip to content

xlab-uiuc/ciri

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Artifact release for the paper "Large Language Models as Configuration Validators".

Artifact organization

.
├── ciri/                   # Ciri engine
├── gallary/                # Gallary for the repository
├── icse25_data/            # Experiment data for paper reproduction
├── requirements.txt        # Python dependencies
├── LICENSE                 # License for the code
└── README.md               # README for the code

All the experiment data necessary to reproduce the figures and tables in the paper's evaluation section (Section 5) is included in icse25_data.

Running Ciri

📜 Overview

Ciri is an advanced LLM-driven configuration validation framework, and also serves as an open platform for future research.

🚨 Prerequisites

GPT4 & GPT 3.5: Please follow the instruction from official OpenAI website to set the API KEY.

Claude-3-Opus & Claude-3-Sonnet: Please follow the instruction from official Claude AI website to set the API KEY.

CodeLlama 7B/13B/34B & DeepSeek: Please make sure you have the computation resources to run the model.

🔥 Quick Start

git clone https://github.com/xlab-uiuc/ciri.git
cd ciri
pip install -r requirements.txt

Running an example:

python3 -m ciri.ciri_eng --input_path ciri/input/sample_input --output_path ciri/output/sample_output --model {model_name} --system hcommon --version 3.3.0

The output should be similar to:

[Ciri] Start
[Ciri] Running for file ciri/input/sample_input/hcommon_ex.xml
[Ciri] Result: There are 1 misconfiguration parameters in the input: fs.default.name
[Ciri] Reason for fs.default.name: The property 'fs.default.name' has the value 'file//' which does not follow the correct URI format.
[Ciri] Writing log file to ciri/output/sample_output/hcommon_ex.xml
[Ciri] End

🤔 Running Ciri with customized features

Parameter Description Options
input-path Directory or file path for the configuration files directory or file path
output-path Directory where the output will be saved directory
model LLM model to be queried "Model Name" in Supported Models
system Name of the evaluated system E.g., hcommon
version Specific version of the system E.g., 3.3.0
validconfig_shot_num Number of valid config shots to use E.g., 1
misconfig_shot_num Number of misconfig shots to use E.g., 3
file_format Format of the config file E.g., XML, INI
code-retrieval-path Path to the code repository directory or file path
language Project Programming Language E.g., java
shot_system System name for sourcing alternate shot E.g., hcommon

🚀 Supportted Models

Models Status Model Name
GPT-4 gpt-4-0125-preview
GPT-3.5 gpt-3.5-turbo-0125
Claude-3-Opus
Claude-3-Sonnet
CodeLlama 7B/13B/34B codellama-34b
DeepSeek deepseek-coder

✅: Supported; 🔨: Coming soon;

Experiment Reproduction

Important

Please note that remote models are frequently updated, which may impact their capabilities and results. For reproducibility, we have provided all raw results generated by the models along with reproduction scripts in the icse25_data directory.

Effectiveness of Ciri (Table 5 & 6)

Ciri shows effectiveness of using state-of-the-art LLMs as configuration validators with the default setting, three misconfig shots and one validconfig shot. For the project {project} and model {model}, you can run the script result_parser.py to reproduce the results listed in Finding 1 and Table 5.

python3 icse25_data/script/result_parser.py --project {project} --model {model} --mode default 

For example, if you want to get the results when running Ciri on project hcommon with model gpt-4, you can run the script:

python3 icse25_data/script/result_parser.py --project hcommon --model gpt-4 --mode default 

For the real world misconfiguration (finding 2), please refer to the icse25_data/results/real_world_misconfig folder.

Effectiveness of few-shot learning (Table 7 & 8, Figure 4)

Using configuration data as shots can effectively improve LLMs’ effectiveness of configuration validation. To further validate the effectiveness of Ciri without few-shot learning, we run Ciri on the same 10 projects with the same models without using any configuration data as shots.

For the project {project} and model {model}, you can run the script result_parser.py to reproduce the results.

python3 icse25_data/script/result_parser.py --project {project} --model {model} --mode zero_shot 

For example, if you want to get the results of the project hcommon and model gpt-4, you can run the script:

python3 icse25_data/script/result_parser.py --project hcommon --model gpt-4 --mode zero_shot

To reproduce the results in Figure 4, you can run the script with the mode shot-v{number of validconfig shots}m{number of misconfig shots}.

To reproduce the results in Table 8, you can run the script with the mode shot_from_{project that provides the shots}.

Bias (Figure 8 & 9)

G-hits

Please refer to the document at the path icse25_data/g_hits which contains Google search result counts collected on August 6th, 2023. For example, searching for "hadoop.common.configuration.version" returns a certain number of results. Note that these search result counts may have changed since the data was collected over a year ago, as Google's index is continuously updated.

To reproduce the results in Figure 8, you can run the script get_figure8.py.

python3 icse25_data/script/get_figure8.py

To reproduce the results in Figure 9, you can run the script get_figure9.py.

python3 icse25_data/script/get_figure9.py

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages