Replication package for Zero-Shot Cross-Domain Code Search without Fine-Tuning, FSE 2025.
-data # datasets and generated data
-cross-domain
-rapid
-models # base models
-scripts # scripts for data preprocessing and evaluationWe use the following models:
python==3.8.18torch==2.0.1- You can install the dependencies by:
pip install -r requirements.txt- We have provided the generated files in
data/. For instructions on generating the code and comments using DeepSeek-Coder-1.3b-Instruct, please refer togenerate_code.pyandgenerate_comment.py. - To reproduce the result, start by obtaining the embeddings from various models. The embeddings will be stored in the folder
vectors. You can modify the dataset name inscripts/get_embedding.shto get different embeddings.
bash scripts/get_embedding.sh- Evaluate the results.
bash scripts/eval.sh- Get embeddings.
bash scripts/get_embedding_rapid.sh- Evaluate the results.
bash scripts/eval_rapid.sh