Skip to content
This repository has been archived by the owner on Oct 3, 2021. It is now read-only.

create preprocessing_script.sh #1303

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
#!/bin/bash
kfriedberger marked this conversation as resolved.
Show resolved Hide resolved

kfriedberger marked this conversation as resolved.
Show resolved Hide resolved
# This file is part of the SV-Benchmarks collection of verification tasks:
# https://github.com/sosy-lab/sv-benchmarks
# SPDX-FileCopyrightText: 2020 The SV-Benchmarks Community
# SPDX-License-Identifier: CC0-1.0

# This script can only be called from sv-benchmarks/c/Juliet_Test/Juliet_Test_Suite_v1.3_for_C_Cpp/C/testcases
kfriedberger marked this conversation as resolved.
Show resolved Hide resolved
# Running this script the .c files in each CWE directory are preprocessed and a _bad.i , _good.i and .yml file are created for each .c file.
kfriedberger marked this conversation as resolved.
Show resolved Hide resolved
# The preprocessed/yml files are saved in the preprocessed folder in each CWE directory in testcases.
# Files with a windows.h header can only be preprocessed on a windows system.


# options for preprocessing
OPTIONS="-E -P -D INCLUDEMAIN=1 -m64 -I "../testcasesupport""

kfriedberger marked this conversation as resolved.
Show resolved Hide resolved
# property_file
property_file=" "

#for loop runs over all folders in testcases
for folder in */
kfriedberger marked this conversation as resolved.
Show resolved Hide resolved
do
# get the matching property_file for the CWE-Id
cwe_folder="$(basename $folder)" # get the plain directory name
cwe="${cwe_folder%${cwe_folder#??????}}" # extract the first 6 chars from the string
kfriedberger marked this conversation as resolved.
Show resolved Hide resolved

case "$cwe" in

"CWE191")
property_file="../../properties/no-overflow.prp"
;;

"CWE119" | "CWE125" | "CWE415" | "CWE401" | "CWE416" | "CWE762" | "CWE787" | "CWE843")
property_file="../../properties/valid-memorysafety.prp"
;;

*)
property_file=""
;;

esac

# if propertyfile is empty then skip the directory otherwise analyse/preprocess the directory
if [ -z "$property_file" ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Can't this logging and continue action just simply happen in the default case of the case instruction? This would simplify the code, no need to check for the content of "$property_file", so readability would be improved

then
echo "SKIPPING directory '${folder}' because of missing propertyfile"
continue
fi

echo "PROCESSING directory '${folder}' with property file '${property_file}'"

output_folder="${folder}/preprocessed"
mkdir -p "${output_folder}"
kfriedberger marked this conversation as resolved.
Show resolved Hide resolved

#while loop through all c files in the current folder
find "${folder}" -iname "*.c" | while read f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: make sure that the files really contain a main function

Context: Unfortunately not all of those c files seem to contain a main function. For example for CWE191 I found 418 .c files mentioning "main" and 198 without that:

user@machine: /sv-benchmarks/c/Juliet_Test/Juliet_Test_Suite_v1.3_for_C_Cpp/C/testcases/CWE191_Integer_Underflow/s01 $ grep main *.c -l |wc -l
418
user@machine: /sv-benchmarks/c/Juliet_Test/Juliet_Test_Suite_v1.3_for_C_Cpp/C/testcases/CWE191_Integer_Underflow/s01 $ grep main *.c -L |wc -l
198

For verification tasks I think we need a main function. Those seem to be all which end in [0-9a].c. Those that end in [b-z].c seem to be the ones without main function. At least in this folder at which I looked.

do
echo -ne "." # one dot per file to see progress
kfriedberger marked this conversation as resolved.
Show resolved Hide resolved

# get the plain file name
file="$(basename $f)"
badfile="${file%.c}_bad"
goodfile="${file%.c}_good"

# get the preprocessed file with OMITGOOD=1/OMITBAD=1
gcc -D OMITGOOD=1 $OPTIONS "$f" -o "${output_folder}/${badfile}.i"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: why do we need to omit the good code for generating the "bad" tasks? Couldn't we just leave that code in there or maybe have two versions of the bad code, one with the good behavior still there and one without it?

gcc -D OMITBAD=1 $OPTIONS "$f" -o "${output_folder}/${goodfile}.i"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: determine whether preprocessing actually succeeded and only write a yml file in this case, i.e., if the return code is 0.


#write the .yml files
echo "format_version: '2.0'
input_files: '${badfile}.i'
properties:
- property_file: '$property_file'
expected_verdict: false
options:
language: C
data_model: LP64
" > "${output_folder}/${badfile}.yml"

echo "format_version: '2.0'
input_files: '${goodfile}.i'
properties:
- property_file: '$property_file'
expected_verdict: true
options:
language: C
data_model: LP64
" > "${output_folder}/${goodfile}.yml"
kfriedberger marked this conversation as resolved.
Show resolved Hide resolved

done
echo "" # newline after the dots above
done