Skip to content

Commit

Permalink
two projects working
Browse files Browse the repository at this point in the history
  • Loading branch information
pbharrin committed Sep 18, 2023
1 parent 6a84c66 commit 38dd734
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 12 deletions.
17 changes: 17 additions & 0 deletions evals/EVAL_NEW_CODE_RESULTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,20 @@
|:---------------------------|:-------------|:------------------------------------|:-------|
| projects/password_gen_eval | password_gen | check_executable_exits_normally ||
| projects/password_gen_eval | password_gen | check_executable_satisfies_function ||
## 2023-09-18

### Existing Code Evaluation Summary:

| Project | Evaluation | All Tests Pass |
|:----------------------------|:-------------------|:-----------------|
| projects/currency_converter | currency_converter ||
| projects/password_gen_eval | password_gen ||

### Detailed Test Results:

| Project | Evaluation | Test | Pass |
|:----------------------------|:-------------------|:------------------------------------|:-------|
| projects/currency_converter | currency_converter | check_executable_exits_normally ||
| projects/currency_converter | currency_converter | check_executable_satisfies_function ||
| projects/password_gen_eval | password_gen | check_executable_exits_normally ||
| projects/password_gen_eval | password_gen | check_executable_satisfies_function ||
2 changes: 1 addition & 1 deletion evals/evals_new_code.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ def single_evaluate(eval_ob: dict) -> list[bool]:
process.wait() # we want to wait until it finishes.

print("running tests on the newly generated code")
# TODO: test the code we should have an executable name
# test the code with the executable name in the config file
evaluation_results = []
for test_case in eval_ob["expected_results"]:
print(f"checking: {test_case['type']}")
Expand Down
22 changes: 11 additions & 11 deletions evals/new_code_eval.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,14 @@ evaluations:
executable_name: "python currency.py"
executable_arguments: "USD CAD 10"
output_satisfies: "tf = lambda a : a.replace('.', '').isnumeric()"
# - name: password_gen
# project_root: "projects/password_gen_eval"
# code_prompt: "Create a password generator CLI tool in Python that generates strong, random passwords based on user-specified criteria, such as length and character types (letters, numbers, symbols). The password generator should be a python program named passwordgenerator.py with two arguments: length, and character types. The character types argument can be one or more of the the following: l for lowercase, u for uppercase, d for digits, and s for symbols."
# expected_results:
# - type: check_executable_exits_normally
# executable_name: "python passwordgenerator.py"
# executable_arguments: "10 d"
# - type: check_executable_satisfies_function
# executable_name: "python passwordgenerator.py"
# executable_arguments: "10 d"
# output_satisfies: "tf = lambda a : len(a) == 10"
- name: password_gen
project_root: "projects/password_gen_eval"
code_prompt: "Create a password generator CLI tool in Python that generates strong, random passwords based on user-specified criteria, such as length and character types (letters, numbers, symbols). The password generator should be a python program named passwordgenerator.py with two arguments: length, and character types. The character types argument can be one or more of the the following: l for lowercase, u for uppercase, d for digits, and s for symbols."
expected_results:
- type: check_executable_exits_normally
executable_name: "python passwordgenerator.py"
executable_arguments: "10 d"
- type: check_executable_satisfies_function
executable_name: "python passwordgenerator.py"
executable_arguments: "10 d"
output_satisfies: "tf = lambda a : len(a) == 10"

0 comments on commit 38dd734

Please sign in to comment.