Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add final evaluation #84

Merged
merged 8 commits into from
Oct 15, 2024
Merged

feat: add final evaluation #84

merged 8 commits into from
Oct 15, 2024

Conversation

leseb
Copy link
Collaborator

@leseb leseb commented Oct 10, 2024

8ff3780 feat: add final evaluation
113078c wip:final eval
fd52c08 update sdg generated line in knowledge-yaml
fc7067b fix typo on task generate
339ae90 fix: do not use backticks
5e74166 fix: add final details to push model to S3
a28d739 bulk: mt-bench eval, final eval and trained model push to S3
0885fa5 chore: remove helpers directory

commit 8ff3780
Author: Sébastien Han [email protected]
Date: Thu Oct 10 17:39:25 2024 +0200

feat: add final evaluation

Signed-off-by: Sébastien Han <[email protected]>

commit 113078c
Author: Michael Clifford [email protected]
Date: Thu Oct 10 20:04:04 2024 -0400

wip:final eval

Signed-off-by: Michael Clifford <[email protected]>

commit fd52c08
Author: sallyom [email protected]
Date: Fri Oct 11 13:21:38 2024 -0400

update sdg generated line in knowledge-yaml

Signed-off-by: sallyom <[email protected]>

commit fc7067b
Author: Sébastien Han [email protected]
Date: Mon Oct 14 09:40:50 2024 +0200

fix typo on task generate

It used to be `/data/generated`.

Signed-off-by: Sébastien Han <[email protected]>

commit 339ae90
Author: Sébastien Han [email protected]
Date: Mon Oct 14 09:41:33 2024 +0200

fix: do not use backticks

The shell from the python executor will interpret the content of the
backtick.

Signed-off-by: Sébastien Han <[email protected]>

commit 5e74166
Author: Sébastien Han [email protected]
Date: Mon Oct 14 09:43:24 2024 +0200

fix: add final details to push model to S3

Signed-off-by: Sébastien Han <[email protected]>

commit a28d739
Author: Sébastien Han [email protected]
Date: Mon Oct 14 23:00:15 2024 +0200

bulk: mt-bench eval, final eval and trained model push to S3

- do not print final eval scores in logs
- use the correct model location for final push
- fix job/cr watch
- add progress bar for download/upload
- add a few globale variables for eval steps
- handle the case where pods are removed but the job exists

Signed-off-by: Sébastien Han <[email protected]>

commit 0885fa5
Author: Sébastien Han [email protected]
Date: Tue Oct 15 17:12:32 2024 +0200

chore: remove helpers directory

THe functions from the helper are embedded into the components
themselves. So they are left unused now.

Signed-off-by: Sébastien Han <[email protected]>

updated_lines.append(new_line)

if changed:
with open(file_path, "w", encoding="utf-8") as file:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaelClifford the "utf-8" to fix the check_printable error, 🙏

@sallyom sallyom force-pushed the final-eval branch 2 times, most recently from 55dfafd to 64c9224 Compare October 11, 2024 21:27
@leseb leseb marked this pull request as ready for review October 14, 2024 07:44
@leseb leseb force-pushed the final-eval branch 5 times, most recently from bc44932 to 6aa2b1b Compare October 14, 2024 21:04
Copy link
Member

@tumido tumido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits. Haven't test run the code though. 🙂

Comment on lines 54 to 71
if gpu_count > 0:
command = [
sys.executable,
"-m",
"vllm.entrypoints.openai.api_server",
"--model",
model_path,
"--tensor-parallel-size",
str(gpu_count),
]
else:
command = [
sys.executable,
"-m",
"vllm.entrypoints.openai.api_server",
"--model",
model_path,
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, making this less repetitive

Suggested change
if gpu_count > 0:
command = [
sys.executable,
"-m",
"vllm.entrypoints.openai.api_server",
"--model",
model_path,
"--tensor-parallel-size",
str(gpu_count),
]
else:
command = [
sys.executable,
"-m",
"vllm.entrypoints.openai.api_server",
"--model",
model_path,
]
command = [
sys.executable,
"-m",
"vllm.entrypoints.openai.api_server",
"--model",
model_path,
]
if gpu_count > 0:
command += [
"--tensor-parallel-size",
str(gpu_count),
]

eval/final/components.py Show resolved Hide resolved
leseb and others added 8 commits October 15, 2024 17:16
Signed-off-by: Sébastien Han <[email protected]>
Signed-off-by: Michael Clifford <[email protected]>
It used to be `/data/generated`.

Signed-off-by: Sébastien Han <[email protected]>
The shell from the python executor will interpret the content of the
backtick.

Signed-off-by: Sébastien Han <[email protected]>
- do not print final eval scores in logs
- use the correct model location for final push
- fix job/cr watch
- add progress bar for download/upload
- add a few globale variables for eval steps
- handle the case where pods are removed but the job exists

Signed-off-by: Sébastien Han <[email protected]>
THe functions from the helper are embedded into the components
themselves. So they are left unused now.

Signed-off-by: Sébastien Han <[email protected]>
Copy link
Member

@tumido tumido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@leseb leseb merged commit 3784007 into opendatahub-io:main Oct 15, 2024
1 check passed
@leseb leseb deleted the final-eval branch October 15, 2024 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants