-
Notifications
You must be signed in to change notification settings - Fork 787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added system prompt for openai #2145
added system prompt for openai #2145
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating these two models! I left a small comment.
Other than that, I believe we will need to look at the following:
With HuggingFace, we might need to change the formatting such that we use messages instead:
messages = [
{
"role": "system",
"content": MY_SYSTEM_PROMPT,
},
{"role": "user", "content": prompt},
]
For LangChain, I'm okay with skipping this for now since it uses a more complex structure and the API here might need to be updated (see https://python.langchain.com/v0.1/docs/modules/model_io/chat/quick_start/#messages-in---message-out).
For llama-cpp-python, it seems straightforward and we should use .create_chat_completion
instead of directly calling the model.
@MaartenGr I updated the code for llama-cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the late reply, I was off for the last couple of weeks.
I left some comments here and there to make sure we are on the same page for all changes.
bertopic/representation/_llamacpp.py
Outdated
[DOCUMENTS] | ||
Keywords: [KEYWORDS] | ||
Provide the extracted topic name directly without any explanation.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's structure this the same way as was originally done in _cohere.py
bertopic/representation/_cohere.py
Outdated
Topic name:""" | ||
Provide the topic name directly without any explanation.""" | ||
|
||
DEFAULT_SYSTEM_PROMPT = "You are designated as an assistant that identify and extract high-level topics from texts." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEFAULT_SYSTEM_PROMPT = "You are designated as an assistant that identify and extract high-level topics from texts." | |
DEFAULT_SYSTEM_PROMPT = "You are an assistant that extracts high-level topics from texts." |
bertopic/representation/_llamacpp.py
Outdated
|
||
Based on the above information, can you give a short label of the topic? | ||
A: """ | ||
DEFAULT_SYSTEM_PROMPT = "You are designated as an assistant that identify and extract high-level topics from texts." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEFAULT_SYSTEM_PROMPT = "You are designated as an assistant that identify and extract high-level topics from texts." | |
DEFAULT_SYSTEM_PROMPT = "You are an assistant that extracts high-level topics from texts." |
bertopic/representation/_llamacpp.py
Outdated
pipeline_kwargs: Mapping[str, Any] = {}, | ||
nr_docs: int = 4, | ||
diversity: float = None, | ||
doc_length: int = None, | ||
doc_length: int = 100, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the value as it was, otherwise we have to do this for all instances of doc_length
across all models.
bertopic/representation/_llamacpp.py
Outdated
} | ||
], ** self.pipeline_kwargs | ||
) | ||
label = topic_description["choices"][0]["message"]["content"].strip().replace("Topic name: ", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove .replace("Topic name: ", "")
if the prompt is updated (see comment above).
@MaartenGr Thanks for your comments, I incorporated them and updated the code. |
@Leo-LiHao Thanks! I noticed a couple of changes to creating the prompt that we might want in all other models but it's alright to keep that as is for now. At some point, I might make a utility function for creating the prompt that is used across all models to make sure everything remains stable. For now, let's merge! |
Ah, I see the linting has an error. Could you check that? EDIT: I updated the settings for the repo so that the checks will automatically run. |
@MaartenGr Thank you! I fixed the lint error and it is read for merge now. |
Awesome, thanks for taking the time! I just merged it 🥳 |
What does this PR do?
Add system prompt for OpenAI Representation model, see Issue #2146
Before submitting