added system prompt for openai #2145

Leo-LiHao · 2024-09-12T16:08:23Z

What does this PR do?

Add system prompt for OpenAI Representation model, see Issue #2146

Before submitting

This PR fixes a typo or improves the docs (if yes, ignore all other checks!).
Did you read the contributor guideline?
Was this discussed/approved via a Github issue? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes (if applicable)?
Did you write any new necessary tests?

MaartenGr

Thanks for updating these two models! I left a small comment.

Other than that, I believe we will need to look at the following:

With HuggingFace, we might need to change the formatting such that we use messages instead:

messages = [
    {
        "role": "system",
        "content": MY_SYSTEM_PROMPT,
    },
    {"role": "user", "content": prompt},
 ]

For LangChain, I'm okay with skipping this for now since it uses a more complex structure and the API here might need to be updated (see https://python.langchain.com/v0.1/docs/modules/model_io/chat/quick_start/#messages-in---message-out).

For llama-cpp-python, it seems straightforward and we should use .create_chat_completion instead of directly calling the model.

bertopic/representation/_cohere.py

Leo-LiHao · 2024-12-18T19:38:33Z

@MaartenGr I updated the code for llama-cpp

MaartenGr

Apologies for the late reply, I was off for the last couple of weeks.

I left some comments here and there to make sure we are on the same page for all changes.

bertopic/representation/_cohere.py

MaartenGr · 2025-01-03T06:28:23Z

bertopic/representation/_llamacpp.py

 [DOCUMENTS]
+Keywords: [KEYWORDS]
+Provide the extracted topic name directly without any explanation."""


Let's structure this the same way as was originally done in _cohere.py

MaartenGr · 2025-01-03T06:29:08Z

bertopic/representation/_cohere.py

-Topic name:"""
+Provide the topic name directly without any explanation."""
+
+DEFAULT_SYSTEM_PROMPT = "You are designated as an assistant that identify and extract high-level topics from texts."


Suggested change

DEFAULT_SYSTEM_PROMPT = "You are designated as an assistant that identify and extract high-level topics from texts."

DEFAULT_SYSTEM_PROMPT = "You are an assistant that extracts high-level topics from texts."

MaartenGr · 2025-01-03T06:29:30Z

bertopic/representation/_llamacpp.py

-
-Based on the above information, can you give a short label of the topic?
-A: """
+DEFAULT_SYSTEM_PROMPT = "You are designated as an assistant that identify and extract high-level topics from texts."


Suggested change

DEFAULT_SYSTEM_PROMPT = "You are designated as an assistant that identify and extract high-level topics from texts."

DEFAULT_SYSTEM_PROMPT = "You are an assistant that extracts high-level topics from texts."

MaartenGr · 2025-01-03T06:30:30Z

bertopic/representation/_llamacpp.py

        pipeline_kwargs: Mapping[str, Any] = {},
        nr_docs: int = 4,
        diversity: float = None,
-        doc_length: int = None,
+        doc_length: int = 100,


Let's keep the value as it was, otherwise we have to do this for all instances of doc_length across all models.

bertopic/representation/_llamacpp.py

MaartenGr · 2025-01-03T06:33:50Z

bertopic/representation/_llamacpp.py

+                      }
+                  ], ** self.pipeline_kwargs
+            )
+            label = topic_description["choices"][0]["message"]["content"].strip().replace("Topic name: ", "")


I think we can remove .replace("Topic name: ", "") if the prompt is updated (see comment above).

bertopic/representation/_llamacpp.py

bertopic/representation/_openai.py

bertopic/representation/_utils.py

Leo-LiHao · 2025-01-03T23:10:12Z

Apologies for the late reply, I was off for the last couple of weeks.

I left some comments here and there to make sure we are on the same page for all changes.

@MaartenGr Thanks for your comments, I incorporated them and updated the code.

MaartenGr · 2025-01-08T11:38:12Z

@Leo-LiHao Thanks! I noticed a couple of changes to creating the prompt that we might want in all other models but it's alright to keep that as is for now. At some point, I might make a utility function for creating the prompt that is used across all models to make sure everything remains stable.

For now, let's merge!

MaartenGr · 2025-01-08T11:42:06Z

Ah, I see the linting has an error. Could you check that?

EDIT: I updated the settings for the repo so that the checks will automatically run.

Leo-LiHao · 2025-01-09T00:23:30Z

@MaartenGr Thank you! I fixed the lint error and it is read for merge now.

MaartenGr · 2025-01-11T09:26:52Z

Awesome, thanks for taking the time! I just merged it 🥳

added system prompt for openai

52556c3

Leo-LiHao mentioned this pull request Sep 12, 2024

Add system prompt for OpenAI representation model #2146

Open

migrated to cohere chat model

8cf724f

MaartenGr reviewed Sep 29, 2024

View reviewed changes

bertopic/representation/_cohere.py Outdated Show resolved Hide resolved

migrated to llamacpp.create_chat_completion

d3a3ca9

MaartenGr reviewed Jan 3, 2025

View reviewed changes

fixed system prompt and default prompt

2c32670

fixed lint

02a336b

MaartenGr merged commit 5cad563 into MaartenGr:master Jan 11, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added system prompt for openai #2145

added system prompt for openai #2145

Leo-LiHao commented Sep 12, 2024 •

edited

Loading

MaartenGr left a comment

Leo-LiHao commented Dec 18, 2024

MaartenGr left a comment

MaartenGr Jan 3, 2025

MaartenGr Jan 3, 2025

MaartenGr Jan 3, 2025

MaartenGr Jan 3, 2025

MaartenGr Jan 3, 2025

Leo-LiHao commented Jan 3, 2025

MaartenGr commented Jan 8, 2025

MaartenGr commented Jan 8, 2025 •

edited

Loading

Leo-LiHao commented Jan 9, 2025

MaartenGr commented Jan 11, 2025

	DEFAULT_SYSTEM_PROMPT = "You are designated as an assistant that identify and extract high-level topics from texts."
	DEFAULT_SYSTEM_PROMPT = "You are an assistant that extracts high-level topics from texts."

added system prompt for openai #2145

added system prompt for openai #2145

Conversation

Leo-LiHao commented Sep 12, 2024 • edited Loading

What does this PR do?

Before submitting

MaartenGr left a comment

Choose a reason for hiding this comment

Leo-LiHao commented Dec 18, 2024

MaartenGr left a comment

Choose a reason for hiding this comment

MaartenGr Jan 3, 2025

Choose a reason for hiding this comment

MaartenGr Jan 3, 2025

Choose a reason for hiding this comment

MaartenGr Jan 3, 2025

Choose a reason for hiding this comment

MaartenGr Jan 3, 2025

Choose a reason for hiding this comment

MaartenGr Jan 3, 2025

Choose a reason for hiding this comment

Leo-LiHao commented Jan 3, 2025

MaartenGr commented Jan 8, 2025

MaartenGr commented Jan 8, 2025 • edited Loading

Leo-LiHao commented Jan 9, 2025

MaartenGr commented Jan 11, 2025

Leo-LiHao commented Sep 12, 2024 •

edited

Loading

MaartenGr commented Jan 8, 2025 •

edited

Loading