😍 Contributing to GPTCache

Before contributing to GPTCache, it is recommended to read the usage doc example-doc. These two articles will introduce how to use GPTCache and the meaning of parameters of related functions.

In the process of contributing, pay attention to the parameter type, because there is currently no type restriction added.

Note that development MUST be based on the dev branch

First check which part you want to contribute:

Add a method to pre-process the llm request
Add a scalar store type
Add a vector store type
Add a new data manager
Add a embedding function
Add a similarity evaluation function
Add a method to post-process the cache answer list
Add a new process in handling chatgpt requests

Lazy import and automatic installation

For newly added third-party dependencies, lazy import and automatic installation are required. Implementation consists of the following steps:

Lazy import

# The __init__.py file of the same directory under the new file
__all__ = ['Milvus']

from gptcache.utils.lazy_import import LazyImport

milvus = LazyImport('milvus', globals(), 'gptcache.cache.vector_data.milvus')


def Milvus(**kwargs):
    return milvus.Milvus(**kwargs)

Automatic installation

# 2.1 Add the import method
# add new method to utils/__init__.py
__all__ = ['import_pymilvus']

from gptcache.utils.dependency_control import prompt_install


def import_pymilvus():
    try:
        # pylint: disable=unused-import
        import pymilvus
    except ModuleNotFoundError as e:  # pragma: no cover
        prompt_install('pymilvus')
        import pymilvus  # pylint: disable=ungrouped-imports

# 2.2 use the import method in your file
from gptcache.util import import_pymilvus
import_pymilvus()

Add a method to pre-process the llm request

refer to the implementation of Pre.

Make sure the input params, the data represents the original request dictionary object
Implement the post method
Add a usage example to example directory and add the corresponding content to example.md README.md

# The origin openai request
import openai

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

# This is the pre-process function of openai request, which is to get the last message
def last_content(data, **_):
    return data.get("messages")[-1]["content"]

Add a cache storage type

refer to the implementation of SQLDataBase.

Implement the CacheStorage interface
Make sure the newly added third-party libraries are lazy imported and automatic installation
Add the new store to the CacheBase method
Add a usage example to example directory and add the corresponding content to example.md README.md

Add a vector store type

refer to the implementation of milvus.

Implement the VectorBase interface
Make sure the newly added third-party libraries are lazy imported and automatic installation
Add the new store to the VectorBase method
Add a usage example to example directory and add the corresponding content to example.md README.md

Add a new data manager

refer to the implementation of MapDataManager, SSDataManager.

Implement the DataManager interface
Add the new store to the get_data_manager method
Add a usage example to example directory and add the corresponding content to example.md README.md

Add a embedding function

refer to the implementation of cohere or openai.

Add a new python file to embedding directory
Make sure the newly added third-party libraries are lazy imported and automatic installation
Implement the embedding function and make sure your output dimension
Add a usage example to example directory and add the corresponding content to example.md README.md

Add a similarity evaluation function

refer to the implementation of SearchDistanceEvaluation or OnnxModelEvaluation

Implement the SimilarityEvaluation interface
Make sure the range of return value, the range method return the min and max value
Make sure the input params of evaluation, you can learn more about in the user view model

rank = chat_cache.evaluation_func({
    "question": pre_embedding_data,
    "embedding": embedding_data,
}, {
    "question": cache_question,
    "answer": cache_answer,
    "search_result": cache_data,
}, extra_param=context.get('evaluation', None))

Make sure the newly added third-party libraries are lazy imported and automatic installation
Implement the similarity evaluation function
Add a usage example to example directory and add the corresponding content to example.md README.md

Add a method to post-process the cache answer list

refer to the implementation of first or random_one

Make sure the input params, you can learn more about in the adapter
Make sure the newly added third-party libraries are lazy imported and automatic installation
Implement the post method
Add a usage example to example directory and add the corresponding content to example.md README.md

# Get the most similar one from multiple results
def first(messages):
    return messages[0]


# Randomly fetch one of many results
def random_one(messages):
    return random.choice(messages)

Add a new process in handling chatgpt requests

Need to have a clear understanding of the current process, refer to the adapter
Add a new process
Make sure all examples work properly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contributing.md

contributing.md

😍 Contributing to GPTCache

Lazy import and automatic installation

Add a method to pre-process the llm request

Add a cache storage type

Add a vector store type

Add a new data manager

Add a embedding function

Add a similarity evaluation function

Add a method to post-process the cache answer list

Add a new process in handling chatgpt requests

Files

contributing.md

Latest commit

History

contributing.md

File metadata and controls

😍 Contributing to GPTCache

Lazy import and automatic installation

Add a method to pre-process the llm request

Add a cache storage type

Add a vector store type

Add a new data manager

Add a embedding function

Add a similarity evaluation function

Add a method to post-process the cache answer list

Add a new process in handling chatgpt requests