Structured extraction in Python, powered by OpenAI's function calling api, designed for simplicity, transparency, and control.
This library is built to interact with openai's function call api from python code, with python structs / objects. It's designed to be intuitive, easy to use, but give great visibily in how we call openai.
This library depends on Pydantic and OpenAI that's all.
To get started with OpenAI Function Call, you need to install it using pip
. Run the following command in your terminal:
$ pip install instructor
To simplify your work with OpenAI models and streamline the extraction of Pydantic objects from prompts, we offer a patching mechanism for the `ChatCompletion`` class. Here's a step-by-step guide:
First, import the required libraries and apply the patch function to the OpenAI module. This exposes new functionality with the response_model parameter.
import openai
from pydantic import BaseModel
from instructor import patch
patch()
Create a Pydantic model to define the structure of the data you want to extract. This model will map directly to the information in the prompt.
class UserDetail(BaseModel):
name: str
age: int
Use the openai.ChatCompletion.create method to send a prompt and extract the data into the Pydantic object. The response_model parameter specifies the Pydantic model to use for extraction.
user: UserDetail = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
response_model=UserDetail,
messages=[
{"role": "user", "content": "Extract Jason is 25 years old"},
]
)
You can then validate the extracted data by asserting the expected values. By adding the type things you also get a bunch of nice benefits with your IDE like spell check and auto complete!
assert user.name == "Jason"
assert user.age == 25
If you want more control than just passing a single class we can use the OpenAISchema
which extends BaseModel
.
This quick start guide contains the follow sections:
- Defining a schema
- Adding Additional Prompting
- Calling the ChatCompletion
- Deserializing back to the instance
OpenAI Function Call allows you to leverage OpenAI's powerful language models for function calls and schema extraction. This guide provides a quick start for using OpenAI Function Call.
To begin, let's define a schema using OpenAI Function Call. A schema describes the structure of the input and output data for a function. In this example, we'll define a simple schema for a User
object:
from instructor import OpenAISchema
class UserDetails(OpenAISchema):
name: str
age: int
In this schema, we define a UserDetails
class that extends OpenAISchema
. We declare two fields, name
and age
, of type str
and int
respectively.
To enhance the performance of the OpenAI language model, you can add additional prompting in the form of docstrings and field descriptions. They can provide context and guide the model on how to process the data.
!!! note Using patch
these docstrings and fields descriptions are powered by pydantic.BaseModel
so they'll work via the patching approach as well.
from instructor import OpenAISchema
from pydantic import Field
class UserDetails(OpenAISchema):
"Correctly extracted user information"
name: str = Field(..., description="User's full name")
age: int
In this updated schema, we use the Field
class from pydantic
to add descriptions to the name
field. The description provides information about the field, giving even more context to the language model.
!!! note "Code, schema, and prompt"
We can run openai_schema
to see exactly what the API will see, notice how the docstrings, attributes, types, and field descriptions are now part of the schema. This describes on this library's core philosophies.
class UserDetails(OpenAISchema):
"Correctly extracted user information"
name: str = Field(..., description="User's full name")
age: int
UserDetails.openai_schema
{
"name": "UserDetails",
"description": "Correctly extracted user information",
"parameters": {
"type": "object",
"properties": {
"name": {
"description": "User's full name",
"type": "string"
},
"age": {
"type": "integer"
}
},
"required": [
"age",
"name"
]
}
}
With the schema defined, let's proceed with calling the ChatCompletion
API using the defined schema and messages.
from instructor import OpenAISchema
from pydantic import Field
class UserDetails(OpenAISchema):
"Correctly extracted user information"
name: str = Field(..., description="User's full name")
age: int
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo-0613",
functions=[UserDetails.openai_schema],
function_call={"name": UserDetails.openai_schema["name"]},
messages=[
{"role": "system", "content": "Extract user details from my requests"},
{"role": "user", "content": "My name is John Doe and I'm 30 years old."},
],
)
In this example, we make a call to the ChatCompletion
API by providing the model name (gpt-3.5-turbo-0613
) and a list of messages. The messages consist of a system message and a user message. The system message sets the context by requesting user details, while the user message provides the input with the user's name and age.
Note that we have omitted the additional parameters that can be included in the API request, such as temperature
, max_tokens
, and n
. These parameters can be customized according to your requirements.
To deserialize the response from the ChatCompletion
API back into an instance of the UserDetails
class, we can use the from_response
method.
user = UserDetails.from_response(completion)
print(user.name) # Output: John Doe
print(user.age) # Output: 30
By calling UserDetails.from_response
, we create an instance of the UserDetails
class using the response from the API call. Subsequently, we can access the extracted user details through the name
and age
attributes of the user
object.
Everything is designed for you to get the best developer experience possible, with the best editor support.
Including autocompletion:
And even inline errors
This quick start guide provided you with a basic understanding of how to use OpenAI Function Call for schema extraction and function calls. You can now explore more advanced use cases and creative applications of this library.
Since UserDetails
is a OpenAISchems
and a pydantic.BaseModel
you can use inheritance and nesting to create more complex emails while avoiding code duplication
class UserDetails(OpenAISchema):
name: str = Field(..., description="User's full name")
age: int
class UserWithAddress(UserDetails):
address: str
class UserWithFriends(UserDetails):
best_friend: UserDetails
friends: List[UserDetails]
If you have any questions, feel free to leave an issue or reach out to the library's author on Twitter. For a more comprehensive solution with additional features, consider checking out MarvinAI.
To see more examples of how we can create interesting models check out some examples.
This project is licensed under the terms of the MIT License.
$ openai_function_call git:(ft-cli) ✗ instructor jobs create-from-file data.jsonl
OpenAI Fine Tuning Job Monitoring
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ ┃ ┃ ┃ Completion ┃ ┃ ┃ ┃ ┃
┃ Job ID ┃ Status ┃ Creation Time ┃ Time ┃ Model Name ┃ File ID ┃ Epochs ┃ Base Model ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ ftjob-PWo6uwk… │ 🚫 cancelled │ 2023-08-23 │ N/A │ │ file-F7lJg6Z4… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 23:10:54 │ │ │ │ │ │
│ ftjob-1whjva8… │ 🚫 cancelled │ 2023-08-23 │ N/A │ │ file-F7lJg6Z4… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 22:47:05 │ │ │ │ │ │
│ ftjob-wGoBDld… │ 🚫 cancelled │ 2023-08-23 │ N/A │ │ file-F7lJg6Z4… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 22:44:12 │ │ │ │ │ │
│ ftjob-yd5aRTc… │ ✅ succeeded │ 2023-08-23 │ 2023-08-23 │ ft:gpt-3.5-tur… │ file-IQxAUDqX… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 14:26:03 │ 15:02:29 │ │ │ │ │
└────────────────┴──────────────┴────────────────┴────────────────┴─────────────────┴────────────────┴────────┴─────────────────┘
Automatically refreshes every 5 seconds, press Ctrl+C to exit