A library to make working with batch inference of LLM call in AWS easier. Currently support is limited to Anthropic models.
- A
.env
file with an entry forAWS_PROFILE=
. This profile should have sufficient permissions to execute a batch inference job. [find the link] - A role with the required permissions [find details]
- A s3 bucket to store the input and outputs for the job.
- Inputs will be written to
f{s3_bucket}/input/{job_name}.jsonl
- Outputs will be written to
f{s3_bucket}/output/{job_id}/{job_name}.jsonl.out
andf{s3_bucket}/output/{job_id}/manifest.json.out
- Inputs will be written to
See:
batch_inference_example()
: for an example of free text responsestructured_batch_inference_example
: for an example of structured response ala instructor
To install the dev and test dependencies:
pip install -e ".[test,dev]"
- example data folder
- Tests
utils
module for creating the required AWS role etc.