Skip to content

Latest commit

 

History

History
2132 lines (1189 loc) · 165 KB

CHANGELOG.md

File metadata and controls

2132 lines (1189 loc) · 165 KB

Changelog

4.35.1 (2024-09-17)

Bug Fixes

4.35.0 (2024-09-17)

Features

4.34.0 (2024-09-17)

Features

Documentation

  • create groq_tracing_tutorial.ipynb (#4615) (5883c5a)
  • quickstart: convert Phoenix inferences instructions to notebook (#4593) (1e9541a)

4.33.2 (2024-09-12)

Bug Fixes

  • get_retrieved_documents should handle missing values (#4599) (4f604b1)

Documentation

4.33.1 (2024-09-05)

Bug Fixes

  • Fix typo and update ensure dataloader results ordering (#4527) (21d71d1)

4.33.0 (2024-09-04)

Features

  • auth: add user api keys table (#4473) (7c1334d)
  • incorporate schema for postgresql and add integration test (#4474) (cd64a99)
  • onboarding: add bedrock (#4465) (b03901b)
  • render db schema in welcome message when applicable (#4479) (ecdf039)
  • Return Query from DeleteSystemApiKey mutation (#4432) (b0639e0)

Bug Fixes

Documentation

4.32.0 (2024-08-29)

Features

Bug Fixes

  • error message for PHOENIX_PORT env vars auto-generated by kubernetes (#4422) (63d0adb)
  • scaffolder should incorporate port from cammand line (#4415) (0678c86)

Documentation

4.31.0 (2024-08-28)

Features

  • ui: add the ability to turn off auto-refresh of projects (#4414) (4a792d2)
  • vision: show images in a gallery, expandable images (#4407) (9e2d67f)

Bug Fixes

  • annotation events should refresh trace project (#4412) (3a18c13)
  • experiments: ensure compare experiments page does not break for experiments that contain a large number of examples (#4402) (71484e0)
  • use dataloader for experiment run annotations (#4397) (5582ce6)

Documentation

4.31.0 (2024-08-27)

Features

  • vision: show images in a gallery, expandable images (#4407) (9e2d67f)

Bug Fixes

  • annotation events should refresh trace project (#4412) (3a18c13)

Documentation

4.30.2 (2024-08-27)

Bug Fixes

  • experiments: ensure compare experiments page does not break for experiments that contain a large number of examples (#4402) (71484e0)
  • use dataloader for experiment run annotations (#4397) (5582ce6)

4.30.1 (2024-08-26)

Bug Fixes

  • improve timeout error message for query_spans method (#4391) (81811f1)

4.30.0 (2024-08-26)

Features

4.29.0 (2024-08-26)

Features

Bug Fixes

  • docker: support arm64 architecture in docker images (#4386) (1b6eec8)

Documentation

4.28.1 (2024-08-23)

Bug Fixes

4.28.0 (2024-08-23)

Features

Documentation

4.27.0 (2024-08-22)

Features

  • Add list_experiments client method (#4271) (a063d83)
  • Add fixtures only for new DBs, add flag to force fixture ingestion (#4315) (ef4adcd)
  • auth: minimal login page (#4320) (764f359)
  • experiments: add the ability to copy experiment IDs to the clipboard (#4317) (589ac03)
  • onboarding demo projects (#4262) (74dd3c7)

Bug Fixes

  • handle None values in the reference column in the get_qa_with_reference helper (#4309) (58685b7)

4.26.0 (2024-08-21)

Features

  • auth: add login/ logout routes and createUser mutation (#4293) (a3ff0f6)

Bug Fixes

  • postgresql driver name for db migrations (#4304) (9e683f2)

4.25.0 (2024-08-21)

Features

Bug Fixes

  • conditionally display re-ranker queries in span details (#4263) (248d61b)
  • python: application launch on Windows (#4276) (9ede0a3)

Documentation

  • add haystack to README (a244cdd)
  • Add human feedback notebook tutorial (#4257) (d4c200f)
  • add LLM fixtures for demo dataset (fine-tuning dataset), fix demo notebook (#4286) (9f54510)
  • Add Phoenix Llamaindex RAG Demo notebook + chunks + questions (#4202) (7f1b817)
  • fix variable name typo in run experiments doc (#4249) (9745754)

4.24.0 (2024-08-15)

Features

  • auth: add user role, exclude system in user lists (#4229) (fb18ab6)
  • auth: user / system api key resolvers (#4232) (c7b939e)
  • experiments: ability to specify concurrency in run_experiment and evaluate_experiment (#4189) (8239d3a)

Documentation

  • Add multimodal image reasoning tutorial with llama index (#4210) (d24712e)

4.23.0 (2024-08-13)

Features

Bug Fixes

  • Propagate span annotation metadata to examples on all mutations (#4195) (181e021)
  • UI: show IO if embedding span is missing embeddings (#4218) (5bc97ff)

4.22.1 (2024-08-12)

Bug Fixes

  • experiments: evaluate_experiment on existing experiment runs (#4204) (515e195)
  • remove skep_deps_check param on phoenix.instrumentors (#4205) (7a9ad5e)

4.22.0 (2024-08-09)

Features

  • UI: annotation filter actions on span/trace tables (#4194) (0301696)

4.21.0 (2024-08-08)

Features

  • annotations: add cta for span annotations (#4160) (ce22de5)

4.20.2 (2024-08-07)

Bug Fixes

  • add token count columns to spans table to improve projects page query performance (#4135) (8c713e3)

Documentation

  • examples: add feedback to manually-instrumented-chatbot example (#4020) (86b299f)

4.20.1 (2024-08-06)

Bug Fixes

4.20.0 (2024-08-06)

Features

  • Add span annotations to dataset example metadata (#4123) (a16dd57)

Bug Fixes

  • use dataloader for span annotations ([#4006]) (ab53325)
  • ensure rest api urls include custom root path (#4137) (9550a7e)

Documentation

  • add more videos to docs (GITBOOK-787) (cb9ee71)
  • Added Prompt flow documentation with example (GITBOOK-781) (2772397)
  • minor fixes to the quickstart (GITBOOK-786) (04a8ea0)
  • Update Tracing Integrations to match standard format (GITBOOK-784) (dedf969)

4.19.0 (2024-08-02)

Features

  • annotations: show all annotations in annotation summary in project page header (#4119) (5b7264e)

4.18.0 (2024-08-02)

Features

  • Add annotation summaries to projects (#4108) (5aa79c4)
  • annotations: add ability to edit human span annotations (#4111) (67cb9a2)
  • session: support a slug to the seesion.view (#4114) (9305f8a)

Bug Fixes

  • set higher lower-bound for OpenInference packages (#4117) (cfcbf58)

Documentation

4.17.0 (2024-08-02)

Features

  • annotations: add feedback column to spans / traces tables with all annotations (#4100) (193b309)
  • annotations: update RetrievalEvaluationLabel styles to match AnnotationLabel (#4101) (eef32df)
  • ui: condensed trace tree (#4099) (548f685)

Bug Fixes

4.16.1 (2024-07-31)

Bug Fixes

  • process annotation insertions after spans (#4084) (5b1a709)

4.16.0 (2024-07-30)

Features

  • Add sort order argument to Span- and Trace- Annotation fields (#4079) (cf3b37c)
  • add trace stream toggle into preferences context (#4035) (bc3be3e)
  • allow retries for annotation insertions when the corresponding span/trace does not exist (#4026) (13af3b5)
  • annotations: add feedback tab to span details (#4069) (8dc9672)
  • annotations: default collapse annotation explanations (#4081) (dbf3ee4)
  • annotations: make color for evaluation summaries consistent with table (#4082) (70a8b5a)
  • annotations: migrate span eval labels to us AnnotationLabel (#4068) (6219e91)
  • trace: add a span aside with timing info and feedback (#4071) (275ad73)
  • ui: tracing getting started button (#4067) (9eba5eb)

Bug Fixes

4.15.0 (2024-07-26)

Features

  • Add containedInDataset boolean field to gql Spans (#4015) (3c096ca)
  • annotations: add annot ation macro and filter condition snippet to project page (#4024) (acc2ff1)
  • annotations: refetch annootations on annotation changes (#3980) (9ba7cb9)
  • datasets: add dataset edit UI and dataset metadata on create (#4005) (d80c438)
  • trace: UI lazy loading of spans (#4014) (ab4fafa)
  • Version mismatch checks (#3989) (8454183)

Bug Fixes

Documentation

4.14.1 (2024-07-23)

Bug Fixes

  • remove clean step from release pipeline and remove rimraf dev dep (#3975) (5e91f8f)

4.14.0 (2024-07-23)

Features

Bug Fixes

  • get_dataset_versions client method does not break on mixed timestamp formats (#3973) (40c448b)

Documentation

  • Fix error in LlamaIndex Quickstart (GITBOOK-750) (40c5b28)
  • Fix images in Custom Task Evaluation (GITBOOK-749) (ee7365e)

4.13.1 (2024-07-22)

Bug Fixes

4.13.0 (2024-07-22)

Features

  • annotations: graphql resolver for all annotation names on a project (#3931) (e7b87b2)

Bug Fixes

  • experiments: ensure experiments table appears even when an experiment has no runs (#3942) (175c268)

Documentation

  • Add API docstrings for experiment evaluators module (#3944) (53079ce)
  • api ref sidebar overhaul (0614255)
  • api ref updates and docstring fixes (e089f99)
  • small fixes for datasets and experiments quickstart notebook (#3934) (e24d721)
  • Update README.md (7836779)

4.12.0 (2024-07-18)

Features

4.11.0 (2024-07-18)

Features

Bug Fixes

  • flatten sequence attribute when value is ndarray (which is not Sequence) (#3926) (a361f87)
  • initialize tracer provider for internal server instrumentation (#3921) (c59af75)
  • security fix for braces (#3924) (c2595c6)

4.10.1 (2024-07-16)

Bug Fixes

4.10.0 (2024-07-16)

Features

  • Add GQL mutations for Span + Trace Annotations (#3891) (78e7e3b)
  • Add REST routes for span and trace annotations (#3869) (43eede1)
  • annotations: ability to copy span and trace IDs (49085c4)

4.9.0 (2024-07-10)

Features

Bug Fixes

  • graphql: clear project when end_time is UNSET (#3879) (7c77a73)
  • remove phoenix.daasets imports (12adc6a)

Reverts

Documentation

  • api reference overhaul modules (e3b9c7f)

4.8.1 (2024-07-09)

Bug Fixes

Documentation

  • api ref clean up (a5d87cc)
  • updated index for api reference (9646ee3)

4.8.0 (2024-07-08)

Features

  • experiments: REST endpoint to delete dataset (#3853) (3c7ede2)

4.7.2 (2024-07-08)

Bug Fixes

4.7.1 (2024-07-04)

Bug Fixes

  • ensure experiment errors messages work on python 3.8 and 3.9 (#3841) (2595cfb)

4.7.0 (2024-07-03)

Features

Bug Fixes

  • allow invocations of OpenAIModel without api key (#3820) (4dd8c0e)

4.6.3 (2024-07-03)

Bug Fixes

4.6.2 (2024-07-02)

Bug Fixes

  • experiments: order annotations by name to make output deterministic (#3806) (256035f)

4.6.1 (2024-07-02)

Documentation

4.6.0 (2024-07-02)

Features

  • create_evaluator decorators (#3642) (56acddd)
  • ability to clear data older than X date, fix DB constraint errors for span.id from datasets to projects (#3670) (993ad5d)
  • add annotations resolver on DatasetRun type (#3473) (c677091)
  • Add basic evaluators for string experiment outputs (#3534) (85bec41)
  • add dataset-related tables (#3169) (b164dfe)
  • add experiment-related tables and migrations (#3381) (b08e8d4)
  • add experiments resolver to DatasetExample gql type (#3446) (f526025)
  • add graphql resolver for adding spans to datasets (#3205) (b80979e)
  • Add LLM evaluators (#3571) (032672b)
  • add patchDatasetExamples mutation (#3343) (9ffe198)
  • add project resolver on span (#3406) (b64d78b)
  • Add relevance evaluator (#3604) (da4a6b3)
  • add runs resolver on Experiment type (#3465) (8140957)
  • add span resolver on DatasetExample gql type (#3394) (6c46d50)
  • auth: ability to set headers via environment variables (ff5b64d)
  • compareExperiments resolver (#3481) (2becd18)
  • dataset example slideover (#3325) (c64f99b)
  • dataset: gql dataset versions connection (#3222) (de28b12)
  • datasets: add reference as alias of expected for evaluator argument bindings (#3790) (fdd070a)
  • datasets: add client method for appending to datasets (#3659) (9c444a8)
  • datasets: add dataframe transformation to dataset (#3736) (fb5730a)
  • datasets: add example modal (#3424) (e52867c)
  • datasets: add graphql field from trace to project (#3606) (7a54241)
  • datasets: add jsonl to download menu (#3495) (fcd6c27)
  • datasets: add pagination to dataset examples table (#3299) (33d7a74)
  • datasets: add sequence number for experiments of the same dataset (#3486) (1a692cf)
  • datasets: add span to dataset from the trace page (#3230) (945af8c)
  • datasets: add the ability to create a dataset dynamically (#3712) (81c0cae)
  • datasets: allow unrecognized parameters in the evaluator function with default values (#3674) (8b97a5e)
  • datasets: capture traces from experiments and their evaluations (#3579) (1917cd7)
  • datasets: create dataset UI (#3217) (5183620)
  • datasets: dataset upload endpoint (plus fixtures) (#3183) (626f18d)
  • datasets: datasets graphql (#3192) (1697d96)
  • datasets: datasets page (#3172) (89305fe)
  • datasets: Delete dataset mutation (#3321) (053fa31)
  • datasets: Delete dataset UI (#3336) (202e9f8)
  • datasets: Delete examples (#3352) (42ab894)
  • datasets: delete examples mutation (#3324) (febea33)
  • datasets: deny v1 routes and gql mutations if readonly (#3501) (de376cf)
  • datasets: Display latest version (#3373) (66cd6a8)
  • datasets: download csv button (#3312) (e5b83a2)
  • datasets: download dataset as CSV text file (#3250) (9629d39)
  • datasets: download jsonl for openai (#3493) (e4412ef)
  • datasets: example and experiment count on datasets table (#3447) (2e3413a)
  • datasets: example experiment runs (#3476) (db592a8)
  • datasets: expose the API playgrounds (#3204) (da1416b)
  • datasets: get_dataset_by_name (726d97d)
  • datasets: gql dataset create (#3203) (679a868)
  • datasets: gql for adding examples (#3266) (4049228)
  • datasets: gql resolver for dataset example count (#3437) (862bb1f)
  • datasets: gql resolver for experiment count (#3443) (5b6bc5c)
  • datasets: gql resolver returns examples in descending order (#3448) (624ba10)
  • datasets: JSON endpoint to get dataset versions (#3323) (fec38ff)
  • datasets: link to view source span (#3413) (faa925e)
  • datasets: multi-select on span / traces tables (#3236) (160c4e6)
  • datasets: navigate to examples if no experiments exist (cbbed30)
  • datasets: post the result of each experiment/evaluation run immediately when it finishes (#3666) (4e21d2c)
  • datasets: print experiment summaries (#3709) (7c70afa)
  • datasets: print the URL to the dataset when uploaded (#3647) (76439cf)
  • datasets: python instructions (#3569) (ee0788a)
  • datasets: routing for examples and experiment pages (#3470) (141b90c)
  • datasets: show example details in a slide-over (b1a1317)
  • datasets: sort by name and createdAt (79f8c88)
  • datasets: sort on version (#3370) (41348cf)
  • datasets: spans as examples (#3279) (1d46c42)
  • datasets: synchronously upload dataset examples returning dataset_id in JSON (#3347) (c32ac4d)
  • datasets: UI to edit a dataset example (#3376) (3950256)
  • datasets: upload JSON for dataset examples (#3658) (47ef311)
  • datasets: usability enhancements (#3773) (912dc9b)
  • datasets: version history modal (#3444) (86755a4)
  • display average run latency in the experiments table (#3743) (cfaafd5)
  • error rate resolver on Experiment type (#3588) (ceaea16)
  • Experiments improvements (#3638) (bd85bea)
  • experiments: add experiment name (#3512) (801ac29)
  • experiments: add the ability to view an experiment's traces (#3603) (084a0c6)
  • experiments: comparison details slideover (74d1bd0)
  • experiments: delete experiments ui (623805c)
  • experiments: delete experiments ui (b942b59)
  • experiments: detail view for comparison (ebc4aa1)
  • experiments: evaluator icon and ingestion (#3639) (70ba085)
  • experiments: evaluator trace slide-over (#3680) (2df5b9d)
  • experiments: experiment error rate column (#3657) (41d354f)
  • experiments: experiment evaluation summaries in the table (#3575) (85c457a)
  • experiments: experiments compare table (47af587)
  • experiments: experiments table (#3454) (a9981da)
  • experiments: full-text toggle for experiments table (537ed97)
  • experiments: gql resolver for experiments (#3404) (6d70786)
  • experiments: Implement run_experiment (#3471) (87a0501)
  • experiments: navigation to experiments view (#3509) (a293f7e)
  • experiments: run count resolver on experiments (#3679) (2444f42)
  • experiments: show run count (#3690) (2c79a78)
  • experiments: show trace slide-over on experiment page (#3640) (8457cb5)
  • experments: ability to view evaluator traces (811290e)
  • experments: add the ability to view experiment metadata in full (#3686) (3560e1d)
  • experments: minimum viable dialog showing how to run an experiment (#3704) (4fb13b8)
  • experments: Switch UI to use experiment name (#3523) (a953231)
  • gql resolver for dataset examples (#3238) (fa0b4d2)
  • Implement GET /datasets/id and GET /datasets (#3197) (36abede)
  • Implement experiments REST API (#3411) (d369fb3)
  • implement get_dataset method on phoenix.Client (#3490) (09fb3f0)
  • implement initial experiment evals (#3526) (b6fabdf)
  • implement patchDataset mutation (#3457) (a0240b3)
  • Improve task argument binding and document run_experiment (#3789) (0b64cbe)
  • List Dataset Examples (#3271) (d5f4391)
  • resolvers for experiment annotation aggregations (#3549) (227e6e0)
  • Support repetitions for experiment runs (#3532) (7942694)
  • ui: display examples in dataset page (#3277) (829746a)
  • Unify run_experiment and evaluate_experiment (#3585) (7e1ffb6)

Bug Fixes

  • add tiebreak to versions resolver (#3488) (ac23ec7)
  • Address relevance eval feedback (#3609) (b231169)
  • datasets: allow duplicate keys for csv upload (#3464) (a0a5b25)
  • datasets: api spec for upload endpoint (#3213) (b719267)
  • datasets: bug with json upload (#3663) (d667b8f)
  • datasets: colab usage of dataset.examples should no longer be list (#3781) (4f148ae)
  • datasets: filter examples by dataset in gql (#3330) (e5606e7)
  • datasets: free up the output keyword as attribute of experiment run objects (#3793) (6b4db71)
  • datasets: get metadata as {} when its value is None in JSON (#3555) (6249ebe)
  • datasets: json return payload for upload csv endpoint (#3364) (4a1d063)
  • datasets: make tests pass with new client (5cfdc5b)
  • datasets: missing annotation trace id (#3664) (d800e36)
  • datasets: reconcile Dataset methods (#3508) (43db5bc)
  • datasets: select nested rows on traces (#3489) (0bdb860)
  • datasets: show full bar on evals of all 1s (#3733) (3faa051)
  • datasets: squash experiment run output by "result" key for graphql query (#3672) (20dba43)
  • datasets: typo on dict type for typed dict (#3684) (5e8e9a3)
  • datasets: update span kind for evaluator with semantic conventions v0.1.9 (#3667) (ff2de45)
  • ensure patches are sorted in numeric patch order (#3379) (70facf1)
  • experiments: Improve the performance of the table (#3732) (8e33b77)
  • experments: fix colab links (#3637) (841ac0d)
  • fix annotation trace ts errors (8314aa5)
  • json cell for experiment metadata (#3556) (f9e2b6d)
  • openapi import error (#3619) (1f81c05)
  • openapi yaml parsing for containers (#3788) (959abf7)
  • order runs in descending order in runs resolver on Experiment type (#3480) (e1818b7)
  • resolve sqlachemy warning regarding remote (#3522) (cd15d9b)
  • style and type errors (#3540) (2cba662)
  • switch to upload_dataset for examples (#3783) (bea7c2f)
  • ui: right align numeric columns (#3587) (781ae7a)

Documentation

  • Added more detail prepping and exporting eval data to the Bring Your Own Evaluator section (GITBOOK-704) (96a312b)
  • api-ref: fix readthedocs build issues (#3706) (0827726)
  • Cleanup datasets section (GITBOOK-694) (18a4d5b)
  • Datasets documentaiton (GITBOOK-697) (8148f67)
  • Datasets review - fixing typos, syntax, labels, links (GITBOOK-702) (fcb56ee)
  • datasets tutorials and quickstart (#3734) (cfa641c)
  • datasets: print useful URLs, disable repetitions (#3583) (14c7d9f)
  • experiments: prompt template iteration for summarization task (#3669) (0842df4)
  • experiments: txt2sql (#3626) (33cd194)
  • experiments: txt2sql (#3714) (b083159)
  • fix creating datasets (GITBOOK-701) (9b83b1d)
  • fix typos (GITBOOK-698) (d413e54)
  • GPT-4o first set (GITBOOK-695) (8dff0bf)
  • No subject (GITBOOK-696) (88859e1)
  • No subject (GITBOOK-699) (9beed78)
  • No subject (GITBOOK-700) (5ac466c)
  • No subject (GITBOOK-703) (f04e9c5)
  • No subject (GITBOOK-707) (2237a88)
  • notebook: datasets and experiments quickstart (#3703) (991df49)
  • placeholders for experiments (GITBOOK-705) (1f7d183)
  • readthedocs (71fceab)
  • rest api guidance (#3314) (0309017)
  • small fixes (GITBOOK-706) (297458e)
  • small fixes (GITBOOK-708) (4990aa5)
  • sphinx api-ref for readthedocs (0bcccbd)
  • update dataset creation (GITBOOK-711) (51c5ea1)
  • use kwargs with datasets (#3748) (530b2c6)
  • use kwargs with datasets (#3748) (#3749) (599e340)

4.5.0 (2024-06-21)

Features

4.4.3 (2024-06-17)

Bug Fixes

4.4.2 (2024-06-13)

Bug Fixes

4.4.1 (2024-06-11)

Bug Fixes

4.4.0 (2024-06-10)

Features

  • ui: add filter snippets for metadata and substring search (#3451) (2c37be4)

4.3.1 (2024-06-10)

Bug Fixes

4.3.0 (2024-06-07)

Features

Bug Fixes

Documentation

  • minimum working example with a local llm (#3348) (e4c657c)

4.2.4 (2024-05-28)

Bug Fixes

Documentation

  • Update langchain dependencies in tutorials (#3316) (e403652)

4.2.3 (2024-05-23)

Bug Fixes

4.2.2 (2024-05-23)

Bug Fixes

4.2.1 (2024-05-23)

Bug Fixes

4.2.0 (2024-05-23)

Features

  • docker image runs as root by default with tags for nonroot and debug images (#3282) (7178c25)

4.1.3 (2024-05-22)

Bug Fixes

  • need to check ".get()" because attribute may not be a dict (#3267) (3917fcc)

4.1.2 (2024-05-20)

Bug Fixes

  • join on trace_id in get_qa_with_reference (#3248) (a88d4ff)

Documentation

4.1.1 (2024-05-17)

Bug Fixes

4.1.0 (2024-05-17)

Features

  • Add ASGI root path parameter to Phoenix server (#3186) (e27cc5d)

Documentation

4.0.3 (2024-05-13)

Bug Fixes

  • Always wait a small amount of time between inserts (#3168) (6e18e3c)

4.0.2 (2024-05-11)

Bug Fixes

  • Bulk inserter begins first insert immediately (#3151) (7e17cb2)
  • unflatten attributes when loading spans from trace_dataset (#3170) (a165023)

4.0.1 (2024-05-09)

Bug Fixes

  • coerce input.value to string at ingestion (#3147) (3742ea7)

Documentation

4.0.0 (2024-05-09)

⚠ BREAKING CHANGES

  • Remove experimental module (#2945)

Features

  • Add log_traces method that sends TraceDataset traces to Phoenix (#2897) (c8f9ed2)
  • add a last N time range selector on project / projects pages (#2907) (3c115f8)
  • add bedrock claude tracing tutorial (#2919) (b8b5240)
  • add default limit to /v1/spans and corresponding client methods (#3026) (e5698d7)
  • add gradient start/end to projects table (#2956) (5b6b217)
  • add grpc endpoint (#2232) (8bbd136)
  • Add indexes on Annotation tables (#3082) (682ecee)
  • Add indexes on spans table (#3098) (12d2574)
  • add opentelemetry trace instrumentation for Phoenix server (#2990) (6ed494e)
  • Add SQL and Code Functionality Eval Templates (#2861) (c7d776a)
  • add trace and document evals to GET v1/evaluations (#2910) (79229f2)
  • Add user frustration eval (#2928) (406938b)
  • Added support for default_headers for azure_openai. (#2917) (6ee5f24)
  • convert graphql api to pull trace evaluations from db (#2867) (11aa455)
  • Deprecate datasets module, rename to inferences (#2785) (4987ea3)
  • experimental: postgres support (a2657d4)
  • fetch annotation names (#2964) (6c5d25d)
  • fetch document retrieval metrics per span using SQL (#2960) (9fdb765)
  • graphql api pulls from db for document evaluations (#2865) (e4b667d)
  • grpc interceptor for prometheus (#3056) (610c8fa)
  • ingest document evals (#2847) (f3fde50)
  • ingest pyarrow span evals into sqlite (#2837) (3a6666c)
  • ingest trace annotations (#2852) (792f674)
  • make graphql api for span evaluations read from database (#2860) (5adf750)
  • move document evaluation summary to pull from db (#2888) (73ca2d7)
  • openapi ui for api exploration (#3041) (5b22961)
  • persistence: add support for sorting by eval scores and labels (#2977) (44c3068)
  • persistence: bulk inserter for spans (#2808) (9ce841e)
  • persistence: clear project (#2976) (665c166)
  • persistence: clear traces UI (#2988) (a717ff6)
  • persistence: dataloader for document retrieval metrics (#2978) (f55c458)
  • persistence: dataloader for span descendants (#2980) (d8e10d4)
  • persistence: ensure migrations run for TreadSession (#2855) (ec4fea7)
  • persistence: fetch latency_ms percentiles using sql with dataloaders (#2818) (48d4643)
  • persistence: fetch streaming_last_updated_at (#2819) (d665e49)
  • persistence: get or delete projects using sql (#2839) (527b9a9)
  • persistence: json binary for postgres (#2849) (29351bf)
  • persistence: launch app with persist (#2817) (add6103)
  • persistence: make launch_app runnable on tmp directory (#2851) (f41e922)
  • persistence: span annotation tables (#2788) (874c61e)
  • persistence: span query DSL with SQL (#2911) (7c01420)
  • persistence: sql sorting for spans (#2823) (eeafb64)
  • persistence: use sqlean v3.45.1 as sqlite engine (#2947) (3b202d7)
  • Remove experimental module (#2945) (01758cf)
  • restrict project metrics to be last 7 days (#2896) (066bc16)
  • span filtering by span evaluations (#2923) (4458ec4)
  • Support basic auth (#3061) (3202256)
  • support for span evaluations to get evaluations endpoint (#2900) (379e336)
  • support pagination on spans resolver (#3046) (2113c5c)
  • Update API for OpenAPI compliance (#2866) (0db65d8)
  • Update eval summaries to use persistence (#2920) (06eb320)

Bug Fixes

3.25.0 (2024-05-06)

Features

Bug Fixes

Documentation

  • development: make it explicit that you need to run pnpm build (#3035) (672cbed)

3.24.0 (2024-04-22)

Features

Bug Fixes

  • ensure recent version of opentelemetry-proto is used (#2948) (33647f5)

3.23.0 (2024-04-19)

Features

  • Added support for default_headers for azure_openai. (#2917) (6ee5f24)

Bug Fixes

Documentation

3.22.0 (2024-04-16)

Features

  • Add log_traces method that sends TraceDataset traces to Phoenix (#2897) (c8f9ed2)

3.21.0 (2024-04-12)

Features

  • Add SQL and Code Functionality Eval Templates (#2861) (c7d776a)

3.20.0 (2024-04-10)

Features

  • Deprecate datasets module, rename to inferences (#2785) (4987ea3)

Documentation

  • dockerize manual instrumentation example (#2797) (651efbe)
  • remove experimental tags in code (4c4a832)

3.19.4 (2024-04-04)

Bug Fixes

  • switch license format in toml (5c6f345)

Documentation

  • fix qa with reference tutorial (e1db1ce)
  • fix qa with reference tutorial (ba24950)
  • make dockerhub URL go to public (6650f67)
  • manually instrumented chatbot (#2730) (46be32b)

3.19.3 (2024-03-30)

Bug Fixes

  • ui: show formatted JSON for attributes (0d1b719)
  • ui: show formatted JSON for attributes (09ad1be)

3.19.2 (2024-03-29)

Bug Fixes

  • ui: broken context for markdown (556e901)

3.19.1 (2024-03-29)

Bug Fixes

  • UI: color rotation for markdown (3184359)

3.19.0 (2024-03-29)

Features

  • gql: add trace node and trace evaluations (#2662) (a985684)

3.18.1 (2024-03-28)

Bug Fixes

  • ignore docs/ directory when formatting (#2714) (1340f74)
  • repair frontend build step in release pipeline (#2716) (796eb6a)

3.18.0 (2024-03-28)

Features

3.17.1 (2024-03-24)

Bug Fixes

  • long project names do not overflow and squash project icon (#2686) (b77bfaa)

Documentation

  • Add mistral (GITBOOK-594) (78676af)
  • add mistral instrumentation to notebook (#2681) (54dc47d)
  • add mistral instrumentor to mistral tutorial (#2682) (13fc1f8)
  • Evals Structure! (GITBOOK-547) (ac23311)
  • fix missing parentheses (GITBOOK-571) (2353953)
  • Mistral (GITBOOK-595) (f245844)
  • No subject (GITBOOK-597) (b6196ac)
  • No subject (GITBOOK-598) (f6a2bd6)
  • Remove pinecone notebook (#2665) (9f1c1d4)
  • trace a deployed app (GITBOOK-593) (08623ea)

3.17.0 (2024-03-21)

Features

  • Add response_format argument to MistralAIModel (#2660) (7da51af)
  • evals: Add Mistral as an eval model (#2640) (c13ab6b)

Documentation

3.16.3 (2024-03-20)

Bug Fixes

3.16.2 (2024-03-20)

Bug Fixes

Documentation

3.16.1 (2024-03-19)

Bug Fixes

  • trace: eliminate truth ambiguity with non-empty numpy arrays (#2626) (be8ce7d)

Documentation

3.16.0 (2024-03-15)

Features

3.15.1 (2024-03-15)

Bug Fixes

  • handle numpy types in json.dumps for gql (#2600) (13cce4f)

Documentation

3.15.0 (2024-03-14)

Features

  • launch_app() with experimental span storage using environment variables for storage path and storage type enums (#2564) (8a0b572)
  • project archiving and deletion (#2585) (121f904)

Bug Fixes

  • projects: the home page should direct you to the projects page if there are multiple projects with data (#2586) (ced4e75)
  • use environment variable for project name (#2590) (e2ace76)

Documentation

3.14.2 (2024-03-14)

Bug Fixes

3.14.1 (2024-03-14)

Bug Fixes

3.14.0 (2024-03-14)

Features

  • experimental span storage with append-only text files (909672b)
  • experimental span storage with append-only text files (#2553) (909672b)

Bug Fixes

  • sagemaker: graphql base url was incorrect for sagemaker jupyterlab (#2572) (7ecf46e)

3.13.1 (2024-03-13)

Bug Fixes

3.13.0 (2024-03-13)

Features

  • add arize-phoenix support for python 3.12 (#2555) (aac0cd5)

3.12.0 (2024-03-13)

Features

Bug Fixes

  • prevent browser caching of static assets (#2549) (038e56e)

3.11.1 (2024-03-12)

Bug Fixes

3.11.0 (2024-03-11)

Features

  • graphql: embed project inside graphql span as private attribute (#2522) (9be1afa)
  • trace: context manager to pause tracing (#2520) (6bf7232)

Bug Fixes

Documentation

  • Update pyproject.toml with proper biline (4fdf710)

3.10.0 (2024-03-09)

Features

  • projects: add support for the PHOENIX_PROJECT_NAME param (#2515) (6f24786)
  • show first non-empty project (#2508) (54a2834)

Bug Fixes

  • support minimal llama-index installations (#2516) (2469677)

Documentation

3.9.0 (2024-03-08)

Features

  • ui: copy to clipboard for prompt template etc. (#2496) (9b853d0)

3.8.0 (2024-03-07)

The Phoenix evals module is graduating out of experimental! You can now install Phoenix evals as a standalone package with pip install arize-phoenix-evals or you can include the new version of phoenix.evals along with the Phoenix install with pip install -U arize-phoenix[evals]. Swapping to the new evals module includes a few small breaking changes which might require some migration work. Details can be found in MIGRATION.md.

phoenix.experimental.evals is being deprecated and will remain in Phoenix for about a month before being removed.

Features

Documentation

3.7.0 (2024-03-07)

Features

3.6.0 (2024-03-06)

Features

Bug Fixes

3.5.0 (2024-03-05)

Features

Bug Fixes

  • Properly define BedrockModel (#2425) (81a720c)
  • remove computed atributes from exported dataframe (#2366) (1de1415)
  • turn span_kind enums into string because it's not serializable by pyarrow (#2438) (50c7eb0)
  • update rag and llm ops notebooks (#2442) (adf1b2b)

Documentation

  • evals: update tracing tutorials with arize-phoenix-evals (#2386) (1af8187)
  • log information about the server at startup (#2445) (6d410c1)
  • update readme for phoenix.evals, fix llama-index example (#2435) (dfffaad)

3.4.1 (2024-02-29)

Bug Fixes

3.4.0 (2024-02-28)

Features

  • Add phoenix.evals bridge to phoenix and add evals extra install (#2389) (d8b9054)

Bug Fixes

  • remove run_relevance_evals and fix import issues (#2375) (9a97e62)
  • traces: add y scroll on trace tree (#2399) (9c4f6b9)

Documentation

3.3.0 (2024-02-23)

Features

Bug Fixes

  • use static version in pyproject.toml for packages (#2346) (ef2148c)

Documentation

3.2.1 (2024-02-16)

Bug Fixes

Documentation

  • update notebooks for px.Client().log_evaluations() (#2311) (a3ca311)

Miscellaneous Chores

3.2.0 (2024-02-16)

Features

Bug Fixes

3.1.2 (2024-02-15)

Bug Fixes

  • allow json string for metadata span attribute (#2301) (ec7fbe2)
  • ui: safely parse JSON and fallback to string for span attributes (#2293) (e43cdbb)

Documentation

3.1.1 (2024-02-15)

Bug Fixes

  • fix: cast message to string in vertexai model (86947a2)

Documentation

3.1.0 (2024-02-15)

Features

Bug Fixes

  • set global session to None if it fails to start (#2286) (6752fd2)
  • trace: Make dataset IDs unique by instance for TraceDataset (#2254) (1ac170f)

Documentation

  • trace: refactor llama-index tutorials to use 0.10.0 (#2277) (055b8d6)

3.0.3 (2024-02-13)

Bug Fixes

  • trace: perform library version compatibility on llama_index (#2272) (89bc510)

3.0.2 (2024-02-13)

Bug Fixes

  • run_evals correctly falls back to default responses on error (#2233) (4b2bd39)

3.0.1 (2024-02-09)

Bug Fixes

3.0.0 (2024-02-09)

⚠ BREAKING CHANGES

  • replace Phoenix tracers with OpenInference instrumentors (#2190)

Features

  • replace Phoenix tracers with OpenInference instrumentors (#2190) (b983c70)

2.11.1 (2024-02-09)

Bug Fixes

  • ui: add last_hour, fix end of hour rounding (#2247) (aa4efaf)

2.11.0 (2024-02-08)

Features

Bug Fixes

  • evals: properly use kw args for models in notebooks (#2235) (7bd59d5)

2.10.0 (2024-02-07)

Features

  • embeddings: add search by text and ID on selection (#2219) (99c480c)

Bug Fixes

  • endpoint for client inside ProcessSession (#2211) (82e279e)
  • trace: return to /tracing url when dismissing trace slide over (#2222) (ee4ced3)
  • traces: warn if collector endpoint is set but launch app is called (#2209) (eb97b8d)

Documentation

  • custom instrumentation (GITBOOK-495) (3310ba6)
  • update px.Client (GITBOOK-494) (61b427c)

2.9.4 (2024-02-06)

Bug Fixes

  • disregard active session if endpoint is provided to px.Client (#2206) (6ec0d23)

2.9.3 (2024-02-05)

Bug Fixes

2.9.2 (2024-02-05)

Bug Fixes

2.9.1 (2024-02-05)

Bug Fixes

Documentation

2.9.0 (2024-02-05)

Features

  • phoenix client get_evaluations() and get_trace_dataset() (#2154) (29800e4)
  • phoenix client get_spans_dataframe() and query_spans() (#2151) (e44b948)

2.8.0 (2024-02-02)

Features

Bug Fixes

  • broken link and openinference links (#2144) (01fb046)
  • databricks check crashes in python console (#2152) (5aeeeff)
  • default collector endpoint breaks on windows (#2161) (f1a2007)
  • Do not retry when context window has been exceeded (#2126) (ff6df1f)
  • remove hyphens from span_id in legacy evaluation fixtures (#2153) (fae859d)

Documentation

  • add docker badge (e584ed8)
  • Add terminal running steps (GITBOOK-441) (91c6b24)
  • No subject (GITBOOK-442) (5c4eb6c)
  • No subject (GITBOOK-443) (11f46cb)
  • No subject (GITBOOK-444) (fcf2bc9)
  • update badge (ddcecea)
  • update prompt to reflect rails (GITBOOK-445) (dea6dd6)

Miscellaneous Chores

2.7.0 (2024-01-24)

Features

  • persistence: add a PHOENIX_WORKING_DIR env var for setting up a… (#2121) (5fbb2e6)

2.6.0 (2024-01-23)

Features

Bug Fixes

  • Clean up vertex clients after event loop closure (#2102) (202c7ea)
  • Determine default async concurrency on a per-model basis (#2096) (b44d8aa)
  • Resolves Bedrock model compatibility issues (#2114) (c4a5343)
  • show localhost when the notebook is running locally (#2090) (095298d)

Documentation

2.5.0 (2024-01-16)

Features

Bug Fixes

  • Adjust evaluation templates and rails for Gemini compatibility (#2075) (3a7bfd2)

2.4.1 (2024-01-11)

Bug Fixes

  • traces: prevent missing key exception when extracting invocation parameters in llama-index (#2076) (5cc9560)

2.4.0 (2024-01-10)

Features

Bug Fixes

  • Handle missing vertex candidates (#2055) (1d0475a)
  • OpenAI clients are not cleaned up after calls to llm_classify (#2068) (3233d56)
  • traces: remove nan from log_evaluations (#2056) (df9ed5c)

Documentation

2.3.0 (2024-01-08)

Features

Bug Fixes

Documentation

  • Add demo link, examples getting started (GITBOOK-396) (e987315)
  • Add Evaluating Traces Section (GITBOOK-386) (7d72029)
  • Add evaluations section for results (GITBOOK-387) (2e74be0)
  • Add final thoughts to evaluation (GITBOOK-405) (20eab16)
  • add import statement (GITBOOK-408) (23247d7)
  • add link (GITBOOK-403) (0be280a)
  • eval concepts typo (GITBOOK-394) (7c80d4b)
  • eval concepts typos (GITBOOK-393) (62bc99f)
  • evaluation concepts typo fix (GITBOOK-390) (2cbc1dc)
  • Extract Data from Spans (GITBOOK-383) (440f530)
  • fix broken section link (GITBOOK-409) (fee537b)
  • fix typos (GITBOOK-391) (c8f5a55)
  • fix typos (GITBOOK-402) (3cd973d)
  • fix typos (GITBOOK-406) (eaa9bea)
  • fix typos (GITBOOK-407) (cad4820)
  • Initial draft of evaluation core concept (GITBOOK-385) (67369cf)
  • Log Evaluations (GITBOOK-389) (369d79d)
  • No subject (GITBOOK-399) (94df884)
  • Re-arrange nav (GITBOOK-398) (54a87eb)
  • Remove the word golden, simplify title (GITBOOK-395) (a2233b2)
  • simplify conceps (GITBOOK-384) (c38f6c2)
  • Simplify examples page (GITBOOK-400) (6144158)
  • Trace Evaluations Section (GITBOOK-388) (2ffa800)
  • Update SECURITY.md (#2029) (363e891)

2.2.1 (2023-12-28)

Bug Fixes

  • Do not retry if eval was successful when using SyncExecutor (#2016) (a869190)
  • ensure float values are properly encoded by otel tracer (#2024) (b12a894)
  • ensure llamaindex spans are correctly encoded (#2023) (3ca6262)
  • Use separate versioning file (#2020) (f38eedf)

2.2.0 (2023-12-22)

Features

  • Add support for Google's Gemini models via Vertex python sdk (#2008) (caf826c)
  • Support first-party Anthropic python SDK (#2004) (a323283)

2.1.0 (2023-12-21)

Features

  • instantiate evaluators by criteria (#1983) (9c72616)
  • support function calling for run_evals (#1978) (8be325c)
  • traces: add v1/traces HTTP endpoint to handle ExportTraceServiceRequest (3c94dea)
  • traces: add v1/traces HTTP endpoint to handle ExportTraceServiceRequest (#1968) (3c94dea)
  • traces: add retrieval summary to header (#2006) (8af0582)
  • traces: evaluation summary on the header (#2000) (965beb0)

Bug Fixes

2.0.0 (2023-12-20)

⚠ BREAKING CHANGES

  • Update llm_classify and llm_generate interfaces (#1974)

Features

Bug Fixes

Documentation

1.9.0 (2023-12-11)

Features

Documentation

  • Add LLM Tracing+Evals notebook with keyless example (#1928) (4c4aac6)

1.8.0 (2023-12-10)

Features

1.7.0 (2023-12-09)

Features

Bug Fixes

  • traces: span evaluations missing from the header (#1908) (5ace81e)

1.6.0 (2023-12-08)

Features

  • openai streaming spans show up in the ui (#1888) (ffa1d41)
  • support instrumentation for openai synchronous streaming (#1879) (b6e8c73)
  • traces: display document retrieval metrics on trace details (#1902) (0c35229)
  • traces: filterable span and document evaluation summaries (#1880) (f90919c)
  • traces: graphql query for document evaluation summary (#1874) (8a6a063)

Documentation

1.5.1 (2023-12-06)

Bug Fixes

1.5.0 (2023-12-06)

Features

  • evals: Human vs AI Evals (#1850) (e96bd27)
  • semantic conventions for tool_calls array in OpenAI ChatCompletion messages (#1837) (c079f00)
  • support asynchronous chat completions for openai instrumentation (#1849) (f066e10)
  • traces: document retrieval metrics based on document evaluation scores (#1826) (3dfb7bd)
  • traces: document retrieval metrics on trace / span tables (#1873) (733d233)
  • traces: evaluation annotations on traces for associating spans with eval metrics (#1693) (a218a65)
  • traces: server-side span filter by evaluation result values (#1858) (6b05f96)
  • traces: span evaluation summary (aggregation metrics of scores and labels) (#1846) (5c5c3d6)

Bug Fixes

Documentation

  • RAG evaluation notebook using traces (#1857) (4b67805)
  • Retrieval Chunks (GITBOOK-372) (39976d3)

1.4.0 (2023-11-30)

Features

  • propagate error status codes to parent spans for improved visibility into trace exceptions (#1824) (1a234e9)

1.3.0 (2023-11-30)

Features

  • Add OpenAI Rate limiting (#1805) (115e044)
  • evals: show span evaluations in trace details slideout (#1810) (4f0e4dc)
  • evaluation ingestion (no user-facing feature is added) (#1764) (7c4039b)
  • feature flags context (#1802) (a2732cd)
  • Implement asynchronous submission for OpenAI evals (#1754) (30c011d)
  • reference link correctness evaluation prompt template (#1771) (bf731df)
  • traces: configurable endpoint for the exporter (#1795) (8515763)
  • traces: display document evaluations alongside the document (#1823) (2ca3613)
  • traces: server-side sort of spans by evaluation result (score or label) (#1812) (d139693)
  • traces: show all evaluations in the table" (#1819) (2b27333)
  • traces: Trace page header with latency, status, and evaluations (#1831) (1d88efd)

Bug Fixes

  • enhance llama-index callback support for exception events (#1814) (8db01df)
  • pin llama-index temporarily (#1806) (d6aa76e)
  • remove sklearn metrics not available in sagemaker (#1791) (20ab6e5)
  • traces: convert (non-list) iterables to lists during protobuf construction due to potential presence of ndarray when reading from parquet files (#1801) (ca72747)
  • traces: make column selector sync'd between tabs (#1816) (125431a)

Documentation

  • Environment documentation (GITBOOK-370) (dbbb0a7)
  • Explanations (GITBOOK-371) (5f33da3)
  • No subject (GITBOOK-369) (656b5c0)
  • sync for 1.3 (#1833) (4d01e83)
  • update default value of variable in run_relevance_eval (GITBOOK-368) (d5bcaf8)

1.2.1 (2023-11-18)

Bug Fixes

1.2.0 (2023-11-17)

Features

Bug Fixes

  • unpin llama-index version in tutorial notebooks (#1766) (5ff74e3)

Documentation

1.1.1 (2023-11-16)

Bug Fixes

1.1.0 (2023-11-14)

Features

Documentation

  • evals: document llm_generate with output parser (#1741) (1e70ec3)

1.0.0 (2023-11-10)

⚠ BREAKING CHANGES

  • models: openAI 1.0 (#1716)

Features

0.1.1 (2023-11-09)

Bug Fixes

  • traces: handle AIMessageChunk in langchain tracer by matching prefix in name (#1724) (8654c0a)

0.1.0 (2023-11-08)

Features

  • add long-context evaluators, including map reduce and refine patterns (#1710) (0c3b105)
  • traces: span table column visibility controls (#1687) (559852f)

Bug Fixes

Documentation