Releases: Arize-ai/phoenix
v0.0.33
:## What's Changed
- feat(tracing): launch_app with traces by @mikeldking in #1044
- feat: add langchain tracer (experimental) by @RogerHYang in #1049
- fix: remove copy= from set_axis for older pandas by @RogerHYang in #1054
- ci: remove unused code by @RogerHYang in #1055
- feat(gql): functionality flags to drive UX by @mikeldking in #1046
- feat(tracing): spans table by @mikeldking in #1061
- feat: llamaindex callback skeleton by @axiomofjoy in #1058
- feat(tracing): graphQL span time bookends and sorting by @mikeldking in #1064
- feat(tracing): add latency column by @mikeldking in #1071
- fix: var not defined for langchain notebook by @pbadhe in #1075
- feat: semantic convention for input and output span attributes by @RogerHYang in #1076
- fix: Tz naive timestamps conversion to local by @pbadhe in #1066
- fix: snap to end of day by @mikeldking in #1086
- fix(tracing): launch fix for traces by @mikeldking in #1095
- ci: remove mistaken pnpm lock file, clear dependabot security by @mikeldking in #1099
- feat: TraceDataset.from_spans class method by @axiomofjoy in #1096
- docs: llama index tracing internal notebook by @mikeldking in #1101
- docs: ragas tutorial by @mikeldking in #1104
- fix(datasets): Show inner error messages in DatasetError.str by @alexmojaki in #1102
- feat: langchain callback notebook by @axiomofjoy in #1100
- fix: convert timedelta to milliseconds by @RogerHYang in #1106
- feat(gql): span attributes as json string by @RogerHYang in #1107
- feat(gql): filter spans by trace id by @RogerHYang in #1108
- fix: add unknown span kind enum by @RogerHYang in #1109
- fix: remove extra print statement by @RogerHYang in #1112
- feat: semantic convention for prompt template by @RogerHYang in #1111
- docs: sentiment classification tutorial update by @axiomofjoy in #893
- fix: langchain tracer prompt template issue by @axiomofjoy in #1114
- feat(trace): exception event by @RogerHYang in #1113
- feat(tracing): llama_index IO by @mikeldking in #1110
- chore: migrate react-table by @mikeldking in #1116
- chore: use getpass on ragas tutorial by @mikeldking in #1115
- feat(tracing): span table with sorting and pagination by @mikeldking in #1118
- feat(trace): semantic convention for token counts by @RogerHYang in #1128
- fix: table sort icon arrow direction by @RogerHYang in #1130
- feat(tracing): trace details by @mikeldking in #1132
- feat(trace): gql token count by @RogerHYang in #1131
- feat(trace): gql descendant spans list by @RogerHYang in #1133
- feat(tracing): Expose get spans method by @mikeldking in #1143
- fix: restricting pydantic version for unit tests with llama-index by @RogerHYang in #1145
- feat: add llm attributes to langchain tracer by @axiomofjoy in #1119
- feat(tracing): tree view of traces by @mikeldking in #1134
- feat: llm message attribute on llamaindex callback by @axiomofjoy in #1135
- feat(trace): gql span input/output by @RogerHYang in #1147
- feat(trace): nested attributes by @RogerHYang in #1149
- feat(tracing): traces table by @mikeldking in #1148
- ci: bump llamaIndex by @mikeldking in #1144
- ci: restore random fixture by @RogerHYang in #1154
- feat(tracing): span status gql and table view UI by @mikeldking in #1155
- docs: Zilliz <> Arize Notebook for Search and Retrieval by @hakantekgul in #1158
- feat(tracing): span events and exceptions highlighting by @mikeldking in #1171
- refactor(tracing): make SpanIO type have mimetype value pair be required by @mikeldking in #1178
- feat: semantic conventions for tools, langchain implementation and tests by @axiomofjoy in #1177
- fix: add invocation parameters to llm spans only by @axiomofjoy in #1180
- feat(tracing): info and IO details by @mikeldking in #1179
- docs: Update milvus_llamaindex_search_and_retrieval_tutorial.ipynb by @hakantekgul in #1187
- feat(trace): gql cumulative token count by @RogerHYang in #1184
- feat(trace): gql count and timestamp bookends by @RogerHYang in #1181
- chore: make langchain notebook and scripts a bit more re-usable by @mikeldking in #1188
- feat(traces): add tracing overview stats by @mikeldking in #1193
- feat(trace): semantic convention for retrieval documents by @RogerHYang in #1197
- chore(tracing): feature parity for llama-index tracing notebook by @mikeldking in #1196
- fix(notebook): KeyError in os.environ by @RogerHYang in #1212
- fix: include 33rc2 in pip install for traces by @RogerHYang in #1214
- fix(notebook): pip install chromadb by @RogerHYang in #1215
- fix: remove chroma dep from langchain callback tutorial by @axiomofjoy in #1217
- fix: disable sorting as a stopgap by @RogerHYang in #1219
- chore: rename callback notebooks by @axiomofjoy in #1221
- fix(css): don't flex hidden tab by @mikeldking in #1223
- docs: langchain tracing notebook with google palm by @axiomofjoy in #1225
- ci: exclude scripts/data from type checks by @RogerHYang in #1216
- fix(trace): allow null end times by @RogerHYang in #1228
- fix: pandas
.apply
returns DataFrame when it should return Series by @RogerHYang in #1229 - ci: upgrade dependencies by @RogerHYang in #1230
- fix: resolve type issue by @axiomofjoy in #1232
- feat: embedding attribute semantic conventions and llamaindex implementation by @axiomofjoy in #1169
- feat(trace): add attributes to events by @RogerHYang in #1172
- chore: notebooks to wrangle ms marco and wiki qa datasets by @axiomofjoy in #1239
- feat: download evaluation dataset by @axiomofjoy in #1241
- feat: Add binary evals using OpenAI model by @fjcasti1 in #1246
- chore: convert llm_eval_binary to synchronous function by @axiomofjoy in #1248
- fix: change tuple to list by @RogerHYang in #1249
- refactor(tracing): change llm messages to a list of objects by @mikeldking in #1254
- feat: add run_eval function by @axiomofjoy in #1250
- fix(trace): make timestamps timezone aware for llama-index callback by @RogerHYang in #1256
- feat(trace): span streaming, filtering, and exporting by @RogerHYang in #1235
- feat(trace): token counts for llama-index callback by @RogerHYang in #1257
- docs: llamaindex tracing tutorial notebook and evals notebook by @axiomofjoy in #1244
- chore: llama-index build scripts by @axiomofjoy in #1067
- fix: pin pandas version greater than or equal to 1.5.0 by @axiomofjoy in #1279
- feat(tracing): llm messages by @mikeldking in #1251
- fix(trace): column getter in SpanSort by ...
v0.0.32
What's Changed
- fix: missing init for conda build by @RogerHYang in #1042
Full Changelog: v0.0.31...v0.0.32
v0.0.31
What's Changed
- docs: llamaindex notebook cleanup by @axiomofjoy in #995
- fix: manually sample document data for llamaindex tutorial by @axiomofjoy in #998
- docs: adjust pinecone totorial to use corpus by @mikeldking in #996
- docs: add in gif for dimension details and psi by @axiomofjoy in #1021
- docs: llamaindex tutorial notebook tweaks by @axiomofjoy in #1022
- fix(app): re-pin modules by @mikeldking in #1025
- feat(traces): move single model view under /model by @mikeldking in #1027
- feat(embeddings): point scale slider by @mikeldking in #1030
- feat(tracing): define span schema by @mikeldking in #1024
- chore: update components by @mikeldking in #1034
- feat(tracing): strawman tracer implementation by @mikeldking in #1035
- chore(tracing): traces fixtures by @mikeldking in #1038
- docs: update DEVELOPMENT.md publishing instructions by @axiomofjoy in #1039
- feat: add experimental sub-module and refactor notebooks by @axiomofjoy in #1029
- docs: v0.0.31 by @axiomofjoy in #1040
Full Changelog: 0.0.30...v0.0.31
0.0.30
Search And Retrieval Troubleshooting
This release contains troubleshooting workflows for search and retrieval! If you are building an LLM powered application that uses RAG (retrieval augmented generation), poor retrieval can be detrimental to the user-experience of your app. Phoenix now supports passing in your knowledge base as a corpus dataset so that you can inspect how your retrieval system is querying for relevant documents in your vector store. Phoenix automatically computes the distance between your queries and document embeddings, helping you quickly identify slices of your data that represent user queries that are not contained in your vector store. Not only that, it visually overlays the retrieval connections within the point cloud so you can visually highlight the vector store clusters your retriever is pulling data from. For all the details, check out our notebooks that cover search and retrieval!
phoenix_rag.mp4
What's Changed
- chore: docs to main (#882) by @mikeldking in #883
- feat: add corpus points to umap by @RogerHYang in #917
- ci: bump hdbscan to deal with cython builds by @mikeldking in #936
- feat(embeddings): relationships lines by @mikeldking in #909
- fix: Only initialize the corpus model if it exists by @mikeldking in #946
- fix: pin HDBSCAN to cython3 compatible version by @mikeldking in #944
- fix(fixtures): fixture reference schema incorrectly falling back to p… by @mikeldking in #949
- fix: restore truthiness by @RogerHYang in #952
- feat: display corpus points as octahedrons by @mikeldking in #954
- feat: primary to corpus ratio by @RogerHYang in #957
- fix: add document text to event (if the event is a retrieved document record) by @RogerHYang in #961
- feat: allow string only responses (i.e. without embedding) by @RogerHYang in #956
- feat: show retrieval in the slide-over by @mikeldking in #955
- feat(embeddings): show corpus percent query by @mikeldking in #968
- fix: wrong ratio calculation for % query by @mikeldking in #973
- feat: calculate euclidean distance retrieval metric time series against the corpus dataset by @RogerHYang in #972
- feat(embeddings): show retrieval distance timeseries by @mikeldking in #974
- fix: suppress numpy runtime warnings about empty inputs by @RogerHYang in #975
- fix(embeddings): metric selector fix for retrieval metrics by @mikeldking in #977
- feat: allow iso 8601 timestamps by @axiomofjoy in #962
- feat(datasets): make corpus schema declaration more semantic by @mikeldking in #978
- docs: docs sync to main, Jul 25, 2023 by @mikeldking in #979
- feat: implement phoenix.Dataset.from_open_inference class method by @axiomofjoy in #965
- docs: adjust notebook text for LLM analysis using GPT by @amank94 in #994
- docs: Update langchain_pinecone_search_and_retrieval_tutorial.ipynb by @arizedatngo in #991
- docs: LlamaIndex tutorial enhancements by @axiomofjoy in #971
New Contributors
- @amank94 made their first contribution in #994
- @arizedatngo made their first contribution in #991
Full Changelog: v0.0.28...0.0.30
v0.0.28
What's Changed
- fix: restrict scikit-learn version by @RogerHYang in #904
- feat(retrieval): relationships schema and parsing by @mikeldking in #902
- chore: update web deps by @mikeldking in #905
- ci: add fixture for semantic search by @RogerHYang in #906
- chore: add wiki retrieval relationship to fixture by @mikeldking in #910
- ci: remove unused code by @RogerHYang in #912
- fix: output
display_name
for dataset name in export by @RogerHYang in #911 - fix(embeddings): color by score values in the point-cloud by @mikeldking in #916
Full Changelog: v0.0.27...v0.0.28
v0.0.27
📥 Download all clusters
You can now export all clusters to the notebook or to a parquet file from the UI! This enables you to explore the clusters back in the notebook or in an environment of your choosing!
export_all.mp4
What's Changed
- docs: langchain pinecone tutorial by @axiomofjoy in #886
- docs: llamaindex search and retrieval tutorial by @axiomofjoy in #891
- feat(embeddings): export all clusters by @mikeldking in #895
- chore: change readme gif by @mikeldking in #903
Full Changelog: v0.0.26...v0.0.27
v0.0.26
✨ Tooltips on the point-cloud!
tooltip.mp4
What's Changed
- feat(embeddings): point-cloud tooltips by @mikeldking in #890
Full Changelog: v0.0.25...v0.0.26
v0.0.25
What's Changed
- chore: bump point-cloud deps by @mikeldking in #867
- feat(gql): export multiple clusters at the same time by @RogerHYang in #868
- docs: Phoenix access from remote server ngrok + ssh by @pbadhe in #880
- chore: docs to main by @mikeldking in #882
- fix(embeddings): disable metric selector while loading points by @mikeldking in #884
- fix: replace IntEnum with Enum for DatasetRole by @RogerHYang in #889
Full Changelog: v0.0.24...v0.0.25
v0.0.24
This release updates Phoenix's capabilities for cluster-based analysis - providing more metrics to help you assess the performance and data quality of your unstructured data.
✨ Cluster Performance Metrics
Clusters can now be analyzed for model performance degradation! Our new release includes accuracy_score
as a model performance metric. Using accuracy as the base metric on the embedding projection allows you to drill into clusters that map to bad predictions quicker than ever before. Finding pockets of bad performance is as simple as picking the metric and sorting the clusters by worst performing. If you are using Phoenix to identify production data that should be re-labeled and fed back into your training pipeline, this is the feature for you.
cluster_performance.mp4
✨ Cluster Data Quality / Custom Metrics
Clusters can now be analyzed via ad-hoc metrics! You can now calculate the average of any numeric feature, tag, prediction, or actual sent into Phoenix. This means you can now find "low-quality" clusters via the heuristic of your choosing! Below is an example of how precision@k
for document retrieval (from a vector store) is used to identify clusters of chatbot queries that are failing to provide a good answer. The neat thing about this feature is that you can use Phoenix to build your own EDA heuristic! Care about rouge score or LLM-assisted evaluations? You can now use these to analyze your embeddings and to discover anomalies by simply sorting your clusters! This feature gives you, the data scientist, a powerful tool to formulate bespoke heuristics for identifying clusters of low performance, quality, and/or drift. We hope you like it!
context_retrieval.mp4
What's Changed
- docs: dolly vs. pythia by @axiomofjoy in #818
- feat: data quality metric by cluster by @RogerHYang in #804
- feat(dimensions): Add the ability to filter by data_type by @mikeldking in #822
- feat(embeddings): metric selector by @mikeldking in #821
- fix: nan bug for gql by @RogerHYang in #832
- feat: add stand-alone clusters endpoint for GraphQL query by @RogerHYang in #831
- feat(embeddings): cluster sorting by @mikeldking in #830
- chore: make placeholder text more obvious by @mikeldking in #833
- fix: change float16 to float32 as dtype for the nan series by @RogerHYang in #837
- fix: return nan on NotImplementedError (when binning on np.float16) by @RogerHYang in #838
- docs: sync 06-09-2023 by @mikeldking in #840
- feat(gql): add prediction id to event metadata by @RogerHYang in #843
- fix: coerce lists to arrays by @RogerHYang in #845
- feat: add performance metrics to each cluster by @RogerHYang in #828
- feat: accuracy timeseries by @RogerHYang in #842
- feat(embeddings): cluster data quality metrics by @mikeldking in #846
- docs: Update DEVELOPMENT.md with pypi publish changes. by @mikeldking in #849
- fix(embeddings): always place clusters with empty metrics at the bottom by @mikeldking in #850
- fix: show not found error when server is no longer running by @mikeldking in #853
- fix: guess whether a column contains any vector or all scalars by @RogerHYang in #854
- chore: camel-case metrics by @mikeldking in #856
- fix: skip empty interval bin with infinity endpoints (when all data are missing values) by @RogerHYang in #857
- feat(embeddings): cluster performance metrics by @mikeldking in #855
- fix(embeddings): force re-render clusters when opacity changes by @mikeldking in #858
- feat: show prediction id in selection details by @RogerHYang in #860
- fix: hide data quality metrics if empty by @mikeldking in #861
- fix: use random init when spectral init (the default) cannot be used by @RogerHYang in #862
- fix: replace NaT (Not a Time) with now (when dataset is empty) by @RogerHYang in #863
- fix(ui): cleanup event details for llm use-case by @mikeldking in #865
Full Changelog: 0.0.23...v0.0.24
0.0.23
❇️ HDBSCAN Tuning! ❇️
Dynamically adjust HDBSCAN parameters to get your clusters just right.
hdbscan_short.mp4
What's Changed
- ci: convert numpy scalars before graphql sees them by @RogerHYang in #788
- feat: add dimension filters to graphql model endpoint by @RogerHYang in #796
- ci: rename function by @RogerHYang in #797
- fix: numba.jit() deprecation warning by @pbadhe in #799
- feat: hdbscan tuning by @mikeldking in #798
- fix: single point selection selecting the wrong ID by @mikeldking in #803
- fix(embeddings): keep point-selection possible during move by @mikeldking in #805
- feat(embeddings): show the dataset in the table by @mikeldking in #812
Full Changelog: v0.0.22...0.0.23