-
Notifications
You must be signed in to change notification settings - Fork 69
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
252 changed files
with
13,658 additions
and
3,392 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# Environment variables | ||
|
||
Weave provides a set of environment variables to configure and optimize its behavior. You can set these variables in your shell or within scripts to control specific functionality. | ||
|
||
```bash | ||
# Example of setting environment variables in the shell | ||
WEAVE_PARALLELISM=10 # Controls the number of parallel workers | ||
WEAVE_PRINT_CALL_LINK=false # Disables call link output | ||
``` | ||
|
||
```python | ||
# Example of setting environment variables in Python | ||
import os | ||
|
||
os.environ["WEAVE_PARALLELISM"] = "10" | ||
os.environ["WEAVE_PRINT_CALL_LINK"] = "false" | ||
``` | ||
|
||
## Environment variables reference | ||
|
||
| Variable Name | Description | | ||
|--------------------------|-----------------------------------------------------------------| | ||
| WEAVE_CAPTURE_CODE | Disable code capture for `weave.op` if set to `false`. | | ||
| WEAVE_DEBUG_HTTP | If set to `1`, turns on HTTP request and response logging for debugging. | | ||
| WEAVE_DISABLED | If set to `true`, all tracing to Weave is disabled. | | ||
| WEAVE_PARALLELISM | In evaluations, the number of examples to evaluate in parallel. `1` runs examples sequentially. Default value is `20`. | | ||
| WEAVE_PRINT_CALL_LINK | If set to `false`, call URL printing is suppressed. Default value is `false`. | | ||
| WEAVE_TRACE_LANGCHAIN | When set to `false`, explicitly disable global tracing for LangChain. | | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Microsoft Azure | ||
|
||
Weights & Biases integrates with Microsoft Azure OpenAI services, helping teams to manage, debug, and optimize their Azure AI workflows at scale. This guide introduces the W&B integration, what it means for Weave users, its key features, and how to get started. | ||
|
||
## Key features | ||
|
||
- **LLM evaluations**: Evaluate and monitor LLM-powered applications using Weave, optimized for Azure infrastructure. | ||
- **Seamless integration**: Deploy W&B Models on a dedicated Azure tenant with built-in integrations for Azure AI Studio, Azure ML, Azure OpenAI Service, and other Azure AI services. | ||
- **Enhanced performance**: Use Azure’s infrastructure to train and deploy models faster, with auto-scaling clusters and optimized resources. | ||
- **Scalable experiment tracking**: Automatically log hyperparameters, metrics, and artifacts for Azure AI Studio and Azure ML runs. | ||
- **LLM fine-tuning**: Fine-tune models with W&B Models. | ||
- **Central repository for models and datasets**: Manage and version models and datasets with W&B Registry and Azure AI Studio. | ||
- **Collaborative workspaces**: Support teamwork with shared workspaces, experiment commenting, and Microsoft Teams integration. | ||
- **Governance framework**: Ensure security with fine-grained access controls, audit trails, and Microsoft Entra ID integration. | ||
|
||
## Getting started | ||
|
||
To use W&B with Azure, add the W&B integration via the [Azure Marketplace](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/weightsandbiasesinc1641502883483.weights_biases_for_azure?tab=Overview). | ||
|
||
For a detailed guide describing how to integrate Azure OpenAI fine-tuning with W&B, see [Integrating Weights & Biases with Azure AI Services](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/weights-and-biases-integration). | ||
|
||
## Learn more | ||
|
||
- [Weights & Biases + Microsoft Azure Overview](https://wandb.ai/site/partners/azure) | ||
- [How W&B and Microsoft Azure Are Empowering Enterprises](https://techcommunity.microsoft.com/blog/azure-ai-services-blog/how-weights--biases-and-microsoft-azure-are-empowering-enterprises-to-fine-tune-/4303716) | ||
- [Microsoft Azure OpenAI Service Documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,176 @@ | ||
import Tabs from '@theme/Tabs'; | ||
import TabItem from '@theme/TabItem'; | ||
|
||
# NVIDIA NIM | ||
|
||
Weave automatically tracks and logs LLM calls made via the [ChatNVIDIA](https://python.langchain.com/docs/integrations/chat/nvidia_ai_endpoints/) library, after `weave.init()` is called. | ||
|
||
## Tracing | ||
|
||
It’s important to store traces of LLM applications in a central database, both during development and in production. You’ll use these traces for debugging and to help build a dataset of tricky examples to evaluate against while improving your application. | ||
|
||
<Tabs groupId="programming-language" queryString> | ||
<TabItem value="python" label="Python" default> | ||
Weave can automatically capture traces for the [ChatNVIDIA python library](https://python.langchain.com/docs/integrations/chat/nvidia_ai_endpoints/). | ||
|
||
Start capturing by calling `weave.init(<project-name>)` with a project name your choice. | ||
|
||
```python | ||
from langchain_nvidia_ai_endpoints import ChatNVIDIA | ||
import weave | ||
client = ChatNVIDIA(model="mistralai/mixtral-8x7b-instruct-v0.1", temperature=0.8, max_tokens=64, top_p=1) | ||
# highlight-next-line | ||
weave.init('emoji-bot') | ||
|
||
messages=[ | ||
{ | ||
"role": "system", | ||
"content": "You are AGI. You will be provided with a message, and your task is to respond using emojis only." | ||
}] | ||
|
||
response = client.invoke(messages) | ||
``` | ||
|
||
</TabItem> | ||
<TabItem value="typescript" label="TypeScript"> | ||
```plaintext | ||
This feature is not available in TypeScript yet since this library is only in Python. | ||
``` | ||
</TabItem> | ||
</Tabs> | ||
|
||
![chatnvidia_trace.png](imgs/chatnvidia_trace.png) | ||
|
||
## Track your own ops | ||
|
||
<Tabs groupId="programming-language" queryString> | ||
<TabItem value="python" label="Python" default> | ||
Wrapping a function with `@weave.op` starts capturing inputs, outputs and app logic so you can debug how data flows through your app. You can deeply nest ops and build a tree of functions that you want to track. This also starts automatically versioning code as you experiment to capture ad-hoc details that haven't been committed to git. | ||
|
||
Simply create a function decorated with [`@weave.op`](/guides/tracking/ops) that calls into [ChatNVIDIA python library](https://python.langchain.com/docs/integrations/chat/nvidia_ai_endpoints/). | ||
|
||
In the example below, we have 2 functions wrapped with op. This helps us see how intermediate steps, like the retrieval step in a RAG app, are affecting how our app behaves. | ||
|
||
```python | ||
# highlight-next-line | ||
import weave | ||
from langchain_nvidia_ai_endpoints import ChatNVIDIA | ||
import requests, random | ||
PROMPT="""Emulate the Pokedex from early Pokémon episodes. State the name of the Pokemon and then describe it. | ||
Your tone is informative yet sassy, blending factual details with a touch of dry humor. Be concise, no more than 3 sentences. """ | ||
POKEMON = ['pikachu', 'charmander', 'squirtle', 'bulbasaur', 'jigglypuff', 'meowth', 'eevee'] | ||
client = ChatNVIDIA(model="mistralai/mixtral-8x7b-instruct-v0.1", temperature=0.7, max_tokens=100, top_p=1) | ||
|
||
# highlight-next-line | ||
@weave.op | ||
def get_pokemon_data(pokemon_name): | ||
# highlight-next-line | ||
# This is a step within your application, like the retrieval step within a RAG app | ||
url = f"https://pokeapi.co/api/v2/pokemon/{pokemon_name}" | ||
response = requests.get(url) | ||
if response.status_code == 200: | ||
data = response.json() | ||
name = data["name"] | ||
types = [t["type"]["name"] for t in data["types"]] | ||
species_url = data["species"]["url"] | ||
species_response = requests.get(species_url) | ||
evolved_from = "Unknown" | ||
if species_response.status_code == 200: | ||
species_data = species_response.json() | ||
if species_data["evolves_from_species"]: | ||
evolved_from = species_data["evolves_from_species"]["name"] | ||
return {"name": name, "types": types, "evolved_from": evolved_from} | ||
else: | ||
return None | ||
|
||
# highlight-next-line | ||
@weave.op | ||
def pokedex(name: str, prompt: str) -> str: | ||
# highlight-next-line | ||
# This is your root op that calls out to other ops | ||
# highlight-next-line | ||
data = get_pokemon_data(name) | ||
if not data: return "Error: Unable to fetch data" | ||
|
||
messages=[ | ||
{"role": "system","content": prompt}, | ||
{"role": "user", "content": str(data)} | ||
] | ||
|
||
response = client.invoke(messages) | ||
return response.content | ||
|
||
# highlight-next-line | ||
weave.init('pokedex-nvidia') | ||
# Get data for a specific Pokémon | ||
pokemon_data = pokedex(random.choice(POKEMON), PROMPT) | ||
``` | ||
|
||
Navigate to Weave and you can click `get_pokemon_data` in the UI to see the inputs & outputs of that step. | ||
</TabItem> | ||
<TabItem value="typescript" label="TypeScript"> | ||
```plaintext | ||
This feature is not available in TypeScript yet since this library is only in Python. | ||
``` | ||
</TabItem> | ||
</Tabs> | ||
|
||
![nvidia_pokedex.png](imgs/nvidia_pokedex.png) | ||
|
||
## Create a `Model` for easier experimentation | ||
|
||
<Tabs groupId="programming-language" queryString> | ||
<TabItem value="python" label="Python" default> | ||
Organizing experimentation is difficult when there are many moving pieces. By using the [`Model`](/guides/core-types/models) class, you can capture and organize the experimental details of your app like your system prompt or the model you're using. This helps organize and compare different iterations of your app. | ||
|
||
In addition to versioning code and capturing inputs/outputs, [`Model`](/guides/core-types/models)s capture structured parameters that control your application’s behavior, making it easy to find what parameters worked best. You can also use Weave Models with `serve`, and [`Evaluation`](/guides/core-types/evaluations)s. | ||
|
||
In the example below, you can experiment with `model` and `system_message`. Every time you change one of these, you'll get a new _version_ of `GrammarCorrectorModel`. | ||
|
||
```python | ||
import weave | ||
from langchain_nvidia_ai_endpoints import ChatNVIDIA | ||
|
||
weave.init('grammar-nvidia') | ||
|
||
class GrammarCorrectorModel(weave.Model): # Change to `weave.Model` | ||
system_message: str | ||
|
||
@weave.op() | ||
def predict(self, user_input): # Change to `predict` | ||
client = ChatNVIDIA(model="mistralai/mixtral-8x7b-instruct-v0.1", temperature=0, max_tokens=100, top_p=1) | ||
|
||
messages=[ | ||
{ | ||
"role": "system", | ||
"content": self.system_message | ||
}, | ||
{ | ||
"role": "user", | ||
"content": user_input | ||
} | ||
] | ||
|
||
response = client.invoke(messages) | ||
return response.content | ||
|
||
|
||
corrector = GrammarCorrectorModel( | ||
system_message = "You are a grammar checker, correct the following user input.") | ||
result = corrector.predict("That was so easy, it was a piece of pie!") | ||
print(result) | ||
``` | ||
</TabItem> | ||
<TabItem value="typescript" label="TypeScript"> | ||
```plaintext | ||
This feature is not available in TypeScript yet since this library is only in Python. | ||
``` | ||
</TabItem> | ||
</Tabs> | ||
|
||
![chatnvidia_model.png](imgs/chatnvidia_model.png) | ||
|
||
## Usage Info | ||
|
||
The ChatNVIDIA integration supports `invoke`, `stream` and their async variants. It also supports tool use. | ||
As ChatNVIDIA is meant to be used with many types of models, it does not have function calling support. |
Oops, something went wrong.