Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Improve Structure and Clarity of CONTRIBUTING.md #1084

Merged
merged 2 commits into from
Oct 22, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
247 changes: 173 additions & 74 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,104 +1,203 @@
# Contributing
# Contributing to BAML

First off, thanks for your interest in contributing to BAML! We appreciate all the help we can get in making it the best way to build any AI agents or applications.

Before contributing, do try to let us know what task you want to take on, and let us know in our [Discord](https://discord.gg/BTNBeXGuaS) #contributing channel.
## Table of Contents

Here is our guide on getting setup.
- [How to Contribute](#how-to-contribute)
- [Join our Community](#join-our-community)
- [Check Existing Issues](#check-existing-issues)
- [Creating an Issue](#creating-an-issue)
- [Fork the Repository](#fork-the-repository)
- [Submit a Pull Request (PR)](#submit-a-pull-request-pr)
- [Examples of Merged PRs](#examples-of-merged-prs)
- [Setting up the BAML Compiler and Runtime](#setting-up-the-baml-compiler-and-runtime)
- [Compiler Architecture Overview](#compiler-architecture-overview)
- [Steps to Build and Test Locally](#steps-to-build-and-test-locally)
- [Running Integration Tests](#running-integration-tests)
- [Python Integration Tests](#python-integration-tests)
- [TypeScript Integration Tests](#typescript-integration-tests)
- [OpenAPI Server Tests](#openapi-server-tests)
- [Grammar Testing](#grammar-testing)
- [VSCode Extension Testing](#vscode-extension-testing)
- [Testing PromptFiddle.com](#testing-prompfiddlecom)

### Compiler Architecture

## How to Contribute

1. **Join our Community**:

- Please join our [Discord](https://discord.gg/BTNBeXGuaS) and introduce yourself in the `#contributing` channel. Let us know what you're interested in working on, and we can help you get started.

2. **Check Existing Issues**:

- Look at the [issue tracker](https://github.com/BoundaryML/baml/issues) and find and issue to work on.
Issues labeled `good first issue` are a good place to start.

3. **Creating an Issue**:

- If you find a bug or have a feature request, please tell us about in the discord channel and then open a new issue. Make sure to provide enough details and include a clear title.

4. **Fork the Repository**:

- Fork the repository and clone your fork locally. Work on your changes in a feature branch.

5. **Submit a Pull Request (PR)**:

- Submit your pull request with a clear description of the changes you've made. Make sure to reference the issue you're working on.

### Examples of Merged PRs:

- **Fix parsing issues**: [PR #1031](https://github.com/BoundaryML/baml/pull/1031)

- **Coerce integers properly**: [PR #1023](https://github.com/BoundaryML/baml/pull/1023)

- **Fix syntax highlighting and a grammar parser crash**: [PR #1013](https://github.com/BoundaryML/baml/pull/1013)

- **Implement literal types (e.g., `sports "SOCCER" | "BASKETBALL"`)**: [PR #978](https://github.com/BoundaryML/baml/pull/978)

- **Fix issue with OpenAI provider**: [PR #896](https://github.com/BoundaryML/baml/pull/896)

- **Implement `map` type**: [PR #797](https://github.com/BoundaryML/baml/pull/797)



## Setting up the BAML Compiler and Runtime

#### Compiler Architecture Overview

<TBD — we will write more details here>

- baml-cli / VSCode generates `baml_client` which contains all the interfaces people use to call the `baml-runtime`
- Pest grammar → AST (build diagnostics for linter) → IntermediateRepr
- baml-runtime parses baml files + builds and calls LLM endpoints using internal LLM providers, then parses the data into “jsonish”, and finally coerces that jsonish into the schema.
- `baml-cli/ VSCode` generates `baml_client`, containing all the interfaces people use to call the `baml-runtime`.

- **Pest grammar -> AST (build diagnostics for linter) -> Intermediate Representation (IR)**: The runtime parses BAML files, builds and calls LLM endpoints, parses data into JSONish, and coerces that JSONish into the schema.


## Example feature PRs
### Steps to Build and Test Locally

1. Fix parsing issues:
1. https://github.com/BoundaryML/baml/pull/1031/files
2. Coerce ints properly ($3,000 → 3000) https://github.com/BoundaryML/baml/pull/1023
2. Fix syntax highlighting and a grammar parser crash https://github.com/BoundaryML/baml/pull/1013/files
3. Implement literal types like `sports "SOCCER" | "BASKETBALL"` https://github.com/BoundaryML/baml/pull/978
4. Fix issue with openai provider https://github.com/BoundaryML/baml/pull/896/files
5. Implement `map` type https://github.com/BoundaryML/baml/pull/797 (see list of items in the PR)
1. Install Rust

## Setting up the compiler / runtime in `engine`
2. Run `cargo build` in `engine/` and make sure everything builds on your machine.

3. Run some unit tests:
- `cd engine/baml-lib/baml/` and run `cargo test` to execute grammar linting tests.

1. Install rust
2. run `cargo build` in `engine/` and make sure everything builds on your machine.
3. run some of the unit tests:
1. `cd engine/baml-lib/baml && cargo test` will run some of our grammar linting tests for example.
4. Run the integration tests.

## Running Integration Tests

Setup your environment variables in an .env file with

OPENAI_API_KEY=”your key” (you mainly just need this one).

Make sure your shell reads these env variables setup so it injects them into the test process, since some of the test scripts don’t try to load these from any .env file yet and just assume the process has them. You can try to use [dotenv-cli](https://www.npmjs.com/package/dotenv-cli)

1. **Python**
1. Install poetry [https://python-poetry.org/docs/](https://python-poetry.org/docs/)
2. `cd integ-tests/python`
3. `poetry shell` (install `poetry` if you don’t have it, and python 3.8)
4. `poetry lock && poetry install`
5. `env -u CONDA_PREFIX poetry run maturin develop --manifest-path ../../engine/language_client_python/Cargo.toml` (this builds the compiler and injects the package into the virtual env)
6. `poetry run baml-cli generate --from ../baml_src` (generate the baml_client)
7. `poetry run python -m pytest -s`
1. run a specific test: `poetry run python -m pytest -s -k "my_test_name"`
2. **TypeScript**
1. Install pnpm [https://pnpm.io/installation](https://pnpm.io/installation)
2. before that, run `pnpm i` in the `engine/language_client_typescript`
3. `cd integ-tests/typescript`
4. `pnpm i`
5. `pnpm build:debug` (builds your new compiler changes)
6. `pnpm generate` (generates baml_client for your tests with any new changes)
7. `pnpm integ-tests` or `pnpm integ-tests -t "my test name"`
3. Ruby
4. **OpenAPI server:**
1. `cd engine/baml-runtime/tests`
2. `cargo test --features internal`
3. This will run the baml-serve server locally, and ping it. You may need to change the PORT variable for your new test to use a different port (we don’t have a good way of autoselecting a port.
4. To test a particular OpenAPI client (TBD instructions)

### Testing grammar changes

1. Use the playground in [https://pest.rs/](https://pest.rs/) to test your grammar with the new syntax
1. modify the existing `.pest` file to update the grammar
2. modify the AST parsing of the new grammar
3. modify the IR (IntermediateRepr)
4. ensure you pass all the existing `cargo test` validations in `engine/baml-lib/`
5. ensure integ tests still pass.
2. We also have a grammar for [`promptfiddle.com`](http://promptfiddle.com) syntax rendering that uses Lezer that you may have to modify. There’s other playground websites for Lezer you can check out.

### Testing VSCode Extension
1. Setup your environment variables in an `.env` file with:

- `OPENAI_API_KEY=”your key”` (you mainly just need this one).

2. Ensure the environment variables are into the test process. You can use [dotenv-cli](https://www.npmjs.com/package/dotenv-cli) to do this.


### Python Integration Tests

1. Install poetry [https://python-poetry.org/docs/](https://python-poetry.org/docs/)

2. Navigate to the Python integration tests: `cd integ-tests/python/`

3. Run the following commands:
- `poetry shell`
- `poetry lock && poetry install`
- `env -u CONDA_PREFIX poetry run maturin develop --manifest-path ../../engine/language_client_python/Cargo.toml`
- `poetry run baml-cli generate --from ../baml_src`
- `poetry run python -m pytest -s`
- To run a specific test: `poetry run python -m pytest -s -k "my_test_name"`


### TypeScript Integration Tests

1. Install pnpm: [https://pnpm.io/installation](https://pnpm.io/installation)

2. Navigate to the Language Client TypeScript directory and install dependencies:
- `cd engine/language_client_typescript/`
- `pnpm i`

3. Navigate to the TypeScript integration tests:
- `cd integ-tests/typescript/`

4. Run the following commands:

- `pnpm i` (install dependencies)
- `pnpm build:debug` (builds your new compiler changes)
- `pnpm generate` (generates `baml_client` for your tests with any new changes)
- `pnpm integ-tests` or `pnpm integ-tests -t "my test name"`


### OpenAPI Server Testss

1. Navigate to the test directory:
- `cd engine/baml-runtime/tests/`

2. Run tests with:

- `cargo test --features internal`

This will run the baml-serve server locally, and ping it. You may need to change the PORT variable for your new test to use a different port (we don’t have a good way of autoselecting a port).

> Instructions for testing a particular OpenAPI client are TBD.

## Grammar Testing

1. Test new syntax in the [pest playground](https://pest.rs/).

2. Update the following:

- **Pest grammar**: Modify the `.pest` file.
- **AST parsing**: Update the AST parsing of the new grammar.
- **IR**: Modify the Intermediate Representation (IR).

3. Ensure all tests pass:

- Run `cargo test` in `engine/baml-lib/`
- Ensure integration tests still pass.

4. Modify the grammar for the [PromptFiddle.com](http://PromptFiddle.com) syntax rendering that uses Lezer, if necessary.


## VSCode Extension Testing

This requires a macos or linux machine, since we symlink some playground files between both [PromptFiddle.com](http://PromptFiddle.com) website app, and the VSCode extension itself.

**Note:** If you are just making changes to the VSCode extension UI, you may want to go to the section: Testing Promptfiddle.com
**Note:** If you are just making changes to the VSCode extension UI, you may want to go to the section: [Testing PromptFiddle.com](#testing-prompfiddlecom).

1. `cd typescript`
2. `pnpm i`
3. `npx turbo build --force`
4. Go to VSCode → Run and Debug (play button near extensions button) → Launch VSCode extension (press play button)
1. This launches a new VSCode window in Debug mode.
2. You can try and open up a simple baml project in this window (read our quickstart to setup a simple project, or clone `baml-examples` repo)
5. Reload the extension (command + shift + p) when you change any core logic in the extension, or just close and open the playground if you rebuild the playground.
1. Navigate to the TypeScript directory:
- `cd typescript/`

To rebuild the playground UI
2. Install dependencies:
- `pnpm i`

3. Build and launch the extension:
- `npx turbo build --force`
- Open VSCode and go to the Run and Debug section (play button near the extensions button).
- Select "Launch VSCode Extension" and press the play button.
- This will open a new VSCode window in Debug mode.
- You can open a simple BAML project in this window (refer to our quickstart guide to set up a simple project, or clone the `baml-examples` repository).

4. Reload the extension:
- Use `Command + Shift + P` to reload the extension when you change any core logic.
- Alternatively, close and reopen the playground if you rebuild the playground.


To rebuild the playground UI:

1. `cd typescript/vscode-ext/packages/web-panel`
2. `pnpm build`
3. Close and open the playground in your “Debug mode VSCode window”

### Testing [prompfiddle.com](http://prompfiddle.com)
## Testing [prompfiddle.com](http://prompfiddle.com)

This is useful if you want to iterate faster on the Extension UI, since it supports hot-reloading.

1. `cd typescript/fiddle-frontend`
2. `pnpm dev`
3. Modify the files in `typescript/playground-common`.
1. Navigate to the Fiddle Frontend directory:
- `cd typescript/fiddle-frontend`

2. Start the dev server:
- `pnpm dev`

3. Modify the files in `typescript/playground-common`

4. Use the `vscode-` prefixed tailwind classes to get proper colors.
5 changes: 2 additions & 3 deletions engine/language_client_codegen/src/openapi.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
use std::collections::HashMap;
use std::{path::PathBuf, process::Command};
use std::path::PathBuf;

use anyhow::{Context, Result};
use baml_types::{BamlMediaType, FieldType, LiteralValue, TypeValue};
Expand All @@ -8,7 +7,7 @@ use internal_baml_core::ir::{
repr::{Function, IntermediateRepr, Node, Walker},
ClassWalker, EnumWalker,
};
use serde::{Deserialize, Serialize};
use serde::Serialize;
use serde_json::json;

use crate::dir_writer::{FileCollector, LanguageFeatures, RemoveDirBehavior};
Expand Down
3 changes: 1 addition & 2 deletions engine/language_client_codegen/src/ruby/field_type.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
use std::collections::HashSet;

use baml_types::{BamlMediaType, FieldType, LiteralValue, TypeValue};
use baml_types::{BamlMediaType, FieldType, TypeValue};

use super::ruby_language_features::ToRuby;

Expand Down
3 changes: 1 addition & 2 deletions engine/language_client_codegen/src/ruby/generate_types.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
use std::collections::HashSet;

use anyhow::Result;

Expand Down Expand Up @@ -30,7 +29,7 @@ pub(crate) struct RubyStreamTypes<'ir> {
partial_classes: Vec<PartialRubyStruct<'ir>>,
}

/// The Python class corresponding to Partial<TypeDefinedInBaml>
/// The Python class corresponding to Partial<TypeDefinedjInBaml>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hellovai in the next PR, I will correct it, my bad

struct PartialRubyStruct<'ir> {
name: &'ir str,
// the name, and the type of the field
Expand Down
1 change: 0 additions & 1 deletion engine/language_client_codegen/src/ruby/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ use anyhow::Result;
use indexmap::IndexMap;
use ruby_language_features::ToRuby;

use either::Either;

use internal_baml_core::ir::repr::IntermediateRepr;

Expand Down
1 change: 0 additions & 1 deletion engine/language_client_codegen/src/typescript/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ mod typescript_language_features;
use std::path::PathBuf;

use anyhow::Result;
use either::Either;
use indexmap::IndexMap;
use internal_baml_core::{
configuration::GeneratorDefaultClientMode,
Expand Down