Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add ADLS backend docs #2906

Closed
wants to merge 12 commits into from
57 changes: 57 additions & 0 deletions docs/integrations/object-storage/adls.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Azure ADLS Storage Backend

`delta-rs` offers native support for using Microsoft Azure Data Lake Storage (ADSL) as an object storage backend.

You don’t need to install any extra dependencies to read/write Delta tables to S3 with engines that use `delta-rs`. You do need to configure your ADLS access credentials correctly.

## Passing Credentials Explicitly

You can also pass ADLS credentials to your query engine explicitly.

For Polars, you would do this using the `storage_options` keyword as demonstrated above. This will forward your credentials to the `object store` library that Polars uses for cloud storage access under the hood. Read the [`object store` documentation](https://docs.rs/object_store/latest/object_store/azure/enum.AzureConfigKey.html#variants) for more information defining specific credentials.

## Example: Write Delta table to ADLS with Polars

Using Polars, you can write a Delta table to ADLS directly like this:

```python
import polars as pl

df = pl.DataFrame({"foo": [1, 2, 3, 4, 5]})

# define container name
container = <container_name>

# define credentials
storage_options = {
"ACCOUNT_NAME": <account_name>,
"ACCESS_KEY": <access_key>,
}

# write Delta to ADLS
df_pl.write_delta(
f"abfs://{container}/delta_table",
storage_options = storage_options
)
```

## Example with pandas

For libraries without direct `write_delta` methods (like Pandas), you can use the `write_deltalake` function from the `deltalake` library:

```python
import pandas as pd
from deltalake import write_deltalake

df = pd.DataFrame({"foo": [1, 2, 3, 4, 5]})

write_deltalake(
f"abfs://{container}/delta_table_pandas",
df,
storage_options=storage_options
)
```

## Using Local Authentication

If your local session is authenticated using the Azure CLI then you can write Delta tables directly to ADLS. Read more about this in the [Azure CLI documentation](https://learn.microsoft.com/en-us/cli/azure/).
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ nav:
- api/exceptions.md
- Integrations:
- Object Storage:
- integrations/object-storage/adls.md
- integrations/object-storage/hdfs.md
- integrations/object-storage/s3.md
- integrations/object-storage/s3-like.md
Expand Down
Loading