docs: Adding blog post for Ray (#5705)

ntkathole · web-flow · commit 398fdcfa3821 · 2025-10-29T14:42:40.000-04:00
Signed-off-by: ntkathole &lt;nikhilkathole2683@gmail.com&gt;
diff --git a/docs/getting-started/components/compute-engine.md b/docs/getting-started/components/compute-engine.md
@@ -25,7 +25,7 @@ engines.
 | SnowflakeComputeEngine  | Runs on Snowflake, designed for scalable feature generation using Snowflake SQL.                | ✅         |      |
 | LambdaComputeEngine     | Runs on AWS Lambda, designed for serverless feature generation.                                 | ✅         |      |
 | FlinkComputeEngine      | Runs on Apache Flink, designed for stream processing and real-time feature generation.          | ❌         |      |
-| RayComputeEngine        | Runs on Ray, designed for distributed feature generation and machine learning workloads.        | ❌         |      |
+| RayComputeEngine        | Runs on Ray, designed for distributed feature generation and machine learning workloads.        | ✅         |      |
 ```
 
 ### Batch Engine
diff --git a/docs/getting-started/genai.md b/docs/getting-started/genai.md
@@ -104,6 +104,24 @@ This integration enables:
 - Efficiently materializing features to vector databases
 - Scaling RAG applications to enterprise-level document repositories
 
+### Scaling with Ray Integration
+
+Feast integrates with Ray to enable distributed processing for RAG applications:
+
+* **Ray Compute Engine**: Distributed feature computation using Ray's task and actor model
+* **Ray Offline Store**: Process large document collections and generate embeddings at scale
+* **Ray Batch Materialization**: Efficiently materialize features from offline to online stores
+* **Distributed Embedding Generation**: Scale embedding generation across multiple nodes
+
+This integration enables:
+- Distributed processing of large document collections
+- Parallel embedding generation for millions of text chunks
+- Kubernetes-native scaling for RAG applications
+- Efficient resource utilization across multiple nodes
+- Production-ready distributed RAG pipelines
+
+For detailed information on building distributed RAG applications with Feast and Ray, see [Feast + Ray: Distributed Processing for RAG Applications](https://feast.dev/blog/feast-ray-distributed-processing/).
+
 ## Model Context Protocol (MCP) Support
 
 Feast supports the Model Context Protocol (MCP), which enables AI agents and applications to interact with your feature store through standardized MCP interfaces. This allows seamless integration with LLMs and AI agents for GenAI applications.
@@ -158,6 +176,7 @@ For more detailed information and examples:
 * [RAG Tutorial with Docling](../tutorials/rag-with-docling.md)
 * [RAG Fine Tuning with Feast and Milvus](../../examples/rag-retriever/README.md)
 * [Milvus Quickstart Example](https://github.com/feast-dev/feast/tree/master/examples/rag/milvus-quickstart.ipynb)
+* [Feast + Ray: Distributed Processing for RAG Applications](https://feast.dev/blog/feast-ray-distributed-processing/)
 * [MCP Feature Store Example](../../examples/mcp_feature_store/)
 * [MCP Feature Server Reference](../reference/feature-servers/mcp-feature-server.md)
 * [Spark Data Source](../reference/data-sources/spark.md)
diff --git a/docs/reference/compute-engine/ray.md b/docs/reference/compute-engine/ray.md
@@ -2,6 +2,24 @@
 
 The Ray compute engine is a distributed compute implementation that leverages [Ray](https://www.ray.io/) for executing feature pipelines including transformations, aggregations, joins, and materializations. It provides scalable and efficient distributed processing for both `materialize()` and `get_historical_features()` operations.
 
+## Quick Start with Ray Template
+
+### Ray RAG Template - Batch Embedding at Scale
+
+For RAG (Retrieval-Augmented Generation) applications with distributed embedding generation:
+
+```bash
+feast init -t ray_rag my_rag_project
+cd my_rag_project/feature_repo
+```
+
+The Ray RAG template demonstrates:
+- **Parallel Embedding Generation**: Uses Ray compute engine to generate embeddings across multiple workers
+- **Vector Search Integration**: Works with Milvus for semantic similarity search
+- **Complete RAG Pipeline**: Data → Embeddings → Search workflow
+
+The Ray compute engine automatically distributes the embedding generation across available workers, making it ideal for processing large datasets efficiently.
+
 ## Overview
 
 The Ray compute engine provides:
@@ -365,6 +383,8 @@ batch_engine:
 
 ### With Feature Transformations
 
+#### On-Demand Transformations
+
 ```python
 from feast import FeatureView, Field
 from feast.types import Float64
@@ -385,4 +405,27 @@ features = store.get_historical_features(
 )
 ```
 
+#### Ray Native Transformations
+
+For distributed transformations that leverage Ray's dataset and parallel processing capabilities, use `mode="ray"` in your `BatchFeatureView`:
+
+```python
+# Feature view with Ray transformation mode
+document_embeddings_view = BatchFeatureView(
+    name="document_embeddings",
+    entities=[document],
+    mode="ray",  # Enable Ray native transformation
+    ttl=timedelta(days=365),
+    schema=[
+        Field(name="document_id", dtype=String),
+        Field(name="embedding", dtype=Array(Float32), vector_index=True),
+        Field(name="movie_name", dtype=String),
+        Field(name="movie_director", dtype=String),
+    ],
+    source=movies_source,
+    udf=generate_embeddings_ray_native,
+    online=True,
+)
+```
+
 For more information, see the [Ray documentation](https://docs.ray.io/en/latest/) and [Ray Data guide](https://docs.ray.io/en/latest/data/getting-started.html). 
diff --git a/docs/reference/offline-stores/README.md b/docs/reference/offline-stores/README.md
@@ -45,3 +45,7 @@ Please see [Offline Store](../../getting-started/components/offline-store.md) fo
 {% content-ref url="mssql.md" %}
 [mssql.md](mssql.md)
 {% endcontent-ref %}
+
+{% content-ref url="ray.md" %}
+[ray.md](ray.md)
+{% endcontent-ref %}
diff --git a/docs/reference/offline-stores/overview.md b/docs/reference/offline-stores/overview.md
@@ -26,33 +26,33 @@ The first three of these methods all return a `RetrievalJob` specific to an offl
 ## Functionality Matrix
 
 There are currently four core offline store implementations: `DaskOfflineStore`, `BigQueryOfflineStore`, `SnowflakeOfflineStore`, and `RedshiftOfflineStore`.
-There are several additional implementations contributed by the Feast community  (`PostgreSQLOfflineStore`, `SparkOfflineStore`, and `TrinoOfflineStore`), which are not guaranteed to be stable or to match the functionality of the core implementations.
+There are several additional implementations contributed by the Feast community  (`PostgreSQLOfflineStore`, `SparkOfflineStore`, `TrinoOfflineStore`, and `RayOfflineStore`), which are not guaranteed to be stable or to match the functionality of the core implementations.
 Details for each specific offline store, such as how to configure it in a `feature_store.yaml`, can be found [here](README.md).
 
 Below is a matrix indicating which offline stores support which methods.
 
-| | Dask | BigQuery | Snowflake | Redshift | Postgres | Spark | Trino | Couchbase |
-| :-------------------------------- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- |
-| `get_historical_features`         | yes | yes | yes | yes | yes | yes | yes | yes |
-| `pull_latest_from_table_or_query` | yes | yes | yes | yes | yes | yes | yes | yes |
-| `pull_all_from_table_or_query`    | yes | yes | yes | yes | yes | yes | yes | yes |
-| `offline_write_batch`             | yes | yes | yes | yes | no  | no  | no  | no  |
-| `write_logged_features`           | yes | yes | yes | yes | no  | no  | no  | no  |
+|| | Dask | BigQuery | Snowflake | Redshift | Postgres | Spark | Trino | Couchbase | Ray |
+|| :-------------------------------- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- |
+|| `get_historical_features`         | yes | yes | yes | yes | yes | yes | yes | yes | yes |
+|| `pull_latest_from_table_or_query` | yes | yes | yes | yes | yes | yes | yes | yes | yes |
+|| `pull_all_from_table_or_query`    | yes | yes | yes | yes | yes | yes | yes | yes | yes |
+|| `offline_write_batch`             | yes | yes | yes | yes | no  | no  | no  | no  | yes |
+|| `write_logged_features`           | yes | yes | yes | yes | no  | no  | no  | no  | yes |
 
 
 Below is a matrix indicating which `RetrievalJob`s support what functionality.
 
-| | Dask | BigQuery | Snowflake | Redshift | Postgres | Spark | Trino | DuckDB | Couchbase |
-| --------------------------------- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
-| export to dataframe                                   | yes | yes | yes | yes | yes | yes | yes | yes | yes | 
-| export to arrow table                                 | yes | yes | yes | yes | yes | yes | yes | yes | yes |
-| export to arrow batches                               | no  | no  | no  | yes | no  | no  | no  | no  | no  |
-| export to SQL                                         | no  | yes | yes | yes | yes | no  | yes | no  | yes |
-| export to data lake (S3, GCS, etc.)                   | no  | no  | yes | no  | yes | no  | no  | no  | yes |
-| export to data warehouse                              | no  | yes | yes | yes | yes | no  | no  | no  | yes |
-| export as Spark dataframe                             | no  | no  | yes | no  | no  | yes | no  | no  | no  |
-| local execution of Python-based on-demand transforms  | yes | yes | yes | yes | yes | no  | yes | yes | yes |
-| remote execution of Python-based on-demand transforms | no  | no  | no  | no  | no  | no  | no  | no  | no  |
-| persist results in the offline store                  | yes | yes | yes | yes | yes | yes | no  | yes | yes |
-| preview the query plan before execution               | yes | yes | yes | yes | yes | yes | yes | no  | yes |
-| read partitioned data                                 | yes | yes | yes | yes | yes | yes | yes | yes | yes |
+|| | Dask | BigQuery | Snowflake | Redshift | Postgres | Spark | Trino | DuckDB | Couchbase | Ray |
+|| --------------------------------- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+|| export to dataframe                                   | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes | 
+|| export to arrow table                                 | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes |
+|| export to arrow batches                               | no  | no  | no  | yes | no  | no  | no  | no  | no  | no  |
+|| export to SQL                                         | no  | yes | yes | yes | yes | no  | yes | no  | yes | no  |
+|| export to data lake (S3, GCS, etc.)                   | no  | no  | yes | no  | yes | no  | no  | no  | yes | yes |
+|| export to data warehouse                              | no  | yes | yes | yes | yes | no  | no  | no  | yes | no  |
+|| export as Spark dataframe                             | no  | no  | yes | no  | no  | yes | no  | no  | no  | no  |
+|| local execution of Python-based on-demand transforms  | yes | yes | yes | yes | yes | no  | yes | yes | yes | yes |
+|| remote execution of Python-based on-demand transforms | no  | no  | no  | no  | no  | no  | no  | no  | no  | no  |
+|| persist results in the offline store                  | yes | yes | yes | yes | yes | yes | no  | yes | yes | yes |
+|| preview the query plan before execution               | yes | yes | yes | yes | yes | yes | yes | no  | yes | yes |
+|| read partitioned data                                 | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes |
diff --git a/docs/reference/offline-stores/ray.md b/docs/reference/offline-stores/ray.md
@@ -5,6 +5,23 @@
 
 The Ray offline store is a data I/O implementation that leverages [Ray](https://www.ray.io/) for reading and writing data from various sources. It focuses on efficient data access operations, while complex feature computation is handled by the [Ray Compute Engine](../compute-engine/ray.md).
 
+## Quick Start with Ray Template
+
+The easiest way to get started with Ray offline store is to use the built-in Ray template:
+
+```bash
+feast init -t ray my_ray_project
+cd my_ray_project/feature_repo
+```
+
+This template includes:
+- Pre-configured Ray offline store and compute engine setup
+- Sample feature definitions optimized for Ray processing
+- Demo workflow showcasing Ray capabilities
+- Resource settings for local development
+
+The template provides a complete working example with sample datasets and demonstrates both Ray offline store data I/O operations and Ray compute engine distributed processing.
+
 ## Overview
 
 The Ray offline store provides:
diff --git a/infra/website/docs/blog/feast-ray-distributed-processing.md b/infra/website/docs/blog/feast-ray-distributed-processing.md
diff --git a/infra/website/public/images/blog/feast_ray_architecture.png b/infra/website/public/images/blog/feast_ray_architecture.png