Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resize and update formatting for the data dictionary page tables under Hubble #1082

Closed
wants to merge 9 commits into from
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ yarn-error.log*

*.info.mdx

.tool-versions

# non-production openrpc.json files
/openrpc/*openrpc.json

Expand Down
14 changes: 14 additions & 0 deletions config/sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ const sidebars: SidebarsConfig = {
{ type: 'ref', id: 'data/rpc/README', label: 'Soroban RPC'},
{ type: 'ref', id: 'data/hubble/README', label: 'Hubble'},
{ type: 'ref', id: 'data/horizon/README', label: 'Horizon'},
{ type: 'ref', id: 'data/galexie/README', label: 'Galexie'},
],
tools: [
{
Expand Down Expand Up @@ -74,6 +75,19 @@ const sidebars: SidebarsConfig = {
collapsible: false,
},
],
galexie: [
{
type: 'category',
label: 'Galexie',
items: [
{
type: "autogenerated",
dirName: "data/galexie",
},
],
collapsible: false,
},
],
soroban_rpc: [
{
type: "category",
Expand Down
2 changes: 1 addition & 1 deletion docs/README.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Information on how to issue assets on the Stellar network and create custom smar

### [Data](/docs/data/README.mdx)

Discover various data availability options: RPC, Hubble, and Horizon.
Discover various data availability options: RPC, Hubble, Horizon, and Galexie.

### [Tools](/docs/tools/README.mdx)

Expand Down
4 changes: 2 additions & 2 deletions docs/build/guides/dapps/frontend-guide.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -743,7 +743,7 @@ This is made possible by using the `server.getEvents` method which allows you to

We will be editing the `CounterPage` component to read events from the counter smart contract immediately the page loads to get the initial counter value and update instead of using "Unknown". Before we continue, please take a look at the [contract code](https://github.com/stellar/soroban-examples/blob/main/events/src/lib.rs). In the contract code, an event named `increment` is emitted whenever the `increment` function is called. It is published over 2 topics, `increment` and `COUNTER` and we need to listen to these topics to get the events.

The topics are stored in a data type called `symbol` and we will need to convert both `increment` and `COUNTER` to `symbol` before we can use them in the [`server.getEvents`](https://developers.stellar.org/docs/data/rpc/api-reference/methods/getEvents) method. By default, soroban RPCs keep track of events for 24 hours and you can query events that happened within the last 24 hours, so if you need to store events for longer, you may need to make use of an [indexer](/docs/tools/developer-tools/data-indexers).
The topics are stored in a data type called `symbol` and we will need to convert both `increment` and `COUNTER` to `symbol` before we can use them in the [`server.getEvents`](https://developers.stellar.org/docs/data/rpc/api-reference/methods/getEvents) method. At maximum, soroban RPCs keep track of events for 7 days and you can query events that happened within the last 7 days, so if you need to store events for longer, you may need to make use of an [indexer](/docs/tools/developer-tools/data-indexers).

To use events,we edit our counter page and add the following code:

Expand Down Expand Up @@ -841,5 +841,5 @@ State Archival is a characteristic of soroban contracts where some data stored o

A few things to note about data retention in soroban contracts:

- Events Data can be queried within 24 hours of the event happening. So you may need an indexer to store events for longer periods.
- Events Data can be queried within 7 days of the event happening. So you may need an indexer to store events for longer periods.
- Transaction data is stored with a retention period of 1440 ledgers. This means after 1440 ledgers, the transaction data cannot be queried using the RPC. Again, you may need an indexer to store transaction data for longer periods.
8 changes: 4 additions & 4 deletions docs/build/guides/events/ingest.mdx
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
---
title: Ingest events published from a contract
description: Use Soroban RPC's getEvents method for querying events, with a 24-hour retention window
description: Use Soroban RPC's getEvents method for querying events, with a 7 day retention window
---

import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";

Soroban RPC provides a `getEvents` method which allows you to query events from a smart contract. However, the data retention window for these events is roughly 24 hours. If you need access to a longer-lived record of these events you'll want to "ingest" the events as they are published, maintaining your own record or database as events are ingested.
Soroban RPC provides a `getEvents` method which allows you to query events from a smart contract. However, the data retention window for these events is 7 days at most. If you need access to a longer-lived record of these events you'll want to "ingest" the events as they are published, maintaining your own record or database as events are ingested.

There are many strategies you can use to ingest and keep the events published by a smart contract. Among the simplest might be using a community-developed tool such as [Mercury](https://mercurydata.app) which will take all the infrastructure work off your plate for a low subscription fee.

Expand Down Expand Up @@ -139,13 +139,13 @@ We are making some assumptions here. We'll assume that your contract sees enough

<TabItem value="python" label="Python">

If we start from scratch, there is no known ledger so we can try to ingest roughly the last 24 hours assuming a ledger closes every 6s.
If we start from scratch, there is no known ledger so we can try to ingest roughly the last 7 days assuming a ledger closes every 6s.

```python
import stellar_sdk

soroban_server = stellar_sdk.SorobanServer()
ledger = soroban_server.get_latest_ledger().sequence-int(3600 / 6 * 24)
ledger = soroban_server.get_latest_ledger().sequence-int(3600 / 6 * 24 * 7)
```

Later on, we will be able to start from the latest ingested ledger by making a query to our DB.
Expand Down
56 changes: 30 additions & 26 deletions docs/data/README.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,19 @@ There are several products to choose from when interacting with the Stellar Netw
This section will walk you through the differences between the various platforms and tools, what platform or tool is best for what use case, and then link to their various documentation locations.

- **[RPC](#rpc)** - live network gateway
- **[Horizon](#horizon)** - API for network state data
- **Galexie** - exports raw ledger metadata files
- **[Hubble](#hubble)** - analytics database for network data

| Features | RPC | Horizon | Galexie | Hubble |
| ----------------------- | --- | ------- | ------- | ------ |
| Real-time Data | ✅ | ✅ | ✅ | ❌ |
| Historical Data | ❌ | ❌\* | ✅ | ✅ |
| Smart Contracts | ✅ | ❌ | ✅ | ✅ |
| API | ✅ | ✅ | ❌ | ❌ |
| Transaction Submission | ✅ | ✅ | ❌ | ❌ |
| Curated and Parsed Data | ❌ | ✅ | ❌ | ✅ |
| Ad Hoc Data Analysis | ❌ | ❌ | ❌ | ✅ |
- **[Horizon](#horizon)** - API for network state data
- **[Galexie](#galexie)** - exports raw ledger metadata files

| Features | RPC | Hubble | Horizon | Galexie |
| ----------------------- | --- | ------ | ------- | ------- |
| Real-time Data | ✅ | ❌ | ✅ | ✅ |
| Historical Data | ❌ | ✅ | ❌\* | ✅ |
| Smart Contracts | ✅ | ✅ | ❌ | ✅ |
| API | ✅ | ❌ | ✅ | ❌ |
| Transaction Submission | ✅ | ❌ | ✅ | ❌ |
| Curated and Parsed Data | ❌ | ✅ | ✅ | ❌ |
| Ad Hoc Data Analysis | ❌ | ✅ | ❌ | ❌ |

\*_Please note that Horizon can provide full historical data but is not the recommended tool for full historical data access._

Expand All @@ -39,20 +39,6 @@ If the RPC does not otherwise serve your needs, please tell us why in the [Stell

You have the option of [setting up your own RPC instance](./rpc/admin-guide.mdx) or using a publicly available service from [an infrastructure provider](./rpc/rpc-providers.mdx).

## [Data Indexers](../tools/developer-tools/data-indexers.mdx)

Data indexers are specialized tools that process and index blockchain data, making it more accessible and queryable to end users. They transform raw blockchain data into a more structured format that’s easier for end users to interact with.

Data indexers have advanced querying capabilities and enhanced analytics. They provide features such as statistical analysis of blockchain activity, visualization of transaction flows, or tracking DeFi metrics — capabilities that go beyond basic transaction lookup for current or historical state data.

Data indexers are a potentially more user-friendly, cost-effective choice for users. Check out several available data indexers for the Stellar network in our [Tools section](../tools/developer-tools/data-indexers.mdx).

## [Analytics Platforms](../tools/developer-tools/analytics-platforms.mdx)

Analytics Platforms are specialized tools that process and make historical Stellar network data available. The Stellar network data is loaded into database tables for large data analytics using SQL. Users can create complex ad hoc analysis, dashboarding, and curate actionable data insights (e.g., business intelligence or business analytics).

Check out several available nalytics platforms for the Stellar network in our [Tools section](../tools/developer-tools/analytics-platforms.mdx).

## [Hubble](./hubble/README.mdx)

Hubble is an SDF-maintained, open-source, publicly available BigQuery data warehouse that provides a complete, holistic historical record of the Stellar network. It is a read-only platform and does not have the capability to send transactions to the network like you can with RPC.
Expand All @@ -70,3 +56,21 @@ Horizon is an API for accessing and interacting with the Stellar network data. I
Horizon stores three types of data (current state, historical state, and derived state) in one database, and the data is available in real-time for transactional use, which makes Horizon more expensive and resource-intensive to operate. If you’re considering using Horizon over the RPC, let us know in the [Stellar Developer Discord](https://discord.gg/stellardev) or file an issue in the [RPC repo](https://github.com/stellar/soroban-rpc) and let us know why!

You can [run your own instance of Horizon](./horizon/admin-guide/README.mdx) or use one of the publicly available Horizon services from [these infrastructure providers](./horizon/horizon-providers.mdx).

## [Galexie](./galexie/README.mdx)

Galexie is a tool for exporting Stellar ledger metadata to external data storage. Learn more about its [use cases](./galexie/README.mdx) and how to [run](./galexie/admin_guide/README.mdx) your own instance of Galexie.

## [Data Indexers](../tools/developer-tools/data-indexers.mdx)

Data indexers are specialized tools that process and index blockchain data, making it more accessible and queryable to end users. They transform raw blockchain data into a more structured format that’s easier for end users to interact with.

Data indexers have advanced querying capabilities and enhanced analytics. They provide features such as statistical analysis of blockchain activity, visualization of transaction flows, or tracking DeFi metrics — capabilities that go beyond basic transaction lookup for current or historical state data.

Data indexers are a potentially more user-friendly, cost-effective choice for users. Check out several available data indexers for the Stellar network in our [Tools section](../tools/developer-tools/data-indexers.mdx).

## [Analytics Platforms](../tools/developer-tools/analytics-platforms.mdx)

Analytics Platforms are specialized tools that process and make historical Stellar network data available. The Stellar network data is loaded into database tables for large data analytics using SQL. Users can create complex ad hoc analysis, dashboarding, and curate actionable data insights (e.g., business intelligence or business analytics).

Check out several available nalytics platforms for the Stellar network in our [Tools section](../tools/developer-tools/analytics-platforms.mdx).
37 changes: 37 additions & 0 deletions docs/data/galexie/README.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
title: Galexie Introduction
sidebar_position: 0
---

## What is Galexie?

Galexie is a tool for extracting, processing, exporting Stellar ledger metadata to external storage, and creating a data lake of pre-processed ledger metadata. Galexie is the foundation of the Composable Data Pipeline (CDP) and serves as the first step in extracting raw Stellar ledger metadata and making it accessible. Learn more about CDP’s benefits and applications in this [blog post](https://stellar.org/blog/developers/composable-data-platform).

## What Are the Key Features of Galexie?

Galexie is designed to make streamlined and efficient export of ledger metadata via a simple user-friendly interface. Its key features include:

- Exporting Stellar ledger metadata to cloud storage
- Configurable to export a specified range of ledgers or continuously stream new ledgers as they are created on the Stellar network
- Exporting ledger metadata in XDR which is Stellar Core’s native format.
- Compressing data before export to optimize storage efficiency in the data lake.

![](/assets/galexie-architecture.png)

## Why XDR Format?

Exporting data in XDR—the native Stellar Core format—enables Galexie to preserve full transaction metadata, ensuring data integrity while keeping storage efficient. The XDR format maintains compatibility with all Stellar components, providing a solid foundation for applications that require consistent access to historical data. Refer to the [XDR](/docs/learn/encyclopedia/data-format/xdr) documentation for more information on this format.

## Why Run Galexie?

Galexie enables you to make a copy of Stellar ledger metadata over which you have complete control. Galexie can continuously sync your data lake with the latest ledger data freeing you up from tedious data ingestion and allowing you to focus on building customized applications that consume and analyze exported data.

## What Can You Do with the Data Lake Created by Galexie?

Once data is stored in the cloud, it becomes easily accessible for integration with modern data processing and analytics tools, enabling various workflows and insights.

The pre-processed ledger data exported by Galexie can be utilized across various applications, such as:

- Analytics Tools: Analyze trends over time.
- Audit Applications: Retrieve historical transaction data for auditing and compliance.
- Monitoring Systems: Create tools to track network metrics.
6 changes: 6 additions & 0 deletions docs/data/galexie/admin_guide/README.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: Admin Guide
sidebar_position: 15
---

This guide provides step-by-step instructions on installing and running the Galexie.
46 changes: 46 additions & 0 deletions docs/data/galexie/admin_guide/configuring.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
title: Configuring
sidebar_position: 20
---

# Configuring

## Steps to Configure Galexie

1. **Copy the Sample Configuration**

Start with the provided sample file, [`config.example.toml`](https://github.com/stellar/go/blob/master/services/galexie/config.example.toml).

2. **Rename and Update the Configuration**

Rename the file to `config.toml` and adjust settings as needed.

- **Key Settings Include:**

- **Google Cloud Storage (GCS) Bucket**

Specify the GCS bucket where Galexie will export Stellar ledger data. Update `destination_bucket_path` to the complete path of your GCS bucket, including subpaths if applicable.

```toml
destination_bucket_path = "stellar-network-data/testnet"
```

- **Stellar Network**

Set the Stellar network to be used in creating the data lake.

```toml
network = "testnet"
```

- **Data Organization (Optional)**

Configure how the exported data is organized in the GCS bucket. The example below adds 64 ledgers per file and organizes them in a directory of 1000 files.

```toml
# Number of ledgers stored in each file
ledgers_per_file = 1

# Number of files per partition/directory
files_per_partition = 64000
```
12 changes: 12 additions & 0 deletions docs/data/galexie/admin_guide/installing.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: Installing
sidebar_position: 30
---

# Installing

To install Galexie, retrieve the Docker image from the [Stellar Docker Hub registry](https://hub.docker.com/r/stellar/stellar-galexie) using the following command:

```shell
docker pull stellar/stellar-galexie
```
52 changes: 52 additions & 0 deletions docs/data/galexie/admin_guide/monitoring.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
title: Monitoring
sidebar_position: 50
---

# Monitoring

### Metrics

Galexie publishes metrics through an HTTP-based admin endpoint, which makes it easier to monitor its performance. This endpoint is configurable in the `config.toml` file, where you can specify the port on which metrics are made available. The data is exposed in Prometheus format, enabling easy integration with existing monitoring and alerting systems.

The admin port can be configured in the `config.toml` file by setting the `admin_port` variable. By default, the `admin_port` is set to `6061`

```toml
# Admin port configuration
# Specifies the port for hosting the HTTP service that publishes metrics.
admin_port = 6061
```

With this configuration, the URL to access the metrics endpoint will be:

```
http://<host>:6061/metrics
```

Galexie emits several application-specific metrics to help track the export process:

- `galexie_last_exported_ledger`: The sequence number of the most recently exported ledger.
- `galexie_uploader_put_duration_seconds`: The time taken to upload objects to the data lake.
- `galexie_uploader_object_size_bytes`: Compressed and uncompressed sizes of the objects being uploaded.
- `galexie_upload_queue_length`: Number of objects currently queued and waiting to be uploaded.

In addition to these application-specific metrics, Galexie also exports system metrics (e.g., CPU, memory, open file descriptors) and Stellar Core ingestion metrics such as `galexie_ingest_ledger_fetch_duration_seconds`

Use these metrics to build queries that monitor Galexie’s performance and export process. Here are a few examples of useful queries:

- Export Times: Query `galexie_uploader_put_duration_seconds` to monitor average upload times.
- Queue Length: Use `galexie_upload_queue_length` to view the number of objects waiting to be uploaded.
- Latest Exported Ledger: Track `galexie_last_exported_ledger` to ensure that ledger exports are up-to-date.

For a quick start, download our pre-built Grafana dashboard for Galexie [here](https://grafana.com/grafana/dashboards/22285-stellar-galexie/). This dashboard provides pre-configured queries and visualizations to help you monitor Galexie's health. You can customize it to fit your specific needs.

### Logging

Galexie emits logs to stdout and generates a log line for every object being exported to help monitor progress.

Example logs:

```
INFO[2024-11-07T17:40:37.795-08:00] Uploading: FFFFFF37--200-299/FFFFFF37--200.xdr.zstd pid=98734 service=galexie
INFO[2024-11-07T17:40:37.892-08:00] Uploaded FFFFFF37--200-299/FFFFFF37--200.xdr.zstd successfully pid=98734 service=galexie
```
Loading