Skip to content

Runs database

Matt Kramer edited this page Apr 2, 2025 · 2 revisions

Runs database

Versions

2x2

Current release, sqlite: https://portal.nersc.gov/project/dune/data/2x2/DB/RunsDB/releases/mx2x2runs_v0.2.sqlite

Current release, spreadsheet: https://docs.google.com/spreadsheets/d/1RI-YyXi994TlXT4gldL99KqONsuyatkjcjKzxupKn3g

FSD

Current release, sqlite: https://portal.nersc.gov/project/dune/data/FSD/run_db/fsd_run_db.20241216.sqlite

Current release, spreadsheet: https://docs.google.com/spreadsheets/d/1RaFJW97BC08XAlMV6Gp7mqad2rtdB11g

Basic instructions

The sqlite file can be opened with the sqlite3 tool on the command line, or the sqlite3 built-in Python module, or the sqlalchemy library, or the sqlitebrowser GUI, etc. You can also use pandas (see example below).

The schema should be reasonably self-explanatory. The spreadsheet is a decent way to get familiar with it.

pandas example

Make sure that you've got sqlalchemy installed in addition to pandas.

import datetime as DT
import pandas as pd

db = 'sqlite:///mx2x2runs_v0.1.sqlite'

# Get the global subrun table as a DataFrame
all_subruns = pd.read_sql('All_global_subruns', db)

# Get the unix timestamps for the "sandbox" period
# (first 2 hours of June 9 in Chicago)
t_start = DT.datetime.fromisoformat('2024-07-09T00:00:00-05:00').timestamp()
t_end = DT.datetime.fromisoformat('2024-07-09T02:00:00-05:00').timestamp()

# Get the sandbox subruns
sandbox_subruns = all_subruns[(all_subruns['start_time_unix'] >= t_start) &
                              (all_subruns['end_time_unix'] <= t_end)]

# Get the CRS table
crs_table = pd.read_sql('CRS_summary', db)

# Print some info on all of the CRS files in the sandbox
for _id, subrun in sandbox_subruns.iterrows():
    crs_info = crs_table[(crs_table['run'] == subrun.crs_run) &
                         (crs_table['subrun'] == subrun.crs_subrun)].iloc[0]
    print(subrun.crs_run, subrun.crs_subrun, crs_info.filename, crs_info.msg_rate)

Clone this wiki locally