Skip to content

Latest commit

 

History

History
203 lines (140 loc) · 7.03 KB

data-access.md

File metadata and controls

203 lines (140 loc) · 7.03 KB
jupyter
jupytext kernelspec
text_representation
extension format_name format_version jupytext_version
.md
markdown
1.3
1.16.0
display_name language name
(heasoft)
python
conda-env-heasoft-py

HEASARC Data Access on SciServer


  • Description: A general overview on accessing data on Sciserver.
  • Level: Intermediate.
  • Data: Access XTE data on Eta Car as an example.
  • Requirements: pyvo.
  • Credit: Tess Jaffe (Sep 2021).
  • Support: Contact the HEASARC helpdesk.
  • Last verified to run: 02/28/2024.

1. Introduction

This notebook presents a tutorial of how to access HEASARC data using the virtual observatory (VO) python client pyvo.

We handle the general case of using the Tabel Access Protocol (TAP) to query any information about the HEASARC tables. A more specific data access tutorial when the table is known, is given in the notebook on Finding and Downloading Data.

The case will be illustrated by querying for XTE observations of Eta Car .

Running On Sciserver:
The notebook requires pyvo, and on Sciserver, it is available on the heasoft conda kernel. Make sure you run the notbeook using that kernel by selecting it in the top right.

2. Module Imports

We need the following python modules:

import sys
import os
import pyvo
from astropy.coordinates import SkyCoord
import requests
import glob
import numpy as np

3. Get the HEASARC TAP service

We can use the Virtual Observatory interfaces to the HEASARC to find the data we're interested in. Specifically, we want to look at the observation tables. So first we get a list of all the tables HEASARC serves and then look for the ones related to RXTE.

3.1 Find the Tables

We start with the Registry of all VO services. The HEASARC table service is using the same backend as our Xamin web interface, the same database that Browse also uses.

tap_services = pyvo.regsearch(servicetype='tap', keywords=['heasarc'])

We then ask the service for all of the tables that are available at the HEASARC:

heasarc_tables = tap_services[0].service.tables

And then we look for the ones related to XTE:

for tablename in heasarc_tables.keys():
    if "xte" in tablename:  
        print(" {:20s} {}".format(tablename, heasarc_tables[tablename].description))

The xtemaster catalog is the one that we're interested in.

Let's see what this table has in it. The same information is availabe in the table description in the website:

https://heasarc.gsfc.nasa.gov/W3Browse/all/xtemaster.html

for column in heasarc_tables['xtemaster'].columns:
    print("{:20s} {}".format(column.name, column.description))

3.2 Build a Search Query

We're interested in Eta Carinae, and we want to get the RXTE cycle, proposal, and observation ID etc. for every observation it took of this source based on its position (Just in case the name has been entered differently, which can happen.)

The following constructs a query in the ADQL language to select the columns (target_name, cycle, prnb, obsid, time, exposure, ra, dec) where the point defined by the observation's RA and DEC lies inside a circle defined by our chosen source position.

The results will be sorted by time. See the NAVO website for more information on how to use these services with python and how to construct ADQL queries for catalog searches.

You can also find more detailed on using these services in the NASA Virtual Observatory workshop tutorials (NAVO)

# Get the coordinate for Eta Car
pos = SkyCoord.from_name("eta car")
query = """SELECT target_name, cycle, prnb, obsid, time, exposure, ra, dec 
    FROM public.xtemaster as cat 
    where 
    contains(point('ICRS',cat.ra,cat.dec),circle('ICRS',{},{},0.1))=1 
    and 
    cat.exposure > 0 order by cat.time
    """.format(pos.ra.deg, pos.dec.deg)
results = tap_services[0].search(query).to_table()
results

4. Using Xamin's API

An alternative method to access the data is to use the Xamin API specifically. Xamin is the main web portal for accessing HEASARC data, and it offers an API that can be used to query the same tables.

The base URL for the Xamin query servelet is, which will be queries using the requests module.

https://heasarc.gsfc.nasa.gov/xamin/QueryServlet?

And it takes the options:

  • table: e.g., "table=xtemaster"
  • constraint: eg., "obsid=10004-01-40-00"
  • object: "object=andromeda" or "object=10.68,41.27"

So we can do:

url = "https://heasarc.gsfc.nasa.gov/xamin/QueryServlet?products&"
result = requests.get(url,params = {"table":"xtemaster",
                                    "object":"eta car",
                                    "resultmax":"10"
                                   })
result.text.split('\n')[0:2]

And then you can construct a file list from the second to last field in each row, the *obs_root.

5. Obtain the Data

If you know structure of the mission data, you can take the list of observations from XTE above and find the specific files of the type you want for each of those observations.

For example, let's collect all the standard product light curves for RXTE. (These are described on the RXTE analysis pages.)

A second approach is to use the Xamin protal, find the data prodcuts and obtain the links there.

Yet another approach is to use VO datalinks service (via pyvo) to find the links to the data. An example of how to do it is shown in the Finding and Downloading Data notebook.

We are working on making more ways to find the data products from the notebook.

The following will use the first approach.

# obtain information about the observations
ids = np.unique( results['cycle','prnb','obsid','time'])
ids.sort(order='time')
ids
# Construct a file list.
rootdir = "/FTP"
rxtedata = "rxte/data/archive"
filenames = []

for (k,val) in enumerate(ids['obsid']):
    fname="{}/{}/AO{}/P{}/{}/stdprod/xp{}_n2a.lc.gz".format(
        rootdir,
        rxtedata,
        ids['cycle'][k],
        ids['prnb'][k],
        ids['obsid'][k],
        ids['obsid'][k].replace('-',''))
    
    f = glob.glob(fname)
    if (len(f) > 0):
        filenames.append(f[0])

print("Found {} out of {} files".format(len(filenames),len(ids)))

On Sciserver, the data can be copied directly from the mount archive located under /FTP/