jupyter | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
- Description: A longer introduction to pySAS on sciserver.
- Level: Beginner
- Data: XMM observation of NGC 3079 (obsid=0802710101)
- Requirements: Must be run using the
HEASARCv6.33.1
image. Run in the (xmmsas) conda environment on Sciserver. You should see (xmmsas) at the top right of the notebook. If not, click there and select (xmmsas). - Credit: Ryan Tanner (April 2024)
- Support: XMM Newton GOF Helpdesk
- Last verified to run: 1 May 2024, for SAS v21
This tutorial provides a much more detailed explanation on how to use pySAS than the one found in the Short pySAS Introduction, but like the Short Intro it only covers how to download observation data files, how to calibrate the data, and how to run any SAS task through pySAS. For explanations on how to use different SAS tasks inside of pySAS see the exmple notebooks provided. A tutorial on how to learn to use SAS and pySAS for XMM analysis can be found in The XMM-Newton ABC Guide.
sasver
(Documentation for sasver)startsas
(Documentation for startsas)cifbuild
(Documentation for cifbuild)odfingest
(Documentation for odfingest)emproc
(Documentation for emproc)epproc
(Documentation for epproc)rgsproc
(Documentation for rgsproc)omichain
(Documentation for omichain)
pysas
Documentationpysas
on GitHub- Common SAS Threads
- Users' Guide to the XMM-Newton Science Analysis System (SAS)
- The XMM-Newton ABC Guide
- XMM Newton GOF Helpdesk - Link to form to contact the GOF Helpdesk.
When running this notebook inside Sciserver, make sure the HEASARC data drive is mounted when initializing the Sciserver compute container. See details here.
Running Outside Sciserver:
This notebook was designed to run on SciServer, but an equivelent notebook can be found on GitHub. You will need to install the development version of pySAS found on GitHub (pySAS on GitHub). There are installation instructions on GitHub and example notebooks can be found inside the directory named 'examples'.
Lets begin by asking three questions:
- What XMM-Newton Observation data do I want to process?
- Which directory will contain the XMM-Newton Observation data I want to process?
- Which directory am I going to use to work with (py)SAS?
For the first question, you will need an Observation ID. In this tutorial we use the ObsID 0802710101
.
For the second question, you will also have to choose a directory for your data (data_dir
). You can set your data directory to any path you want, but for now we will use your scratch space.
For the third question, a working directory will automatically be created for each ObsID, as explained below. You can change this manually, but using the default is recommended.
import os
import pysas
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
# To get your user name. Or you can just put your user name in the path for your data.
from SciServer import Authentication as auth
usr = auth.getKeystoneUserWithToken(auth.getToken()).userName
data_dir = os.path.join('/home/idies/workspace/Temporary/',usr,'scratch/xmm_data')
obsid = '0802710101'
By running the cell below, an Observation Data File (odf
) object is created. By itself it doesn't do anything, but it has several helpful functions to get your data ready to analyse.
odf = pysas.odfcontrol.ODFobject(obsid)
When you run the cell below the following things will happen.
-
odfcompile
will check ifdata_dir
exists, and if not it will create it. -
Inside data_dir
odfcompile
will create a directory with the value for the obs ID (i.e.$data_dir/0802710101/
). -
Inside of that,
odfcompile
will create two directories:a.
$data_dir/0802710101/ODF
where the observation data files are kept.b.
$data_dir/0802710101/work
where theccf.cif
,*SUM.SAS
, and output files are kept. -
odfcompile
will automatically transfer the data forobsid
to$data_dir/0802710101/ODF
from the HEASARC archive. -
odfcompile
will runcfibuild
andodfingest
.
That is it! Your data is now calibrated and ready for use with all the standard SAS commands!
odf.odfcompile(data_dir=data_dir,repo='sciserver',overwrite=False)
If you need to include options for either or both cfibuild
and odfingest
, these can be passed to odfcompile
using the inputs cifbuild_opts='Insert options here'
and odfingest_opts='Insert options here'
.
Another important input is overwrite=True/False
. If set to true, it will erase all data, including any previous analysis output, in the obsid directory (i.e. $data_dir/0802710101/
) and download the original files again.
You can also choose the level of data products you download. If you set level=ODF
then it will download the raw, uncalibrated data and recalibrate it. If you set level=PPS
this will download previously calibrated data products that can be used directly for analisys.
The odf
object will also store some useful information for analysis. For example, it stores data_dir
, odf_dir
, and work_dir
:
print("Data directory: {0}".format(odf.data_dir))
print("ODF directory: {0}".format(odf.odf_dir))
print("Work directory: {0}".format(odf.work_dir))
The location and name of important files are also stored in a Python dictionary in the odf object.
instrument_files = list(odf.files.keys())
print(instrument_files,'\n')
for instrument in instrument_files:
print(f'File Type: {instrument}')
print('>>> {0}'.format(odf.files[instrument]),'\n')
If you want more information on the function odfcompile
run the cell below to see the function documentation.
odf.odfcompile?
Now we are ready to execute any SAS task needed to analize our data. To execute any SAS task within a Notebook, we need to import from pysas
a component known as Wrapper
. The following cell shows how to do that,
from pysas.wrapper import Wrapper as w
Any SAS task accepts arguments which can be either specific options, e.g. --version, which shows the task's version, or parameters with format param=value. When the task is invoked from the command line, these arguments follow the name of the task. However, in Notebooks we have to pass them to the task in a different way. This is done using a Python list, whose name you are free to choose. Let the name of such list be inargs.
To pass the option --version to the task to be executed, we must define inargs as,
inargs = ['--version']
To execute the task, we will use the Wrapper component imported earlier from pysas, as w (which is a sort of alias), as follows,
t = w('sasver', inargs)
In Python terms, t is an instantiation of the object Wrapper (or its alias w).
To run sasver
(click here for sasver documentation), we can now do as follows,
t.run()
This output is equivalent to having run sasver
in the command line with argument --version.
Each SAS task, regardless of the task being a Python task or not, accepts a predefined set of options. To list which are these options, we can always invoke the task with option --help (or -h as well).
With sasver
, as with some other SAS tasks, we could define inargs as an empty list, which is equivalent to run the task in the command line without options, like this,
inargs = []
t = w('sasver', inargs)
t.run()
That is indeed the desired output of the task sasver
.
A similar result can be achieved by combining all the previous steps into a single expression, like this,
w('sasver', []).run()
The output of sasver
provides useful information on which version of SAS is being run and which SAS environment variables are defined.
Note: It is important to always use [ ] when passing parameters to a task when using the wrapper, as parameters and options have to be passed in the form of a list. For example, w('evselect', ['-h']).run(), will execute the SAS task evselect
with option -h.
As noted earlier, we can list all options available to any SAS task with option --help (or -h),
w('sasver', ['-h']).run()
As explained in the help text shown here, if the task would have had any available parameters, we would get a listing of them immediately after the help text.
As shown in the text above, the task sasver
has no parameters.
This depends on your experience level with SAS and what you are using the data for. For a tutorial on preparing and filtering your data for analysis or to make images see The XMM-Newton ABC Guide, or check out any of the example notebooks.
In the next cells we show how to run four typical SAS tasks, three procs
and one chain
, to process exposures taken with the EPIC PN and MOS instruments, RGS, and OM. You can run these SAS tasks to see what they do. Some of them may take some time to run.
os.chdir(odf.work_dir)
inargs = []
w('epproc', inargs).run()
The most common SAS tasks to run are: epproc
, emproc
, rgsproc
, and omichain
. Each one can be run without inputs (but some inputs are needed for more advanced analysis).
You can list all input arguments available to any SAS task with option '--help'
(or '-h'
),
w('epproc', ['-h']).run()
Here is an example of how to apply a "standard" filter. This is equivelant to running the following SAS command:
evselect table=unfiltered_event_list.fits withfilteredset=yes \
expression='(PATTERN $<=$ 12)&&(PI in [200:12000])&&#XMMEA_EM' \
filteredset=filtered_event_list.fits filtertype=expression keepfilteroutput=yes \
updateexposure=yes filterexposure=yes
The input arguments should be in a list, with each input argument a separate string. Note: Some inputs require single quotes to be preserved in the string. This can be done using double quotes to form the string. i.e. "expression='(PATTERN <= 12)&&(PI in [200:4000])&&#XMMEA_EM'"
. An explanation of this filter, and other filters, can be found in The XMM-Newton ABC Guide.
unfiltered_event_list = "3278_0802710101_EMOS1_S001_ImagingEvts.ds"
inargs = ['table={0}'.format(unfiltered_event_list),
'withfilteredset=yes',
"expression='(PATTERN <= 12)&&(PI in [200:4000])&&#XMMEA_EM'",
'filteredset=filtered_event_list.fits',
'filtertype=expression',
'keepfilteroutput=yes',
'updateexposure=yes',
'filterexposure=yes']
w('evselect', inargs).run()
For convenience there is a function called basic_setup
which will run odfcompile
, and then run both epproc
and emproc
. This allows for data to be copied into your personal data space, calibrated, and run two of the most common SAS tasks, all with a single command.
odf = pysas.odfcontrol.ODFobject(obsid)
odf.basic_setup(data_dir=data_dir,overwrite=False,repo='sciserver',rerun=False)
Running basic_setup(data_dir=data_dir,overwrite=False,repo='sciserver',rerun=True)
is the same as running the following commands:
odf.odfcompile(data_dir=data_dir,overwrite=False,repo='sciserver')
w('epproc',[]).run()
w('emproc',[]).run()
Using the function odf.basic_setup
with rerun=False will check if epproc
or emproc
have already been run and will not overwrite existing output files. If rerun=True then previous output files will be ignored and overwritten. After running basic_setup
there will be more files listed in the odf.files
dictionary.
instrument_files = list(odf.files.keys())
print(instrument_files,'\n')
for instrument in instrument_files:
print(f'File Type: {instrument}')
print('>>> {0}'.format(odf.files[instrument]),'\n')
For more information see the function documentation.
odf.basic_setup?
If you want to just copy the raw data, and not do anything with it, you can use the function download_data
. The function takes obsid
and data_dir
(both required) and copies the data from the HEASARC on SciServer. If the directory data_dir
does not exist, it will create it. It will also create a subdirectory for the obsid
. WARNING:
This function will silently erase any prior data in the directory $data_dir/obsid/
.
pysas.odfcontrol.download_data(obsid,data_dir,repo='sciserver')