Python for SPARTA¶
This file contains some of the main routines for the python analysis package for SPARTA.
Code examples¶
The main purpose of this module is to translate SPARTA output from its HDF5 format into a python
structure of dictionaries and structured arrays. In this section, we explore some of the
functionality of the load()
function through code examples. First, we need to import the
sparta module. We also set a default filename variable:
from sparta_tools import sparta
filename = '/Users/you/some/dir/sparta.hdf5'
Basic structure¶
We can now attempt to execute the load function. It always returns a dictionary:
dic = sparta.load(filename)
>>> sparta.load: Loading file /Users/you/some/dir/sparta.hdf5.
>>> sparta.load: Loading 25181 halos from SPARTA file...
print(dic.keys())
>>> dict_keys(['anl_prf', 'tcr_ptl', 'halos', 'config', 'simulation'])
If we execute the load function without any parameters besides the filename, the function loads
all data from the SPARTA file. For large files, this can take a long time or even exceed the
available memory! In the case above, the file contained a total of 25181 halos. Here, the term
“halo” means a branch in a merger tree, i.e., the history of a halo over time. This history begins
when the halo is first detected by the halo finder and ends when the halo disappears, generally
through merging into another, larger halo or because the simulation ends. Besides the halo data,
the file seems to contain particle tracer information (the tcr_ptl
sub-dictionary) as well as
a density profile analysis (anl_prf
). See the Introduction for an introduction to these
abbreviations.
In addition, the dictionary always contains the config
and simulation
sub-dictionaries
which have the same content as the eponymous groups in the HDF5 file (see The SPARTA HDF5 output format).
Let’s explore their content a little:
for k in sorted(dic['config'].keys()):
x = dic['config'][k]
if not isinstance(x, dict):
print('%-30s %-40s' % (k, str(x)))
>>> cat_halo_jump_tol_box 0.03
>>> cat_halo_jump_tol_phys 10.0
>>> ...
>>> snap_path b'/Users/you/snapdir_%04d/snapshot_%04d.%d'
>>> snap_sim_type 0
>>> tcr_ptl_create_radius 2.0
>>> tcr_ptl_delete_radius 3.0
The dictionary contains all user-defined configuration parameters (see Run-time configuration parameters). Note that we did not display the entries that are themselves dictionaries. Those contain the config parameters for particular Results or Analyses:
for k in sorted(dic['config'].keys()):
x = dic['config'][k]
if isinstance(x, dict):
print(k)
>>> anl_prf
>>> res_oct
For example, for the orbit counting results that are contained in this example file:
for k in sorted(dic['config']['res_oct'].keys()):
print('%-30s %-40s' % (k, str(dic['config']['res_oct'][k])))
>>> res_oct_max_norbit 3
Similarly, the simulation
sub-dictionary contains information about the simulation that SPARTA
was run on:
for k in sorted(dic['simulation'].keys()):
print('%-20s %-40s' % (k, str(dic['simulation'][k])))
>>> Omega_L 0.73
>>> Omega_m 0.27
>>> box_size 62.5
>>> ...
Loading specific halos¶
Warning
The order of the returned halo data is the order in which the halos appear in the
SPARTA file, not the order in which they are requested (if the halo_ids
parameter is used).
This ordering also means that the order may vary between two otherwise identical SPARTA runs because the output ordering is non-deterministic. When comparing two files, one must always match the halo IDs.
Function documentation¶
|
Load the contents of a SPARTA HDF5 results file. |
|
Find halos in a SPARTA file according to certain criteria, output them as an ID list. |
|
Find matches between the halo IDs of two sets of analyses and return the matched arrays. |
|
Decide whether a halo was a host given a SPARTA status. |
|
Decide whether a halo was a subhalo given a SPARTA status. |
|
Decide whether a halo was a subhalo for more than one snapshot given a SPARTA status. |
|
Decide whether a halo was a ghost given a SPARTA status. |
- sparta_tools.sparta.load(filename=None, hdf5_file=None, halo_ids=None, halo_mask=None, load_halo_data=True, analyses=None, tracers=None, results=None, anl_match=None, anl_pad_unmatched=True, res_match=None, res_pad_unmatched=True, log_level=1)¶
Load the contents of a SPARTA HDF5 results file.
- Parameters
- filename: str
The path to the sparta file. Either this field of hdf5_file must not be None.
- hdf5_file: HDF5 file object
Sometimes multiple load operations need to be performed, in which case the user may prefer not to keep opening and closing the HDF5 file. If a valid file object is passed, that file object is used and the filename parameter is ignored.
- halo_ids: array_like
If this field is None, the results for all halos are loaded. If it contains the catalog IDs of one or multiple halos (at any snapshot!), only the results for those halos will be loaded. The order of the returned halos may not be the same as the order of this input list!
- halo_mask: array_like
If this field is None, the results for all halos are loaded (unless a selection is made with the halo_ids parameter instead). If not None, the parameter must be a numpy array with n_halos entries, where True means a halo is loaded. Such an array can, for example, be generated with the findIDs() function, and speeds up loading because the IDs do not have to be searched in the halo ID array.
- load_halo_data: bool
If True, the properties of halos (such as the histories of their ID, radius, and status) will be loaded, otherwise they will be omitted.
- analyses: array_like
If None, all analyses are loaded. Otherwise a list of analysis names to load, with the names corresponding to the abbreviated analysis names used in the sparta results file (e.g. “rsp”, see the introduction section of the documentation).
- tracers: array_like
If None, all tracers are loaded. Otherwise a list of tracer names to load, with the names corresponding to the abbreviated tracer names used in the sparta results file (e.g. “ptl” or “sho”, see the introduction section of the documentation).
- results: array_like
If None, all tracer results are loaded. Otherwise a list of result names to load, with the names corresponding to the abbreviated result names used in the sparta results file (e.g. “ifl” or “sbk”, see the introduction section of the documentation).
- anl_match: array_like
If None, no matching is performed. Otherwise, the parameter can be the name of one analysis (e.g. ‘rsp’) or a list of analyses (e.g. [‘rsp’]) for which matching is performed. This means that the halo and analysis arrays have the same dimension, i.e. that each halo is assigned exactly one analysis of each of the given types. If there are more than one of an analysis, the first one is returned. The anl_pad_unmatched parameter determines what happens when halos do not have an analysis.
- anl_pad_unmatched: bool
If we are matching analyses (see above), we can either discard halos that do not have the analyses in question (
anl_pad_unmatched = False
), or we can pad the analysis arrays with empty elements (anl_pad_unmatched = True
). Such padded elements can easily be identified by their ID field which ishalo_id = -1
.- res_match: array_like
If None, no matching is performed. Otherwise, the parameter must be a list of results with at least two elements, e.g.
res_match = ['ifl', 'sbk']
. In that case, the listed results are matched by their tracer ID. Since all result arrays are sorted by tracer ID, the matched arrays will naturally be in the same order. The res_pad_unmatched parameter determines what happens to results that do not have a counterpart. If a tracer does not have one of the matched results, no matching is performed and a warning is output.- res_pad_unmatched: bool
If matching results (see above), we can either discard (
res_pad_unmatched = False
) all unmatched results (i.e. results from tracers that do not have all of the matched types), or we can keep them (res_pad_unmatched = True
). In the latter case, in order to maintain the synched array ordering of the result arrays, we need to insert void records where results are missing. The void records can easily be identified by theirtracer_id = -1
value.- log_level: int
If zero, no output is generated. One is the default level of output, greater numbers lead to very detailed output including timing information.
- Returns
- dic: dictionary
A dictionary containing essentially the same file structure as the HDF5 file, depending on the options chosen.
- sparta_tools.sparta.findHalos(filename=None, hdf5_file=None, cuts=[], log_level=1)¶
Find halos in a SPARTA file according to certain criteria, output them as an ID list.
The result of this function can be used as input to other functions such as load(). By default, the function tries to find the quantity passed in each cut in the structured halo array in the SPARTA file. Some quantities are automatically generated, namely M200m (from R200m) and N200m (from M200m). Each cut dictionary must contain the following entries:
q: the identifier of the quantity to be cut on, e.g. R200m
min: the minimum value of this quantity
max: the maximum value of this quantity
possible parameters:
a / t / z / snap: the time where the cut is considered
a_max / t_max / z_max / snap_max: if passed, make a cut between t and t_max
A special case is a cut on the halo status, that is, whether a halo was a host or subhalo or ghost. In that case, the keyword
include
must be in the dictionary, and contain a list of statuses to exclude which can behosts
,subs
, orghosts
.- Parameters
- filename: str
The path to the sparta file. Either this field of hdf5_file must not be None.
- hdf5_file: HDF5 file object
Sometimes multiple load operations need to be performed, in which case the user may prefer not to keep opening and closing the HDF5 file. If a valid file object is passed, that file object is used and the filename parameter is ignored.
- cuts: array_like
A list of dictionaries, where each entry corresponds to a cut.
- log_level: int
Output level
- Returns
- ids: array_like
A list of halo IDs.
- mask: array_like
A boolean array of dimension n_halos which can be used to speed up the load() function.
- sparta_tools.sparta.matchAnalyses(anl1, anl2)¶
Find matches between the halo IDs of two sets of analyses and return the matched arrays.
The order of halos in SPARTA arrays is, essentially, random because it depends on the processes to which halos are assigned, and thus the computing architecture. When comparing the results of two SPARTA runs, we must match the halo IDs. This function returns matched arrays of analyses, their size may or may not be equal to the size of the input arrays depending on whether all analyses have matches or not.
- Parameters
- anl1: structured array
A set of halo analyses as returned by the
load()
function. For example, ifdic
was returned by theload()
function, the profiles analysis (if it exists) can be found indic['anl_prf']
which is a structured array that can serve as input to this function.- anl2: structured array
See above, for a second SPARTA file.
- Returns
- anl1_matched: structured array
The analyses in
anl1
that have matches inanl2
.- anl2_matched: structured array
The analyses in
anl2
that have matches inanl1
, in the same order as theanl1_matched
returned.
- sparta_tools.sparta.haloIsHost(status)¶
Decide whether a halo was a host given a SPARTA status.
This function refers to the
status
field output in thehalos
group in SPARTA output file, or thesparta_status
field in MORIA output files. Thefinal_status
fields take on different meanings and cannot be evaluated with this function.- Parameters
- status: array_like
One integer or a numpy array of integers indicating a SPARTA status..
- Returns
- is_host: array_like
Boolean number or array with the same dimensions as
status
, True if the status indicates that a halo was a host.
- sparta_tools.sparta.haloIsSub(status)¶
Decide whether a halo was a subhalo given a SPARTA status.
This function refers to the
status
field output in thehalos
group in SPARTA output file, or thesparta_status
field in MORIA output files. Thefinal_status
fields take on different meanings and cannot be evaluated with this function.- Parameters
- status: array_like
One integer or a numpy array of integers indicating a SPARTA status..
- Returns
- is_sub: array_like
Boolean number or array with the same dimensions as
status
, True if the status indicates that a halo was a subhalo.
- sparta_tools.sparta.haloIsSubPermanently(status)¶
Decide whether a halo was a subhalo for more than one snapshot given a SPARTA status.
The distinction whether a halo is a subhalo for one or multiple snapshots may seem insignificant, but SPARTA treats fly-through events where a halo is a sub for only one snapshot somewhat differently.
This function refers to the
status
field output in thehalos
group in SPARTA output file, or thesparta_status
field in MORIA output files. Thefinal_status
fields take on different meanings and cannot be evaluated with this function.- Parameters
- status: array_like
One integer or a numpy array of integers indicating a SPARTA status..
- Returns
- is_sub_permanently: array_like
Boolean number or array with the same dimensions as
status
, True if the status indicates that a halo was a subhalo for more than one snapshot.
- sparta_tools.sparta.haloIsGhost(status)¶
Decide whether a halo was a ghost given a SPARTA status.
This function refers to the
status
field output in thehalos
group in SPARTA output file, or thesparta_status
field in MORIA output files. Thefinal_status
fields take on different meanings and cannot be evaluated with this function.- Parameters
- status: array_like
One integer or a numpy array of integers indicating a SPARTA status..
- Returns
- is_sub: array_like
Boolean number or array with the same dimensions as
status
, True if the status indicates that a halo was a ghost.