Console output¶
SPARTA outputs information about its progress to the console, but interpreting this output can be non-trivial. This document gives some output examples and explains their meaning.
Startup¶
At the beginning, SPARTA outputs information about the compile-time and run-time configuration:
--------------------------------------------------------------------------------------------
[Main] COMPILER SETTINGS
--------------------------------------------------------------------------------------------
[Main] User-defined output settings
[Main] OUTPUT_TRACER_PARTICLES 0
[Main] OUTPUT_TRACER_SUBHALOS 1
[Main] OUTPUT_RESULT_INFALL 1
[Main] OUTPUT_RESULT_INFALL_SMR 0
...
[Main] Actions to be performed
[Main] DO_ANALYSIS_ANY 1
[Main] DO_ANALYSIS_RSP 1
[Main] DO_ANALYSIS_PROFILES 0
[Main] DO_ANALYSIS_PROFILES_ALL 0
...
The OUTPUT_* switches are set by the user (see Compiling SPARTA), the DO_* switches are worked out by the code itself. They indicate the actual actions performed by the code; many actions can be internally switched on by certain output fields. The second section gives the full run-time configuration, whether set by the user or default:
--------------------------------------------------------------------------------------------
[Main] RUN-TIME CONFIGURATION
--------------------------------------------------------------------------------------------
[Main] Input (halo catalogs)
[Main] catalog_path /some/dir/catalogs
[Main] cat_type 0
[Main] cat_n_chunks 1
[Main] catalog field id id
[Main] catalog field desc_id desc_id
...
Snapshots¶
The following examples assume the default log_level
of 1 (see Run-time configuration parameters). If this
level is increased, there is additional output, but that is mostly designed for debugging
purposes. In some of the output, we have removed the leading “[Main] [SN 28]” indicators to avoid
line breaking. Each snapshot’s output is split into sections, starting with basic information:
[Main] [SN 28] Snapshot: 29/30, file 057, a = 0.20416, z = 3.89812, t = 1.64698
In this case, the snapshot index (28) does not match up with the file index (57), which is allowed. The time is measured since the Big Bang, like all times in SPARTA. The second section gives us an idea of how many halos the code is tracking:
[Main] [SN 28] Halos (alive): 9249 total, 8601 connected, 766 new, 8986 hosts (97.2 %), 263 subs (2.8 %)
[Main] [SN 28] Halos (ended): 113 total, 85 discarded, 28 saved (178 cumulative)
The live halos were either connected from a previous snapshot, or newly created. Halo results are saved to file as soon as a halo ends in order to free memory as soon as possible. However, not all halos are written to file, for example if they never reached a critical mass. The number of saved halos is given by the cumulative number at the end.
The following output lines give us a more detailed statistics of the tracers SPARTA is currently tracking. Each type of tracer (particles and subhalos in this case) get separate lines. First, the particle tracer lines will typically look something like this:
[Main] [SN 28] Tracer ptl:
Active 857149 (5.1 %), ignoring 385042 (2.3 %), processed 1404859 (8.4 %), deleted 81300 (0.5 %)
Of proc'd: new 173086 (12.3 %), rec'd 0 (0.0 %), ign 304779 (21.7 %), tr 734640 (52.3 %), n-tr 192354 (13.7 %)
Of del'td: host-sub 24948 (30.7 %), escpd 151 (0.2 %), unconn 14276 (17.6 %), all done 41925 (51.6 %)
Subhalo tags: 9941, ignored 1017, created 1368, result 2303
Here, active means that SPARTA is actively tracking these tracers, that is, there is a tracer object in memory and we will try to connect this tracer in the next snapshot. To avoid excessive tracking, SPARTA keeps a list of tracers to ignore, for example particles that have already orbited over a halo and for which all analyses have been finished. The number of processed tracers is the total number of tracers that were considered in this snapshot.
For particle tracers, all percentages refer to the total number of particles in the simulation. It is not uncommon for SPARTA to track a significant fraction of all particles! The second line tells us what happened to all processed particle tracers, namely whether:
a new tracer was created
a tracer was recreated, meaning it had been a tracer before, was deleted, but has come back and needs to be analyzed again
a tracer was ignored, typically because it was found on the ignore list
a tracer simply continued from one sanpshot to another (“tracked”)
a tracer did not yet exist and is not being created for whatever reason (“not tracked”).
The following line tells us why particle tracers were deleted:
because the halo became a subhalo (what exactly happens to particle tracers in this case depends on the settings)
because they escaped, meaning strayed too far from the halo
because they could not be found in the halo in the next snapshot (“unconn”, typically also because the particle left the halo)
because all their tracer results had finished (“all done”).
The final line relating to particle tracers tells us how many particles were tagged as belonging to a subhalo. Some of these tags were ignored because the particle is on the ignore list, in some cases a tracer was created due to the tag, and in some cases an existing result was altered to incorporate the tag information (“result”). These are technical distinctions and typically not important.
The next lines tell us about the statistics of subhalo tracers:
[Main] [SN 28] Tracer sho:
Active 429, new 130, deleted 93
Of del'td: host-sub 14 (15.1 %), escpd 0 (0.0 %), unconn 57 (61.3 %), all done 22 (23.7 %)
The fields have the same meaning as for particle tracers, but there are fewer scenarios. As tracers are analyzed, each halo accumulates tracer results:
[Main] [SN 28] Results:
tcr res current sn-new cm-new sn-fld cm-fld sn-svd cm-svd fld-f svd-f
ptl ifl 842775 97395 929445 0 0 0 0 0.000 0.000
ptl sbk 108471 17222 111760 4737 36875 0 0 0.248 0.000
sho ifl 944 130 1036 0 0 10 56 0.000 0.054
sho sbk 20 5 20 2 8 0 0 0.286 0.000
There is one line for each tracer and each result. The columns tell us the number of:
current: results currently kept in memory
sn-new: results that were successfully created during this snapshot
cm-new: cumulative sum of all results that have been successfully created
sn-fld: number of results that failed during this snapshot
cm-fld: cumulative number of all failed results
sn-svd: results saved to file during this snapshot
cm-svd: cumulative number saved to file
fld-f: fraction of total events that failed
svd-f: fraction of successful events that was saved (not all halos are saved to file).
We now move from statistics about the objects SPARTA is tracking into diagnostics of efficiency, memory, and timing:
[Main] [SN 28] Efficiency: Volume ratio 1.04 [1.03..1.04], snap files read 4 [4..4]
[Main] [SN 28] Exchanged 17 halos, 0 sub pointers, 1942 particles, 0 subhalos
One indicator for the efficiency of the domain decomposition is how large the volume is that the average process needs to consider compared to its share of the box. In an ideal world, that number should be close to unity (i.e., if the box was cleanly split between processes). In reality, and depending on the file format, the volume ratio can be significantly larger than one because the volumes around halos overlap, or because the files are not split into rectangular volumes. The number of snapshot files read per process tells us how many files the particles on a given process were distributed over.
The second line appears if domain decomposition is turned on. The number of exchanged halos should always be much smaller than the total number, otherwise the domain decomposition is failing.
We now move from memory to timing. The following line contains basic timing information:
[Main] [SN 28] Timing: Total 3.6, balance 1.40 (indep. 2.3, idle 0.3)
The total number tells us how long it took SPARTA to complete this snapshot. Note that this time
can vary drastically, from almost zero (if only halos but no particles are considered, for example)
to ~hours for heavy computations. If log_level_timing
is set higher than zero, we also get a
much more detailed accounting of where SPARTA spent its time during this snapshot. The statistics
refer to that process that took the longest overall, but if the run is reasonably well balanced,
they should be representative of the other processes as well:
[Main] [SN 28] On max. proc (003) Time (s) %
[Main] [SN 28] ------------------------------------------------------------
[Main] [SN 28] Reading catalog (main) 0.1 4.0
[Main] [SN 28] Exchange 0.0 1.3
[Main] [SN 28] Reading snapshots 0.7 37.0
[Main] [SN 28] Process snap (indep.) 1.0 57.0
[Main] [SN 28] Building tree 0.8 45.9
[Main] [SN 28] Tree searches 0.0 1.9
[Main] [SN 28] Particle preparation 0.1 3.9
[Main] [SN 28] Tracer results 0.0 1.6
[Main] [SN 28] Tracer ptl result sbk 0.0 1.3
[Main] [SN 28] Analyses 0.0 1.3
[Main] [SN 28] Analysis hps 0.0 1.3
[Main] [SN 28] ------------------------------------------------------------
[Main] [SN 28] Timing total 1.8 100.0
The listing contains only fields that took up at least 1% of the total run time during this
snapshot. By setting log_level_timing
to 2, you can get the full listing, but that will
generally not be more informative than the reduced listing (unless you want to know exactly how
much time a less-relevant action took up).
The following section tells us about the memory consumption of SPARTA, an extremely important number because large memory consumption can cause a job to crash. Memory is called persistent in this context if it persists between snapshots:
[Main] [SN 28] Pers. memory: Total 328 MB (82 per proc), halos 83 (25.3 %), tracers ptl 122 (37.2 %)
[Main] [SN 28] Results ifl 37 (11.4 %) sbk 4 (1.3 %)
The output gives a rough indication of which fields are responsible for the bulk of the memory consumption (for more detailed output, see below). In particular, the halos fraction refers to the actual halo objects, but does not include their tracers which are listed separately. Tracer results are yet another category.
However, we do not only care about how much memory is allocated for tracers and results, but also how efficiently it is allocated. Since the large arrays in SPARTA are typically dynamically allocated, they have some unused memory to avoid frequent re-allocations. The next few lines tell us the fraction of memory in the dynamical arrays that is actually used:
[Main] [SN 28] Memory eff.: Tracer ptl 0.697, ignore list 0.374, results: ifl 0.696 sbk 0.597
[Main] [SN 28] Tracer sho 0.014, ignore list 0.014, results: ifl 0.075 sbk 0.053
[Main] [SN 28] Analyses: rsp --- hps 1.000
We see that the large particle and particle result arrays are used to about 70% in this particular run. The subhalo arrays are less efficient, but they are also very small in comparison. The analysis arrays are allocated perfectly (in the case of hps) because there is exactly one per halo, and no splashback analyses have been allocated yet because they are computed at the end of a halo’s lifetime.
Finally, we may want a more detailed accounting of exactly which parts of the code consume the bulk
of the memory. Depending on the log_level_memory
parameter, the final lines give us an
impression of the memory consumption of SPARTA:
[Main] [SN 28] Memory: Total current 0.3 GB (82.0 MB per proc), peak 1.1 GB (272.9 MB per proc)
[Main] [SN 28] Overall peak was 312925148, 298.4 MB on proc 3
[Main] [SN 28] Memory ID Description At overall peak: Byte MB Fraction
[Main] [SN 28] DA_HALO Halo objects 24832000 23.7 7.9 %
[Main] [SN 28] DA_TRACER Tracer objects 34080800 32.5 10.9 %
[Main] [SN 28] DA_RESIFL Infall result objects 10175360 9.7 3.3 %
[Main] [SN 28] DA_ALHPS Haloprops analysis objects 20228208 19.3 6.5 %
[Main] [SN 28] PARTICLES Raw particle data 182641248 174.2 58.4 %
[Main] [SN 28] PARTICLEMESSAGES Particle exchange packets 36458368 34.8 11.7 %
There is an important distinction between the memory currently occupied (which contains only objects that persist between snapshots, since this output refers to the end of the snapshot), and peak memory consumption, which would be responsible for out-of-memory crashes. As the latter is typically more relevant, the output focuses on peak consumption and tells us
the total peak and how much that would be per process if the memory was shared equally
what the actual peak on one process was, and which process
which fields occupied significant memory at that peak.
For example, the output above tells us that, at peak, raw particle data read from snapshot files and particle-data messages between processes were responsible for about 70% of SPARTA’s memory consumption.
Note that the peak memory consumption refers to the peak at any time, not necessarily at the given snapshot. Typically, the memory consumption rises with time as more and more halos form. However, in some cases, the load balancing algorithm shifts memory away from the peak-consumption process and the peak decreases for one or a few snapshots. In that case, the above lines will simply be repeated at the following snapshot(s).
Final output¶
Once all snapshots have been finished, SPARTA will (depending on settings) output final timing and memory information, e.g.:
[Main] All times are in seconds.
[Main] Task Time Per snap Level % Total %
[Main] --------------------------------------------------------------------------
[Main] Initialization 0.6 0.3
[Main] Snapshots 165.4 99.5
[Main] Statistics 0.0 0.0 0.0 0.0
[Main] Connection requests 0.0 0.0 0.0 0.0
[Main] Reading catalog (main) 4.9 0.1 2.9 2.9
[Main] Connections 0.2 0.0 0.1 0.1
[Main] Load balancing 12.7 0.2 7.7 7.6
[Main] Exchange 4.6 0.1 2.8 2.8
[Main] Halo box 0.4 0.0 0.2 0.2
[Main] Reading snapshots 39.4 0.7 23.9 23.7
[Main] Process snap (indep.) 84.7 1.4 51.3 51.0
[Main] Building tree 57.2 1.0 68.2 34.4
[Main] Sorting 2.2 0.0 2.6 1.3
[Main] Tagging subhalos 4.5 0.1 5.4 2.7
[Main] Halo loop 0.0 0.0 0.0 0.0
[Main] Tree searches 4.8 0.1 5.7 2.9
[Main] Mass profile 0.2 0.0 0.2 0.1
[Main] Particle preparation 8.4 0.1 10.0 5.0
[Main] Subhalo preparation 0.0 0.0 0.0 0.0
[Main] Tracer results 2.8 0.0 3.3 1.7
[Main] Tracer ptl result ifl 0.4 0.0 15.7 0.3
[Main] Tracer ptl result sbk 2.3 0.0 84.3 1.4
[Main] Mark ending tracers 0.2 0.0 0.2 0.1
[Main] Particle post-processing 0.0 0.0 0.0 0.0
[Main] Subhalo post-processing 0.0 0.0 0.0 0.0
[Main] Tracer ignore lists 0.1 0.0 0.1 0.1
[Main] Delete tracers 0.2 0.0 0.3 0.1
[Main] Find ended halos 0.0 0.0 0.0 0.0
[Main] Analyses 3.3 0.1 4.0 2.0
[Main] Analysis rsp 0.6 0.0 18.6 0.4
[Main] Analysis hps 2.7 0.0 81.4 1.6
[Main] Idle 17.8 0.3 10.7 10.7
[Main] Write results 0.3 0.0 0.2 0.2
[Main] Delete ended halos 0.3 0.0 0.2 0.2
[Main] Write restart files 0.0 0.0 0.0 0.0
[Main] Finalize 0.2 0.1
[Main] --------------------------------------------------------------------------
[Main] Total 166.2 100.0
Unlike the timing information output at each snapshot, this contains a full listing of all active parts of the code (but not parts that were excluded at compile-time). The level percentage refers to the fraction of the next-higher (less indented) level of processes. For example, all level percentages under “process snapshot” sum to 100%, and the individual tracer results also sum to 100%. The final column shows the fraction of the total runtime each part of the code occupied. In this case, the main takeaways are that loading particles and building the particle tree took up about 60% of the code’s runtime (a fairly typical situation).
The final output is a full accounting of SPARTA’s memory consumption:
Memory: Peak total consumption was 2.5 GB (634.6 MB per proc).
Overall peak was 751869780, 717.0 MB on proc 2. Memory consuming fields:
---------------------------------------------------------------------------------------------------------
| Memory ID | Description | Highest consumption | At overall peak |
| | | Byte MB | Byte MB Fraction |
---------------------------------------------------------------------------------------------------------
| DA_HALO | Halo objects | 48896000 46.6 | 48896000 46.6 6.5 % |
| DA_TRACER | Tracer objects | 178834032 170.5 | 171008552 163.1 22.7 % |
| DA_RESIFL | Infall result objects | 65665920 62.6 | 62944640 60.0 8.4 % |
| DA_RESSBK | Splashback result objects | 17882880 17.1 | 16944960 16.2 2.3 % |
| DA_ALRSP | Rsp analysis objects | 19035712 18.2 | 0 0.0 0.0 % |
| DA_ALHPS | Haloprops analysis objects | 47311264 45.1 | 46366320 44.2 6.2 % |
| DA_INT | Integer lists | 16000 0.0 | 0 0.0 0.0 % |
| DA_SUBPOINTER | Subhalo lists | 49056 0.0 | 47328 0.0 0.0 % |
| DA_ID | Ignored ID lists | 10168800 9.7 | 9722400 9.3 1.3 % |
| BOXES | Snapshot file boundaries | 38400 0.0 | 38400 0.0 0.0 % |
| HDF5BUFFER | HDF5 read/write buffer | 28360 0.0 | 0 0.0 0.0 % |
| CATREADING | Halo catalog read/exchange | 422776 0.4 | 0 0.0 0.0 % |
| SNAPREADBUF | Snapshot read buffer | 13374192 12.8 | 987932 0.9 0.1 % |
| PARTICLES | Raw particle data | 344000448 328.1 | 308899424 294.6 41.1 % |
| PARTICLEMESSAGES | Particle exchange packets | 92870656 88.6 | 86013824 82.0 11.4 % |
| OUTPUTBUF | Output buffers | 51559980 49.2 | 0 0.0 0.0 % |
| EXCHANGEBUF | Exchange buffers | 17152800 16.4 | 0 0.0 0.0 % |
| TREE | Tree calculations | 65879304 62.8 | 0 0.0 0.0 % |
| TREEPOT | Tree potential | 123584 0.1 | 0 0.0 0.0 % |
| DOMAINDECOMP | Domain decomposition | 42416 0.0 | 0 0.0 0.0 % |
| HALOLOGIC | Halo/subhalo handling | 191352 0.2 | 0 0.0 0.0 % |
| TRACERHANDLING | Tracer handling | 25834 0.0 | 0 0.0 0.0 % |
| TCRPTL | Tracer: particles | 718128 0.7 | 0 0.0 0.0 % |
| RESSBK | Result: splashback | 24 0.0 | 0 0.0 0.0 % |
| ANLRSP | Analysis: Rsp | 267266 0.3 | 0 0.0 0.0 % |
| ANLHPS | Analysis: Halo properties | 276436 0.3 | 0 0.0 0.0 % |
---------------------------------------------------------------------------------------------------------
Again, here we include all memory categories that contributed any memory consumption, even if a small fraction. The “highest consumption” columns refer to the peak consumption in each field, whereas the right columns refer to the consumption at the overall peak on any one core. The former give us a feeling for which fields are important consumers, whereas the latter allow us to diagnose causes of possible out-of-memory situations. The two listings can be quite different; note, for example, that the various output and exchange buffers can be significant memory consumers, but that they are zero at peak because peak occured during snapshot loading rather than during the exchange or output steps.