SciPy

Console output

SPARTA outputs information about its progress to the console, but interpreting this output can be non-trivial. This document gives some output examples and explains their meaning.

Startup

At the beginning, SPARTA outputs information about the compile-time and run-time configuration:

--------------------------------------------------------------------------------------------
[Main] COMPILER SETTINGS
--------------------------------------------------------------------------------------------
[Main] User-defined output settings
[Main]     OUTPUT_TRACER_PARTICLES                            0
[Main]     OUTPUT_TRACER_SUBHALOS                             1
[Main]     OUTPUT_RESULT_INFALL                               1
[Main]     OUTPUT_RESULT_INFALL_SMR                           0
...
[Main] Actions to be performed
[Main]     DO_ANALYSIS_ANY                                    1
[Main]     DO_ANALYSIS_RSP                                    1
...

The OUTPUT_* switches are set by the user (see Compiling SPARTA), the DO_* switches are worked out by the code itself. They indicate the actual actions performed by the code; many actions can be internally switched on by certain output fields. The second section gives the full run-time configuration, whether set by the user or default:

--------------------------------------------------------------------------------------------
[Main] RUN-TIME CONFIGURATION
--------------------------------------------------------------------------------------------
[Main] Input (halo catalogs)
[Main]     catalog_path                  /some/dir/catalogs
[Main]     cat_type                      0
[Main]     cat_n_chunks                  1
[Main]     catalog field id              id
[Main]     catalog field desc_id         desc_id
...

Snapshots

The following examples assume the default log_level of 1 (see Run-time configuration parameters). If this level is increased, there is additional output, but that is mostly designed for debugging purposes. In some of the output, we have removed the leading “[Main] [SN 28]” indicators to avoid line breaking. Each snapshot’s output is split into sections, starting with basic information:

[Main] [SN  28] Snapshot:       29/30, file 057, a = 0.20416, z = 3.89812, t = 1.64698

In this case, the snapshot index (28) does not match up with the file index (57), which is allowed. The time is measured since the Big Bang, like all times in SPARTA. The second section gives us an idea of how many halos the code is tracking:

[Main] [SN  28] Halos (alive):  9249 total, 8601 connected, 766 new, 8986 hosts (97.2 %), 263 subs (2.8 %)
[Main] [SN  28] Halos (ended):  113 total, 85 discarded, 28 saved (178 cumulative)

The live halos were either connected from a previous snapshot, or newly created. Halo results are saved to file as soon as a halo ends in order to free memory as soon as possible. However, not all halos are written to file, for example if they never reached a critical mass. The number of saved halos is given by the cumulative number at the end.

The following output lines give us a more detailed statistics of the tracers SPARTA is currently tracking. Each type of tracer (particles and subhalos in this case) get separate lines. First, the particle tracer lines will typically look something like this:

[Main] [SN  28] Tracer ptl:
    Active 857149 (5.1 %), ignoring 385042 (2.3 %), processed 1404859 (8.4 %), deleted 81300 (0.5 %)
    Of proc'd: new 173086 (12.3 %), rec'd 0 (0.0 %), ign 304779 (21.7 %), tr 734640 (52.3 %), n-tr 192354 (13.7 %)
    Of del'td: host-sub 24948 (30.7 %), escpd 151 (0.2 %), unconn 14276 (17.6 %), all done 41925 (51.6 %)
    Subhalo tags: 9941, ignored 1017, created 1368, result 2303

Here, active means that SPARTA is actively tracking these tracers, that is, there is a tracer object in memory and we will try to connect this tracer in the next snapshot. To avoid excessive tracking, SPARTA keeps a list of tracers to ignore, for example particles that have already orbited over a halo and for which all analyses have been finished. The number of processed tracers is the total number of tracers that were considered in this snapshot.

For particle tracers, all percentages refer to the total number of particles in the simulation. It is not uncommon for SPARTA to track a significant fraction of all particles! The second line tells us what happened to all processed particle tracers, namely whether:

  • a new tracer was created

  • a tracer was recreated, meaning it had been a tracer before, was deleted, but has come back and needs to be analyzed again

  • a tracer was ignored, typically because it was found on the ignore list

  • a tracer simply continued from one sanpshot to another (“tracked”)

  • a tracer did not yet exist and is not being created for whatever reason (“not tracked”).

The following line tells us why particle tracers were deleted:

  • because the halo became a subhalo (what exactly happens to particle tracers in this case depends on the settings)

  • because they escaped, meaning strayed too far from the halo

  • because they could not be found in the halo in the next snapshot (“unconn”, typically also because the particle left the halo)

  • because all their tracer results had finished (“all done”).

The final line relating to particle tracers tells us how many particles were tagged as belonging to a subhalo. Some of these tags were ignored because the particle is on the ignore list, in some cases a tracer was created due to the tag, and in some cases an existing result was altered to incorporate the tag information (“result”). These are technical distinctions and typically not important.

The next lines tell us about the statistics of subhalo tracers:

[Main] [SN  28] Tracer sho:
    Active 429, new 130, deleted 93
    Of del'td: host-sub 14 (15.1 %), escpd 0 (0.0 %), unconn 57 (61.3 %), all done 22 (23.7 %)

The fields have the same meaning as for particle tracers, but there are fewer scenarios. As tracers are analyzed, each halo accumulates tracer results:

[Main] [SN  28] Results:
    tcr  res   current    sn-new    cm-new    sn-fld    cm-fld    sn-svd    cm-svd   fld-f   svd-f
    ptl  ifl    842775     97395    929445         0         0         0         0   0.000   0.000
    ptl  sbk    108471     17222    111760      4737     36875         0         0   0.248   0.000
    sho  ifl       944       130      1036         0         0        10        56   0.000   0.054
    sho  sbk        20         5        20         2         8         0         0   0.286   0.000

There is one line for each tracer and each result. The columns tell us the number of:

  • current: results currently kept in memory

  • sn-new: results that were successfully created during this snapshot

  • cm-new: cumulative sum of all results that have been successfully created

  • sn-fld: number of results that failed during this snapshot

  • cm-fld: cumulative number of all failed results

  • sn-svd: results saved to file during this snapshot

  • cm-svd: cumulative number saved to file

  • fld-f: fraction of total events that failed

  • svd-f: fraction of successful events that was saved (not all halos are saved to file).

We now move from statistics about the objects SPARTA is tracking into diagnostics of efficiency, memory, and timing:

[Main] [SN  28] Efficiency:     Volume ratio 1.04 [1.03..1.04], snap files read 4 [4..4]
[Main] [SN  28]                 Exchanged 17 halos, 0 sub pointers, 1942 particles, 0 subhalos

One indicator for the efficiency of the domain decomposition is how large the volume is that the average process needs to consider compared to its share of the box. In an ideal world, that number should be close to unity (i.e., if the box was cleanly split between processes). In reality, and depending on the file format, the volume ratio can be significantly larger than one because the volumes around halos overlap, or because the files are not split into rectangular volumes. The number of snapshot files read per process tells us how many files the particles on a given process were distributed over.

The second line appears if domain decomposition is turned on. The number of exchanged halos should always be much smaller than the total number, otherwise the domain decomposition is failing.

We now move from memory to timing. The following line contains basic timing information:

[Main] [SN  28] Timing:         Total 3.6, balance 1.40 (indep. 2.3, idle 0.3)

The total number tells us how long it took SPARTA to complete this snapshot. Note that this time can vary drastically, from almost zero (if only halos but no particles are considered, for example) to ~hours for heavy computations. If log_level_timing is set higher than zero, we also get a much more detailed accounting of where SPARTA spent its time during this snapshot. The statistics refer to that process that took the longest overall, but if the run is reasonably well balanced, they should be representative of the other processes as well:

[Main] [SN  28]                 On max. proc (003)         Time (s)          %
[Main] [SN  28]                 ------------------------------------------------------------
[Main] [SN  28]                 Reading catalog (main)          0.1        4.0
[Main] [SN  28]                 Exchange                        0.0        1.3
[Main] [SN  28]                 Reading snapshots               0.7       37.0
[Main] [SN  28]                 Process snap (indep.)           1.0       57.0
[Main] [SN  28]                   Building tree                   0.8       45.9
[Main] [SN  28]                   Tree searches                   0.0        1.9
[Main] [SN  28]                   Particle preparation            0.1        3.9
[Main] [SN  28]                   Tracer results                  0.0        1.6
[Main] [SN  28]                     Tracer ptl result sbk           0.0        1.3
[Main] [SN  28]                   Analyses                        0.0        1.3
[Main] [SN  28]                     Analysis hps                    0.0        1.3
[Main] [SN  28]                 ------------------------------------------------------------
[Main] [SN  28]                 Timing total                    1.8      100.0

The listing contains only fields that took up at least 1% of the total run time during this snapshot. By setting log_level_timing to 2, you can get the full listing, but that will generally not be more informative than the reduced listing (unless you want to know exactly how much time a less-relevant action took up).

The following section tells us about the memory consumption of SPARTA, an extremely important number because large memory consumption can cause a job to crash. Memory is called persistent in this context if it persists between snapshots:

[Main] [SN  28] Pers. memory:   Total 328 MB (82 per proc), halos 83 (25.3 %), tracers ptl 122 (37.2 %)
[Main] [SN  28]                 Results ifl 37 (11.4 %) sbk 4 (1.3 %)

The output gives a rough indication of which fields are responsible for the bulk of the memory consumption (for more detailed output, see below). In particular, the halos fraction refers to the actual halo objects, but does not include their tracers which are listed separately. Tracer results are yet another category.

However, we do not only care about how much memory is allocated for tracers and results, but also how efficiently it is allocated. Since the large arrays in SPARTA are typically dynamically allocated, they have some unused memory to avoid frequent re-allocations. The next few lines tell us the fraction of memory in the dynamical arrays that is actually used:

[Main] [SN  28] Memory eff.:    Tracer ptl 0.697, ignore list 0.374, results: ifl 0.696 sbk 0.597
[Main] [SN  28]                 Tracer sho 0.014, ignore list 0.014, results: ifl 0.075 sbk 0.053
[Main] [SN  28]                 Analyses: rsp   --- hps 1.000

We see that the large particle and particle result arrays are used to about 70% in this particular run. The subhalo arrays are less efficient, but they are also very small in comparison. The analysis arrays are allocated perfectly (in the case of hps) because there is exactly one per halo, and no splashback analyses have been allocated yet because they are computed at the end of a halo’s lifetime.

Finally, we may want a more detailed accounting of exactly which parts of the code consume the bulk of the memory. Depending on the log_level_memory parameter, the final lines give us an impression of the memory consumption of SPARTA:

[Main] [SN  28] Memory:  Total current 0.3 GB (82.0 MB per proc), peak 1.1 GB (272.9 MB per proc)
[Main] [SN  28]          Overall peak was 312925148, 298.4 MB on proc 3
[Main] [SN  28]          Memory ID             Description        At overall peak:  Byte      MB Fraction
[Main] [SN  28]          DA_HALO               Halo objects                     24832000    23.7    7.9 %
[Main] [SN  28]          DA_TRACER             Tracer objects                   34080800    32.5   10.9 %
[Main] [SN  28]          DA_RESIFL             Infall result objects            10175360     9.7    3.3 %
[Main] [SN  28]          DA_ALHPS              Haloprops analysis objects       20228208    19.3    6.5 %
[Main] [SN  28]          PARTICLES             Raw particle data               182641248   174.2   58.4 %
[Main] [SN  28]          PARTICLEMESSAGES      Particle exchange packets        36458368    34.8   11.7 %

There is an important distinction between the memory currently occupied (which contains only objects that persist between snapshots, since this output refers to the end of the snapshot), and peak memory consumption, which would be responsible for out-of-memory crashes. As the latter is typically more relevant, the output focuses on peak consumption and tells us

  • the total peak and how much that would be per process if the memory was shared equally

  • what the actual peak on one process was, and which process

  • which fields occupied significant memory at that peak.

For example, the output above tells us that, at peak, raw particle data read from snapshot files and particle-data messages between processes were responsible for about 70% of SPARTA’s memory consumption.

Note that the peak memory consumption refers to the peak at any time, not necessarily at the given snapshot. Typically, the memory consumption rises with time as more and more halos form. However, in some cases, the load balancing algorithm shifts memory away from the peak-consumption process and the peak decreases for one or a few snapshots. In that case, the above lines will simply be repeated at the following snapshot(s).

Final output

Once all snapshots have been finished, SPARTA will (depending on settings) output final timing and memory information, e.g.:

[Main] All times are in seconds.
[Main] Task                               Time   Per snap    Level %    Total %
[Main] --------------------------------------------------------------------------
[Main] Initialization                  0.6                              0.3
[Main] Snapshots                     165.4                             99.5
[Main]   Statistics                      0.0        0.0        0.0        0.0
[Main]   Connection requests             0.0        0.0        0.0        0.0
[Main]   Reading catalog (main)          4.9        0.1        2.9        2.9
[Main]   Connections                     0.2        0.0        0.1        0.1
[Main]   Load balancing                 12.7        0.2        7.7        7.6
[Main]   Exchange                        4.6        0.1        2.8        2.8
[Main]   Halo box                        0.4        0.0        0.2        0.2
[Main]   Reading snapshots              39.4        0.7       23.9       23.7
[Main]   Process snap (indep.)          84.7        1.4       51.3       51.0
[Main]     Building tree                  57.2        1.0       68.2       34.4
[Main]     Sorting                         2.2        0.0        2.6        1.3
[Main]     Tagging subhalos                4.5        0.1        5.4        2.7
[Main]     Halo loop                       0.0        0.0        0.0        0.0
[Main]     Tree searches                   4.8        0.1        5.7        2.9
[Main]     Mass profile                    0.2        0.0        0.2        0.1
[Main]     Particle preparation            8.4        0.1       10.0        5.0
[Main]     Subhalo preparation             0.0        0.0        0.0        0.0
[Main]     Tracer results                  2.8        0.0        3.3        1.7
[Main]       Tracer ptl result ifl           0.4        0.0       15.7        0.3
[Main]       Tracer ptl result sbk           2.3        0.0       84.3        1.4
[Main]     Mark ending tracers             0.2        0.0        0.2        0.1
[Main]     Particle post-processing        0.0        0.0        0.0        0.0
[Main]     Subhalo post-processing         0.0        0.0        0.0        0.0
[Main]     Tracer ignore lists             0.1        0.0        0.1        0.1
[Main]     Delete tracers                  0.2        0.0        0.3        0.1
[Main]     Find ended halos                0.0        0.0        0.0        0.0
[Main]     Analyses                        3.3        0.1        4.0        2.0
[Main]       Analysis rsp                    0.6        0.0       18.6        0.4
[Main]       Analysis hps                    2.7        0.0       81.4        1.6
[Main]   Idle                           17.8        0.3       10.7       10.7
[Main]   Write results                   0.3        0.0        0.2        0.2
[Main]   Delete ended halos              0.3        0.0        0.2        0.2
[Main]   Write restart files             0.0        0.0        0.0        0.0
[Main] Finalize                        0.2                              0.1
[Main] --------------------------------------------------------------------------
[Main] Total                         166.2                            100.0

Unlike the timing information output at each snapshot, this contains a full listing of all active parts of the code (but not parts that were excluded at compile-time). The level percentage refers to the fraction of the next-higher (less indented) level of processes. For example, all level percentages under “process snapshot” sum to 100%, and the individual tracer results also sum to 100%. The final column shows the fraction of the total runtime each part of the code occupied. In this case, the main takeaways are that loading particles and building the particle tree took up about 60% of the code’s runtime (a fairly typical situation).

The final output is a full accounting of SPARTA’s memory consumption:

Memory: Peak total consumption was 2.5 GB (634.6 MB per proc).
Overall peak was 751869780, 717.0 MB on proc 2. Memory consuming fields:
---------------------------------------------------------------------------------------------------------
| Memory ID        | Description                |  Highest consumption  |         At overall peak       |
|                  |                            |       Byte        MB  |     Byte         MB  Fraction |
---------------------------------------------------------------------------------------------------------
| DA_HALO          | Halo objects               |   48896000      46.6  |   48896000     46.6     6.5 % |
| DA_TRACER        | Tracer objects             |  178834032     170.5  |  171008552    163.1    22.7 % |
| DA_RESIFL        | Infall result objects      |   65665920      62.6  |   62944640     60.0     8.4 % |
| DA_RESSBK        | Splashback result objects  |   17882880      17.1  |   16944960     16.2     2.3 % |
| DA_ALRSP         | Rsp analysis objects       |   19035712      18.2  |          0      0.0     0.0 % |
| DA_ALHPS         | Haloprops analysis objects |   47311264      45.1  |   46366320     44.2     6.2 % |
| DA_INT           | Integer lists              |      16000       0.0  |          0      0.0     0.0 % |
| DA_SUBPOINTER    | Subhalo lists              |      49056       0.0  |      47328      0.0     0.0 % |
| DA_ID            | Ignored ID lists           |   10168800       9.7  |    9722400      9.3     1.3 % |
| BOXES            | Snapshot file boundaries   |      38400       0.0  |      38400      0.0     0.0 % |
| HDF5BUFFER       | HDF5 read/write buffer     |      28360       0.0  |          0      0.0     0.0 % |
| CATREADING       | Halo catalog read/exchange |     422776       0.4  |          0      0.0     0.0 % |
| SNAPREADBUF      | Snapshot read buffer       |   13374192      12.8  |     987932      0.9     0.1 % |
| PARTICLES        | Raw particle data          |  344000448     328.1  |  308899424    294.6    41.1 % |
| PARTICLEMESSAGES | Particle exchange packets  |   92870656      88.6  |   86013824     82.0    11.4 % |
| OUTPUTBUF        | Output buffers             |   51559980      49.2  |          0      0.0     0.0 % |
| EXCHANGEBUF      | Exchange buffers           |   17152800      16.4  |          0      0.0     0.0 % |
| TREE             | Tree calculations          |   65879304      62.8  |          0      0.0     0.0 % |
| TREEPOT          | Tree potential             |     123584       0.1  |          0      0.0     0.0 % |
| DOMAINDECOMP     | Domain decomposition       |      42416       0.0  |          0      0.0     0.0 % |
| HALOLOGIC        | Halo/subhalo handling      |     191352       0.2  |          0      0.0     0.0 % |
| TRACERHANDLING   | Tracer handling            |      25834       0.0  |          0      0.0     0.0 % |
| TCRPTL           | Tracer: particles          |     718128       0.7  |          0      0.0     0.0 % |
| RESSBK           | Result: splashback         |         24       0.0  |          0      0.0     0.0 % |
| ANLRSP           | Analysis: Rsp              |     267266       0.3  |          0      0.0     0.0 % |
| ANLHPS           | Analysis: Halo properties  |     276436       0.3  |          0      0.0     0.0 % |
---------------------------------------------------------------------------------------------------------

Again, here we include all memory categories that contributed any memory consumption, even if a small fraction. The “highest consumption” columns refer to the peak consumption in each field, whereas the right columns refer to the consumption at the overall peak on any one core. The former give us a feeling for which fields are important consumers, whereas the latter allow us to diagnose causes of possible out-of-memory situations. The two listings can be quite different; note, for example, that the various output and exchange buffers can be significant memory consumers, but that they are zero at peak because peak occured during snapshot loading rather than during the exchange or output steps.