Sort out subdetector outputs in DAQReader (!64) · Merge requests · Luca Scotto Lavina / straxen

Luca Scotto Lavina requested to merge sort_in_daqreader into master Mar 08, 2020

Created by: JelleAalbers

This:

Lets DAQReader sort/split raw records data from different subdetectors into different data types.
Makes a start with splitting off XENON1T and XENONnT contexts explicitly.

After this, DAQReader will be a multi-output plugin, producing raw_records, raw_records_lowgain, raw_records_mv, and raw_records_aqmon. If any of them has no entries (e.g. MV for a non-linked TPC run) the data types will still exist, but all chunks will be empty. On disk, they get a folder with only a metadata.json file (no empty files).

Sorting out the different subdetectors in DAQReader seems preferable over several alternatives (to me -- happy to hear other perspectives!):

Having multiple DAQReaders (what I had initially in mind) would put the burden on redax to sort things in different (sub)folders on ceph. Strax would also have a big headache trying to piece together the many independent source plugins producing data with not-perfectly aligned chunks (because final chunk breaks are set by where the data shows gaps).
Sorting in PulseProcessing is what we currently do. This means PulseProcessing shows up in the muon veto and acquisition monitor dependency chain, so when we e.g. change the TPC hitfinder threshold all muon veto processing would have to be redone from scratch. That can't be right.
Let the first plugin of each subsystem select a channel range would still mean you still have to (download and) go through the full raw_records, even to reprocess very light data such as the acquisition monitor or muon veto.
Having separate channel-selector plugins that do no processing for each subsystem would be almost equivalent to this PR does -- except that you would save the intermediate unsorted raw_records unnecessarily. Having many plugins depend on a big monolithic raw_records could also increase strax's memory usage, depending on under what conditions numpy arrays get copied internally.

This does not consider the neutron veto yet. I'm starting to agree with @darrylmasson that the most elegant solution would be a single DAQReader plugin that can handle everything. We would have to make a few modifications to deal with the different digitizer time resolution, and (if I understood correctly) the fact that the data will appear in a different folder. We'd also have to consider if the data is still sparse enough to look for simultaneous chunk breaks.

Sort out subdetector outputs in DAQReader

Merge request reports