HDF5 Schema

This document defines the on-disk HDF5 layout for rustpix event data and histograms. The schema is designed for scipp compatibility via NeXus, using NXevent_data for events and NXdata for histograms.

Goals

  • Bounded-memory processing for large TPX3 datasets
  • scipp-compatible layout via NeXus (NXevent_data + NXdata)
  • Clear units and metadata to support TOF ↔ eV conversion
  • Optional fields (tot, chip_id, cluster_id) are truly optional

File Structure

/
  rustpix_format_version = "0.1"
  entry/                     (NXentry)
    hits/                    (NXevent_data) [optional]
    neutrons/                (NXevent_data) [optional]
    histogram/               (NXdata)       [optional]
    metadata/                (group)        [optional]

File-Level Conventions

  • Root group has attribute: rustpix_format_version = "0.1"
  • All groups use NX_class attributes where applicable
  • Units are stored as dataset attributes: units = "ns", "pixel", "deg", etc.
  • Endianness is native (HDF5 handles portability)

Event Data (NXevent_data)

Event groups follow the NeXus NXevent_data base class, used by SNS/ISIS event files and expected by Mantid.

Required Datasets

NameTypeShapeUnitsDescription
event_idi32(N)idDetector element ID
event_time_offsetu64(N)nsTime-of-flight relative to pulse

Pulse Indexing (for pulsed sources)

NameTypeShapeUnitsDescription
event_time_zerou64(J)nsStart time of each pulse
event_indexi32(J)idIndex into event arrays

Optional Datasets

NameTypeShapeUnitsDescription
time_over_thresholdu64(N)nsToT in nanoseconds
chip_idu8(N)idChip identifier
cluster_idi32(N)idCluster assignment
n_hitsu16(N)countHits per neutron
xu16(N)pixelGlobal pixel X
yu16(N)pixelGlobal pixel Y

Cluster ID Convention

  • cluster_id >= 0: Valid cluster index
  • cluster_id = -1: Unclustered / noise

Event ID Mapping

For imaging data, event_id maps to pixel coordinates:

event_id = y * x_size + x

Group attributes x_size and y_size define the detector dimensions.

Histogram Data (NXdata)

Histogram data is stored in a single NXdata group named histogram.

Group Attributes

NX_class = "NXdata"
signal = "counts"
axes = ["rot_angle", "y", "x", "time_of_flight"]
rot_angle_indices = 0
y_indices = 1
x_indices = 2
time_of_flight_indices = 3

Required Datasets

NameTypeShapeUnitsDescription
countsu64(R, Y, X, E)countHistogram counts
rot_anglef64(R)degRotation angle
yf64(Y)pixelY axis
xf64(X)pixelX axis
time_of_flightf64(E)nsTOF axis

Axis Representation

  • Centers: axis length = N, axis_mode = "centers"
  • Edges: axis length = N+1, axis_mode = "edges"

Optional Energy Axis

If flight_path_m and tof_offset_ns are provided:

NameTypeShapeUnitsDescription
energy_eVf64(E)eVDerived energy axis

Conversion Metadata

Stored as attributes at /entry:

AttributeTypeDescription
flight_path_mf64Effective flight path length
tof_offset_nsf64Instrument TOF window shift
energy_axis_kindstringTypically "tof"

TOF to Energy Conversion

Using the non-relativistic relation:

E = (m_n / 2) * (L / t)²

where:
  t = (event_time_offset + tof_offset_ns) * 1e-9  [seconds]
  L = flight_path_m  [meters]
  m_n = neutron mass

Metadata Group

/entry/metadata may contain:

  • Detector config (chip transforms, pixel size)
  • Clustering config
  • Extraction config
  • Processing provenance (git sha, rustpix version)
  • Instrument context (facility, run ID)

Preferred storage: UTF-8 string dataset named metadata_json.

ORNL SNS/HFIR NXsnsevent Schema

rustpix also supports export in the ORNL SNS/HFIR NXsnsevent schema used by instruments like VENUS (BL10). This is a separate export mode available in the GUI (as "HDF5 (SNS NXsnsevent)") and the CLI (via .nxs.h5 file extension).

SNS File Structure

/
  entry/                           (NXentry, definition="NXsnsevent")
    run_number                     (string)
    experiment_identifier          (string, e.g. "IPTS-35004")
    start_time                     (string, ISO 8601)
    end_time                       (string, ISO 8601)
    duration                       (f64, seconds)
    proton_charge                  (f64, picoCoulombs)
    total_counts                   (u64)
    total_pulses                   (u64)
    bank{N}_events/                (NXevent_data)
      event_id                     (u32, N) [units=""]
      event_time_offset            (f32, N) [units="microsecond"]
      event_time_zero              (f64, P) [units="second"]
      event_index                  (u64, P)
      total_counts                 (u64, scalar)
    instrument/                    (NXinstrument)
      name                         (string, e.g. "VENUS")
      beamline                     (string, e.g. "BL10")
      bank{N}/                     hard link to /entry/bank{N}_events
    DASlogs/                       (NXcollection)
    sample/                        (NXsample)

Key Differences from Generic Schema

AspectGeneric rustpixSNS NXsnsevent
Pixel IDi32, y * x_size + xu32, bank_offset + row * width + col
TOFu64 nanosecondsf32 microseconds
Pulse timeu64 nanosecondsf64 seconds (from run start)
Event indexi32u64
Groups/entry/hits/, /entry/neutrons//entry/bank{N}_events/

VENUS Pixel ID Mapping

For VENUS bank100 (512x512 TPX3 detector):

event_id = 1,000,000 + row * 512 + col
Range: 1,000,000 to 1,262,143

Note: rustpix internally uses 514x514 coordinates (with 2-pixel chip gaps), whereas the SNS schema uses a 512x512 pixel grid. The SNS writer handles the 514x514→512x512 gap-pixel remapping automatically via SnsBankConfig.gap_columns/gap_rows, so callers should provide logical detector coordinates and must not pre-remap them.

Format Selection Guide

rustpix supports two HDF5 export schemas. Choose based on your use case:

Generic NeXus (.h5)SNS NXsnsevent (.nxs.h5)
Best forAnalysis with scipp, Mantid, custom toolsCompatibility with ORNL SNS/HFIR workflows
SchemaNXevent_dataNXsnsevent
TOF unitsNanoseconds (u64)Microseconds (f32)
Pixel IDy * x_size + xbank_offset + row * width + col
Run metadataMinimalFull (run number, IPTS, proton charge, timestamps)
Instrument infoNoneInstrument name, beamline

How to Select

CLI:

  • Generic NeXus: rustpix process input.tpx3 -o output.h5
  • SNS NXsnsevent: rustpix process input.tpx3 -o output.nxs.h5 --run-number 12345
  • Or use --format to override: -f hdf5 or -f sns-hdf5

GUI:

  • File > Export, then choose "HDF5 (NeXus)" or "HDF5 (SNS NXsnsevent)"

Python:

  • Generic NeXus: output_path="neutrons.h5"
  • SNS NXsnsevent: output_path="output.nxs.h5"

Implementation Notes

Chunking Strategy

  • Events: Chunk along event dimension, 50k–200k events per chunk
  • Histograms: Chunk along slowest-changing dimensions (e.g., rot_angle)

Compression

  • Start with gzip level 1–4 for balanced I/O
  • Use shuffle + compression for integer datasets

Data Types

  • Use u64 for timestamps in ns to prevent overflow
  • Use f64 for coordinates requiring sub-pixel precision