HDF5 Schema
This document defines the on-disk HDF5 layout for rustpix event data and histograms. The schema is designed for scipp compatibility via NeXus, using NXevent_data for events and NXdata for histograms.
Goals
- Bounded-memory processing for large TPX3 datasets
- scipp-compatible layout via NeXus (NXevent_data + NXdata)
- Clear units and metadata to support TOF ↔ eV conversion
- Optional fields (tot, chip_id, cluster_id) are truly optional
File Structure
/
rustpix_format_version = "0.1"
entry/ (NXentry)
hits/ (NXevent_data) [optional]
neutrons/ (NXevent_data) [optional]
histogram/ (NXdata) [optional]
metadata/ (group) [optional]
File-Level Conventions
- Root group has attribute:
rustpix_format_version = "0.1" - All groups use
NX_classattributes where applicable - Units are stored as dataset attributes:
units = "ns","pixel","deg", etc. - Endianness is native (HDF5 handles portability)
Event Data (NXevent_data)
Event groups follow the NeXus NXevent_data base class, used by SNS/ISIS event files and expected by Mantid.
Required Datasets
| Name | Type | Shape | Units | Description |
|---|---|---|---|---|
event_id | i32 | (N) | id | Detector element ID |
event_time_offset | u64 | (N) | ns | Time-of-flight relative to pulse |
Pulse Indexing (for pulsed sources)
| Name | Type | Shape | Units | Description |
|---|---|---|---|---|
event_time_zero | u64 | (J) | ns | Start time of each pulse |
event_index | i32 | (J) | id | Index into event arrays |
Optional Datasets
| Name | Type | Shape | Units | Description |
|---|---|---|---|---|
time_over_threshold | u64 | (N) | ns | ToT in nanoseconds |
chip_id | u8 | (N) | id | Chip identifier |
cluster_id | i32 | (N) | id | Cluster assignment |
n_hits | u16 | (N) | count | Hits per neutron |
x | u16 | (N) | pixel | Global pixel X |
y | u16 | (N) | pixel | Global pixel Y |
Cluster ID Convention
cluster_id >= 0: Valid cluster indexcluster_id = -1: Unclustered / noise
Event ID Mapping
For imaging data, event_id maps to pixel coordinates:
event_id = y * x_size + x
Group attributes x_size and y_size define the detector dimensions.
Histogram Data (NXdata)
Histogram data is stored in a single NXdata group named histogram.
Group Attributes
NX_class = "NXdata"
signal = "counts"
axes = ["rot_angle", "y", "x", "time_of_flight"]
rot_angle_indices = 0
y_indices = 1
x_indices = 2
time_of_flight_indices = 3
Required Datasets
| Name | Type | Shape | Units | Description |
|---|---|---|---|---|
counts | u64 | (R, Y, X, E) | count | Histogram counts |
rot_angle | f64 | (R) | deg | Rotation angle |
y | f64 | (Y) | pixel | Y axis |
x | f64 | (X) | pixel | X axis |
time_of_flight | f64 | (E) | ns | TOF axis |
Axis Representation
- Centers: axis length = N,
axis_mode = "centers" - Edges: axis length = N+1,
axis_mode = "edges"
Optional Energy Axis
If flight_path_m and tof_offset_ns are provided:
| Name | Type | Shape | Units | Description |
|---|---|---|---|---|
energy_eV | f64 | (E) | eV | Derived energy axis |
Conversion Metadata
Stored as attributes at /entry:
| Attribute | Type | Description |
|---|---|---|
flight_path_m | f64 | Effective flight path length |
tof_offset_ns | f64 | Instrument TOF window shift |
energy_axis_kind | string | Typically "tof" |
TOF to Energy Conversion
Using the non-relativistic relation:
E = (m_n / 2) * (L / t)²
where:
t = (event_time_offset + tof_offset_ns) * 1e-9 [seconds]
L = flight_path_m [meters]
m_n = neutron mass
Metadata Group
/entry/metadata may contain:
- Detector config (chip transforms, pixel size)
- Clustering config
- Extraction config
- Processing provenance (git sha, rustpix version)
- Instrument context (facility, run ID)
Preferred storage: UTF-8 string dataset named metadata_json.
ORNL SNS/HFIR NXsnsevent Schema
rustpix also supports export in the ORNL SNS/HFIR NXsnsevent schema used by instruments like VENUS (BL10). This is a separate export mode available in the GUI (as "HDF5 (SNS NXsnsevent)") and the CLI (via .nxs.h5 file extension).
SNS File Structure
/
entry/ (NXentry, definition="NXsnsevent")
run_number (string)
experiment_identifier (string, e.g. "IPTS-35004")
start_time (string, ISO 8601)
end_time (string, ISO 8601)
duration (f64, seconds)
proton_charge (f64, picoCoulombs)
total_counts (u64)
total_pulses (u64)
bank{N}_events/ (NXevent_data)
event_id (u32, N) [units=""]
event_time_offset (f32, N) [units="microsecond"]
event_time_zero (f64, P) [units="second"]
event_index (u64, P)
total_counts (u64, scalar)
instrument/ (NXinstrument)
name (string, e.g. "VENUS")
beamline (string, e.g. "BL10")
bank{N}/ hard link to /entry/bank{N}_events
DASlogs/ (NXcollection)
sample/ (NXsample)
Key Differences from Generic Schema
| Aspect | Generic rustpix | SNS NXsnsevent |
|---|---|---|
| Pixel ID | i32, y * x_size + x | u32, bank_offset + row * width + col |
| TOF | u64 nanoseconds | f32 microseconds |
| Pulse time | u64 nanoseconds | f64 seconds (from run start) |
| Event index | i32 | u64 |
| Groups | /entry/hits/, /entry/neutrons/ | /entry/bank{N}_events/ |
VENUS Pixel ID Mapping
For VENUS bank100 (512x512 TPX3 detector):
event_id = 1,000,000 + row * 512 + col
Range: 1,000,000 to 1,262,143
Note: rustpix internally uses 514x514 coordinates (with 2-pixel chip gaps), whereas the SNS schema uses a 512x512 pixel grid. The SNS writer handles the 514x514→512x512 gap-pixel remapping automatically via SnsBankConfig.gap_columns/gap_rows, so callers should provide logical detector coordinates and must not pre-remap them.
Format Selection Guide
rustpix supports two HDF5 export schemas. Choose based on your use case:
Generic NeXus (.h5) | SNS NXsnsevent (.nxs.h5) | |
|---|---|---|
| Best for | Analysis with scipp, Mantid, custom tools | Compatibility with ORNL SNS/HFIR workflows |
| Schema | NXevent_data | NXsnsevent |
| TOF units | Nanoseconds (u64) | Microseconds (f32) |
| Pixel ID | y * x_size + x | bank_offset + row * width + col |
| Run metadata | Minimal | Full (run number, IPTS, proton charge, timestamps) |
| Instrument info | None | Instrument name, beamline |
How to Select
CLI:
- Generic NeXus:
rustpix process input.tpx3 -o output.h5 - SNS NXsnsevent:
rustpix process input.tpx3 -o output.nxs.h5 --run-number 12345 - Or use
--formatto override:-f hdf5or-f sns-hdf5
GUI:
- File > Export, then choose "HDF5 (NeXus)" or "HDF5 (SNS NXsnsevent)"
Python:
- Generic NeXus:
output_path="neutrons.h5" - SNS NXsnsevent:
output_path="output.nxs.h5"
Implementation Notes
Chunking Strategy
- Events: Chunk along event dimension, 50k–200k events per chunk
- Histograms: Chunk along slowest-changing dimensions (e.g.,
rot_angle)
Compression
- Start with gzip level 1–4 for balanced I/O
- Use shuffle + compression for integer datasets
Data Types
- Use
u64for timestamps in ns to prevent overflow - Use
f64for coordinates requiring sub-pixel precision