Introduction
Rustpix is a high-performance pixel detector data processing library for neutron imaging. It processes Timepix3 (TPX3) data with throughput exceeding 96 million hits per second, featuring multiple clustering algorithms, centroid extraction, and Python bindings.
Features
- Fast TPX3 Processing: Parallel packet parsing with memory-mapped I/O
- Multiple Clustering Algorithms:
- ABS (Adjacency-Based Search) - 8-connectivity clustering
- DBSCAN - Density-based with spatial indexing
- Grid - Parallel grid-based clustering
- Streaming Architecture: Process files larger than RAM
- Python Bindings: Thin wrappers with NumPy integration
- CLI Tool: Command-line interface for batch processing
- GUI Application: Interactive analysis with real-time visualization
- Multiple Output Formats: HDF5, Arrow, CSV
Performance
- Throughput: 96M+ hits/sec on modern hardware
- Memory: Streaming architecture processes files larger than RAM
- Parallel: Multi-threaded clustering with rayon
- Optimized: SIMD-friendly data layouts (SoA)
Getting Started
Choose the interface that best fits your workflow:
- Python API - For scripting and integration with scientific Python stack
- CLI Tool - For batch processing and shell scripts
- GUI Application - For interactive exploration and analysis
Workspace Structure
Rustpix is organized as a Rust workspace with multiple crates:
| Crate | Description |
|---|---|
| rustpix-core | Core traits and types |
| rustpix-tpx | TPX3 packet parser and hit types |
| rustpix-algorithms | Clustering algorithms (ABS, DBSCAN, Graph, Grid) |
| rustpix-io | File I/O with memory-mapped reading |
| rustpix-python | Python bindings (PyO3) |
| rustpix-cli | Command-line interface |
| rustpix-gui | GUI application (egui) |
Links
Installation
Rustpix can be installed in several ways depending on your needs:
| Method | Best For |
|---|---|
| Python (pip) | Python scripting, Jupyter notebooks |
| macOS (Homebrew) | GUI application on macOS |
| Rust (cargo) | CLI tool, Rust library development |
| From Source | Development, custom builds |
Quick Install
Python Users
pip install rustpix
macOS Users (GUI)
brew tap ornlneutronimaging/rustpix
brew install --cask rustpix
Rust Users (CLI)
cargo install rustpix-cli
System Requirements
- Python: 3.11 or later
- macOS: Big Sur (11.0) or later, Apple Silicon (ARM64)
- Rust: 1.70 or later (for building from source)
Python Installation
pip
The recommended way to install rustpix for Python is via pip:
pip install rustpix
This installs pre-built wheels for:
- Linux (x86_64, glibc 2.28+)
- macOS (ARM64 and x86_64)
- Windows (x86_64)
Verify Installation
import rustpix
print(rustpix.__version__)
Virtual Environment (Recommended)
We recommend using a virtual environment:
# Create environment
python -m venv .venv
source .venv/bin/activate # Linux/macOS
# or: .venv\Scripts\activate # Windows
# Install
pip install rustpix
With Scientific Stack
For data analysis workflows, install alongside NumPy and other tools:
pip install rustpix numpy matplotlib h5py
Jupyter Notebooks
Rustpix works well in Jupyter notebooks:
pip install rustpix jupyterlab
jupyter lab
Troubleshooting
Python Version
Rustpix requires Python 3.11 or later. Check your version:
python --version
Wheels Not Available
If no wheel is available for your platform, pip will attempt to build from source, which requires Rust. See From Source for build instructions.
macOS Installation (Homebrew)
GUI Application
Install the rustpix GUI application via Homebrew:
# Add the tap
brew tap ornlneutronimaging/rustpix
# Install the GUI app
brew install --cask rustpix
Launch
After installation, launch from:
- Spotlight: Search for "Rustpix"
- Applications: Find Rustpix in
/Applications - Terminal:
open -a Rustpix
Requirements
- macOS Big Sur (11.0) or later
- Apple Silicon (ARM64) architecture
Note: Intel Mac support is available via the CLI tool or Python package.
Updating
brew upgrade --cask rustpix
Uninstalling
brew uninstall --cask rustpix
Gatekeeper Notice
On first launch, macOS may show a security warning. The Homebrew installation automatically handles the quarantine attribute, but if you see a warning:
- Go to System Preferences > Security & Privacy
- Click Open Anyway for Rustpix
Rust Installation (cargo)
CLI Tool
Install the command-line interface via cargo:
cargo install rustpix-cli
This installs the rustpix binary to ~/.cargo/bin/.
Verify Installation
rustpix --version
rustpix --help
Library Usage
Add rustpix crates to your Rust project:
# Core types and traits
cargo add rustpix-core
# Clustering algorithms
cargo add rustpix-algorithms
# TPX3 parsing
cargo add rustpix-tpx
# File I/O
cargo add rustpix-io
Example Cargo.toml
[dependencies]
rustpix-core = "1.0"
rustpix-algorithms = "1.0"
rustpix-tpx = "1.0"
rustpix-io = "1.0"
API Documentation
Rust API documentation is available on docs.rs:
Requirements
- Rust 1.70 or later
- For HDF5 support: HDF5 libraries (automatically handled via static linking)
Building From Source
Prerequisites
Clone Repository
git clone https://github.com/ornlneutronimaging/rustpix
cd rustpix
Using Pixi (Recommended)
Pixi manages all dependencies automatically:
# Install pixi
curl -fsSL https://pixi.sh/install.sh | bash
# Install dependencies and build
pixi install
pixi run build
Available Tasks
pixi run test # Run all tests
pixi run clippy # Run linter
pixi run gui # Launch GUI (release mode)
pixi run gui-debug # Launch GUI (debug mode)
pixi run docs # Build Rust documentation
Using Cargo
Build without pixi:
# Build all crates
cargo build --release --workspace
# Run tests
cargo test --workspace
# Build CLI
cargo build --release -p rustpix-cli
# Build GUI
cargo build --release -p rustpix-gui
Python Bindings
Build Python package with maturin:
# Install maturin
pip install maturin
# Build and install in development mode
cd rustpix-python
maturin develop --release
Or build a wheel:
maturin build --release -m rustpix-python/Cargo.toml
pip install target/wheels/rustpix-*.whl
Development Setup
For development with hot reloading:
pixi install
pixi run gui-debug # Faster builds, debug symbols
Troubleshooting
HDF5 Linking Errors
HDF5 is statically linked by default. If you encounter issues:
# Ensure HDF5 dev packages are installed
# Ubuntu/Debian:
sudo apt install libhdf5-dev
# macOS:
brew install hdf5
Python Not Found
Ensure Python 3.11+ is available:
python3 --version
# or with pixi:
pixi run python --version
Python API
The rustpix Python package provides thin wrappers around the high-performance Rust core. Data is returned as NumPy arrays or PyArrow Tables for seamless integration with the scientific Python ecosystem.
Overview
| Function | Description |
|---|---|
read_tpx3_hits | Read all hits from a TPX3 file |
stream_tpx3_hits | Stream hits in batches |
process_tpx3_neutrons | Process hits into neutron events |
stream_tpx3_neutrons | Stream neutron events in batches |
cluster_hits | Cluster an existing HitBatch |
Data Types
HitBatch
Contains raw detector hits with the following fields:
| Field | Type | Description |
|---|---|---|
x | uint16 | X coordinate (pixels) |
y | uint16 | Y coordinate (pixels) |
tof | uint32 | Time-of-flight (25ns ticks) |
tot | uint16 | Time-over-threshold (charge proxy) |
timestamp | uint32 | Raw timestamp |
chip_id | uint8 | Detector chip ID |
cluster_id | int32 | Cluster assignment (-1 if unclustered) |
Note: The
toffield is stored in 25ns tick units for efficiency. To convert to nanoseconds:tof_ns = tof * 25
NeutronBatch
Contains processed neutron events:
| Field | Type | Description |
|---|---|---|
x | float64 | Centroid X (sub-pixel resolution) |
y | float64 | Centroid Y (sub-pixel resolution) |
tof | uint32 | Time-of-flight (25ns ticks) |
tot | uint16 | Total charge (sum of hit ToT) |
n_hits | uint16 | Number of hits in cluster |
chip_id | uint8 | Detector chip ID |
Note: The
toffield is stored in 25ns tick units. To convert to nanoseconds:tof_ns = tof * 25
Output Formats
Both HitBatch and NeutronBatch support:
# Convert to NumPy dict of arrays
data = batch.to_numpy()
# Convert to PyArrow Table (requires pyarrow)
table = batch.to_arrow()
Algorithms
Three clustering algorithms are available:
| Algorithm | Description | Best For |
|---|---|---|
abs | Adjacency-Based Search (8-connectivity) | General use, balanced |
dbscan | Density-based spatial clustering | Noisy data |
grid | Parallel grid-based clustering | Large datasets |
Specify with the algorithm keyword argument:
neutrons = rustpix.process_tpx3_neutrons(
"data.tpx3",
algorithm="abs" # or "dbscan", "grid"
)
Next Steps
- Quick Start - Basic usage examples
- Configuration - Detailed configuration options
Quick Start
Reading Hits
Load all hits from a TPX3 file into memory:
import rustpix
# Read all hits
hits = rustpix.read_tpx3_hits("data.tpx3")
# Convert to NumPy arrays
data = hits.to_numpy()
print(f"Loaded {len(data['x'])} hits")
# Access individual arrays
x = data['x'] # uint16
y = data['y'] # uint16
tof = data['tof'] # uint32, 25ns ticks (multiply by 25 for nanoseconds)
tot = data['tot'] # uint16
Streaming Hits
For large files, stream hits in batches:
import rustpix
for batch in rustpix.stream_tpx3_hits("large_data.tpx3"):
data = batch.to_numpy()
process_batch(data)
Processing Neutrons
Convert hits to neutron events using clustering:
import rustpix
# Configure clustering
clustering = rustpix.ClusteringConfig(
radius=5.0, # spatial epsilon (pixels)
temporal_window_ns=75.0, # temporal epsilon (nanoseconds)
min_cluster_size=1
)
# Configure centroid extraction
extraction = rustpix.ExtractionConfig(
super_resolution_factor=8.0,
weighted_by_tot=True,
min_tot_threshold=10
)
# Process file (returns single batch)
neutrons = rustpix.process_tpx3_neutrons(
"data.tpx3",
clustering_config=clustering,
extraction_config=extraction,
algorithm="abs",
collect=True
)
# Convert to NumPy
data = neutrons.to_numpy()
print(f"Found {len(data['x'])} neutron events")
Streaming Neutrons
Stream neutron events for large files:
import rustpix
clustering = rustpix.ClusteringConfig(radius=5.0, temporal_window_ns=75.0)
# Stream neutrons (default mode)
for batch in rustpix.stream_tpx3_neutrons(
"large_data.tpx3",
clustering_config=clustering
):
data = batch.to_numpy()
save_batch(data)
Or use process_tpx3_neutrons without collect=True:
# Streaming is the default
for batch in rustpix.process_tpx3_neutrons(
"large_data.tpx3",
clustering_config=clustering
):
process_batch(batch.to_numpy())
Clustering Hits
Cluster an existing HitBatch:
import rustpix
# Read hits
hits = rustpix.read_tpx3_hits("data.tpx3")
# Cluster
clustering = rustpix.ClusteringConfig(radius=5.0, temporal_window_ns=75.0)
neutrons = rustpix.cluster_hits(
hits,
clustering_config=clustering,
algorithm="dbscan"
)
data = neutrons.to_numpy()
PyArrow Integration
Export to PyArrow for Parquet, Arrow IPC, or DataFrame conversion:
import rustpix
neutrons = rustpix.process_tpx3_neutrons("data.tpx3", collect=True)
# Convert to PyArrow Table
table = neutrons.to_arrow()
# Save as Parquet
import pyarrow.parquet as pq
pq.write_table(table, "neutrons.parquet")
# Convert to Pandas
df = table.to_pandas()
VENUS Detector Defaults
For VENUS detector at SNS:
import rustpix
# Use VENUS-specific defaults
detector = rustpix.DetectorConfig.venus_defaults()
clustering = rustpix.ClusteringConfig.venus_defaults()
extraction = rustpix.ExtractionConfig.venus_defaults()
neutrons = rustpix.process_tpx3_neutrons(
"venus_data.tpx3",
detector_config=detector,
clustering_config=clustering,
extraction_config=extraction,
collect=True
)
Out-of-Core Processing
For files larger than RAM:
import rustpix
# Configure memory-bounded processing
for batch in rustpix.stream_tpx3_neutrons(
"huge_file.tpx3",
clustering_config=rustpix.ClusteringConfig(),
memory_fraction=0.5, # Use up to 50% of RAM
parallelism=4, # Worker threads
async_io=True # Async reader pipeline
):
save_batch(batch.to_numpy())
Configuration
DetectorConfig
Configure detector-specific parameters:
import rustpix
config = rustpix.DetectorConfig(
tdc_frequency_hz=60.0, # TDC frequency (Hz)
enable_missing_tdc_correction=True,
chip_size_x=256, # Chip width in pixels
chip_size_y=256, # Chip height in pixels
chip_transforms=None # Custom chip transformations
)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
tdc_frequency_hz | float | 60.0 | TDC frequency in Hz |
enable_missing_tdc_correction | bool | True | Correct for missing TDC packets |
chip_size_x | int | 256 | Chip width in pixels |
chip_size_y | int | 256 | Chip height in pixels |
chip_transforms | list | None | Chip coordinate transformations |
Chip Transforms
Chip transforms are 2x2 affine matrices plus translation:
# Transform tuple: (a, b, c, d, tx, ty)
# x' = a*x + b*y + tx
# y' = c*x + d*y + ty
config = rustpix.DetectorConfig(
chip_transforms=[
(1, 0, 0, 1, 0, 0), # Identity for chip 0
(1, 0, 0, 1, 256, 0), # Chip 1 offset by 256 in X
]
)
Presets
# VENUS detector defaults
config = rustpix.DetectorConfig.venus_defaults()
# Load from JSON
config = rustpix.DetectorConfig.from_json('{"tdc_frequency_hz": 60.0}')
ClusteringConfig
Configure the clustering algorithm:
config = rustpix.ClusteringConfig(
radius=5.0,
temporal_window_ns=75.0,
min_cluster_size=1,
max_cluster_size=None
)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
radius | float | 5.0 | Spatial epsilon in pixels |
temporal_window_ns | float | 75.0 | Temporal epsilon in nanoseconds |
min_cluster_size | int | 1 | Minimum hits per cluster |
max_cluster_size | int | None | Maximum hits per cluster (optional) |
Tuning Tips
- radius: Larger values merge more hits. Start with 5.0 for typical neutron events.
- temporal_window_ns: Should match detector timing characteristics. 75ns works for most TPX3 setups.
- min_cluster_size: Set to 2+ to filter noise (single-hit events).
- max_cluster_size: Set to filter large background events (e.g., gamma showers).
ExtractionConfig
Configure centroid extraction:
config = rustpix.ExtractionConfig(
super_resolution_factor=8.0,
weighted_by_tot=True,
min_tot_threshold=10
)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
super_resolution_factor | float | 8.0 | Sub-pixel resolution multiplier |
weighted_by_tot | bool | True | Weight centroid by ToT (charge) |
min_tot_threshold | int | 10 | Filter hits below this ToT |
Super Resolution
The super_resolution_factor controls sub-pixel precision:
1.0: Integer pixel coordinates8.0: 1/8 pixel precision (default)16.0: 1/16 pixel precision
ToT Weighting
When weighted_by_tot=True, the centroid is computed as:
x_centroid = Σ(x_i * tot_i) / Σ(tot_i)
y_centroid = Σ(y_i * tot_i) / Σ(tot_i)
This improves resolution by weighting toward hits with higher charge deposition.
Algorithm-Specific Parameters
Pass algorithm parameters as keyword arguments:
# ABS algorithm
neutrons = rustpix.process_tpx3_neutrons(
"data.tpx3",
algorithm="abs",
abs_scan_interval=1000, # Scan interval for ABS
collect=True
)
# DBSCAN algorithm
neutrons = rustpix.process_tpx3_neutrons(
"data.tpx3",
algorithm="dbscan",
dbscan_min_points=2, # Min points for core sample
collect=True
)
# Grid algorithm
neutrons = rustpix.process_tpx3_neutrons(
"data.tpx3",
algorithm="grid",
grid_cell_size=10, # Grid cell size in pixels
collect=True
)
Out-of-Core Processing Parameters
For streaming with memory constraints:
for batch in rustpix.stream_tpx3_neutrons(
"huge_file.tpx3",
clustering_config=rustpix.ClusteringConfig(),
# Memory management
out_of_core=True, # Enable (default: True for streaming)
memory_fraction=0.5, # Target 50% of available RAM
memory_budget_bytes=4_000_000_000, # Or explicit 4GB limit
# Parallelism
parallelism=4, # Worker threads
queue_depth=8, # Pipeline queue depth
async_io=True # Async I/O pipeline
):
process_batch(batch)
| Parameter | Type | Default | Description |
|---|---|---|---|
out_of_core | bool | True | Enable out-of-core processing |
memory_fraction | float | 0.5 | Fraction of RAM to target |
memory_budget_bytes | int | Auto | Explicit memory budget |
parallelism | int | CPU count | Worker thread count |
queue_depth | int | 8 | Pipeline queue depth |
async_io | bool | True | Enable async I/O |
Command Line Interface
The rustpix CLI provides batch processing capabilities for TPX3 files.
Installation
cargo install rustpix-cli
Or build from source:
cargo build --release -p rustpix-cli
Commands
| Command | Description |
|---|---|
process | Process TPX3 files to extract neutron events |
info | Show information about a TPX3 file |
benchmark | Benchmark clustering algorithms |
Quick Examples
# Process a file
rustpix process input.tpx3 -o output.csv
# Show file info
rustpix info input.tpx3
# Benchmark algorithms
rustpix benchmark input.tpx3
# Get help
rustpix --help
rustpix process --help
Output Formats
The output format is determined by file extension:
| Extension | Format |
|---|---|
.csv | Comma-separated values with header |
.bin, .dat | Binary format (compact) |
| Other | Binary format (default) |
See Commands Reference for detailed usage.
Commands Reference
rustpix process
Process TPX3 files to extract neutron events.
rustpix process [OPTIONS] -o <OUTPUT> <INPUT>...
Arguments
| Argument | Description |
|---|---|
<INPUT>... | Input TPX3 file(s) |
Options
| Option | Default | Description |
|---|---|---|
-o, --output <PATH> | Required | Output file path |
-a, --algorithm <ALGO> | abs | Clustering algorithm (abs, dbscan, grid) |
--radius <FLOAT> | 5.0 | Spatial radius for clustering (pixels) |
--temporal-window-ns <FLOAT> | 75.0 | Temporal window for clustering (nanoseconds) |
--min-cluster-size <INT> | 1 | Minimum cluster size |
--out-of-core <BOOL> | true | Enable out-of-core processing |
--memory-fraction <FLOAT> | 0.5 | Fraction of available memory to use |
--memory-budget-bytes <INT> | Auto | Explicit memory budget in bytes |
--parallelism <INT> | Auto | Worker threads for processing |
--queue-depth <INT> | 2 | Pipeline queue depth |
--async-io <BOOL> | false | Enable async I/O pipeline |
-v, --verbose | Off | Verbose output |
Examples
# Basic processing
rustpix process input.tpx3 -o output.csv
# Process multiple files
rustpix process file1.tpx3 file2.tpx3 -o combined.csv
# Use DBSCAN with custom parameters
rustpix process input.tpx3 -o output.csv \
--algorithm dbscan \
--radius 3.0 \
--temporal-window-ns 50.0
# Verbose output with parallel processing
rustpix process input.tpx3 -o output.bin \
--verbose \
--parallelism 8 \
--async-io true
# Memory-constrained processing
rustpix process huge_file.tpx3 -o output.csv \
--memory-fraction 0.3 \
--out-of-core true
rustpix info
Display information about a TPX3 file.
rustpix info <INPUT>
Example
$ rustpix info data.tpx3
File: data.tpx3
Size: 104857600 bytes (104.86 MB)
Packets: 6553600
Hits: 5242880
TOF range: 0 - 16666666
X range: 0 - 511
Y range: 0 - 511
rustpix benchmark
Benchmark clustering algorithms on a TPX3 file.
rustpix benchmark [OPTIONS] <INPUT>
Options
| Option | Default | Description |
|---|---|---|
-i, --iterations <INT> | 3 | Number of benchmark iterations |
Example
$ rustpix benchmark data.tpx3 --iterations 5
Benchmarking with 5242880 hits, 5 iterations
Algorithm | Mean Time (ms) | Min Time (ms) | Max Time (ms)
-----------------------------------------------------------------
ABS | 245.32 | 238.45 | 256.78
DBSCAN | 1234.56 | 1198.23 | 1287.34
Grid | 312.45 | 298.12 | 334.56
rustpix out-of-core-benchmark
Benchmark out-of-core processing modes.
rustpix out-of-core-benchmark [OPTIONS] <INPUT>
Options
| Option | Default | Description |
|---|---|---|
-a, --algorithm <ALGO> | abs | Clustering algorithm |
--radius <FLOAT> | 5.0 | Spatial radius (pixels) |
--temporal-window-ns <FLOAT> | 75.0 | Temporal window (ns) |
--min-cluster-size <INT> | 1 | Minimum cluster size |
-i, --iterations <INT> | 3 | Number of iterations |
--memory-fraction <FLOAT> | 0.5 | Memory fraction |
--parallelism <INT> | Auto | Worker threads |
--queue-depth <INT> | 2 | Queue depth |
--async-io <BOOL> | false | Enable async I/O |
Example
$ rustpix out-of-core-benchmark data.tpx3 --parallelism 4 --async-io true
Out-of-core benchmark (3 iterations)
Single-thread avg: 12.345s
Multi-thread avg: 4.567s (threads: 4, async: true)
Speedup: 2.70x
Environment Variables
The CLI respects standard environment variables:
| Variable | Description |
|---|---|
RAYON_NUM_THREADS | Override default thread count for parallel processing |
GUI Application
The rustpix GUI provides interactive visualization and analysis of TPX3 data.
Installation
macOS (Homebrew)
brew tap ornlneutronimaging/rustpix
brew install --cask rustpix
From Source
# Using pixi
pixi run gui
# Using cargo
cargo run --release -p rustpix-gui
Features
- Interactive file loading: Open TPX3 files via file dialog or drag-and-drop
- Real-time visualization: View hits and neutron events on 2D detector maps
- Algorithm selection: Choose between ABS, DBSCAN, and Grid clustering
- Parameter tuning: Adjust clustering parameters with immediate visual feedback
- ROI selection: Define regions of interest for focused analysis
- Export options: Save processed data to HDF5, CSV, TIFF, and other formats
- Memory monitoring: Track memory usage during processing
Launching
macOS
- Spotlight: Search for "Rustpix"
- Applications: Find in
/Applications - Terminal:
open -a Rustpix
From Source
# Release mode (faster)
pixi run gui
# Debug mode (faster compilation)
pixi run gui-debug
Workflow
1. Load Data
- Click File > Open or drag a
.tpx3file onto the window - Wait for the file to load (progress shown in status bar)
- Raw hits appear in the visualization panel
2. Configure Processing
- Select clustering algorithm from the dropdown
- Adjust parameters:
- Radius: Spatial clustering distance (pixels)
- Temporal Window: Time clustering window (nanoseconds)
- Min Cluster Size: Filter small clusters
3. Process
- Click Process to run clustering
- Neutron events appear in the visualization
- Statistics shown in the info panel
4. Analyze
- Pan/Zoom: Mouse wheel and drag to navigate
- ROI: Draw regions of interest for statistics
- Histogram: View ToF and spatial distributions
5. Export
- Click File > Export
- Choose format:
- HDF5: Full data with metadata
- CSV: Simple tabular export
- TIFF: Image export
- Select output location
Keyboard Shortcuts
| Shortcut | Action |
|---|---|
Cmd+O | Open file |
Cmd+S | Save/Export |
Cmd+Q | Quit |
Space | Toggle processing |
R | Reset view |
Escape | Cancel ROI selection |
System Requirements
- macOS Big Sur (11.0) or later
- Apple Silicon (ARM64) recommended
- 8GB RAM minimum (16GB recommended for large files)
- OpenGL 3.3 or later
Troubleshooting
App Won't Open (macOS)
If macOS blocks the app:
- Go to System Preferences > Security & Privacy
- Click Open Anyway for Rustpix
Out of Memory
For very large files:
- Enable streaming mode in preferences
- Reduce the loaded time range
- Use the CLI for batch processing instead
Slow Rendering
- Reduce the number of displayed points (use downsampling)
- Close other applications to free GPU memory
- Try the CLI for processing, GUI for visualization only
Clustering Algorithms
Rustpix provides three clustering algorithms for grouping detector hits into neutron events. Each algorithm has different performance characteristics and is suited for different use cases.
Overview
| Algorithm | Complexity | Best For | Parallelism |
|---|---|---|---|
| ABS | O(n) average | General use, balanced performance | Single-threaded |
| DBSCAN | O(n log n) | Noisy data, irregular clusters | Single-threaded |
| Grid | O(n) | Large datasets, parallel processing | Multi-threaded |
ABS (Adjacency-Based Search)
The default algorithm. Uses 8-connectivity search to find adjacent pixels within temporal and spatial thresholds.
How It Works
- Hits are processed in time order
- For each hit, search for neighbors within radius and temporal window
- Group connected hits into clusters using flood-fill
- Periodically scan for completed clusters (configurable interval)
Parameters
| Parameter | Description | Typical Value |
|---|---|---|
radius | Maximum pixel distance | 5.0 |
temporal_window_ns | Maximum time difference | 75.0 ns |
abs_scan_interval | Hits between cluster scans | 100 |
When to Use
- General-purpose neutron imaging
- Files with moderate noise levels
- When processing speed is important
neutrons = rustpix.process_tpx3_neutrons(
"data.tpx3",
algorithm="abs",
abs_scan_interval=1000,
collect=True
)
DBSCAN
Density-Based Spatial Clustering of Applications with Noise. Groups points based on density reachability.
How It Works
- Build spatial index of all hits
- For each unvisited hit, find neighbors within epsilon
- If enough neighbors (min_points), start a cluster
- Recursively expand cluster with density-reachable points
- Points not in any cluster are marked as noise
Parameters
| Parameter | Description | Typical Value |
|---|---|---|
radius | Epsilon (spatial search radius) | 5.0 |
temporal_window_ns | Temporal epsilon | 75.0 ns |
dbscan_min_points | Minimum neighbors for core point | 2 |
When to Use
- High noise environments
- When cluster shape is irregular
- When you need to identify noise points
neutrons = rustpix.process_tpx3_neutrons(
"data.tpx3",
algorithm="dbscan",
dbscan_min_points=2,
collect=True
)
Grid
Parallel grid-based clustering with spatial indexing.
How It Works
- Divide detector space into cells
- Assign hits to cells based on position
- Process cells in parallel using rayon
- Merge clusters that span cell boundaries
- Use union-find for efficient cluster merging
Parameters
| Parameter | Description | Typical Value |
|---|---|---|
radius | Maximum pixel distance | 5.0 |
temporal_window_ns | Maximum time difference | 75.0 ns |
grid_cell_size | Cell size in pixels | 32 |
When to Use
- Very large datasets
- Multi-core systems
- When throughput is critical
neutrons = rustpix.process_tpx3_neutrons(
"data.tpx3",
algorithm="grid",
grid_cell_size=32,
collect=True
)
Performance Comparison
Benchmark results on a typical neutron imaging dataset (5M hits):
| Algorithm | Time (ms) | Memory | Notes |
|---|---|---|---|
| ABS | ~250 | Low | Consistent, predictable |
| DBSCAN | ~1200 | Medium | Slower but noise-robust |
| Grid | ~300 | Medium | Scales with cores |
Choosing an Algorithm
Start with ABS (default)
│
├─ Too much noise? → Try DBSCAN
│
├─ Need more speed? → Try Grid
│ └─ (especially on multi-core systems)
│
└─ Results look good? → Stick with ABS
Parameter Tuning
Spatial Radius
- Too small: Clusters split into multiple events
- Too large: Separate events merged together
- Start with: 5.0 pixels, adjust based on results
Temporal Window
- Too small: Events spanning multiple TDC cycles split
- Too large: Unrelated events merged
- Start with: 75.0 ns (matches typical TPX3 timing)
Min Cluster Size
- 1: Accept all clusters (including noise)
- 2+: Filter single-hit noise events
- Typical: 1-3 depending on noise level
Technical Reference
This section provides detailed technical specifications for rustpix.
Contents
- HDF5 Schema - NeXus-compatible HDF5 file format specification
API Documentation
Rust API
Comprehensive Rust API documentation is available on docs.rs:
- rustpix-core - Core traits and types
- rustpix-algorithms - Clustering algorithms
- rustpix-tpx - TPX3 parser
- rustpix-io - File I/O
Python API
See the Python API chapter for comprehensive Python documentation.
Data Formats
Input: TPX3
Rustpix reads Timepix3 (TPX3) binary files. TPX3 files contain:
- Hit packets (pixel coordinates, timestamp, ToT)
- TDC packets (timing reference)
- Metadata headers
Output Formats
| Format | Extension | Description |
|---|---|---|
| HDF5 | .h5, .hdf5 | NeXus-compatible, recommended for large datasets |
| Arrow | .arrow | Apache Arrow IPC format |
| Parquet | .parquet | Columnar format, good for analytics |
| CSV | .csv | Human-readable, simple export |
| Binary | .bin, .dat | Compact, fastest I/O |
Performance Characteristics
Throughput
| Operation | Throughput | Notes |
|---|---|---|
| TPX3 parsing | 96M+ hits/sec | Memory-mapped, parallel |
| ABS clustering | ~20M hits/sec | Single-threaded |
| Grid clustering | ~15M hits/sec | Multi-threaded, scales with cores |
| DBSCAN clustering | ~4M hits/sec | Spatial index overhead |
Memory Usage
- Streaming mode: Bounded memory, configurable via
memory_fraction - Batch mode: ~100 bytes per hit for full processing
- HDF5 export: Chunked writes, minimal peak memory
Version Compatibility
| Rustpix Version | Python | Rust | macOS | Linux | Windows |
|---|---|---|---|---|---|
| 1.0.x | 3.11+ | 1.70+ | 11.0+ | glibc 2.28+ | 10+ |
HDF5 Schema
This document defines the on-disk HDF5 layout for rustpix event data and histograms. The schema is designed for scipp compatibility via NeXus, using NXevent_data for events and NXdata for histograms.
Goals
- Bounded-memory processing for large TPX3 datasets
- scipp-compatible layout via NeXus (NXevent_data + NXdata)
- Clear units and metadata to support TOF ↔ eV conversion
- Optional fields (tot, chip_id, cluster_id) are truly optional
File Structure
/
rustpix_format_version = "0.1"
entry/ (NXentry)
hits/ (NXevent_data) [optional]
neutrons/ (NXevent_data) [optional]
histogram/ (NXdata) [optional]
metadata/ (group) [optional]
File-Level Conventions
- Root group has attribute:
rustpix_format_version = "0.1" - All groups use
NX_classattributes where applicable - Units are stored as dataset attributes:
units = "ns","pixel","deg", etc. - Endianness is native (HDF5 handles portability)
Event Data (NXevent_data)
Event groups follow the NeXus NXevent_data base class, used by SNS/ISIS event files and expected by Mantid.
Required Datasets
| Name | Type | Shape | Units | Description |
|---|---|---|---|---|
event_id | i32 | (N) | id | Detector element ID |
event_time_offset | u64 | (N) | ns | Time-of-flight relative to pulse |
Pulse Indexing (for pulsed sources)
| Name | Type | Shape | Units | Description |
|---|---|---|---|---|
event_time_zero | u64 | (J) | ns | Start time of each pulse |
event_index | i32 | (J) | id | Index into event arrays |
Optional Datasets
| Name | Type | Shape | Units | Description |
|---|---|---|---|---|
time_over_threshold | u64 | (N) | ns | ToT in nanoseconds |
chip_id | u8 | (N) | id | Chip identifier |
cluster_id | i32 | (N) | id | Cluster assignment |
n_hits | u16 | (N) | count | Hits per neutron |
x | u16 | (N) | pixel | Global pixel X |
y | u16 | (N) | pixel | Global pixel Y |
Cluster ID Convention
cluster_id >= 0: Valid cluster indexcluster_id = -1: Unclustered / noise
Event ID Mapping
For imaging data, event_id maps to pixel coordinates:
event_id = y * x_size + x
Group attributes x_size and y_size define the detector dimensions.
Histogram Data (NXdata)
Histogram data is stored in a single NXdata group named histogram.
Group Attributes
NX_class = "NXdata"
signal = "counts"
axes = ["rot_angle", "y", "x", "time_of_flight"]
rot_angle_indices = 0
y_indices = 1
x_indices = 2
time_of_flight_indices = 3
Required Datasets
| Name | Type | Shape | Units | Description |
|---|---|---|---|---|
counts | u64 | (R, Y, X, E) | count | Histogram counts |
rot_angle | f64 | (R) | deg | Rotation angle |
y | f64 | (Y) | pixel | Y axis |
x | f64 | (X) | pixel | X axis |
time_of_flight | f64 | (E) | ns | TOF axis |
Axis Representation
- Centers: axis length = N,
axis_mode = "centers" - Edges: axis length = N+1,
axis_mode = "edges"
Optional Energy Axis
If flight_path_m and tof_offset_ns are provided:
| Name | Type | Shape | Units | Description |
|---|---|---|---|---|
energy_eV | f64 | (E) | eV | Derived energy axis |
Conversion Metadata
Stored as attributes at /entry:
| Attribute | Type | Description |
|---|---|---|
flight_path_m | f64 | Effective flight path length |
tof_offset_ns | f64 | Instrument TOF window shift |
energy_axis_kind | string | Typically "tof" |
TOF to Energy Conversion
Using the non-relativistic relation:
E = (m_n / 2) * (L / t)²
where:
t = (event_time_offset + tof_offset_ns) * 1e-9 [seconds]
L = flight_path_m [meters]
m_n = neutron mass
Metadata Group
/entry/metadata may contain:
- Detector config (chip transforms, pixel size)
- Clustering config
- Extraction config
- Processing provenance (git sha, rustpix version)
- Instrument context (facility, run ID)
Preferred storage: UTF-8 string dataset named metadata_json.
Implementation Notes
Chunking Strategy
- Events: Chunk along event dimension, 50k–200k events per chunk
- Histograms: Chunk along slowest-changing dimensions (e.g.,
rot_angle)
Compression
- Start with gzip level 1–4 for balanced I/O
- Use shuffle + compression for integer datasets
Data Types
- Use
u64for timestamps in ns to prevent overflow - Use
f64for coordinates requiring sub-pixel precision