# PyStormTracker Roadmap This document outlines the strategic plan for improving PyStormTracker's performance, CI/CD pipelines, and overall architecture, with a focus on high-resolution climate data scalability. ## 1. Performance & Scalability * **Prevent CPU Oversubscription (Numba vs. Dask/MPI):** * *Current State:* Dask/MPI orchestrates processes, but Numba kernels lack explicit thread constraints. If `parallel=True` is used in Numba, it will oversubscribe CPU cores and cause thrashing. * *Action:* Explicitly control thread topology inside worker tasks (e.g., `numba.set_num_threads(1)` when scaling via Dask/MPI processes). * **Vectorize the `SimpleLinker`:** * *Current State:* Linking uses a vectorized Haversine matrix but remains $O(N \times M)$, which can be a bottleneck as trajectory counts scale. * *Action:* Leverage `scipy.spatial.cKDTree` for nearest-neighbor lookups across time steps to convert spatial proximity searches to highly optimized C-level trees. * **Manage Memory Pressure (Chunking) (Completed):** * Implemented time-chunking across backends to prevent memory exhaustion on large datasets. This maintains optimal block-IO performance by avoiding metadata/locking overhead. * **Array-Backed Data Model (Completed):** * Transitioned from nested Python objects to flat, C-contiguous NumPy arrays for trajectories and centers. * **JIT-Optimized Kernels (Completed):** * Implemented core mathematical filters (Laplacian, Extrema, MGE, CCL) in GIL-free Numba JIT. * **GPU-Accelerated Preprocessing & Detection (Experimental):** * *Action:* Expand JAX-native capabilities beyond spherical harmonic transforms and kinematic derivatives to include local extrema detection and Laplacian filtering. This will enable full end-to-end GPU/TPU acceleration for high-resolution datasets. * *Status:* JAX-based spectral filtering and vector derivatives have been implemented as an experimental backend. ## 2. CI/CD & Testing * **Implement Performance Regression Testing:** * *Current State:* No automated guardrails against JIT performance degradation. * *Action:* Integrate `pytest-benchmark` with a deterministic synthetic dataset fixture. Add a CI job that fails if Numba execution time drops significantly compared to `main`. * **Dependency Audit:** * *Action:* Add a weekly scheduled CI run of `uv sync --resolution lowest-direct` combined with `pytest` to ensure minimum versions in `pyproject.toml` remain accurate. * **Tiered Integration Testing (Completed):** * Implemented "Short" vs "Full" integration test suites to balance local dev speed with CI thoroughness. ## 3. Architecture * **Idiomatic Xarray Integration (`apply_ufunc`):** * *Current State:* Xarray is primarily used as an I/O loader before dropping down to NumPy arrays and manual parallel orchestration. * *Action:* Wrap core Numba filters inside `xr.apply_ufunc(..., dask="parallelized")`. This allows Xarray to natively handle chunking and distributed execution. * **Distributed Backends (Completed):** * Native support for Dask and MPI backends with **automatic environment detection** and fallback logic. * **Modern CLI & API (Completed):** * Grouped, logical command-line interface with auto-configuration of parallel workers. * Flexible `Tracker` Protocol for cross-algorithm support. * **Remote Data Support (Completed):** * Native support for remote Zarr datasets via HTTP, S3, and GS protocols with automatic format detection. ## 4. Distribution & Ecosystem * **Modular Dependencies (Completed):** * Optional dependency groups (e.g., `[hodges]`, `[mpi]`, `[grib]`) to minimize build-time requirements and simplify installation in constrained environments like ReadTheDocs. * **Conda-forge Distribution (Completed):** * Available on `conda-forge` for easy cross-platform installation. ## 5. Feature Implementation * **HodgesTracker Integration (Completed):** * Native Python/Numba implementation of the Modified Greedy Exchange (MGE) algorithm with algorithmic parity to TRACK-1.5.2. * **Preprocessing (Completed):** * **HodgesTracker Refinement (In Progress):** * *Action:* Implement Dierckx B-spline surface fitting and evaluation in Numba to achieve bit-wise coordinate identity with original TRACK software. * **Postprocessing (Track Metrics):** * *Action:* Implement Accumulated Track Activity (ATA) and other storm track metrics from **Yau and Chang (2020)**. * **JAX-Based Feature Detection (Proposed):** * *Action:* Develop JAX-native implementations of the extrema detection and intensity refinement kernels to support high-throughput, GPU-resident tracking pipelines.