0.1.2
This release is mostly about honesty and stability: training you can reproduce, metrics that do not lie, and concurrency that does not lose your data.
Training
- Reproducible runs. Set training.seed to a fixed, non-zero value and the same dataset and settings produce the same model. Leave it at 0 for a fresh random seed each time.
- Class balancing. New training.balance-classes and training.class-weight-cap weight the minority class up, so a small cheater or legit sample is not drowned out by the majority.
- Honest metrics. When there is no validation holdout, the training message and the model metadata now say the numbers came from training data. No more treating a training-set score as an independent estimate.
Commands
- Added /lad models compare <a> <b> to put two models side by side.
- /lad models info now shows the metrics source, whether classes were balanced, and the training seed.
- /lad status and /lad dataset info read the dataset off the main thread.
Performance
- Predictions run on their own thread pool (performance.prediction-threads), separate from the dataset writer, so heavy inference does not block recording.
- Dataset counting and trimming stream the file line by line instead of loading every row into memory.
- Optional ping band (detector.min-ping-ms, detector.max-ping-ms) skips windows recorded under unusual latency.
Fixes and hardening
- Alert history is now read and written under proper synchronization.
- Shutdown stops predictions, drains queued writes, then closes the writer in the right order, so recorded windows are not lost.
- Model files load through an allow-list deserializer that only resolves model and core JDK classes.
- Added a JUnit test suite for metrics, GCD math, prediction tracking, dataset parsing, and path handling.
Test on a local server before production, and keep backups of your models and datasets.