Back to all projects
AI/ML2025
model-lab
On-device audio AI — MLX, Whisper & model evaluation infrastructure
PythonMLXWhisperApple SiliconCUDAJupyter
Problem
Selecting the right audio AI model requires systematic evaluation across accuracy, speed, cost, and hardware — not just running a few tests. Teams need infrastructure to make deployment decisions with data.
Approach
Built model-lab as a repeatable evaluation framework with shared harness, identical metrics, and automated comparison dashboards. Tested across Apple Silicon MPS, NVIDIA CUDA, and CPU. Produced production readiness grades for Whisper variants and LFM2.5-Audio.
Result
Production-ready evaluation framework for ASR/TTS models with automated scorecards and multi-device benchmarking.