
Quant Trader & AI Engineer
Quant Developer | ML & MLOps
Python • C++ • LangChain • PyTorch • Kubernetes
MS CS @ Northeastern
Building high-performance trading systems and production ML infrastructure — from microsecond execution engines to distributed GPU training platforms
I'm a Quant Trader and AI Engineer currently pursuing my Master's in Computer Science at Northeastern University (4.0 GPA), specializing in quantitative finance and production machine learning systems.
As a Quant Developer, I build trading infrastructure that operates at microsecond precision — from statistical arbitrage strategies achieving Sharpe ratios of 10.5 to low-latency market simulators processing 5.67M events per second.
On the ML & MLOps side, I architect production systems that scale — recently leading a team of 20 engineers at Webearl AI, where we built inference platforms serving 100K+ daily requests with sub-500ms latency. I've built RAG systems with sub-10ms cached response times and distributed training platforms achieving 10.6× speedup.
My tech stack spans the full spectrum: Python and C++ for core systems, PyTorch for ML modeling, LangChain for LLM applications, and Kubernetes for orchestration. I'm passionate about the intersection where quantitative rigor meets engineering excellence.
From quantitative trading strategies to production ML infrastructure
Ensemble trading strategy achieving Sharpe 10.5 through GPU-accelerated feature engineering processing 5M ticks with LightGBM ensemble and hedge ratio optimization
📈 Sharpe: 10.5 | 86% Win Rate | 54K ticks/sec
Multi-venue trading simulator processing 5.67M events/second with microsecond scheduling and 94% better slippage prediction across 8 exchanges
📈 5.67M events/sec | 0.10μs Latency | 94% Slippage Improvement
Retrieval-augmented generation with FAISS vector search achieving sub-10ms cached responses, 100% relevance scores, and human feedback loops
📈 Sub-10ms Cached | 100% Relevance | 60% Hit Rate
Multi-GPU infrastructure achieving 10.6× training speedup (8h → 45min) through data parallelism with demonstrated linear scaling
📈 10.6× Speedup | 2.95× Measured | Linear Scaling
Adaptive order router achieving 74% average fill improvement through real-time latency tracking across 8 exchanges, excelling with 87% improvement under network stress
📈 74% Avg Improvement | 87% Under Stress | 8 Venues
Distributed streaming pipeline achieving 333K+ events/second with <2s end-to-end latency using Kafka and async Python horizontal scaling
📈 333K events/sec | <2s Latency | 5-Node Distributed
Whether you're looking for quantitative trading expertise, ML engineering, or technical leadership — I'd love to hear from you.
📍 Boston, MA
Currently: MS CS @ Northeastern University (4.0 GPA)
Open to: Quantitative Trading • ML Engineering • MLOps • Research Opportunities
© 2025 Shivang Raval. Built with React & Tailwind CSS.