Work on the frontier of AI research, systems architecture, and distributed engineering. We are looking for ambitious builders ready to redefine human-agent coordination.
Design stateful multi-agent workflows, tool execution loops, and autonomous self-healing execution pathways. Scale production frameworks like LangGraph and CrewAI for global enterprises.
Optimize LLM tensor compile chains and low-level kernels. Customize Triton kernels, speculative decoding parameters, and coordinate paged attention layouts inside vLLM frameworks.
Build highly resilient, massive PyTorch clusters. Implement pipeline, tensor, and ZeRO-3 data parallelism across high-speed InfiniBand and RoCE network fabrics.
Work directly alongside founding systems developers and research fellows. Research novel alignment pipelines, RLHF optimization mechanisms, and collaborative agent benchmarks.
Assist our MLOps team in scheduling Kubernetes GPU node quotas, configuring Prometheus dashboard monitoring, and maintaining clean weights registry structures via MLflow.