Company Overview At Orcrist Technologies, we are pioneering the future of data intelligence with our Orcrist Intelligence Platform (OIP). This secure, Kubernetes-native system is designed for deployment as Software as a Service (SaaS) or can be self-hosted/on-premises, accommodating even air-gapped missions. Our mission is to seamlessly integrate data processing, machine learning, and user-centric design for defense, law enforcement, and corporate teams. Role Overview As a Machine Learning Engineer, you will be instrumental in productionizing NLP, audio, and document models that enhance the insight experiences provided by OIP. You will take full ownership of model packaging, deployment, monitoring, and evaluation, collaborating closely with research and product teams to ensure the delivery of reliable insights globally. Your Responsibilities Package and deploy various models (ASR, translation, OCR, NER, summarization) utilizing Triton/KServe on Kubernetes. Develop evaluation pipelines (WER, BLEU, F1, latency, cost) and automate release gating processes. Manage streaming and batch inference through Kafka, Temporal, and backfill tooling. Monitor model performance and quality using Prometheus, Grafana, and Evidently; optimize inference costs and performance metrics. Collaborate with TypeScript teams to design payload schemas, contracts, and establish human-in-the-loop feedback mechanisms. Qualifications 4 to 8+ years of experience in ML engineering or MLOps, with a proven track record of deploying models to production. Proficiency in Python and frameworks such as PyTorch/Transformers, along with experience using Triton/KServe or similar technologies. Familiarity with Kubernetes, GitOps, CI/CD practices, and GPU workload management. Strong understanding of evaluation metrics, monitoring techniques, and annotation workflows. Eligibility to work in Germany; export-control screening may be required for certain projects. Preferred Qualifications Experience with Temporal, Beam/Flink, or Ray Serve; familiarity with ONNX/TensorRT optimization. Proficiency in the German language (B1+) and experience with defense or public safety datasets. Knowledge of WhisperX, DeepStream/GStreamer, or vector search integrations. What We Offer A modern MLOps stack including Triton, Temporal, Kafka, MLflow, Weights & Biases, Evidently, and Kubernetes. A remote-first work environment in Germany, with regular meetups in Berlin, 30 days of vacation, and a budget for equipment and learning.
Jan 13, 2026