About the job
As an AI Architect at CreativeChaos, you will spearhead the development of cutting-edge AI-native products. You will be responsible for leading cross-functional Innovation Delivery Squads, overseeing projects from inception to deployment across various platforms such as web, mobile, AI agents, and streaming backends. Your role demands a hands-on approach as a technical leader, capable of scoping, designing, staffing, and ensuring successful product launch and operation at scale.
Key Responsibilities:
- Establish and manage squads through all phases: Discovery, Prototype, Product, Platform, and Site Reliability Engineering (SRE).
- Design and implement Retrieval-Augmented Generation (RAG) and agent systems: select models (e.g., Anthropic Claude, OpenAI, Google, or open weights like Llama/Mistral), define necessary tools/functions, and select appropriate retrieval methods (default Postgres + pgvector, scaling to Weaviate/Qdrant/Pinecone as needed).
- Ensure the safe operation of AI systems through evaluations & guardrails, structured outputs (JSON/Schema), PII redaction, refusal policies, and adherence to cost/latency budgets, alongside LLM observability.
- Take ownership of delivery outcomes, including Service Level Objectives (SLOs), quality, cost, and velocity; implement feature flags and canary releases.
- Engage with clients for discovery, scoping, Statement of Work (SoW), project roadmaps, and Quarterly Business Reviews (QBRs).
- Recruit and mentor Tech Leads, Engineering Managers, and Product Managers, enhancing best practices within the teams.
Qualifications:
- 8–12+ years of engineering experience; at least 4 years in a leadership role overseeing multi-team delivery and deploying production web/mobile systems at scale.
- Demonstrated experience shipping a production AI application utilizing models such as Claude, GPT, Gemini, Llama, or Mistral, supported by a retrieval mechanism (pgvector or a vector database) and a foundational evaluation/guardrail pipeline.
- Experience in orchestrating tasks using tools like LangGraph/DSPy or Temporal for reliable workflows, as well as implementing rerankers (e.g., Cohere/Jina/Voyage) and prompt/tool versioning.
- Proficient in modern cloud and data technologies: serverless architectures, Kubernetes, Terraform, OpenTelemetry, and feature flags/experimentation.
- Strong client communication skills and a solid commercial acumen (SoWs, staffing, resource utilization).
Technology Stack (Hands-on experience required):
- Models: Anthropic Claude, OpenAI, Google, open weights (Llama, Mistral).
- Orchestration & agents: LangGraph (or DSPy) for graphical workflows; Temporal for durable, long-running tasks and SLAs.
- Retrieval: Postgres + pgvector (primary); Weaviate/Qdrant/Pinecone for scalability; hybrid search with OpenSearch/Typesense.
- Embeddings/rerankers: OpenAI/Voyage/E5/BGE; Cohere/Jina/Voyage.
- Guardrails & evaluations: JSON/Pydantic schemas, red-team sets, promptfoo/Ragas/DeepEval; content/PII filtering.
- Observability: OpenTelemetry traces including prompt/tool spans; Langfuse/Arize Phoenix (or equivalent) + Sentry/Grafana.
- Application & data: Next.js 15 (RSC), TypeScript/Go/Python; Postgres; Kafka/Redpanda/NATS; dbt/lakehouse (optional).
- Operations: Cloud Run/ECS/K8s; Terraform/OpenTofu; GitHub Actions; LaunchDarkly/Unleash; Statsig/GrowthBook.
