About the job
About Pathway
At Pathway, we are pioneering the next generation of AI models with our groundbreaking post-transformer architecture that addresses the core memory limitations of traditional AI systems. Unlike conventional transformers that reset with each iteration, our model empowers continuous learning, infinite context reasoning, and real-time adaptability. We are not merely enhancing existing technology; we are defining the future of AI.
Our innovative architecture surpasses traditional Transformer models, offering enterprises unparalleled transparency in model operations. By seamlessly integrating our foundational model with the fastest data processing engine available, Pathway enables businesses to shift from merely optimizing past technologies to embracing a new era of contextualized, experience-driven intelligence. We are proud to partner with esteemed organizations such as NATO, La Poste, and Formula 1 racing teams.
Led by our visionary CEO, Zuzanna Stamirowska, along with AI experts like CTO Jan Chorowski, who pioneered the application of Attention in speech at Google Brain, and CSO Adrian Kosowski, a distinguished computer scientist and quantum physicist, our team is at the forefront of AI innovation.
Pathway is supported by prominent investors and advisors, including Lukasz Kaiser, co-author of the Transformer model and a key figure in developing OpenAI’s reasoning models. Our headquarters is located in Palo Alto, California.
The Opportunity
We are on the lookout for dedicated AI Benchmark & Dataset Engineering Interns to assist in defining and executing our benchmarking processes for model evaluation.
Your Responsibilities
- Identify and curate key public and client-driven benchmarks relevant to our target use cases and markets.
- Assess candidate benchmarks for clarity, data quality, evaluation methodologies, and alignment with our model roadmap.
- Execute benchmarks using baseline models to validate setups, discover edge cases, and mitigate risks in R&D processes.
- Prepare and deliver “benchmark-ready” packages to R&D, including specifications, data, evaluation scripts, expected metrics, and constraints.
- Establish and maintain a comprehensive vocabulary and documentation regarding benchmarks, datasets, and evaluation formats for use by GTM and R&D teams.
- Organize and track benchmark results, model leaderboards, and establish benchmarks for various customer scenarios.
- Contribute to demonstrations and public-facing proof points derived from benchmark results.
Your contributions will be pivotal in shaping our benchmarking processes for AI model evaluation, directly influencing our development strategies and market communication.
