About the job
About Us
Baton Corporation is at the forefront of technology development, creating and managing the entire tech stack for pump.fun, the leading memecoin launchpad currently in operation. Our systems are designed to handle low latency and high throughput, functioning under continuous load, where precision is critical.
Your Role
As a Reinforcement Learning Engineer, you will take charge of a production trading system that deploys real capital. This is not a research-focused position; your primary responsibility will be to develop robust, measurable learning systems that operate safely within real-world constraints.
- Lead the development and deployment of an RL-driven trading agent that utilizes real capital to enhance trading volumes and user engagement within the memecoin ecosystem.
- Craft reward functions and policies that align with product objectives while implementing stringent downside risk measures.
- Create evaluation and validation frameworks, including simulations and offline analyses, to reduce dependency on live sequential testing.
- Manage the transition of an existing heuristic-based production system to a learning-based framework.
- Assume full ownership and technical leadership as the sole RL expert, overseeing everything from data and modeling to deployment, monitoring, and safety measures.
Qualifications
- Proven experience in deploying autonomous learning systems that directly manage capital, pricing, traffic, or resources, with the ability to articulate previous challenges and solutions.
- Experience in designing and enforcing strict risk limits (capital caps, loss bounds, circuit breakers) within live systems, not just theoretical discussions on risk-aware objectives.
- Expertise in developing a policy evaluation loop from scratch (including simulators, replay systems, counterfactuals, and shadow deployments) before live implementation.
- Ability to make and justify difficult trade-offs (e.g., heuristic vs. RL, bandit vs. deep RL) based on empirical evidence rather than ideology.
- Experience as the sole proprietor of a complex ML system within a small team, without the support of extensive research organizations or infrastructure teams.
Work Environment
- We work on-site in our New York office.
- Expect long and unconventional hours.
- Be prepared for a fast-paced work environment.
- Expect a culture of intense collaboration and innovation.

