Andrew Gritsevskiy

	Andrew Gritsevskiy
Occupation	Entrepreneur, researcher
Known for	Founder of RunRL

Andrew Gritsevskiy is a researcher and entrepreneur who is the founder of RunRL, a platform that provides reinforcement learning as a service for optimizing large language models (LLMs) and AI agents. The company is part of Y Combinator batch X25.^[1]

Career

Prior to founding RunRL, Gritsevskiy was involved in several research and entrepreneurial endeavors. He co-founded Cavendish Labs, described as a research laboratory based in Vermont that addresses threats such as advanced AI and pandemics. He also ran Contramont Research, an AI safety lab, alongside several collaborators. His research interests have spanned reinforcement learning, quantum information, neuroscience, drug development, astronomy, and robotics. He is a co-author of a research paper on the construction of unelicitable backdoors in language models, published on arXiv.^[2]

Gritsevskiy additionally co-founded Lex.ma, a separate project, and has published various technical projects including an implementation of an SHA transformer.

RunRL

RunRL is a San Francisco-based company with a small team that offers a reinforcement learning platform designed to improve the performance of LLMs and AI agents on specific tasks. The platform allows users to define custom reward functions that specify desired model behavior, then applies reinforcement learning algorithms — described by the company as related to the techniques behind DeepSeek R1 — to optimize model outputs accordingly.

The company's workflow involves three steps: users define their task and submit prompts with custom reward functions; the platform runs reinforcement learning training; and the resulting model is deployed with improved performance on the specified task. RunRL markets the service as an alternative to prompt engineering, allowing users to train specialized models rather than relying on generic LLMs.

RunRL's applications include chemistry models (such as generating drug candidates, where the company has claimed results from a smaller model that outperform larger general-purpose models), web agents for automating online tasks, code generation (including training models to generate SQL or JSON from unstructured text), and voice agents. The platform includes features such as AgentFlow, which enables AI agents to self-improve based on custom criteria, and enterprise-oriented solutions including custom reward development.

References

↑ "RunRL". 'RunRL}'. Retrieved 2026-03-19.
↑ "RunRL". 'RunRL}'. Retrieved 2026-03-19.

[1] "RunRL". 'RunRL}'. Retrieved 2026-03-19.

[2] "RunRL". 'RunRL}'. Retrieved 2026-03-19.

[1]

[2]