Mechanize, Inc.

We build environments and evals for frontier coding agents. In these environments, models carry out software engineering work such as building a feature, deploying an application, or debugging an issue in an unfamiliar codebase. A grader scores the model’s performance, and these scores serve as signals during reinforcement learning and evaluations.

Frontier models are already surprisingly good at writing code. Our engineers find where they still break down and build environments that reveal those limits. Our current focus is software engineering, but our long-term goal is the full automation of valuable work across the economy.

Careers

We’re hiring software engineers to design and build these environments. Apply here.

Learn how our interview process works, or read about what working here is like.

GBA Eval

GBA Eval is a benchmark that measures how well coding agents can write a Game Boy Advance emulator from scratch in 24 hours. It is similar to environments we build and sell at Mechanize. You can view more here.

GPT-5.5

Diff

Reference

Essays

These essays explain our vision and how we think about AI, work, and the economy:

Press releases

Our press announcements are collected on the press releases page.

Mechanize is backed by Nat Friedman and Daniel Gross, Patrick Collison, Adam D’Angelo, Marco Mascorro, Dwarkesh Patel, Sholto Douglas, Devendra Chaplot, Alex Atallah, and Marcus Abramovitch.

For inquiries, you can reach us at our contact page.