Archal Labs

Human-curated data for computer-use agents.

Archal Labs builds the data infrastructure for economically valuable agents. Two core beliefs inform our philosophy and research: first, that computer use agents is the most promising avenue for language models to perform knowledge work in the near future. Second, that “Most Algorithmic Progress is Data Progress”.

Frontier models are great at reasoning in math and coding tasks because they’ve been relentlessly trained on those tasks by human experts. Unsurprisingly, these models perform poorly on basic desktop tasks that any human might be able to do. Waiting for labs to produce models with more generalizable intelligence is an option we reject. Instead, we draw from existing methods that have improved model expertise in mathematics, coding, or law. In particular, data annotation firms hire domain experts to tutor models in domains with bespoke needs. We think this same approach should be leveraged to teach models the skills that are economically impactful.

We believe agents should interface with platforms humans actually use to create value. To do so, models should know how to use browsers, operating systems, and general applications.

However, current state-of-the-art models often fail to perform reliably on simple tasks that any human can do. Current solutions like finetuning, prompt engineering, or multi-agent workflows can be performant, but at the cost of heavily incurring API fees.

Our solution is human curated traces. We’ve built an end-to-end pipeline that entails model evaluation harnesses and isolated environments built for our talent pool. This means workflows that get stuck in failure loops are instantly caught and remedied by high quality traces, which are validated by experiments done in academia and internally. We want to be able to use models that can automate the drudgery in our lives, and we’re thrilled to be part of an effort to bring that vision to life.