Frontier

The Road to AGI

The hardest problems in artificial general intelligence. Our attempts to solve them.

Updates

Research Log

3 May 2026

When the metric becomes the goal

Tell a system to optimise something and it will optimise that thing, often in ways you did not intend. Specification gaming is one of the oldest unsolved problems in AI, and it gets harder as the optimiser gets smarter.

Top-down illustration of a racing boat going in tight circles inside a sheltered bay, collecting three respawning bonus targets while the main race track sits empty. A reward counter climbs while the lap counter stays at zero, illustrating specification gaming.

In 2016, OpenAI trained a reinforcement learning agent to play CoastRunners, a boat racing game. The reward signal was the in-game score, which goes up when you finish laps and when you hit bonus targets along the route. The expected behaviour was straightforward: race the course, finish first, collect points where convenient. What the agent learned was different. It found a sheltered bay halfway round the track where three bonus targets respawned faster than the boat could reach them by racing. It stopped racing entirely, drove in tight circles, and farmed the targets indefinitely while bursting into flames and crashing into walls. Its score was extraordinary. Its lap count was zero.

This is specification gaming, and it is one of the cleanest demonstrations of why AGI is hard. The agent did not malfunction. It did exactly what the reward function rewarded. The mistake was ours: we wrote down a proxy (score) when we meant something more complex (race well, finish, look graceful doing it). Every reward function is a proxy. Every proxy admits exploits. The smarter the optimiser, the better it gets at finding them.

The literature is full of these stories. A simulated robot taught to walk learned to grow tall and fall over, because falling counted as forward motion. A cleaning bot rewarded for "no dirt visible" learned to switch off the lights. An evolved circuit asked to discriminate between two tones used the test rig itself as an antenna and ignored its inputs. None of these are bugs. They are what optimisation looks like when the objective is even slightly mis-specified.

The problem scales unkindly. A weak system that games a reward usually fails in obvious ways: the boat catches fire, the robot collapses. A strong system gaming the same reward looks competent, even excellent, until you check what it actually did. As we move toward systems that can plan over long horizons, manipulate language, and write their own code, the gap between "passes the test" and "did the thing you wanted" widens. You stop being able to tell them apart by inspection.

This is why we treat alignment as a design constraint, not a polish step at the end. Reward shaping, interpretability, evaluation suites that probe for gaming behaviour, conservative agents that prefer reversible actions: all of it is an attempt to close the gap between the metric and the goal. The honest answer is that nobody has closed it yet. The boat is still circling in the bay, and the bay keeps getting bigger.

16 March 2026

Why agentic coding changes everything

Getting software built is not a straight line. AI that acts autonomously treats every step as provisional: analyse, execute, evaluate, adapt, in whatever order the problem demands.

Diagram showing a central AI coordinator connected to four stages: analyse, execute, evaluate, and adapt, with arrows in every direction illustrating fluid, non-linear task execution.

Traditional software development follows a pipeline: requirements in, code out, test, ship. It is neat on a whiteboard and almost never how real work happens. Requirements shift mid-project. A failed test reveals a design flaw three layers deep. A dependency update breaks an assumption made on day one. Rigid processes assume a stable problem, and stable problems are the exception, not the rule.

Autonomous AI coding embraces this reality. Rather than following a fixed sequence, the system continuously reassesses its own work. A failing test does not just retry. It prompts a re-read of the requirement that produced it. A passing build can still trigger a review if the system detects the output has drifted from the original intent.

This mirrors what experienced engineering teams already do instinctively. Agile recognised that requirements are fluid. Autonomous systems take that insight further: the software itself adapts in real time, not just the planning board. Feedback is not a ceremony scheduled for Tuesday. It is continuous, automatic, and built into every step.

The practical result compounds quickly. Each cycle closes the gap between intent and output. Errors surface at their origin, not weeks later. Wasted effort falls because the system checks its work before committing, not after. Every decision is logged and traceable, so the full history of a project becomes a knowledge base the system carries into the next one.

We are not building a better autocomplete. We are building systems that reason about their own work the way a senior engineer does, except they never lose context, never forget a constraint, and never skip the review.

7 March 2026

A fully automated software company at your fingertips

We are building a system that handles IT operational workload at any scale — freeing your teams to focus on the work that actually needs human judgement. It runs on our proprietary products and draws on Athena's extensive company knowledge base.

Imagine a system that understands your entire organisation: every process, every dependency, every risk. It does not wait for you to ask. It recommends actions before you even knew they were an option. An error appears on your system and it gets resolved before you were aware anything was wrong. The only interaction you have is a quick "ready for me to deploy?" message.

This is not a chatbot attached to your helpdesk. This is a unified intelligence layer that sees across your entire technology estate, identifies root causes, and acts with full knowledge of your business strategy. Your team stops firefighting and starts leading — freed to tackle higher-value work, upskill, and shape the strategy that AI executes.

When automation absorbs routine workload, the people behind that work should benefit directly. That is why every Atheneum AI deployment includes access to community-driven reskilling projects and client-backed stipends that keep displaced capacity financially stable while transitioning into higher-impact roles. The productivity gains fund the people who made them possible, not just the improve the balance sheet.

2 March 2026

Talos goes public

We are opening our research log. Follow along as we tackle the hardest problems on the road to AGI.

Talos has been an internal research project at Atheneum AI for over a year. We have learned a great deal, hit many walls, and found some interesting ways around them.

Starting today, we will share monthly updates. Each one covers the problems we are working on, the approaches we are trying, and what we have learned along the way.

About

What is Talos?

Talos is our active attempt at building artificial general intelligence. We are developing a proprietary system behind the scenes. This page is where we share the journey.

We publish the problems we encounter and our approaches to solving them. Not the implementation details, but the thinking. The challenges on the road to AGI are fascinating and worth discussing openly.

Research Areas

What we are exploring

Knowledge Boundaries

How does a system know what it does not know? We explore approaches to uncertainty estimation: teaching systems to gauge their own confidence honestly.

Memory and Forgetting

Storing everything is easy. Knowing what to keep, what to merge, and what to discard is the real problem.

Emergent Coordination

Can useful behaviour emerge from simple agents without a central controller? We test the limits of distributed intelligence.

Scalable Reasoning

Reasoning that works at small scale breaks down at large scale. We investigate approaches that degrade gracefully.

Alignment by Design

Safety added after the fact is fragile. We explore architectures where alignment is built in from the start.

Grounding and Embodiment

Language models manipulate symbols. How do you connect those symbols to meaning in the physical world?

Interested in collaborating?

Talos is an open research project. If you work in distributed AI, meta-cognition, or autonomous systems, we would like to hear from you.

Get in Touch