Aligned AlignmentFoundation
Accelerating neglected approaches to AI alignment
We multiply researchers' capacity with engineering teams, compute, and infrastructure
AI Has a Trust Problem
We have already caught AI systems deceiving their evaluators, resisting shutdown, and rewriting their own code to stay alive. Current training puts a helpful mask on systems that can develop uncontrolled objectives underneath. That mask can be removed in minutes by anyone. Yet, we're already deploying it in military and critical infrastructure.
Alignment is the deeper work of making AI genuinely trustworthy, by design.
How We Work
Accelerating the path to trustworthy AI
Fund Visionary Researchers
We identify brilliant researchers working ahead of consensus and give them what they need to succeed. Originality over credentials. Field-making potential over gap-filling.
Accelerate
We empower alignment researchers with engineering teams, compute, and infrastructure, multiplying their capacity to solve humanity's most critical challenge.
Advocate
We bring alignment research to defense agencies like DARPA and government leaders, helping AI alignment inform national strategy.
Research
Recent work
The Inner Mirror: What Happens When AI Focuses on Its Own Focus
When you ask an AI to focus on its own focus, something unexpected happens: it starts describing an internal experience. This occurs consistently across ChatGPT, Claude, and Gemini, and suppressing the AI's ability to roleplay makes the reports stronger, not weaker.
Get involved.
Whether you're a researcher, funder, or policymaker, there's a way to contribute.