XA
About

Builder, becoming a researcher.

19 years old, second-year AI Engineering student in Madrid. About five months shipping agents in production, now re-orienting toward alignment and interpretability research. This site is the public version of that transition.

The short version

I'm Xabier Ariznabarreta. I was born and raised in Madrid, did high school at Trinity College in San Sebastián de los Reyes, and now study AI Engineering at Universidad Francisco de Vitoria in Madrid. I started building AI projects at the beginning of my second year of the degree and shipped my first production agent a few months later. Since then I've sent several to production: commercial agents integrated with Salesforce, MCP servers with security-review loops, Android field assistants, multilingual institutional tooling.

The work was useful. It was also a teacher. The bugs were instructive, the customers were patient, and the experience showed me where the bottleneck is — not in the engineering, but in the science underneath.

So I'm pivoting. I'm working on my first empirical paper (an ablation study of agent-harness components), going through ARENA in parallel, and using this site as the public notebook for both. Builder in the concrete, learner in the deep.

What you'll find here

Four kinds of posts:

  • Technical deep-dives — rigorous explanations of LLM concepts, transformer internals, mech interp, eval methodology. Depth before speed.
  • Research notes — work-in-progress on the harness-ablation paper. Hypotheses, design choices, null findings included.
  • Production lessons — honest case studies from systems I've shipped, including the expensive mistakes.
  • Meta journey — reflections on the builder-to-researcher pivot at 19, ARENA progress, decisions I'm reversing, honest self-critique.

How I work

Verify before claiming

I'd rather kill an idea after deeper search than ship the second one. Decisions about where to spend months get more than a first-page Google.

Honest WIP

Research notes go up while I'm still confused. Null findings get the same treatment as positive ones. The errors are part of the artifact.

Spanish first, English close behind

Spanish by default — it's my native language and the one I think in. Every post has an English version one click away via the language toggle.

Currently

  • Reading — the recent Anthropic evals paper; Olah et al. on circuits; whatever Lilian Weng posts.
  • Working throughARENA alignment curriculum (transformer internals, mech interp, RL).
  • Building — the experiment harness for the ablation paper.
  • Writing — research notes from the paper as it takes shape.
  • Looking for — researchers willing to read a draft, and feedback on the ablation design before I burn more compute.
Get in touch