Personalization requires data

29 Dec 2025

4 minute read

In 2025, AI models learned to effectively search and process vast amounts of information to take actions. This has shown its colors the most in coding, eg through harnesses like claude code that have had a sizable impact on programmers’ workflows.

But this year of progress doesn’t seem to have had that much of an effect on the personalization of our interactions with models, ie whether models understand the user’s context, what they care about, and their intentions, in a way that allows them to answer better. Most chatbot users’ personalization is still limited to a system prompt. Memory features don’t seem that effective at actually learning things about the user. Why?

The bottleneck is data. It is very hard to evaluate a method for personalization, because personalization is about humans, and humans are messy in ML. Getting good eval signals for personalization is hard, because grading a model’s personalization is intrinsically subjective, and requires feedback at the level of the life and interactions that the model is being personalized to. There is no verified signal, and building a generic rubric seems hard. These facts do not mesh well with the current direction of machine learning, which is just now starting to go beyond verifiable rewards into rubrics, and is fueled by narratives of human replacement that make personalization not key (If I am building the recursively improving autonomous agi, why do I need to make it personalized). ¹

Start by giving models more information #

If the bottleneck is data, before we have an eval science for personalization, we can start by curating our existing personal data and then sharing that with our AIs, to enrich our interactions with them. The lucky thing is, agents are good at exploring information, and many of us already have a bunch of data – our notes, journals, writing the media we like, etc…

Instead of just a system prompt, we can take that data, and curate it to give models a searchable artifact. Something agents can explore when your questions might benefit from context—and write to, to remember things for later.

whorl - My first guess #

Over the break, I built a very simple software tool to do this. It’s called whorl, and you can install it here.

whorl is a local server that holds any text you give it—journal entries, website posts, documents, reading notes, etc…—and exposes an MCP that lets models search and query it. Point it at an folder or upload files.

I gave it my journals, website, and miscellaneous docs, and started using Claude Code with the whorl MCP. Its responses were much more personalized to my actual context and experiences.

Examples #

First I asked it:

do a deep investigation of this personal knowledge base, and make a text representation of the user. this is a text that another model could be prompted with, and would lead them to interacting with the user in a way the user would enjoy more

It ran a bunch of bash and search calls, thought for a bit, and then made a detailed profile of me, and my guess is that its quality beats many low effort system prompts, linked here.

I’m an ML researcher, so I then asked it to recommend papers and explain the motivation for various recs. Many of these I’ve already read, but it has some interesting suggestions, quite above my usual experience with these kinds of prompts. See here.

These prompts are those where the effect of personalization is most clear, but this is also useful in general chat convos, allowing the model to query and search for details that might be relevant.

It can also use the MCP to modify and correct the artifact provided it, to optimize for later interactions – especially if you host a “user guide” there like the one I linked. Intentionally sharing personal data artifacts is the first step to having agents that understand you.

Conclusion #

Personalization requires data. People need to invest into seeing what models can do with their current data, and figuring out what flows and kinds of interactions this data is useful for, towards building technology that can empower humans towards their goals. whorl is a simple tool that makes that first step easy. People who have already created a bunch of content should use that content to enhance their interactions with AIs.

I will soon write up more thoughts on why I think this is misguided, both in terms of its impact on the world and for the development of AI technology. Thinking about data for personalization and how to do that kind of eval is deeply important for building technologies that empower us, and is something I think about at Fulcrum. ↩