Memory in AI Coding Assistants: What Persists and What Doesn't

You spend an hour at the start of every new session re-explaining the same things to your AI coding assistant:
"This project uses the repository pattern. Don't use raw queries in handlers. All API responses follow this shape. We prefer explicit error returns over exceptions."
The AI nods, remembers for the session, then forgets everything the moment you close the window.
Next day. Same briefing. Same nods. Same forgetting.
This isn't a limitation of the AI's intelligence — it's a limitation of its memory architecture. Once you understand how AI memory actually works, you can stop repeating yourself and start building a tool that genuinely knows your codebase.
The Four Types of AI Memory
AI coding assistants don't have a single "memory." They have multiple distinct memory systems, each with different properties, lifetimes, and mechanisms.
1. In-Context Memory (Volatile)
This is the sliding window we discussed in the context post. Everything the AI knows during a session lives here — your conversation, pasted files, tool outputs, its own responses. It's fast, rich, and completely volatile: when the session ends, it's gone.
Lifetime: One session
What goes in: Everything you say and share
What goes out: Everything, on session close
Your control: High — you decide what to paste and say
In-context memory is the AI's working memory. It's powerful but temporary. Treating it as your only memory mechanism is like relying solely on RAM with no disk storage.
2. Injected Memory (Persistent, Manual)
This is the most practical and underused form of memory in AI coding workflows. It works by placing instructions and context into files that the tool automatically reads at the start of every session — before you type a single character.
Different tools use different file names:
| Tool | File |
|---|---|
| Claude Code | CLAUDE.md |
| Cursor | .cursorrules |
| GitHub Copilot | .github/copilot-instructions.md |
| Windsurf | .windsurfrules |
| Aider | CONVENTIONS.md |
These files get injected into the system prompt at session start. The AI reads them before anything else. This is the closest thing to permanent memory in most AI coding workflows.
Lifetime: Permanent (until you edit the file)
What goes in: Whatever you write into the file
What goes out: Nothing — it persists across sessions
Your control: Total — you're the author
3. External Memory (Persistent, Automated)
Some advanced AI coding tools connect to external retrieval systems — vector databases, code search indexes, embedding stores. When you ask a question, the tool searches this external store and pulls relevant context into the session automatically.
Examples:
- Codebase indexing in Cursor or Copilot — semantically searching your entire repo
- Knowledge bases in enterprise AI tools — past decisions, architecture docs, runbooks
- Embeddings-based search — finding the right file or function by meaning, not exact keyword
External memory can hold arbitrarily large amounts of information — far more than any context window. The trade-off is retrieval quality: you get back what the retrieval system thinks is relevant, not necessarily what you actually need.
Lifetime: As long as the index exists
What goes in: Indexed files, documents, code
What goes out: Retrieved chunks, not full content
Your control: Medium — you control what gets indexed
4. Agent Memory (Persistent, AI-Written)
This is the newest and most interesting type. In agentic workflows, the AI itself writes notes to files that persist across sessions — effectively creating its own memory.
Claude Code does this with its /memory system. The agent can write observations, decisions, and learned facts to markdown files. Future sessions read these files and pick up where the previous agent left off.
Lifetime: As long as the file exists
What goes in: Whatever the agent decides to record
What goes out: The agent's own notes, on next read
Your control: Low to medium — you review and edit, but the AI writes
Why Injected Memory Is Your Most Powerful Tool Right Now
External memory and agent memory are powerful but require significant setup or specific tools. Injected memory — a plain text file in your repo — is available today, in every AI coding tool, with zero infrastructure.
And most developers don't use it.
Here's what a well-written CLAUDE.md (or equivalent) does for you:
Eliminates the briefing ritual. No more spending the first ten minutes of every session re-explaining your architecture. The AI already knows.
Keeps AI suggestions on-track. When the AI knows your conventions, it generates code that fits your codebase instead of code that looks like a Stack Overflow answer from a different project.
Survives team turnover. New team member onboarding with an AI? The injected memory file teaches the AI your project's rules. The AI teaches the new developer. The context file becomes living documentation.
Compounds over time. Every time you discover a new constraint or make a key decision, you add it to the file. The AI gets progressively smarter about your specific project.
What to Put in Your Memory File
The common mistake is writing a memory file that describes what the codebase does. That's documentation — and the AI can read your source files for that.
What the AI can't derive from source files: why decisions were made, what to avoid, and what you care about.
High-Value Content
Architecture decisions with rationale:
## Architecture
- Repository pattern for all database access. No raw SQL in handlers.
Reason: we had a production incident from a handler doing direct queries —
the abstraction layer gives us one place to add logging and retries.
- All errors are returned as values, not thrown as exceptions.
Reason: consistent with our Go-influenced error handling style.Conventions that aren't obvious from the code:
## Conventions
- API response shape: { data: T, error: string | null, meta: { page, total } }
- Date formatting: always ISO 8601. Never locale-specific formats.
- Feature flags live in lib/flags.ts. Never hardcode feature toggles inline.What to avoid:
## Do Not
- Do not use useEffect for data fetching. We use React Query.
- Do not add new dependencies without checking with the team first.
- Do not use the UserContext directly — use the useCurrentUser() hook.Current project state:
## Current Work
- Migrating from REST to tRPC. New endpoints use tRPC. Old ones being migrated gradually.
- Auth is being refactored. Don't touch auth/legacy/ — it's being replaced.Low-Value Content (Skip These)
- File structure descriptions — the AI can read your directory tree
- Library version numbers — irrelevant to code generation
- How to run the project — put that in README
- Generic best practices — the AI already knows these
Building Memory That Evolves With Your Project
A memory file written once and never updated is a memory file that lies.
The best teams treat the injected memory file like a living document — updated as the project evolves, reviewed periodically, and pruned when information becomes stale.
A simple workflow:
When to update the memory file:
- After any architectural decision
- When you reject an AI suggestion for a project-specific reason
- When you discover a constraint the AI keeps getting wrong
- After onboarding a new developer (what did they find confusing?)
- When deprecating something the AI keeps suggesting
When to prune the memory file:
- When a constraint no longer applies
- When you've migrated away from something you warned about
- When the file gets long enough that important items get lost
Keep it focused. A 50-line memory file that the AI reads carefully is more valuable than a 500-line one it skims.
Agent Memory: Letting the AI Remember for Itself
In Claude Code's memory system, the agent can write to memory files after a session. You can prompt it explicitly:
"Remember that we're using optimistic updates in this module — don't suggest adding loading states for user interactions."
The agent writes that to a memory file. Next session, it reads the file before you say anything. You never repeat yourself again.
This is powerful but requires oversight. AI-written memory can:
- Capture subtleties that you'd forget to write down yourself
- Compound across many sessions without manual effort
- Occasionally record something wrong or outdated
Review AI-written memory periodically the same way you'd review a junior developer's notes — with trust but verification.
Memory Across Tools: A Practical Comparison
| Memory Type | Claude Code | Cursor | Copilot | Aider |
|---|---|---|---|---|
| Injected files | CLAUDE.md | .cursorrules | copilot-instructions.md | CONVENTIONS.md |
| Codebase indexing | Via MCP tools | Built-in | Built-in | Via plugins |
| Agent self-writes | Yes (/memory) | No | No | No |
| Cross-session persistence | Via files | Via files | Via files | Via files |
All of them support injected memory. That's your baseline. Everything else is a layer on top.
The Mental Model: RAM vs Disk
The cleanest mental model for AI memory:
- In-context memory = RAM — fast, large enough for current work, volatile
- Injected memory = SSD — always available at boot, you manage what's on it
- External memory = NAS/cloud storage — huge capacity, slower retrieval
- Agent memory = log files — written automatically, needs periodic review
A program that uses only RAM and never persists to disk loses all state on restart. An AI that uses only in-context memory and no injected files loses all knowledge of your project every session.
You wouldn't build a program that way. Don't build your AI workflow that way either.
Summary and Key Takeaways
✅ AI coding assistants have four memory types: in-context, injected, external, and agent-written
✅ In-context memory is volatile — it dies with the session
✅ Injected memory (CLAUDE.md, .cursorrules, etc.) is persistent, manual, and available in every tool today
✅ Write memory files with decisions + rationale + what to avoid — not descriptions of what the code does
✅ A 50-line focused memory file beats a 500-line exhaustive one
✅ Treat memory files as living documents — update after decisions, prune when stale
✅ Agent self-written memory compounds over time — review it periodically
✅ Mental model: in-context = RAM, injected = SSD, external = cloud storage, agent = log files
Final Thought
Every hour you spend repeating yourself to an AI is an hour you could have spent writing one good memory file.
The AI isn't going to remember. That's not a flaw — it's the architecture. Your job is to build the persistence layer that makes it irrelevant.
Write the file. Keep it current. Stop briefing. Start building.
Related: Context Is Everything in AI Coding Assistants — the companion post on managing the in-session context window.
📬 Subscribe to Newsletter
Get the latest blog posts delivered to your inbox every week. No spam, unsubscribe anytime.
We respect your privacy. Unsubscribe at any time.
💬 Comments
Sign in to leave a comment
We'll never post without your permission.