TL;DR
Background Agents in Cursor 3 let you delegate coding tasks asynchronously—like assigning tickets to AI team members that work independently while you focus elsewhere. This guide covers practical workflow patterns, parallel execution strategies, configuration best practices, and real limitations learned from daily use. Whether you're running agents locally in worktrees or remotely in the cloud, you'll learn how to structure your work for maximum async throughput.
💡 Tool tip: When building
.cursor/rulesorenvironment.jsonconfigs, validate your JSON syntax with our free JSON Formatter to catch errors before agents consume them.
Table of Contents
- What Background Agents Actually Are
- Setting Up the Agents Window
- Five Practical Workflow Patterns
- Configuration Deep Dive
- Performance and Benchmarks
- Background Agent vs Claude Code vs Copilot Agent
- Integrations and Multi-Platform Triggers
- Limitations and Gotchas
- Best Practices for Daily Use
- Key Takeaways
- FAQ
- Related Resources
What Background Agents Actually Are
Background Agents are autonomous coding units that execute tasks independently of your active editing session. Unlike inline completions or synchronous chat, they don't block your workflow—you describe a task, the agent goes to work, and you get notified when it's done or needs input.
The mental model is delegation, not direction:
Each agent gets its own isolated environment:
- Local worktree agents run in a separate Git worktree on your machine, preventing conflicts with your working directory
- Cloud agents spin up dedicated Ubuntu VMs with full dev environments—they keep running after you close your laptop
- Remote SSH agents execute on your own servers or dev machines
The key insight: agents don't just generate code snippets. They clone repos, install dependencies, write implementation code, run test suites, self-correct on failures, and submit pull requests. This is full-cycle development, not code completion.
New to AI agents? Start with our AI Agent glossary entry for foundational concepts.
Setting Up the Agents Window
The Agents Window is your control center for all running agents. Access it via Cmd+Shift+P → "Agents Window" (or the dedicated keyboard shortcut Cmd+Shift+A).
Interface Layout
The window presents each agent as a tab with real-time status:
| Status | Meaning |
|---|---|
| 🟢 Running | Agent is actively coding, testing, or building |
| 🟡 Waiting | Agent needs human input or approval |
| 🔵 Complete | Task finished—PR or diff ready for review |
| 🔴 Failed | Agent hit an unrecoverable error |
You can tile agents side-by-side to monitor multiple tasks simultaneously. Each agent tab shows:
- Current step (e.g., "Running test suite — 3/47 passing")
- File changes diff
- Terminal output
- Browser screenshots (for UI tasks)
Launching Your First Agent
// From Cursor's command palette or chat:
// "Create a Background Agent to add input validation to the user registration form"
// The agent will:
// 1. Create a new Git worktree
// 2. Analyze existing form code
// 3. Add validation logic with Zod schemas
// 4. Write unit tests
// 5. Run tests and fix failures
// 6. Present a diff for your review
For Python projects, the workflow is identical:
# Task: "Add retry logic with exponential backoff to all [API](https://qubittool.com/en/glossary/api) client methods"
# The agent autonomously:
# 1. Identifies all API client files
# 2. Adds tenacity decorator with backoff
# 3. Writes tests verifying retry behavior
# 4. Runs pytest and fixes any failures
# 5. Submits changes for review
Five Practical Workflow Patterns
These patterns emerge from daily use of Background Agents. Each addresses a specific development scenario.
Pattern 1: Assign and Forget — Refactoring Tasks
The simplest pattern. Describe a refactoring task and move on:
Task: "Refactor the payment service to use the Strategy pattern.
Extract each payment method (Stripe, PayPal, Apple Pay) into its own
strategy class. Maintain backward compatibility with existing tests."
This works best for:
- Mechanical refactors with clear rules
- Code style migrations (e.g., class → functional components)
- Dependency upgrades with predictable changes
When to use: Tasks where the desired end state is unambiguous.
Pattern 2: Parallel Test Coverage Sprint
Launch multiple agents simultaneously, each covering a different module:
Each agent gets a scoped directive:
// Agent 1 task:
`Write comprehensive unit tests for src/auth/.
Target 90% branch coverage. Use Jest + Testing Library.
Follow existing test patterns in src/auth/__tests__/login.test.ts.
Mock external services using MSW.`
// Agent 2 task:
`Write integration tests for src/payments/.
Cover Stripe webhook handling, refund flows, and subscription lifecycle.
Use the test Stripe API keys from .env.test.`
When to use: When you need to rapidly boost test coverage before a release.
Pattern 3: Review Prep — Agent Pre-Reviews
Before requesting human review, have an agent analyze the PR:
Task: "Review the changes in PR #247. Check for:
1. Security issues (SQL injection, XSS, auth bypass)
2. Performance regressions (N+1 queries, missing indexes)
3. Missing error handling
4. Breaking API changes
Post findings as a summary."
This catches obvious issues before consuming a human reviewer's time. Bugbot automates this further—on every PR, it runs analysis using Learned Rules from your team's historical feedback.
When to use: Before every human code review. Zero friction, high signal.
Pattern 4: Worktree Isolation for Experiments
The /worktree command creates a fully isolated branch for speculative work:
/worktree Experiment with replacing Express with Hono for the API layer.
Migrate 3 representative endpoints. Benchmark request throughput.
Report results without merging.
The agent works in its own worktree. Your working directory stays untouched. If the experiment fails, discard it with zero cleanup.
When to use: Evaluating new libraries, architectural experiments, risky migrations you want to test before committing.
Pattern 5: Best-of-N Model Racing
The /best-of-n command runs the same task across multiple models simultaneously:
/best-of-n composer, claude-sonnet, gpt-5
Implement a rate limiter middleware using sliding window algorithm.
Must handle distributed scenarios with Redis.
Cursor creates a separate worktree per model. Once all complete, you compare implementations side-by-side and use /apply-worktree to merge the winner.
// Example output comparison:
// Composer 2: 47 lines, clean but basic sliding window
// Claude Sonnet: 89 lines, handles edge cases, includes Lua script
// GPT-5: 62 lines, TypeScript-native, good test coverage
// Choose: /apply-worktree claude-sonnet
When to use: Critical path code where you want to compare approaches before committing to one.
Configuration Deep Dive
Proper configuration is the difference between agents that "kind of work" and agents that consistently deliver production-quality code.
Project Rules (.cursor/rules)
This file tells agents your project's conventions—think of it as onboarding documentation for your AI team:
# .cursor/rules
project:
name: "acme-api"
language: "TypeScript"
framework: "NestJS"
test_runner: "Jest"
conventions:
- "Use dependency injection via NestJS providers"
- "All endpoints require @Auth() decorator"
- "Database access only through repository pattern"
- "Error responses use ProblemDetails (RFC 7807)"
- "No console.log in production code—use Logger service"
testing:
- "Unit tests in __tests__/ adjacent to source"
- "Integration tests in test/ at project root"
- "Use TestingModule.createTestingModule() for DI"
- "Mock external APIs with nock"
code_style:
- "Prefer explicit return types on public methods"
- "Use barrel exports (index.ts) for module boundaries"
- "Enum values in UPPER_SNAKE_CASE"
Cloud Environment Configuration
For cloud agents, define the development environment declaratively:
{
"image": "node:20-bookworm",
"setup": {
"commands": [
"npm ci",
"npx prisma generate",
"npx prisma db push --accept-data-loss"
]
},
"services": {
"database": "postgres:16",
"cache": "redis:7-alpine",
"queue": "rabbitmq:3-management"
},
"secrets": ["DATABASE_URL", "STRIPE_SECRET_KEY", "JWT_SECRET"],
"layer_cache": true
}
The layer_cache: true flag enables Docker layer caching, yielding ~70% faster rebuilds on subsequent agent runs.
Validate your environment.json with our JSON Formatter to catch syntax issues early.
Multi-Repo Setup (Cursor 3.4+)
Since the 3.4 update, agents can operate across multiple repositories:
# .cursor/multi-repo.yaml
repositories:
- path: "./backend"
rules: "./backend/.cursor/rules"
language: "Python"
- path: "./frontend"
rules: "./frontend/.cursor/rules"
language: "TypeScript"
- path: "./shared-types"
rules: null # Uses root rules
language: "TypeScript"
# Agent task spanning repos:
# "Update the User type in shared-types, then update both
# backend serializers and frontend components to match."
Build Secrets for Private Registries
If your project pulls packages from private registries:
{
"build_secrets": {
"NPM_TOKEN": "settings://secrets/npm-token",
"GITHUB_PACKAGES_TOKEN": "settings://secrets/gh-packages"
},
"npmrc_template": "//npm.pkg.github.com/:_authToken=${GITHUB_PACKAGES_TOKEN}"
}
Secrets are injected at build time only—never written to disk in the agent's workspace.
Performance and Benchmarks
Real performance data from Cursor's published metrics and community testing:
Bugbot Resolution Rates
| Metric | Cursor Bugbot | GitHub Copilot | CodeRabbit |
|---|---|---|---|
| Bugs found per run | 0.70 avg | 0.43 avg | 0.51 avg |
| Resolution rate | 78.13% | 46.69% | 48.96% |
| User merge rate | 79% | N/A | N/A |
| Self-improvement | Learned Rules | None | Manual rules |
The 79% merge rate means that nearly 4 out of 5 bugs identified by Bugbot are resolved by developers at merge time—indicating high signal-to-noise ratio.
Cloud Agent Build Performance
| Operation | Without Cache | With Layer Cache | Improvement |
|---|---|---|---|
| Initial build | ~4.2 min | ~4.2 min | — |
| Subsequent builds | ~4.2 min | ~1.3 min | 70% faster |
| Dependency install | ~2.1 min | ~0.4 min | 81% faster |
| Test execution | ~1.8 min | ~1.8 min | No change |
Layer caching applies to npm ci, system package installs, and Prisma generation—anything that doesn't change between runs.
Token Economics
Understanding token usage helps optimize costs:
Average agent task: ~15,000-50,000 tokens consumed
Simple refactor: ~8,000 tokens
Test generation (one module): ~25,000 tokens
Full feature implementation: ~80,000-150,000 tokens
Composer 2's pricing ($0.50/M input, $2.50/M output standard) makes routine agent tasks extremely cheap—a typical refactoring costs less than $0.05.
Background Agent vs Claude Code vs Copilot Agent
Choosing the right async coding tool depends on your workflow preferences:
| Feature | Cursor Background Agent | Claude Code | GitHub Copilot Agent |
|---|---|---|---|
| Interface | GUI (Agents Window) | Terminal CLI | VS Code + GitHub UI |
| Execution | Local worktree / Cloud VM / SSH | Local terminal / CI | GitHub-hosted runner |
| Parallel tasks | Multiple simultaneous agents | Single session (multi via SDK) | One per Issue |
| Trigger sources | IDE, Slack, GitHub, Linear, Mobile | Terminal, GitHub Actions, SDK | GitHub Issues, PR comments |
| Multi-repo | Native (3.4+) | Manual context | Single repo per agent |
| Max autonomy | Until PR submission | 7+ hours continuous | Until PR submission |
| Offline/local | Yes (worktree mode) | Yes (native) | No (cloud only) |
| Best for | Parallel orchestration, team workflows | Terminal power users, CI/CD | GitHub-native enterprises |
| Cost | Pro $20/mo (included) | Max $100/mo or API usage | Enterprise $39/user/mo |
For a comprehensive tool comparison, see our AI Coding Tools 2026 Comparison.
Understanding the LLM architectures behind these tools helps you pick the right model for each task.
Integrations and Multi-Platform Triggers
Cursor 3.4 introduced multi-platform agent triggers—you're no longer limited to the desktop IDE.
Microsoft Teams Integration
@Cursor in any Teams channel:
"@Cursor Fix the flaky test in auth.spec.ts — it fails on CI
but passes locally. Probably a timing issue."
The agent picks up the task, creates a cloud session, and posts results back to the Teams thread.
GitHub and Linear Triggers
# .github/workflows/cursor-agent.yml
on:
issues:
types: [labeled]
jobs:
agent-fix:
if: contains(github.event.label.name, 'cursor-agent')
runs-on: ubuntu-latest
steps:
- uses: cursor/agent-action@v1
with:
task: ${{ github.event.issue.body }}
project_rules: .cursor/rules
Mobile Access
Monitor and manage running agents from Cursor's mobile companion app:
- View agent progress and status
- Approve or reject pending changes
- Assign new tasks on the go
- Receive completion notifications
This means you can assign a bug fix to an agent while commuting—review and merge the result when you're back at your desk.
Limitations and Gotchas
Background Agents are powerful but not magic. These are real limitations learned from daily use:
When Agents Struggle
-
Architectural decisions — Agents execute well on clearly scoped tasks. They're poor at deciding whether to use microservices vs. monolith, or choosing between event sourcing and CRUD. Keep strategic decisions human.
-
Large context requirements — Very large monorepos can exceed the context window. If an agent needs to understand 50+ files simultaneously, it may miss connections. Break tasks into smaller units.
-
Exploratory/creative coding — When the goal isn't clear ("make the UX feel better"), agents produce generic output. They need concrete acceptance criteria.
-
Complex state machines — Multi-step business logic with subtle edge cases often requires iterative human-agent collaboration rather than pure delegation.
Cost Considerations
- Cloud agents on Pro plan are included but usage-capped
- Heavy parallel usage may require Pro+ ($60/mo) or Ultra ($200/mo)
- Composer 2 is cheap per-token, but 10 parallel agents burning tokens adds up
- Self-hosted cloud agents (Enterprise) remove per-usage costs but require infrastructure
Common Pitfalls
// ❌ Vague task description
"Make the code better"
// ✅ Specific, verifiable task
"Refactor src/services/email.ts:
- Extract HTML template rendering into a separate TemplateService
- Add retry logic for SMTP failures (3 attempts, exponential backoff)
- Add unit tests achieving >80% branch coverage
- Run existing tests to ensure no regressions"
# ❌ No acceptance criteria
"Add caching to the API"
# ✅ Clear scope and constraints
"""Add Redis caching to GET /api/products endpoint:
- Cache key: f"products:{category}:{page}"
- TTL: 300 seconds
- Invalidate on POST/PUT/DELETE to /api/products
- Add cache hit/miss metrics via StatsD
- Write integration test verifying cache behavior"""
Best Practices for Daily Use
Seven practices distilled from heavy daily use of Background Agents:
1. Write tasks like Jira tickets, not chat messages. Include context, acceptance criteria, and constraints. The agent can't ask clarifying questions mid-execution.
2. Start with low-stakes tasks. Don't assign your first agent a critical production hotfix. Begin with test generation, documentation updates, or dependency upgrades—build trust incrementally.
3. Use /worktree for anything experimental. It's zero cost to discard. Treat worktrees like draft branches that never touch your working copy.
4. Leverage Learned Rules aggressively. Every time you review agent output and correct something, that feedback trains Bugbot. The more you interact, the better agents get at your codebase.
5. Batch related tasks for parallel execution. Instead of one agent doing "write tests for entire app," launch 4 agents each covering one module. Parallelism is your multiplier.
6. Review diffs, not full files. Agents show you exactly what changed. Focus your review energy on the diff—trust the surrounding context that didn't change.
7. Keep .cursor/rules updated. As your project evolves, update the rules file. It's the single most impactful lever for agent output quality. Think of it like prompt engineering for your entire project.
When managing configuration files across projects, use our YAML to JSON converter for quick format switching between rule file formats.
Key Takeaways
- Background Agents = async delegation: Assign tasks and get results without blocking your current work
- Five patterns cover 90% of use cases: Assign-and-forget, parallel test sprints, review prep, worktree experiments, best-of-n racing
- Configuration quality determines output quality: Invest in
.cursor/rulesandenvironment.json - Parallel execution is the multiplier: 4 agents × 30 min = 2 hours of work in 30 minutes
- Know the boundaries: Agents excel at well-scoped implementation; struggle with ambiguity and architecture
- Bugbot's 78% resolution rate makes automated pre-review practically mandatory
- Multi-platform triggers (Teams, GitHub, Linear, mobile) mean agents work even when you're away from your IDE
For a deeper look at Cursor 3's architecture including Composer 2 and Canvases, see our Cursor 3 Cloud Agent & Composer Review.
Further Reading
- Explore other AI coding paradigms in our Spec Coding Guide.
- Learn about the evolution of context management in the Context Engineering Guide.
FAQ
What is a Background Agent in Cursor 3?
A Background Agent is an autonomous AI coding unit that runs tasks asynchronously in Cursor's Agents Window. Think of it like assigning a ticket to a junior developer—it works independently on code changes, tests, and PRs while you focus on other work. Agents can run locally in Git worktrees, in the cloud on isolated VMs, or on remote SSH hosts.
How many Background Agents can I run in parallel?
On the Pro plan ($20/mo), you can run multiple agents simultaneously—the practical limit depends on your plan's usage quota. Each agent operates in its own isolated Git worktree, so they never conflict. Pro+ and Ultra plans offer 3x and 20x usage respectively, enabling heavier parallel workloads.
What's the difference between Background Agent and Cloud Agent?
Background Agent is the broader concept—any agent running asynchronously. Cloud Agent is a specific type of Background Agent that runs on Cursor's remote VMs with full Ubuntu environments. You can also run Background Agents locally (in worktrees) or on remote SSH connections. Cloud Agents keep working after you close your laptop.
How do I configure my project for Background Agents?
Create a .cursor/rules file defining project conventions, then optionally add .cursor/environment.json for cloud environments. For multi-repo setups (Cursor 3.4+), configure each repository's rules independently. Use the Secrets tab in Settings for API keys and credentials—never commit secrets to config files.
Are Background Agents worth it for solo developers?
Absolutely. Solo developers benefit the most from parallel agent execution—you can have one agent writing tests, another refactoring a module, and a third updating docs, effectively multiplying your output. The key is writing clear task descriptions and reviewing agent output, not doing all the coding yourself.
Related Resources
Internal Tools
- JSON Formatter — Validate agent configuration files
- YAML to JSON — Convert between rule file formats
- Text Diff — Compare agent output across best-of-n runs
- Regex Tester — Test patterns used in agent rules
Related Articles
- Cursor 3 Cloud Agent & Composer Deep Dive — Architecture and feature overview
- Claude Code Agent Programming Guide — Terminal-native alternative
- AI Coding Tools 2026 Comparison — Full tool landscape
- From Programmer to Agent Shepherd — The shifting developer role
- Vibe Coding Complete Guide — Foundational coding philosophy
Glossary
- AI Agent — Autonomous AI systems that take actions
- LLM — Large Language Models powering agents
- Prompt Engineering — Crafting effective instructions
- Context Window — Token limits agents operate within
- Token — The unit of AI model input/output
- MCP — Model Context Protocol for tool integration