TL;DR

Background Agents in Cursor 3 let you delegate coding tasks asynchronously—like assigning tickets to AI team members that work independently while you focus elsewhere. This guide covers practical workflow patterns, parallel execution strategies, configuration best practices, and real limitations learned from daily use. Whether you're running agents locally in worktrees or remotely in the cloud, you'll learn how to structure your work for maximum async throughput.

💡 Tool tip: When building .cursor/rules or environment.json configs, validate your JSON syntax with our free JSON Formatter to catch errors before agents consume them.

Table of Contents

What Background Agents Actually Are

Background Agents are autonomous coding units that execute tasks independently of your active editing session. Unlike inline completions or synchronous chat, they don't block your workflow—you describe a task, the agent goes to work, and you get notified when it's done or needs input.

The mental model is delegation, not direction:

graph LR A["Developer"] -->|"Assign task"| B["Agents Window"] B --> C["Agent 1: Local Worktree"] B --> D["Agent 2: Cloud VM"] B --> E["Agent 3: Remote SSH"] C -->|"Completes"| F["PR Ready"] D -->|"Completes"| F E -->|"Completes"| F F -->|"Review"| A

Each agent gets its own isolated environment:

  • Local worktree agents run in a separate Git worktree on your machine, preventing conflicts with your working directory
  • Cloud agents spin up dedicated Ubuntu VMs with full dev environments—they keep running after you close your laptop
  • Remote SSH agents execute on your own servers or dev machines

The key insight: agents don't just generate code snippets. They clone repos, install dependencies, write implementation code, run test suites, self-correct on failures, and submit pull requests. This is full-cycle development, not code completion.

New to AI agents? Start with our AI Agent glossary entry for foundational concepts.

Setting Up the Agents Window

The Agents Window is your control center for all running agents. Access it via Cmd+Shift+P → "Agents Window" (or the dedicated keyboard shortcut Cmd+Shift+A).

Interface Layout

The window presents each agent as a tab with real-time status:

Status Meaning
🟢 Running Agent is actively coding, testing, or building
🟡 Waiting Agent needs human input or approval
🔵 Complete Task finished—PR or diff ready for review
🔴 Failed Agent hit an unrecoverable error

You can tile agents side-by-side to monitor multiple tasks simultaneously. Each agent tab shows:

  • Current step (e.g., "Running test suite — 3/47 passing")
  • File changes diff
  • Terminal output
  • Browser screenshots (for UI tasks)

Launching Your First Agent

typescript
// From Cursor's command palette or chat:
// "Create a Background Agent to add input validation to the user registration form"

// The agent will:
// 1. Create a new Git worktree
// 2. Analyze existing form code
// 3. Add validation logic with Zod schemas
// 4. Write unit tests
// 5. Run tests and fix failures
// 6. Present a diff for your review

For Python projects, the workflow is identical:

python
# Task: "Add retry logic with exponential backoff to all [API](https://qubittool.com/en/glossary/api) client methods"

# The agent autonomously:
# 1. Identifies all API client files
# 2. Adds tenacity decorator with backoff
# 3. Writes tests verifying retry behavior
# 4. Runs pytest and fixes any failures
# 5. Submits changes for review

Five Practical Workflow Patterns

These patterns emerge from daily use of Background Agents. Each addresses a specific development scenario.

Pattern 1: Assign and Forget — Refactoring Tasks

The simplest pattern. Describe a refactoring task and move on:

code
Task: "Refactor the payment service to use the Strategy pattern.
Extract each payment method (Stripe, PayPal, Apple Pay) into its own
strategy class. Maintain backward compatibility with existing tests."

This works best for:

  • Mechanical refactors with clear rules
  • Code style migrations (e.g., class → functional components)
  • Dependency upgrades with predictable changes

When to use: Tasks where the desired end state is unambiguous.

Pattern 2: Parallel Test Coverage Sprint

Launch multiple agents simultaneously, each covering a different module:

graph TD A["Developer: Sprint Plan"] -->|"Agent 1"| B["auth/ module tests"] A -->|"Agent 2"| C["payments/ module tests"] A -->|"Agent 3"| D["notifications/ module tests"] A -->|"Agent 4"| E["user-profiles/ module tests"] B --> F["Combined PR: +340 test cases"] C --> F D --> F E --> F

Each agent gets a scoped directive:

typescript
// Agent 1 task:
`Write comprehensive unit tests for src/auth/.
Target 90% branch coverage. Use Jest + Testing Library.
Follow existing test patterns in src/auth/__tests__/login.test.ts.
Mock external services using MSW.`

// Agent 2 task:
`Write integration tests for src/payments/.
Cover Stripe webhook handling, refund flows, and subscription lifecycle.
Use the test Stripe API keys from .env.test.`

When to use: When you need to rapidly boost test coverage before a release.

Pattern 3: Review Prep — Agent Pre-Reviews

Before requesting human review, have an agent analyze the PR:

code
Task: "Review the changes in PR #247. Check for:
1. Security issues (SQL injection, XSS, auth bypass)
2. Performance regressions (N+1 queries, missing indexes)
3. Missing error handling
4. Breaking API changes
Post findings as a summary."

This catches obvious issues before consuming a human reviewer's time. Bugbot automates this further—on every PR, it runs analysis using Learned Rules from your team's historical feedback.

When to use: Before every human code review. Zero friction, high signal.

Pattern 4: Worktree Isolation for Experiments

The /worktree command creates a fully isolated branch for speculative work:

code
/worktree Experiment with replacing Express with Hono for the API layer.
Migrate 3 representative endpoints. Benchmark request throughput.
Report results without merging.

The agent works in its own worktree. Your working directory stays untouched. If the experiment fails, discard it with zero cleanup.

When to use: Evaluating new libraries, architectural experiments, risky migrations you want to test before committing.

Pattern 5: Best-of-N Model Racing

The /best-of-n command runs the same task across multiple models simultaneously:

code
/best-of-n composer, claude-sonnet, gpt-5
Implement a rate limiter middleware using sliding window algorithm.
Must handle distributed scenarios with Redis.

Cursor creates a separate worktree per model. Once all complete, you compare implementations side-by-side and use /apply-worktree to merge the winner.

typescript
// Example output comparison:
// Composer 2: 47 lines, clean but basic sliding window
// Claude Sonnet: 89 lines, handles edge cases, includes Lua script
// GPT-5: 62 lines, TypeScript-native, good test coverage

// Choose: /apply-worktree claude-sonnet

When to use: Critical path code where you want to compare approaches before committing to one.

Configuration Deep Dive

Proper configuration is the difference between agents that "kind of work" and agents that consistently deliver production-quality code.

Project Rules (.cursor/rules)

This file tells agents your project's conventions—think of it as onboarding documentation for your AI team:

yaml
# .cursor/rules
project:
  name: "acme-api"
  language: "TypeScript"
  framework: "NestJS"
  test_runner: "Jest"

conventions:
  - "Use dependency injection via NestJS providers"
  - "All endpoints require @Auth() decorator"
  - "Database access only through repository pattern"
  - "Error responses use ProblemDetails (RFC 7807)"
  - "No console.log in production code—use Logger service"

testing:
  - "Unit tests in __tests__/ adjacent to source"
  - "Integration tests in test/ at project root"
  - "Use TestingModule.createTestingModule() for DI"
  - "Mock external APIs with nock"

code_style:
  - "Prefer explicit return types on public methods"
  - "Use barrel exports (index.ts) for module boundaries"
  - "Enum values in UPPER_SNAKE_CASE"

Cloud Environment Configuration

For cloud agents, define the development environment declaratively:

json
{
  "image": "node:20-bookworm",
  "setup": {
    "commands": [
      "npm ci",
      "npx prisma generate",
      "npx prisma db push --accept-data-loss"
    ]
  },
  "services": {
    "database": "postgres:16",
    "cache": "redis:7-alpine",
    "queue": "rabbitmq:3-management"
  },
  "secrets": ["DATABASE_URL", "STRIPE_SECRET_KEY", "JWT_SECRET"],
  "layer_cache": true
}

The layer_cache: true flag enables Docker layer caching, yielding ~70% faster rebuilds on subsequent agent runs.

Validate your environment.json with our JSON Formatter to catch syntax issues early.

Multi-Repo Setup (Cursor 3.4+)

Since the 3.4 update, agents can operate across multiple repositories:

python
# .cursor/multi-repo.yaml
repositories:
  - path: "./backend"
    rules: "./backend/.cursor/rules"
    language: "Python"
    
  - path: "./frontend"
    rules: "./frontend/.cursor/rules"
    language: "TypeScript"
    
  - path: "./shared-types"
    rules: null  # Uses root rules
    language: "TypeScript"

# Agent task spanning repos:
# "Update the User type in shared-types, then update both
#  backend serializers and frontend components to match."

Build Secrets for Private Registries

If your project pulls packages from private registries:

json
{
  "build_secrets": {
    "NPM_TOKEN": "settings://secrets/npm-token",
    "GITHUB_PACKAGES_TOKEN": "settings://secrets/gh-packages"
  },
  "npmrc_template": "//npm.pkg.github.com/:_authToken=${GITHUB_PACKAGES_TOKEN}"
}

Secrets are injected at build time only—never written to disk in the agent's workspace.

Performance and Benchmarks

Real performance data from Cursor's published metrics and community testing:

Bugbot Resolution Rates

Metric Cursor Bugbot GitHub Copilot CodeRabbit
Bugs found per run 0.70 avg 0.43 avg 0.51 avg
Resolution rate 78.13% 46.69% 48.96%
User merge rate 79% N/A N/A
Self-improvement Learned Rules None Manual rules

The 79% merge rate means that nearly 4 out of 5 bugs identified by Bugbot are resolved by developers at merge time—indicating high signal-to-noise ratio.

Cloud Agent Build Performance

Operation Without Cache With Layer Cache Improvement
Initial build ~4.2 min ~4.2 min
Subsequent builds ~4.2 min ~1.3 min 70% faster
Dependency install ~2.1 min ~0.4 min 81% faster
Test execution ~1.8 min ~1.8 min No change

Layer caching applies to npm ci, system package installs, and Prisma generation—anything that doesn't change between runs.

Token Economics

Understanding token usage helps optimize costs:

code
Average agent task: ~15,000-50,000 tokens consumed
Simple refactor: ~8,000 tokens
Test generation (one module): ~25,000 tokens
Full feature implementation: ~80,000-150,000 tokens

Composer 2's pricing ($0.50/M input, $2.50/M output standard) makes routine agent tasks extremely cheap—a typical refactoring costs less than $0.05.

Background Agent vs Claude Code vs Copilot Agent

Choosing the right async coding tool depends on your workflow preferences:

Feature Cursor Background Agent Claude Code GitHub Copilot Agent
Interface GUI (Agents Window) Terminal CLI VS Code + GitHub UI
Execution Local worktree / Cloud VM / SSH Local terminal / CI GitHub-hosted runner
Parallel tasks Multiple simultaneous agents Single session (multi via SDK) One per Issue
Trigger sources IDE, Slack, GitHub, Linear, Mobile Terminal, GitHub Actions, SDK GitHub Issues, PR comments
Multi-repo Native (3.4+) Manual context Single repo per agent
Max autonomy Until PR submission 7+ hours continuous Until PR submission
Offline/local Yes (worktree mode) Yes (native) No (cloud only)
Best for Parallel orchestration, team workflows Terminal power users, CI/CD GitHub-native enterprises
Cost Pro $20/mo (included) Max $100/mo or API usage Enterprise $39/user/mo

For a comprehensive tool comparison, see our AI Coding Tools 2026 Comparison.

Understanding the LLM architectures behind these tools helps you pick the right model for each task.

Integrations and Multi-Platform Triggers

Cursor 3.4 introduced multi-platform agent triggers—you're no longer limited to the desktop IDE.

Microsoft Teams Integration

code
@Cursor in any Teams channel:
"@Cursor Fix the flaky test in auth.spec.ts — it fails on CI
but passes locally. Probably a timing issue."

The agent picks up the task, creates a cloud session, and posts results back to the Teams thread.

GitHub and Linear Triggers

yaml
# .github/workflows/cursor-agent.yml
on:
  issues:
    types: [labeled]

jobs:
  agent-fix:
    if: contains(github.event.label.name, 'cursor-agent')
    runs-on: ubuntu-latest
    steps:
      - uses: cursor/agent-action@v1
        with:
          task: ${{ github.event.issue.body }}
          project_rules: .cursor/rules

Mobile Access

Monitor and manage running agents from Cursor's mobile companion app:

  • View agent progress and status
  • Approve or reject pending changes
  • Assign new tasks on the go
  • Receive completion notifications

This means you can assign a bug fix to an agent while commuting—review and merge the result when you're back at your desk.

Limitations and Gotchas

Background Agents are powerful but not magic. These are real limitations learned from daily use:

When Agents Struggle

  1. Architectural decisions — Agents execute well on clearly scoped tasks. They're poor at deciding whether to use microservices vs. monolith, or choosing between event sourcing and CRUD. Keep strategic decisions human.

  2. Large context requirements — Very large monorepos can exceed the context window. If an agent needs to understand 50+ files simultaneously, it may miss connections. Break tasks into smaller units.

  3. Exploratory/creative coding — When the goal isn't clear ("make the UX feel better"), agents produce generic output. They need concrete acceptance criteria.

  4. Complex state machines — Multi-step business logic with subtle edge cases often requires iterative human-agent collaboration rather than pure delegation.

Cost Considerations

  • Cloud agents on Pro plan are included but usage-capped
  • Heavy parallel usage may require Pro+ ($60/mo) or Ultra ($200/mo)
  • Composer 2 is cheap per-token, but 10 parallel agents burning tokens adds up
  • Self-hosted cloud agents (Enterprise) remove per-usage costs but require infrastructure

Common Pitfalls

typescript
// ❌ Vague task description
"Make the code better"

// ✅ Specific, verifiable task
"Refactor src/services/email.ts:
- Extract HTML template rendering into a separate TemplateService
- Add retry logic for SMTP failures (3 attempts, exponential backoff)
- Add unit tests achieving >80% branch coverage
- Run existing tests to ensure no regressions"
python
# ❌ No acceptance criteria
"Add caching to the API"

# ✅ Clear scope and constraints
"""Add Redis caching to GET /api/products endpoint:
- Cache key: f"products:{category}:{page}"
- TTL: 300 seconds
- Invalidate on POST/PUT/DELETE to /api/products
- Add cache hit/miss metrics via StatsD
- Write integration test verifying cache behavior"""

Best Practices for Daily Use

Seven practices distilled from heavy daily use of Background Agents:

1. Write tasks like Jira tickets, not chat messages. Include context, acceptance criteria, and constraints. The agent can't ask clarifying questions mid-execution.

2. Start with low-stakes tasks. Don't assign your first agent a critical production hotfix. Begin with test generation, documentation updates, or dependency upgrades—build trust incrementally.

3. Use /worktree for anything experimental. It's zero cost to discard. Treat worktrees like draft branches that never touch your working copy.

4. Leverage Learned Rules aggressively. Every time you review agent output and correct something, that feedback trains Bugbot. The more you interact, the better agents get at your codebase.

5. Batch related tasks for parallel execution. Instead of one agent doing "write tests for entire app," launch 4 agents each covering one module. Parallelism is your multiplier.

6. Review diffs, not full files. Agents show you exactly what changed. Focus your review energy on the diff—trust the surrounding context that didn't change.

7. Keep .cursor/rules updated. As your project evolves, update the rules file. It's the single most impactful lever for agent output quality. Think of it like prompt engineering for your entire project.

When managing configuration files across projects, use our YAML to JSON converter for quick format switching between rule file formats.

Key Takeaways

  • Background Agents = async delegation: Assign tasks and get results without blocking your current work
  • Five patterns cover 90% of use cases: Assign-and-forget, parallel test sprints, review prep, worktree experiments, best-of-n racing
  • Configuration quality determines output quality: Invest in .cursor/rules and environment.json
  • Parallel execution is the multiplier: 4 agents × 30 min = 2 hours of work in 30 minutes
  • Know the boundaries: Agents excel at well-scoped implementation; struggle with ambiguity and architecture
  • Bugbot's 78% resolution rate makes automated pre-review practically mandatory
  • Multi-platform triggers (Teams, GitHub, Linear, mobile) mean agents work even when you're away from your IDE

For a deeper look at Cursor 3's architecture including Composer 2 and Canvases, see our Cursor 3 Cloud Agent & Composer Review.

Further Reading

FAQ

What is a Background Agent in Cursor 3?

A Background Agent is an autonomous AI coding unit that runs tasks asynchronously in Cursor's Agents Window. Think of it like assigning a ticket to a junior developer—it works independently on code changes, tests, and PRs while you focus on other work. Agents can run locally in Git worktrees, in the cloud on isolated VMs, or on remote SSH hosts.

How many Background Agents can I run in parallel?

On the Pro plan ($20/mo), you can run multiple agents simultaneously—the practical limit depends on your plan's usage quota. Each agent operates in its own isolated Git worktree, so they never conflict. Pro+ and Ultra plans offer 3x and 20x usage respectively, enabling heavier parallel workloads.

What's the difference between Background Agent and Cloud Agent?

Background Agent is the broader concept—any agent running asynchronously. Cloud Agent is a specific type of Background Agent that runs on Cursor's remote VMs with full Ubuntu environments. You can also run Background Agents locally (in worktrees) or on remote SSH connections. Cloud Agents keep working after you close your laptop.

How do I configure my project for Background Agents?

Create a .cursor/rules file defining project conventions, then optionally add .cursor/environment.json for cloud environments. For multi-repo setups (Cursor 3.4+), configure each repository's rules independently. Use the Secrets tab in Settings for API keys and credentials—never commit secrets to config files.

Are Background Agents worth it for solo developers?

Absolutely. Solo developers benefit the most from parallel agent execution—you can have one agent writing tests, another refactoring a module, and a third updating docs, effectively multiplying your output. The key is writing clear task descriptions and reviewing agent output, not doing all the coding yourself.

Internal Tools

Glossary

  • AI Agent — Autonomous AI systems that take actions
  • LLM — Large Language Models powering agents
  • Prompt Engineering — Crafting effective instructions
  • Context Window — Token limits agents operate within
  • Token — The unit of AI model input/output
  • MCP — Model Context Protocol for tool integration