What is MCP Sampling?

MCP Sampling is a Model Context Protocol capability that allows an MCP Server to request language-model generation through the MCP Host or Client under user and client control.

Quick Facts

Specification	Official Specification

How It Works

MCP Sampling reverses the usual direction of model invocation. Instead of only the host calling server tools, a server can ask the host to perform a model completion for a specific purpose. This is useful when a server needs LLM assistance but should not hold model credentials or directly call a provider. Sampling must be treated carefully because it can amplify trust issues: the host should review the request, apply policy, preserve user consent, and avoid letting a server silently steer the model outside the user's intent.

Key Characteristics

Server-initiated request: lets an MCP Server ask the host side for model generation
Credential separation: avoids giving every MCP Server direct access to model provider keys
Policy-controlled: should remain subject to host, client, and user approval rules
Context-sensitive: sampling requests may include messages, system intent, and constraints
Security-sensitive: untrusted servers must not be allowed to manipulate model behavior silently

Common Use Cases

An MCP Server asking the host model to summarize retrieved records before returning them
A tool integration requesting model help to transform data into a user-readable explanation
Keeping model credentials centralized in the host rather than distributed across servers
Applying user-visible policy checks before a server-initiated generation occurs
Building advanced server workflows that need LLM reasoning without owning the LLM runtime

Example

Loading code...

Frequently Asked Questions

Why does MCP Sampling exist?

It allows servers to request model assistance without directly owning model credentials or bypassing host policy. The host remains the control point for model access.

Is Sampling the same as a tool call?

No. A tool call asks a server to perform an operation. Sampling asks the host side to perform language-model generation on behalf of a server request.

What should a host check before allowing Sampling?

The host should check server trust, user intent, request content, data sensitivity, model policy, token budget, and whether the request should be visible to or approved by the user.

Can Sampling create security risks?

Yes. A malicious or compromised server could try to influence model behavior, leak data through prompts, or generate misleading instructions. Hosts should apply strict policy and logging.

Related Tools

MCP Server Directory

Comprehensive directory of MCP (Model Context Protocol) servers, SDKs, clients, and tools. Discover official implementations, community servers, and development resources for building AI-powered applications with Claude and other LLMs.

JSON Formatter

Format, beautify, validate and minify JSON online for free. Features syntax highlighting, tree view, history tracking, and one-click copy. No signup required. 100% client-side processing for privacy.

AI Websites Directory

An authoritative, comprehensive, and continuously updated AI resources directory. It covers global and domestic model providers, open-source ecosystems, research indexes and leaderboards, developer platforms, and curated tool catalogs—helping you quickly discover, compare, and choose the right AI products and references. Supports keyword search and favorites, with clear category sections and an expanding dataset for better experience.

Related Terms

MCP

MCP (Model Context Protocol) is an open protocol standard introduced by Anthropic in 2024, enabling standardized connections between AI applications and external tools/data sources through JSON-RPC 2.0 specification, solving the fragmentation problem of AI Agent integration with heterogeneous systems.

LLM

LLM (Large Language Model) is a type of artificial intelligence model trained on massive amounts of text data to understand, generate, and manipulate human language with remarkable fluency and contextual awareness, powering applications from conversational AI to code generation.

Guardrails

Guardrails are safety mechanisms and constraints implemented in AI systems to prevent harmful, inappropriate, or unintended outputs while ensuring the model operates within acceptable boundaries.

Prompt Injection

Prompt Injection is a cybersecurity attack specifically targeting applications built on Large Language Models (LLMs). In this attack, a malicious user crafts an input designed to trick the LLM into ignoring its original System Prompt and safety guardrails, forcing it to execute the attacker's hidden instructions instead. This attack exploits a fundamental architectural flaw in current LLMs: the inability to strictly separate 'system control instructions' from 'user input data'.

MCP Protocol Deep Dive【2026】- The New Paradigm for Building AI Applications

Deep dive into MCP (Model Context Protocol) architecture and principles. Includes Server development tutorials, client comparisons, and complete code examples. Master the new paradigm of AI development!

2026-02-06

LLM Guardrails Engineering in Practice: How to Safely Deploy Large Models to Production [2026]

A deep dive into LLM Guardrails principles and engineering. Covers NeMo Guardrails, Guardrails AI, and Llama Guard. Includes Python/Node.js examples for building safe, reliable, and hallucination-free AI applications.

2026-04-25

The Art of AI Agent Tools: Best Practices for Writing High-Quality MCP Tools and Function Definitions

Master the craft of writing tools that LLMs can use reliably. This guide covers the anatomy of a great tool definition, 10 battle-tested best practices for MCP tools and function calling schemas, anti-patterns to avoid, testing strategies, and composition patterns — with before/after code examples.

2026-04-23