What is ChatModel?
ChatModel is an application-layer abstraction for invoking conversational language models through structured messages, normalized parameters, optional streaming, tool-call support, and provider-independent response handling.
How It Works
A ChatModel is the boundary between AI application orchestration and the specific model provider API. Instead of letting every workflow know the details of OpenAI-compatible endpoints, Claude-style messages, local model runtimes, retry behavior, or streaming chunk formats, a ChatModel exposes a stable contract for sending chat messages and receiving model outputs. A strong ChatModel abstraction should make model replacement, fallback, testing, tracing, and policy enforcement possible without rewriting the rest of the application.
Key Characteristics
- Message-oriented interface: operates on structured chat roles rather than raw text concatenation
- Provider normalization: hides differences in request fields, response objects, streaming chunks, and error shapes
- Capability surface: may expose tool calling, JSON or structured output, multimodal input, token accounting, and streaming
- Operational boundary: gives teams one place to apply timeout, retry, fallback, rate-limit, and tracing policies
- Testability: allows deterministic mocks and fixtures for workflows that otherwise depend on nondeterministic model output
Common Use Cases
- Switching between cloud LLMs and local models without changing orchestration code
- Injecting a model call into a Chain, Graph, Workflow, or Agent runtime
- Capturing token usage, latency, request IDs, and model errors in one consistent layer
- Testing RAG and tool-calling flows with mocked model responses
- Implementing model fallback when the primary provider fails or exceeds cost limits
Example
Loading code...Frequently Asked Questions
Is ChatModel the same as an LLM?
No. An LLM is the underlying model. A ChatModel is the application interface used to call a conversational model. The same ChatModel abstraction may wrap different providers, versions, or deployment modes.
Why not call the model provider SDK directly?
Direct SDK calls are acceptable in small experiments, but production systems benefit from a stable boundary. A ChatModel abstraction centralizes retries, timeouts, streaming, tracing, fallback, tests, and provider-specific normalization.
What should a production ChatModel expose?
At minimum, it should expose message input, model options, response content, errors, and provider metadata. Depending on the application, it may also expose streaming chunks, tool-call requests, token usage, structured output validation, and safety information.
Can a ChatModel be mocked?
Yes. One reason to define ChatModel as an interface is to replace the real model with deterministic fixtures during tests. This is important for evaluating orchestration logic without paying for model calls or depending on nondeterministic responses.