TL;DR
Eino's core design philosophy is component = interface—every capability unit (ChatModel, Tool, Retriever, etc.) is defined by a Go interface with clear input/output types and method signatures, with freely swappable implementations underneath. This article dissects all 9 core components in terms of design intent and practical usage, culminating in a complete Q&A Bot with search tool integration that demonstrates the full ChatModel → Tool → Retriever collaboration workflow.
This is the second article in the Eino Framework series. We recommend reading the overview first to understand Eino's overall architecture.
Table of Contents
- Key Takeaways
- Component Architecture Philosophy
- ChatModel Deep Dive
- Tool System
- Retriever & Vector Search
- Document Processing Pipeline
- Embedding & ChatTemplate
- Lambda Custom Nodes
- Practice: Build a Q&A Bot with Search Tool
- Best Practices
- FAQ
- Summary
- Related Resources
Key Takeaways
- Interface as contract: Each component defines capability boundaries through Go interfaces, enabling free implementation swapping
- ChatModel trifecta:
Generate(synchronous),Stream(streaming),BindTools(tool registration) cover all LLM interaction scenarios - Tool = ToolInfo + execution logic: JSON Schema parameter descriptions let models understand when to call what
- Retriever unified abstraction: ElasticSearch / VikingDB implementations are interchangeable with zero upstream code changes
- Document Pipeline: Loader → Transformer → Indexer three-stage pipeline covers the entire knowledge ingestion workflow
- Lambda as glue: Any Go function can be wrapped as an orchestration node
Component Architecture Philosophy
Eino's component design follows a three-layer principle:
Design Principles:
| Principle | Description | Practical Effect |
|---|---|---|
| Interface-first | Define interface before implementation | Compile-time type safety |
| Explicit I/O | Strict parameter and return types | Fewer runtime errors |
| Option pattern | Variadic ...Option for runtime config |
Flexible without bloat |
| Zero-coupling | Interface and implementation in separate packages | Import only what you need |
This design means that in AI Agent development, switching model providers or search engines requires only changing initialization code—business logic remains untouched.
Complete Component Overview
| Component | Purpose | Available Implementations |
|---|---|---|
| ChatModel | Interact with LLM: input Message[], output Message | OpenAI, Claude, Gemini, Ark, Ollama |
| Tool | Execute actions based on model output | Google Search, DuckDuckGo, Custom |
| Retriever | Fetch context for grounding | ElasticSearch, Volc VikingDB |
| ChatTemplate | Convert external input into prompt messages | DefaultChatTemplate |
| Document Loader | Load text from sources | WebURL, Amazon S3, File |
| Document Transformer | Transform/split text | HTMLSplitter, ScoreReranker |
| Indexer | Store and index documents | ElasticSearch, Volc VikingDB |
| Embedding | Text → vector | OpenAI, Ark |
| Lambda | Custom function node | Any Go function |
ChatModel Deep Dive
ChatModel is Eino's most critical component, responsible for all interactions with large language models.
Interface Definition
type ChatModel interface {
Generate(ctx context.Context, input []*schema.Message, opts ...Option) (*schema.Message, error)
Stream(ctx context.Context, input []*schema.Message, opts ...Option) (*schema.StreamReader[*schema.Message], error)
BindTools(tools []*schema.ToolInfo) error
}
Each method serves a distinct purpose:
| Method | Purpose | Typical Scenario |
|---|---|---|
Generate |
Synchronous complete response | One-shot Q&A, batch processing |
Stream |
Streaming token-by-token | Real-time chat UI, low TTFT |
BindTools |
Register available tools | Function Calling, Agent tool dispatch |
Multi-Provider Support
// OpenAI
model, _ := openai.NewChatModel(ctx, &openai.ChatModelConfig{
Model: "gpt-4o",
APIKey: os.Getenv("OPENAI_API_KEY"),
})
// Ollama local models
model, _ := ollama.NewChatModel(ctx, &ollama.ChatModelConfig{
Model: "llama3:70b",
BaseURL: "http://localhost:11434",
})
// Ark (ByteDance Volcano Engine)
model, _ := ark.NewChatModel(ctx, &ark.ChatModelConfig{
Model: "ep-xxx-endpoint",
APIKey: os.Getenv("ARK_API_KEY"),
})
Generate vs Stream Usage
// Synchronous - suited for background processing
message, err := model.Generate(ctx, []*schema.Message{
schema.SystemMessage("you are a helpful assistant."),
schema.UserMessage("what does the future AI App look like?"),
})
if err != nil {
log.Fatal(err)
}
fmt.Println(message.Content)
// Streaming - suited for real-time UI
reader, err := model.Stream(ctx, messages)
if err != nil {
log.Fatal(err)
}
defer reader.Close()
for {
chunk, err := reader.Recv()
if err == io.EOF {
break
}
fmt.Print(chunk.Content) // token-by-token output
}
When you need to quickly validate JSON returned by an LLM, the JSON Formatter tool is handy for visual inspection.
Tool System
Tools bridge the gap between model "thinking" and "acting." When a model determines it needs external information or must execute an operation, it invokes a Tool through the Function Calling mechanism.
ToolInfo Definition
Each Tool describes its capabilities to the model via the ToolInfo struct:
type ToolInfo struct {
Name string // Tool name
Description string // Capability description (model uses this to decide when to call)
Parameters *schema.Schema // JSON Schema format parameter definition
}
Custom Tool Implementation
// Define a search tool
searchTool := &schema.ToolInfo{
Name: "web_search",
Description: "Search the web for current information about a topic",
Parameters: &schema.Schema{
Type: "object",
Properties: map[string]*schema.Schema{
"query": {
Type: "string",
Description: "The search query",
},
"max_results": {
Type: "integer",
Description: "Maximum number of results to return",
},
},
Required: []string{"query"},
},
}
// Register with the model
err := model.BindTools([]*schema.ToolInfo{searchTool})
Parameters use JSON Schema format—if you're unfamiliar with it, use the JSON Formatter tool to validate your schema structure.
Tool and Function Calling Relationship
Eino's Tool system works in concert with the model's Function Calling capability:
- Developers register tools via
BindTools - The model decides whether to invoke tools based on conversation context
- The framework parses the model's
tool_callresponse and executes the corresponding Tool - Execution results are passed back as
ToolMessage - The model generates a final answer based on tool results
Retriever & Vector Search
The Retriever component provides standardized document retrieval for RAG (Retrieval-Augmented Generation) applications.
Interface Abstraction
type Retriever interface {
Retrieve(ctx context.Context, query string, opts ...Option) ([]*schema.Document, error)
}
Behind this clean interface lies complex vector search logic: Query → Embedding → ANN Search → Document Ranking.
ElasticSearch Implementation
retriever, _ := elasticsearch.NewRetriever(ctx, &elasticsearch.RetrieverConfig{
Addresses: []string{"http://localhost:9200"},
Index: "knowledge_base",
TopK: 5,
// Supports hybrid search: vector + keyword
SearchMode: elasticsearch.HybridSearch,
})
docs, err := retriever.Retrieve(ctx, "What are Eino's core components?")
for _, doc := range docs {
fmt.Printf("Score: %.3f | Content: %s\n", doc.Score, doc.Content[:100])
}
VikingDB Implementation
retriever, _ := vikingdb.NewRetriever(ctx, &vikingdb.RetrieverConfig{
Collection: "my_knowledge_base",
TopK: 5,
Region: "cn-beijing",
})
// Identical interface - no upstream code changes needed
docs, err := retriever.Retrieve(ctx, "Vector database selection recommendations")
Implementation Comparison
| Feature | ElasticSearch | Volc VikingDB |
|---|---|---|
| Deployment | Self-hosted / Managed | Volcano Engine cloud |
| Hybrid search | ✅ BM25 + Vector | ✅ Native support |
| Scale | Millions | Billions |
| Ops complexity | High | Low (fully managed) |
| Cost model | Resource-based | Pay-per-query |
Document Processing Pipeline
Before vector search can work, raw documents must pass through a standardized processing pipeline:
Document Loader — Data Ingestion
// Load from Web URL
loader, _ := weburl.NewLoader(&weburl.Config{
URL: "https://example.com/docs",
Timeout: 30 * time.Second,
})
docs, _ := loader.Load(ctx)
// Load from local files
loader, _ := file.NewLoader(&file.Config{
Path: "/data/knowledge/*.md",
})
docs, _ := loader.Load(ctx)
Document Transformer — Text Processing
// HTML splitter: semantic chunking
splitter, _ := htmlsplitter.NewTransformer(&htmlsplitter.Config{
ChunkSize: 512,
ChunkOverlap: 64,
})
chunks, _ := splitter.Transform(ctx, docs)
// Score Reranker: relevance-based reordering
reranker, _ := scorereranker.NewTransformer(&scorereranker.Config{
Model: "bge-reranker-v2",
TopN: 3,
})
ranked, _ := reranker.Transform(ctx, chunks)
Indexer — Storage and Indexing
indexer, _ := elasticsearch.NewIndexer(ctx, &elasticsearch.IndexerConfig{
Addresses: []string{"http://localhost:9200"},
Index: "knowledge_base",
})
// Batch index documents
err := indexer.Store(ctx, chunks)
Embedding & ChatTemplate
Embedding — Text Vectorization
The Embedding component converts text into high-dimensional vector representations—the foundation of vector search:
embedder, _ := openai.NewEmbedding(ctx, &openai.EmbeddingConfig{
Model: "text-embedding-3-small",
})
vectors, err := embedder.EmbedStrings(ctx, []string{
"Eino is a Go AI framework",
"ChatModel interface supports multiple models",
})
// vectors[0] = []float64{0.012, -0.034, ...} (1536 dimensions)
ChatTemplate — Prompt Assembly
ChatTemplate assembles external inputs (user questions, retrieved documents, etc.) into a standard Message list:
template := chattemplate.New(&chattemplate.Config{
Templates: []*schema.Message{
schema.SystemMessage("You are a professional technical assistant.\n\nReference materials:\n{{.context}}"),
schema.UserMessage("{{.question}}"),
},
})
messages, _ := template.Format(ctx, map[string]interface{}{
"context": retrievedDocs,
"question": "What are Eino's component design principles?",
})
// Outputs standard []*schema.Message, ready for ChatModel
Lambda Custom Nodes
Lambda is the "Swiss Army knife" of Eino's orchestration system, wrapping any Go function as an orchestratable node:
// Data formatting Lambda
formatNode := lambda.New(func(ctx context.Context, docs []*schema.Document) (string, error) {
var sb strings.Builder
for i, doc := range docs {
sb.WriteString(fmt.Sprintf("[%d] %s\n", i+1, doc.Content))
}
return sb.String(), nil
})
// Filtering Lambda
filterNode := lambda.New(func(ctx context.Context, msg *schema.Message) (*schema.Message, error) {
if len(msg.Content) > 10000 {
msg.Content = msg.Content[:10000] + "...(truncated)"
}
return msg, nil
})
Lambda nodes can be chained between any two nodes in an orchestration graph for data transformation, validation, logging, and other lightweight operations.
Practice: Build a Q&A Bot with Search Tool
Here's a complete example integrating ChatModel, Tool, and Retriever to build a Q&A Bot with web search capabilities:
package main
import (
"context"
"fmt"
"log"
"os"
"github.com/cloudwego/eino/components/model/openai"
"github.com/cloudwego/eino/schema"
)
// Define search tool execution logic
func executeSearch(query string) string {
// In production, connect to a real search API
return fmt.Sprintf("Search results: latest information about '%s'...", query)
}
func main() {
ctx := context.Background()
// 1. Initialize ChatModel
model, err := openai.NewChatModel(ctx, &openai.ChatModelConfig{
Model: "gpt-4o",
APIKey: os.Getenv("OPENAI_API_KEY"),
})
if err != nil {
log.Fatal(err)
}
// 2. Define and register Tool
searchTool := &schema.ToolInfo{
Name: "web_search",
Description: "Search the internet for current information",
Parameters: &schema.Schema{
Type: "object",
Properties: map[string]*schema.Schema{
"query": {
Type: "string",
Description: "Search query keywords",
},
},
Required: []string{"query"},
},
}
if err := model.BindTools([]*schema.ToolInfo{searchTool}); err != nil {
log.Fatal(err)
}
// 3. First round: model decides whether to call a tool
messages := []*schema.Message{
schema.SystemMessage("You are a helpful assistant that can search for the latest information."),
schema.UserMessage("What is the latest version of the Eino framework?"),
}
response, err := model.Generate(ctx, messages)
if err != nil {
log.Fatal(err)
}
// 4. Check if model requested tool invocation
if len(response.ToolCalls) > 0 {
for _, call := range response.ToolCalls {
fmt.Printf("Model requests tool: %s, args: %s\n", call.Function.Name, call.Function.Arguments)
// 5. Execute tool
result := executeSearch(call.Function.Arguments)
// 6. Pass tool result back
messages = append(messages, response) // assistant message (with tool_call)
messages = append(messages, schema.ToolMessage(result, call.ID))
}
// 7. Model generates final answer based on tool results
finalResponse, err := model.Generate(ctx, messages)
if err != nil {
log.Fatal(err)
}
fmt.Println("Final answer:", finalResponse.Content)
} else {
fmt.Println("Direct answer:", response.Content)
}
}
If you need to convert Go structs to JSON format during debugging, the JSON to Go tool supports bidirectional conversion.
Best Practices
Component Selection Guidelines
| Scenario | Recommended Approach |
|---|---|
| Rapid prototyping | Ollama local model + File Loader |
| Production RAG | Ark/OpenAI Embedding + VikingDB Retriever |
| Agent tool chains | ChatModel.BindTools + custom Tool combination |
| Large document processing | WebURL Loader → HTMLSplitter → batch Indexer |
Error Handling Pattern
// Add timeout and retry for ChatModel calls
ctx, cancel := context.WithTimeout(ctx, 30*time.Second)
defer cancel()
var response *schema.Message
for retries := 0; retries < 3; retries++ {
response, err = model.Generate(ctx, messages)
if err == nil {
break
}
time.Sleep(time.Duration(retries+1) * time.Second)
}
Performance Optimization Tips
- Batch Embedding: Process multiple text segments in a single request to reduce API calls
- Stream-first: Always use
Streamfor user-facing scenarios to minimize Time-to-First-Token - Retriever warm-up: Execute an empty query at startup to warm the connection pool
- Keep Lambda lightweight: Avoid heavy I/O operations in Lambda nodes; target < 10ms execution
FAQ
Q: How do I hot-swap models without restarting the service?
A: Leverage Go's interface semantics by maintaining a ChatModel variable at the upper layer and dynamically replacing the implementation via a config center:
var currentModel ChatModel // interface variable
func switchModel(provider string) {
switch provider {
case "openai":
currentModel, _ = openai.NewChatModel(ctx, openaiConfig)
case "ollama":
currentModel, _ = ollama.NewChatModel(ctx, ollamaConfig)
}
}
Q: How do I handle Tool execution timeouts?
A: Use context-based timeouts with a fallback at the Tool execution layer:
toolCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()
result, err := executeTool(toolCtx, toolCall)
if err != nil {
result = "Tool execution timed out, please try answering without this tool"
}
Q: How do I merge results from multiple Retrievers?
A: Use a Lambda node for result aggregation and deduplication:
mergeNode := lambda.New(func(ctx context.Context, results [][]*schema.Document) ([]*schema.Document, error) {
seen := make(map[string]bool)
var merged []*schema.Document
for _, docs := range results {
for _, doc := range docs {
if !seen[doc.ID] {
seen[doc.ID] = true
merged = append(merged, doc)
}
}
}
return merged, nil
})
Summary
Eino's component system is a battle-tested Go AI infrastructure validated at ByteDance scale. The core design can be summarized as:
- ChatModel unifies multi-model integration complexity—one interface covering sync, streaming, and tool binding
- Tool standardizes the Function Calling workflow, giving Agents the ability to "act"
- Retriever abstracts away vector search implementation differences, providing pluggable backends for RAG
- Document Pipeline covers the complete knowledge ingestion workflow from loading to indexing
- Lambda serves as glue, letting arbitrary logic fit into the orchestration graph
The next article explores how to use Eino's orchestration engines (Chain, Graph & Workflow) to assemble these components into complex AI applications.