TL;DR

Eino's core design philosophy is component = interface—every capability unit (ChatModel, Tool, Retriever, etc.) is defined by a Go interface with clear input/output types and method signatures, with freely swappable implementations underneath. This article dissects all 9 core components in terms of design intent and practical usage, culminating in a complete Q&A Bot with search tool integration that demonstrates the full ChatModel → Tool → Retriever collaboration workflow.

This is the second article in the Eino Framework series. We recommend reading the overview first to understand Eino's overall architecture.


Table of Contents

  1. Key Takeaways
  2. Component Architecture Philosophy
  3. ChatModel Deep Dive
  4. Tool System
  5. Retriever & Vector Search
  6. Document Processing Pipeline
  7. Embedding & ChatTemplate
  8. Lambda Custom Nodes
  9. Practice: Build a Q&A Bot with Search Tool
  10. Best Practices
  11. FAQ
  12. Summary
  13. Related Resources

Key Takeaways

  • Interface as contract: Each component defines capability boundaries through Go interfaces, enabling free implementation swapping
  • ChatModel trifecta: Generate (synchronous), Stream (streaming), BindTools (tool registration) cover all LLM interaction scenarios
  • Tool = ToolInfo + execution logic: JSON Schema parameter descriptions let models understand when to call what
  • Retriever unified abstraction: ElasticSearch / VikingDB implementations are interchangeable with zero upstream code changes
  • Document Pipeline: Loader → Transformer → Indexer three-stage pipeline covers the entire knowledge ingestion workflow
  • Lambda as glue: Any Go function can be wrapped as an orchestration node

Component Architecture Philosophy

Eino's component design follows a three-layer principle:

graph TB A["Interface Layer"] --> B["Implementation Layer"] B --> C["Replaceable Layer"] A --> D["ChatModel Interface"] A --> E["Tool Interface"] A --> F["Retriever Interface"] D --> G["OpenAI"] D --> H["Claude"] D --> I["Ollama"] E --> J["Google Search"] E --> K["Custom Tool"] F --> L["ElasticSearch"] F --> M["VikingDB"]

Design Principles:

Principle Description Practical Effect
Interface-first Define interface before implementation Compile-time type safety
Explicit I/O Strict parameter and return types Fewer runtime errors
Option pattern Variadic ...Option for runtime config Flexible without bloat
Zero-coupling Interface and implementation in separate packages Import only what you need

This design means that in AI Agent development, switching model providers or search engines requires only changing initialization code—business logic remains untouched.

Complete Component Overview

Component Purpose Available Implementations
ChatModel Interact with LLM: input Message[], output Message OpenAI, Claude, Gemini, Ark, Ollama
Tool Execute actions based on model output Google Search, DuckDuckGo, Custom
Retriever Fetch context for grounding ElasticSearch, Volc VikingDB
ChatTemplate Convert external input into prompt messages DefaultChatTemplate
Document Loader Load text from sources WebURL, Amazon S3, File
Document Transformer Transform/split text HTMLSplitter, ScoreReranker
Indexer Store and index documents ElasticSearch, Volc VikingDB
Embedding Text → vector OpenAI, Ark
Lambda Custom function node Any Go function

ChatModel Deep Dive

ChatModel is Eino's most critical component, responsible for all interactions with large language models.

Interface Definition

go
type ChatModel interface {
    Generate(ctx context.Context, input []*schema.Message, opts ...Option) (*schema.Message, error)
    Stream(ctx context.Context, input []*schema.Message, opts ...Option) (*schema.StreamReader[*schema.Message], error)
    BindTools(tools []*schema.ToolInfo) error
}

Each method serves a distinct purpose:

Method Purpose Typical Scenario
Generate Synchronous complete response One-shot Q&A, batch processing
Stream Streaming token-by-token Real-time chat UI, low TTFT
BindTools Register available tools Function Calling, Agent tool dispatch

Multi-Provider Support

go
// OpenAI
model, _ := openai.NewChatModel(ctx, &openai.ChatModelConfig{
    Model:  "gpt-4o",
    APIKey: os.Getenv("OPENAI_API_KEY"),
})

// Ollama local models
model, _ := ollama.NewChatModel(ctx, &ollama.ChatModelConfig{
    Model:   "llama3:70b",
    BaseURL: "http://localhost:11434",
})

// Ark (ByteDance Volcano Engine)
model, _ := ark.NewChatModel(ctx, &ark.ChatModelConfig{
    Model:  "ep-xxx-endpoint",
    APIKey: os.Getenv("ARK_API_KEY"),
})

Generate vs Stream Usage

go
// Synchronous - suited for background processing
message, err := model.Generate(ctx, []*schema.Message{
    schema.SystemMessage("you are a helpful assistant."),
    schema.UserMessage("what does the future AI App look like?"),
})
if err != nil {
    log.Fatal(err)
}
fmt.Println(message.Content)

// Streaming - suited for real-time UI
reader, err := model.Stream(ctx, messages)
if err != nil {
    log.Fatal(err)
}
defer reader.Close()

for {
    chunk, err := reader.Recv()
    if err == io.EOF {
        break
    }
    fmt.Print(chunk.Content) // token-by-token output
}

When you need to quickly validate JSON returned by an LLM, the JSON Formatter tool is handy for visual inspection.


Tool System

Tools bridge the gap between model "thinking" and "acting." When a model determines it needs external information or must execute an operation, it invokes a Tool through the Function Calling mechanism.

ToolInfo Definition

Each Tool describes its capabilities to the model via the ToolInfo struct:

go
type ToolInfo struct {
    Name        string          // Tool name
    Description string          // Capability description (model uses this to decide when to call)
    Parameters  *schema.Schema  // JSON Schema format parameter definition
}

Custom Tool Implementation

go
// Define a search tool
searchTool := &schema.ToolInfo{
    Name:        "web_search",
    Description: "Search the web for current information about a topic",
    Parameters: &schema.Schema{
        Type: "object",
        Properties: map[string]*schema.Schema{
            "query": {
                Type:        "string",
                Description: "The search query",
            },
            "max_results": {
                Type:        "integer",
                Description: "Maximum number of results to return",
            },
        },
        Required: []string{"query"},
    },
}

// Register with the model
err := model.BindTools([]*schema.ToolInfo{searchTool})

Parameters use JSON Schema format—if you're unfamiliar with it, use the JSON Formatter tool to validate your schema structure.

Tool and Function Calling Relationship

sequenceDiagram participant User participant Model as ChatModel participant Tool as Tool Executor User->>Model: Send message Model->>Model: Decide if tool needed Model-->>Tool: Return tool_call (name + args) Tool->>Tool: Execute tool logic Tool-->>Model: Return tool result Model->>User: Generate final answer

Eino's Tool system works in concert with the model's Function Calling capability:

  1. Developers register tools via BindTools
  2. The model decides whether to invoke tools based on conversation context
  3. The framework parses the model's tool_call response and executes the corresponding Tool
  4. Execution results are passed back as ToolMessage
  5. The model generates a final answer based on tool results

The Retriever component provides standardized document retrieval for RAG (Retrieval-Augmented Generation) applications.

Interface Abstraction

go
type Retriever interface {
    Retrieve(ctx context.Context, query string, opts ...Option) ([]*schema.Document, error)
}

Behind this clean interface lies complex vector search logic: Query → Embedding → ANN Search → Document Ranking.

ElasticSearch Implementation

go
retriever, _ := elasticsearch.NewRetriever(ctx, &elasticsearch.RetrieverConfig{
    Addresses: []string{"http://localhost:9200"},
    Index:     "knowledge_base",
    TopK:      5,
    // Supports hybrid search: vector + keyword
    SearchMode: elasticsearch.HybridSearch,
})

docs, err := retriever.Retrieve(ctx, "What are Eino's core components?")
for _, doc := range docs {
    fmt.Printf("Score: %.3f | Content: %s\n", doc.Score, doc.Content[:100])
}

VikingDB Implementation

go
retriever, _ := vikingdb.NewRetriever(ctx, &vikingdb.RetrieverConfig{
    Collection: "my_knowledge_base",
    TopK:       5,
    Region:     "cn-beijing",
})

// Identical interface - no upstream code changes needed
docs, err := retriever.Retrieve(ctx, "Vector database selection recommendations")

Implementation Comparison

Feature ElasticSearch Volc VikingDB
Deployment Self-hosted / Managed Volcano Engine cloud
Hybrid search ✅ BM25 + Vector ✅ Native support
Scale Millions Billions
Ops complexity High Low (fully managed)
Cost model Resource-based Pay-per-query

Document Processing Pipeline

Before vector search can work, raw documents must pass through a standardized processing pipeline:

graph LR A["Document Loader"] --> B["Document Transformer"] B --> C["Embedding"] C --> D["Indexer"] A1["WebURL / S3 / File"] --> A B1["HTMLSplitter / Reranker"] --> B C1["OpenAI / Ark Embedding"] --> C D1["ES / VikingDB"] --> D

Document Loader — Data Ingestion

go
// Load from Web URL
loader, _ := weburl.NewLoader(&weburl.Config{
    URL:     "https://example.com/docs",
    Timeout: 30 * time.Second,
})
docs, _ := loader.Load(ctx)

// Load from local files
loader, _ := file.NewLoader(&file.Config{
    Path: "/data/knowledge/*.md",
})
docs, _ := loader.Load(ctx)

Document Transformer — Text Processing

go
// HTML splitter: semantic chunking
splitter, _ := htmlsplitter.NewTransformer(&htmlsplitter.Config{
    ChunkSize:    512,
    ChunkOverlap: 64,
})
chunks, _ := splitter.Transform(ctx, docs)

// Score Reranker: relevance-based reordering
reranker, _ := scorereranker.NewTransformer(&scorereranker.Config{
    Model: "bge-reranker-v2",
    TopN:  3,
})
ranked, _ := reranker.Transform(ctx, chunks)

Indexer — Storage and Indexing

go
indexer, _ := elasticsearch.NewIndexer(ctx, &elasticsearch.IndexerConfig{
    Addresses: []string{"http://localhost:9200"},
    Index:     "knowledge_base",
})

// Batch index documents
err := indexer.Store(ctx, chunks)

Embedding & ChatTemplate

Embedding — Text Vectorization

The Embedding component converts text into high-dimensional vector representations—the foundation of vector search:

go
embedder, _ := openai.NewEmbedding(ctx, &openai.EmbeddingConfig{
    Model: "text-embedding-3-small",
})

vectors, err := embedder.EmbedStrings(ctx, []string{
    "Eino is a Go AI framework",
    "ChatModel interface supports multiple models",
})
// vectors[0] = []float64{0.012, -0.034, ...} (1536 dimensions)

ChatTemplate — Prompt Assembly

ChatTemplate assembles external inputs (user questions, retrieved documents, etc.) into a standard Message list:

go
template := chattemplate.New(&chattemplate.Config{
    Templates: []*schema.Message{
        schema.SystemMessage("You are a professional technical assistant.\n\nReference materials:\n{{.context}}"),
        schema.UserMessage("{{.question}}"),
    },
})

messages, _ := template.Format(ctx, map[string]interface{}{
    "context":  retrievedDocs,
    "question": "What are Eino's component design principles?",
})
// Outputs standard []*schema.Message, ready for ChatModel

Lambda Custom Nodes

Lambda is the "Swiss Army knife" of Eino's orchestration system, wrapping any Go function as an orchestratable node:

go
// Data formatting Lambda
formatNode := lambda.New(func(ctx context.Context, docs []*schema.Document) (string, error) {
    var sb strings.Builder
    for i, doc := range docs {
        sb.WriteString(fmt.Sprintf("[%d] %s\n", i+1, doc.Content))
    }
    return sb.String(), nil
})

// Filtering Lambda
filterNode := lambda.New(func(ctx context.Context, msg *schema.Message) (*schema.Message, error) {
    if len(msg.Content) > 10000 {
        msg.Content = msg.Content[:10000] + "...(truncated)"
    }
    return msg, nil
})

Lambda nodes can be chained between any two nodes in an orchestration graph for data transformation, validation, logging, and other lightweight operations.


Practice: Build a Q&A Bot with Search Tool

Here's a complete example integrating ChatModel, Tool, and Retriever to build a Q&A Bot with web search capabilities:

go
package main

import (
    "context"
    "fmt"
    "log"
    "os"

    "github.com/cloudwego/eino/components/model/openai"
    "github.com/cloudwego/eino/schema"
)

// Define search tool execution logic
func executeSearch(query string) string {
    // In production, connect to a real search API
    return fmt.Sprintf("Search results: latest information about '%s'...", query)
}

func main() {
    ctx := context.Background()

    // 1. Initialize ChatModel
    model, err := openai.NewChatModel(ctx, &openai.ChatModelConfig{
        Model:  "gpt-4o",
        APIKey: os.Getenv("OPENAI_API_KEY"),
    })
    if err != nil {
        log.Fatal(err)
    }

    // 2. Define and register Tool
    searchTool := &schema.ToolInfo{
        Name:        "web_search",
        Description: "Search the internet for current information",
        Parameters: &schema.Schema{
            Type: "object",
            Properties: map[string]*schema.Schema{
                "query": {
                    Type:        "string",
                    Description: "Search query keywords",
                },
            },
            Required: []string{"query"},
        },
    }

    if err := model.BindTools([]*schema.ToolInfo{searchTool}); err != nil {
        log.Fatal(err)
    }

    // 3. First round: model decides whether to call a tool
    messages := []*schema.Message{
        schema.SystemMessage("You are a helpful assistant that can search for the latest information."),
        schema.UserMessage("What is the latest version of the Eino framework?"),
    }

    response, err := model.Generate(ctx, messages)
    if err != nil {
        log.Fatal(err)
    }

    // 4. Check if model requested tool invocation
    if len(response.ToolCalls) > 0 {
        for _, call := range response.ToolCalls {
            fmt.Printf("Model requests tool: %s, args: %s\n", call.Function.Name, call.Function.Arguments)

            // 5. Execute tool
            result := executeSearch(call.Function.Arguments)

            // 6. Pass tool result back
            messages = append(messages, response) // assistant message (with tool_call)
            messages = append(messages, schema.ToolMessage(result, call.ID))
        }

        // 7. Model generates final answer based on tool results
        finalResponse, err := model.Generate(ctx, messages)
        if err != nil {
            log.Fatal(err)
        }
        fmt.Println("Final answer:", finalResponse.Content)
    } else {
        fmt.Println("Direct answer:", response.Content)
    }
}

If you need to convert Go structs to JSON format during debugging, the JSON to Go tool supports bidirectional conversion.


Best Practices

Component Selection Guidelines

Scenario Recommended Approach
Rapid prototyping Ollama local model + File Loader
Production RAG Ark/OpenAI Embedding + VikingDB Retriever
Agent tool chains ChatModel.BindTools + custom Tool combination
Large document processing WebURL Loader → HTMLSplitter → batch Indexer

Error Handling Pattern

go
// Add timeout and retry for ChatModel calls
ctx, cancel := context.WithTimeout(ctx, 30*time.Second)
defer cancel()

var response *schema.Message
for retries := 0; retries < 3; retries++ {
    response, err = model.Generate(ctx, messages)
    if err == nil {
        break
    }
    time.Sleep(time.Duration(retries+1) * time.Second)
}

Performance Optimization Tips

  1. Batch Embedding: Process multiple text segments in a single request to reduce API calls
  2. Stream-first: Always use Stream for user-facing scenarios to minimize Time-to-First-Token
  3. Retriever warm-up: Execute an empty query at startup to warm the connection pool
  4. Keep Lambda lightweight: Avoid heavy I/O operations in Lambda nodes; target < 10ms execution

FAQ

Q: How do I hot-swap models without restarting the service?

A: Leverage Go's interface semantics by maintaining a ChatModel variable at the upper layer and dynamically replacing the implementation via a config center:

go
var currentModel ChatModel // interface variable

func switchModel(provider string) {
    switch provider {
    case "openai":
        currentModel, _ = openai.NewChatModel(ctx, openaiConfig)
    case "ollama":
        currentModel, _ = ollama.NewChatModel(ctx, ollamaConfig)
    }
}

Q: How do I handle Tool execution timeouts?

A: Use context-based timeouts with a fallback at the Tool execution layer:

go
toolCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()

result, err := executeTool(toolCtx, toolCall)
if err != nil {
    result = "Tool execution timed out, please try answering without this tool"
}

Q: How do I merge results from multiple Retrievers?

A: Use a Lambda node for result aggregation and deduplication:

go
mergeNode := lambda.New(func(ctx context.Context, results [][]*schema.Document) ([]*schema.Document, error) {
    seen := make(map[string]bool)
    var merged []*schema.Document
    for _, docs := range results {
        for _, doc := range docs {
            if !seen[doc.ID] {
                seen[doc.ID] = true
                merged = append(merged, doc)
            }
        }
    }
    return merged, nil
})

Summary

Eino's component system is a battle-tested Go AI infrastructure validated at ByteDance scale. The core design can be summarized as:

  • ChatModel unifies multi-model integration complexity—one interface covering sync, streaming, and tool binding
  • Tool standardizes the Function Calling workflow, giving Agents the ability to "act"
  • Retriever abstracts away vector search implementation differences, providing pluggable backends for RAG
  • Document Pipeline covers the complete knowledge ingestion workflow from loading to indexing
  • Lambda serves as glue, letting arbitrary logic fit into the orchestration graph

The next article explores how to use Eino's orchestration engines (Chain, Graph & Workflow) to assemble these components into complex AI applications.