LLM Fine-tuning & Deployment

Master LLM fine-tuning (LoRA/RLHF), quantization, and best practices for local and server-side production deployment.

6 Articles in This Series · 创建于 2026-02-21

LLM Fine-Tuning: Full, LoRA & QLoRA Methods Compared

Fine-tune large language models with full fine-tuning, LoRA, or QLoRA. Includes Hugging Face code, data preparation, and when to choose fine-tuning vs RAG.

2026-02-21QubitTool Team

LoRA Fine-Tuning: QLoRA Setup & PEFT Guide

Fine-tune LLMs efficiently with LoRA and QLoRA. Step-by-step PEFT setup, key hyperparameters, and memory optimization for Hugging Face model customization.

2026-02-21QubitTool Team

What is RLHF? How ChatGPT Learns from Human Feedback

RLHF aligns AI with human preferences through reward modeling and PPO. Learn the technique behind ChatGPT, InstructGPT, and compare RLHF vs DPO approaches.

2026-02-21QubitTool Team

What is Model Quantization? INT8, GPTQ & AWQ Explained

Model quantization reduces LLM size by 75% with minimal quality loss. Learn INT8/INT4, GPTQ, AWQ, GGUF methods with practical code examples using llama.cpp.

2026-02-21QubitTool Team

Ollama Advanced Practical Guide: Running and Fine-tuning Open Source LLMs Locally

With increasing demands for data privacy and offline computing, running Large Language Models (LLMs) locally has become a top choice for many enterprises and developers. This article delves into the advanced usage of Ollama, including custom Modelfiles, REST API integration, and lightweight fine-tuning with external data.

2026-04-03QubitTool Tech Team

WebLLM Practical Guide: Engineering Architecture for Running Large Language Models in the Browser

Explore the execution mechanism of browser-based Large Language Models (LLMs) based on WebGPU. This article details the WebLLM architecture and guides you in building an offline AI application with zero server inference costs, complete with model caching and VRAM optimization strategies.

2026-04-03QubitTool Tech Team

LLM Fine-tuning & Deployment

LLM Fine-Tuning: Full, LoRA & QLoRA Methods Compared

LoRA Fine-Tuning: QLoRA Setup & PEFT Guide

What is RLHF? How ChatGPT Learns from Human Feedback

What is Model Quantization? INT8, GPTQ & AWQ Explained

Ollama Advanced Practical Guide: Running and Fine-tuning Open Source LLMs Locally

WebLLM Practical Guide: Engineering Architecture for Running Large Language Models in the Browser

Related Tools

AI Websites Directory

AI Prompt Websites

MCP Server Directory

AI Agent Directory

Related Terms

Fine-tuning

LLM

LoRA

Quantization

RLHF

Agent Memory

Agentic Workflow

AGI

AI Agent

AI Code Review