Question 1

How much data do I need for fine-tuning a language model?

Accepted Answer

The amount varies by task and method. For instruction tuning, quality matters more than quantity—even 1,000-10,000 high-quality examples can significantly improve performance. Parameter-efficient methods like LoRA can work with smaller datasets (hundreds to thousands of examples), while full fine-tuning typically benefits from larger datasets. Always prioritize data quality over quantity.

Question 2

What is the difference between LoRA and full fine-tuning?

Accepted Answer

Full fine-tuning updates all model parameters, requiring significant compute and memory but potentially achieving the best performance. LoRA (Low-Rank Adaptation) only trains small adapter matrices added to attention layers, typically updating just 0.1-1% of parameters. LoRA uses much less memory, trains faster, and produces smaller checkpoint files while achieving 90-95% of full fine-tuning quality.

Question 3

Can fine-tuning make a model forget its original capabilities?

Accepted Answer

Yes, this is called catastrophic forgetting. Aggressive fine-tuning on narrow data can degrade general capabilities. To mitigate this, use lower learning rates, include diverse training data, apply regularization techniques, or use parameter-efficient methods like LoRA that preserve most original weights. Mixing some general-purpose data with domain-specific data also helps.

Question 4

When should I choose fine-tuning over few-shot prompting?

Accepted Answer

Choose fine-tuning when you need consistent behavior at scale, have specific formatting or style requirements, want to reduce per-query costs (shorter prompts), require improved performance on specialized tasks, or need to embed domain knowledge. Few-shot prompting is better for rapid prototyping, tasks with frequently changing requirements, or when you lack training data.

Question 5

What hardware do I need to fine-tune large language models?

Accepted Answer

Hardware requirements depend on the method and model size. Full fine-tuning of 7B+ models typically requires multiple high-end GPUs (A100, H100) with 40GB+ VRAM each. QLoRA enables fine-tuning 7B-13B models on consumer GPUs with 24GB VRAM (RTX 3090/4090). For very large models (70B+), even QLoRA requires multiple GPUs or cloud instances with substantial memory.

Created	Popularized with BERT (2018) and GPT models
Specification	Official Specification

What is Fine-tuning?

Quick Facts

How It Works

Key Characteristics

Common Use Cases

Example

Frequently Asked Questions

How much data do I need for fine-tuning a language model?

What is the difference between LoRA and full fine-tuning?

Can fine-tuning make a model forget its original capabilities?

When should I choose fine-tuning over few-shot prompting?

What hardware do I need to fine-tune large language models?

Related Tools

JSON Formatter

CSV to JSON Converter

Related Terms

LoRA

PEFT

RLHF

DPO

Related Articles

LLM Fine-Tuning: Full, LoRA & QLoRA Methods Compared

LoRA Fine-Tuning: QLoRA Setup & PEFT Guide

RAG vs Fine-tuning: Which LLM Approach to Choose? [2026]