What is Adapter?
Adapter is a small trainable module added to a pretrained neural network so the model can be adapted without updating all original weights.
How It Works
Adapters are a family of parameter-efficient fine-tuning techniques. Instead of modifying every parameter in a large model, training updates a smaller set of inserted or attached parameters while the base model remains mostly frozen. This reduces memory, storage, and deployment cost, and makes it easier to maintain multiple task-specific variants. Adapter-style methods include bottleneck adapters, LoRA-style low-rank adapters, prompt adapters, and other PEFT variants.
Key Characteristics
- Adds a small trainable component to a larger frozen or mostly frozen model
- Reduces fine-tuning memory and storage compared with full fine-tuning
- Supports multiple task-specific variants on top of one base model
- May trade some peak quality for efficiency and operational simplicity
- Closely related to PEFT, LoRA, QLoRA, and low-rank adaptation
Common Use Cases
- Creating domain-specific variants of a shared LLM
- Fine-tuning on limited GPU memory
- Serving multiple customer-specific model behaviors
- Experimenting with task adaptation without copying full model weights
- Combining efficient training with faster rollback and versioning
Example
Loading code...Frequently Asked Questions
Is LoRA an adapter method?
Yes. LoRA is commonly treated as an adapter-style PEFT method because it adds trainable low-rank updates to a base model.
Why use adapters instead of full fine-tuning?
Adapters reduce training cost, storage, and operational complexity, especially when maintaining many variants.
Can adapters be merged into a base model?
Some adapter types, such as LoRA, can often be merged into base weights for deployment, depending on the framework.
Do adapters always match full fine-tuning quality?
Not always. They are efficient, but quality depends on task, rank or adapter size, data quality, and model architecture.