Low-Rank Adaptation (LoRA)
Author: CislunarSpace
Site: https://cislunarspace.cn
Definition
Low-Rank Adaptation (LoRA) is a Parameter-Efficient Fine-Tuning (PEFT) method proposed by Hu et al. (2021). The core idea of LoRA is that the weight updates in a pretrained model can be effectively approximated by a low-rank matrix. By freezing the original pretrained weights and injecting a pair of trainable low-rank decomposition matrices into each Transformer layer, LoRA achieves performance comparable to full fine-tuning while training only 0.1%–3% of the original model parameters.
Mathematical Principle
Given a pretrained weight matrix at some layer, LoRA decomposes the parameter update into a product of two low-rank matrices:
where , , and rank .
The forward pass becomes:
Since is much smaller than and , the number of trainable parameters is dramatically reduced. For example, with and , the original layer has ~16.8M parameters, while LoRA requires training only ~65K parameters (~0.4%).
Training Process
LoRA training follows these steps:
- Freeze pretrained weights: All original parameters remain unchanged
- Inject low-rank matrices: Add trainable and matrices to each target layer (typically Q, K, V, O projection matrices in attention layers)
- Initialization: is typically initialized with Gaussian random values, is initialized to zero, ensuring at the start of training
- Training: Only and parameters are updated using standard gradient descent
- Inference merging: After training, merge into the original weights: , introducing no additional inference latency
Comparison with Full Fine-Tuning
| Feature | Full Fine-Tuning | LoRA |
|---|---|---|
| Trainable parameters | 100% | 0.1%–3% |
| Memory requirements | High | Low |
| Training speed | Slow | Fast |
| Inference latency | No additional delay | No additional delay (after merging) |
| Multi-task support | Requires multiple full model copies | Different low-rank matrices per task |
| Performance | Optimal | Near full fine-tuning |
Comparison with P-tuning V2
Both LoRA and P-tuning V2 are parameter-efficient fine-tuning methods, but they differ in strategy:
| Feature | LoRA | P-tuning V2 |
|---|---|---|
| Parameter modification | Constructs low-rank matrices externally | Adds soft prompts and embedding layers internally |
| Modification location | Weight matrices at each target layer | Virtual prompts before input + embeddings at each layer |
| Inference | No overhead after weight merging | Requires processing additional soft prompt tokens |
| Typical application | ChatGLM3-6B fine-tuning | ChatGLM2-6B fine-tuning |
Application in Spacecraft Intention Recognition
In the study by Jing et al. (2025), LoRA was used to fine-tune the ChatGLM3-6B model for spacecraft intention recognition. The experiment used LoRA rank and scaling factor 32, training for only ~3,000 iterations. Results showed:
- The LoRA-fine-tuned ChatGLM3-6B achieved 99.90% accuracy under instruction prompts, the highest among all tested models
- Accuracy improved by 83.94% compared to the base model
- Robustness was close to the base model, with standard deviation increasing by only 1.25x
Related Concepts
References
- Hu E J, Shen Y, Wallis P, et al. LoRA: Low-rank adaptation of large language models. arXiv:2106.09685, 2021.
- Jing H, Sun Q, Dang Z, Wang H. Intention Recognition of Space Noncooperative Targets Using Large Language Models. Space Sci. Technol. 2025;5:0271.
- Ling C, Zhao X, Lu J, et al. Domain specialization as the key to make large language models disruptive: A comprehensive survey. arXiv:2305.18703, 2023.
