Why LoRA Shows 1.4B Trainable Params Instead of 38M When Fine-Tuning Gemma 3 4B?

4 days ago 13
ARTICLE AD BOX

I found a code snippet for fine-tuning the Gemma-3 4B model on an OCR dataset that converts handwritten math formulas into LaTeX. The original author shared their results showing about **38 million trainable parameters** when using LoRA.

I copied the code exactly — without modifying even a single line — and ran it on Google Colab using the Unsloth library for LoRA fine-tuning. However, during training, it reports **1.4 billion trainable parameters** instead of 38 million.

I’m not sure why this huge difference is happening, especially since the code is identical to the original. Does anyone know what might be causing this mismatch?

Read Entire Article