Why LoRA Shows 1.4B Trainable Params Instead of 38M When Fine-Tuning Gemma 3 4B?

4 days ago 13

ARTICLE AD BOX

I found a code snippet for fine-tuning the Gemma-3 4B model on an OCR dataset that converts handwritten math formulas into LaTeX. The original author shared their results showing about **38 million trainable parameters** when using LoRA.

I copied the code exactly — without modifying even a single line — and ran it on Google Colab using the Unsloth library for LoRA fine-tuning. However, during training, it reports **1.4 billion trainable parameters** instead of 38 million.

I’m not sure why this huge difference is happening, especially since the code is identical to the original. Does anyone know what might be causing this mismatch?

Read Entire Article

LEFT SIDEBAR AD

Hidden in mobile, Best for skyscrapers.

Why LoRA Shows 1.4B Trainable Params Instead of 38M When Fine-Tuning Gemma 3 4B?

ARTICLE AD BOX

Related

I have a problem with the request module in Automate Boring Stuff With Python - Chapter 13

How do I resolve the ConnectionResetError and CondaHTTPError when attempting to update conda despite multiple retries and Anaconda reinstalls?

Make a Python process that communicates with itself over a PTY

LEFT SIDEBAR AD