In Minibase, training a model means fine-tuning an existing base model with your dataset. Right now, fine-tuning uses a method called LoRA (Low-Rank Adaptation). LoRA is an efficient approach that updates only a small number of parameters instead of retraining the whole model. This makes fine-tuning much faster, cheaper, and more resource-friendly, while still tailoring the model to your dataset.
All models in Minibase—whether task-based, chat-based, language-based, or micro-based—can be fine-tuned using any dataset in the system. The choice of dataset and model combination depends on the kind of behavior you want to achieve.
How to Use
Step 1: Choose your model
Go to the Models tab.
Select the type of base model you want to fine-tune (Task, Language, Chat, or Micro).
Name your model and add an optional description.
Step 2: Select your dataset
Upload your dataset via the Datasets tab, or select an existing dataset from your organization.
Any dataset you upload into Minibase is compatible with any model type.
Choose a dataset that matches the problem you’re trying to solve. For example:
Q&A dataset → Task-based model
Conversational dataset → Chat-based model
Translation dataset → Language-based model
Step 3: Kick off training
Start fine-tuning by clicking Train model.
Behind the scenes, Minibase applies LoRA fine-tuning to adapt the base model to your dataset.
Training time depends on model size and dataset size. Larger models train slower, but offer more capability.
Step 4: Download or use your model
Once training is complete, you can:
Download the model in a LoRA format compatible with Hugging Face and other frameworks.
Use it directly in Minibase to run inference, test in the playground, or integrate into your workflow.
Tips & Best Practices
Match dataset to model: Even though any dataset works with any model, results are best when aligned. Use task-style datasets with task models, chat-style with chat models, etc.
Use clean, high-quality data: Fine-tuning amplifies your dataset. Typos, inconsistent formatting, or poor examples can lead to unreliable outputs.
Start small, then expand: Begin with a focused dataset. Evaluate your model’s behavior, then add more examples to improve coverage.
Experiment with naming and versions: Give models descriptive names (e.g., “Customer Support v1”) to keep track of experiments.
Troubleshooting
Training is too slow
Training is too slow
Larger models (like Chat-based or Language-based) require more resources. For faster iterations, start with smaller models like Task-based or Micro-based.
My model isn't learning well.
My model isn't learning well.
Check dataset quality and size. Around 3,000 examples is a minimum for decent performance; ~10,000 is better for robust results. Make sure your dataset matches the type of task.