Skip to main content

Which base model should I use?

Minibase offers three model sizes, which differ in terms of speed, capability, and resource use.

N
Written by Niko McCarty
Updated over 2 months ago

How to Use

TL;DR: Use Small (135M) for speed and simple tasks, Standard (360M) for balance and general use, or Large (1.7B) for complex, nuanced, and longer tasks.

Step 1: Start by assessing your task

  • Is it simple and repetitive, or does it require nuance and detailed responses?

  • How long are the outputs you expect (short answers vs. paragraphs)?

  • Will you be running the model on limited hardware, or do you have capacity for larger models?

Step 2: Match your needs to the model size

Small Base Model (135M)

  • Fastest and lightest option — runs smoothly on limited hardware.

  • Best for simple, structured tasks that don’t require nuanced reasoning.

  • Ideal use cases:

    • Text classification (e.g., labeling, spam detection).

    • Keyword or data extraction.

    • Simple factual Q&A with short answers.

  • Limitations: Struggles with long responses, subtle language, or multi-step reasoning.

Standard Base Model (360M)

  • Balanced model — more accurate and expressive than the Small model, but still efficient.

  • Great general-purpose choice for most fine-tuning projects.

  • Ideal use cases:

    • Instruction following with short or medium-length responses.

    • Sentiment analysis with multiple categories.

    • Text normalization or rewriting.

    • Multi-step but narrow reasoning (e.g., extract + reformat data).

  • Limitations: Not as strong for highly nuanced or long-form tasks.

Large Base Model (1.7B)

  • Most capable option — excels at nuance and detailed output.

  • Best for complex, varied, or open-ended tasks.

  • Ideal use cases:

    • Conversational agents or chatbots.

    • Summarization of longer text.

    • Knowledge-based Q&A where detail matters.

    • Complex classification or reasoning across longer context.

  • Limitations: Slower and requires more memory than smaller models.

Step 3: Iterate if unsure

  • If you’re not sure which model to pick, start small.

  • Prototype with the Standard Base Model, then move to the Large Base Model if you need more nuance.

Tips & Best Practices

  • Prototype fast, then scale: Start with the smaller or standard model for quick feedback, then train larger models as needed.

  • Consider deployment environment: Use the Small model for mobile or edge use, and larger models for server or cloud deployment.

  • Think about output length: Longer, more complex outputs generally require the Large Base Model.

Troubleshooting

I don’t know which model to start with

Pick the Standard Base Model — it’s the most balanced option and works for a wide variety of tasks.

The outputs are too shallow or simplistic

Move up to a larger model (Standard or Large Base) for better reasoning and nuance.

My model runs too slowly

Switch to a smaller model (Small or Standard Base) to improve speed.

Did this answer your question?