The Chat-Based model is a compact conversational model designed for building interactive applications. With 1.1B parameters, it strikes a balance between efficiency and conversational ability, making it ideal for development, testing, and lightweight production use cases.
It is best suited for:
Conversational chatbots that need to handle back-and-forth dialogue.
Simple reasoning tasks where short chains of logic are needed.
Content generation (short passages, draft text, basic summaries).
Prototyping conversational flows before scaling to larger models.
This model provides a cost-effective way to explore dialogue systems without requiring large amounts of compute.
Model Specs
# of Parameters: ~1.1B
Download Size: ~2.1 GB (estimate)
Max Sequence Length: 2,048 tokens
Context Window: 2,048 tokens
Compact enough to run on modest hardware while still delivering strong conversational performance.
How to Use
Step 1: Select the model
From the Models tab, choose the Chat-Based model for training.
Step 2: Upload your dataset
Provide examples in Instruction, Input, and Response format.
Example:
Instruction: “Answer conversationally.”
Input: “How’s the weather today?”
Response: “It looks like sunny skies and mild temperatures.”
Step 3: Train the model
Upload in CSV, Excel, JSON, or JSON-L format. Minibase automatically converts to JSON-L.
Training uses LoRA fine-tuning for efficiency.
Step 4: Test in the browser
On the Model Details page, chat directly with your model.
Adjust Temperature to tune creativity and Max Tokens to control response length.
Step 5: Deploy or download
Download in GGUF format to run locally or on your own servers.
Or deploy instantly with Minibase Cloud and call your model via API in Python or JavaScript.
Tips & Best Practices
Best for conversation: Use when you want multi-turn dialogue, not just short answers.
Tune length & style: Adjust Max Tokens and Temperature to refine how verbose or creative it is.
Balanced quantization:
High (Q6_K): best quality, production-ready.
Medium (Q4_K_M): balanced, recommended default.
Low (Q4_K_S): fastest, best for mobile/edge.
Not for very long context: Use a larger model if you need to handle more than ~2k tokens.
Troubleshooting
Model is slow or resource-heavy?
Model is slow or resource-heavy?
Despite its compact size, this model may run slower on very constrained devices. Try switching to a lower quantization level (e.g. Q4_K_S_ for faster inference on small machines.
Chat is disconnected from previous context?
Chat is disconnected from previous context?
Include prior exchanges in your training examples or append recent conversation context in prompts to preserve state.
It doesn’t know common facts like company founders or history?
It doesn’t know common facts like company founders or history?
This model benefits from broad QA or Wikipedia-style data. Add examples covering annotated facts similar to test prompts.
Why does my chat feel flat or uninformative?
Why does my chat feel flat or uninformative?
Conversational nuance improves with multi-turn fine-tuning. Include interactive dialogue examples—role-play, persona, or Q&A—to make responses livelier.