What Is Depth Upscaling?
Depth upscaling is a model merging technique developed by Upstage. Instead of training a large model from scratch, it starts with a pretrained model (in this case Llama 2 13B), duplicates its middle layers, and fine-tunes the resulting larger model. The duplicate layers start with the same weights as the originals — a warm initialization that requires far less training compute than starting from random weights.
The result for SOLAR 10.7B:
- Take Llama 2 13B (32 transformer layers)
- Remove the last 8 layers
- Concatenate two copies of the first 24 layers (total: 48 layers)
- Fine-tune on high-quality data
The output has 10.7B parameters — slightly fewer than 13B because the embedding layer is shared — but depth that would normally require a 30B+ model to achieve.
HuggingFace Leaderboard Performance
When SOLAR 10.7B was released in December 2023, it entered the top-10 of the HuggingFace Open LLM Leaderboard despite being the smallest model in that tier. The key results at time of release:
| Benchmark | SOLAR 10.7B | Llama 2 70B | Mistral 7B | |---|---|---|---| | Average (4-task) | 74.2 | 67.9 | 60.1 | | ARC | 66.5 | 67.3 | 59.9 | | HellaSwag | 88.1 | 87.3 | 81.3 | | MMLU | 65.5 | 68.9 | 64.2 | | TruthfulQA | 76.8 | 44.9 | 45.5 |
The TruthfulQA score (76.8%) is particularly striking — Llama 2 70B scores 44.9% on the same benchmark. This reflects the quality of fine-tuning data as much as architecture.
Korean and English Bilingual Strength
Upstage is a South Korean AI company, and SOLAR 10.7B was trained with strong Korean language data alongside English. This makes it notable among open-source models for Korean language tasks:
- Korean MMLU: outperforms models twice its size that were not specifically trained for Korean
- Korean instruction following: the instruct variant handles polite/formal Korean register correctly
- Code-switching (Korean + English in same conversation): handled gracefully
Using the Instruct Variant
from openai import OpenAI
client = OpenAI(
api_key="YOUR_UPSTAGE_KEY",
base_url="https://api.upstage.ai/v1/solar",
)
response = client.chat.completions.create(
model="solar-1-mini-chat",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain the depth upscaling technique in simple terms."},
],
)
print(response.choices[0].message.content)
Self-Hosting with Ollama
ollama pull solar
ollama run solar
At 10.7B parameters, SOLAR runs comfortably on a machine with 16GB VRAM or 32GB unified memory (MacBook Pro M2). In Q4_K_M quantization it requires about 7GB, making it viable on consumer GPUs like the RTX 3080 10GB.
Apache 2.0 License
SOLAR 10.7B is licensed under Apache 2.0 — fully permissive for commercial use without attribution requirements or usage restrictions. This is an important distinction from Llama 2's custom license (which has user-count thresholds) and makes SOLAR suitable for building commercial products.
When to Choose SOLAR
- You need a 10B model that punches above its weight class on English and Korean
- You want Apache 2.0 commercial licensing without restrictions
- You are running on hardware that fits 7–16GB VRAM
- You want a model that demonstrates the depth upscaling technique for your own fine-tuning research