The Open-Source Strategy
Together's bet that curated, optimized open-source models could be a viable alternative to closed-source APIs has shaped the company's entire product strategy. The thesis has aged well as Llama, Qwen, and DeepSeek have demonstrated capability competitive with closed-source frontier models for many use cases.
Why open-source models
Together's strategic bet was that customers would care about more than absolute frontier capability:
- Cost. Per-token pricing on optimized open-source serving can be 50-80% cheaper than equivalent OpenAI / Anthropic API pricing.
- Customization. Open-source models can be fine-tuned. Closed-source frontier models can't be fine-tuned at the level open-source allows.
- Data control. Some customers don't want to send sensitive data through proprietary model APIs.
- Regulatory. Some sectors require model traceability that open-source enables.
- Optionality. Customers want to avoid lock-in to a single closed-source provider.
These motivations support a real market for open-source-hosted inference. Together's bet was that the market would be large enough to support a platform business.
The curation problem
Open-source model releases happen constantly. Llama family alone has multiple variants per release; Mistral, Qwen, DeepSeek release at their own cadence. Customers don't want to evaluate every model release.
Together's curation:
- Maintain a catalog of the genuinely competitive open-source models.
- Deprecate models that have been surpassed.
- Publish benchmarks and quality comparisons.
- Maintain instruction-tuned variants for chat-style use cases.
- Surface the right model for the right use case.
This is a real product investment that customers value. Doing this well requires deep model evaluation expertise — exactly what Together's research lineage supports.
Performance gap
How does open-source quality compare to closed-source frontier?
- 2022-early-2023: large gap. GPT-4 was meaningfully ahead of any open-source model.
- Late 2023-2024: gap narrowed. Llama 2/3, Mixtral, etc. were competitive on many benchmarks.
- 2024-2026: gap variable by task. For many tasks, open-source is close to or matches closed-source frontier. For some (reasoning-heavy, coding, agentic), closed-source retains a lead.
The narrowing gap is the strongest tailwind for Together's strategy. The remaining gap is the constraint.
Commercial advantages
From a customer's perspective, the per-token economics on open-source-hosted inference at Together vs closed-source frontier APIs:
- Often 50-80% cheaper per token.
- Slightly different latency profile (variable depending on model and serving optimization).
- Can be much cheaper for high-volume workloads where the latency advantage of closed-source frontier isn't needed.
For workloads where open-source quality is "good enough" — many enterprise use cases — the cost savings make Together a strong economic choice.
Ecosystem positioning
Together has positioned itself as a friendly counterparty to the open-source ecosystem rather than a competitor:
- Contributing back to open-source projects (FlashAttention, etc.).
- Releasing models (RedPajama, etc.).
- Publishing research that benefits the broader community.
- Hosting models from many providers without playing favorites.
This posture builds goodwill that translates into customer trust. Compared to managed-inference competitors who are more closed about their stack, Together's research-and-open-source presence is differentiated.
Risks
- Closed-source frontier reasserts dominance. If OpenAI / Anthropic / Google maintain enough quality lead, customers stick with closed APIs.
- Open-source quality plateaus. Without continued investment from Meta and others, the catalog of competitive models could stagnate.
- Hyperscalers absorb open-source. AWS Bedrock, Azure AI, GCP Vertex all offer open-source model hosting. Customers may default to their existing cloud rather than to Together.
- Margin compression. As open-source-inference becomes commoditized, prices fall and Together's margins compress.
Takeaway
The open-source strategy is the load-bearing strategic bet at Together. It has aged well so far; whether it continues to age well depends on the open-source ecosystem maintaining momentum and on Together capturing share faster than hyperscalers can absorb the same workloads. The next chapter looks at the research credibility that supports the strategy.