Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
vessenes
6 months ago
|
parent
|
context
|
favorite
| on:
Qwen3-VL
Roughly 1/10 the cost of Opus 4.1, 1/2 the cost of Sonnet 4 on per token inference basis. Impressive. I'd love to see a fast (groq style) version of this served. I wonder if the architecture is amenable.
petesergeant
6 months ago
|
next
[–]
Cerebras are hosting other Qwen models via OpenRouter, so probably
aitchnyu
6 months ago
|
prev
[–]
Isnt it a 3x rate difference? 0.7$ for Qwen3-VL vs 3$ for Sonnet 4?
vessenes
6 months ago
|
parent
[–]
Openrouter had $8-ish / 1M tokens for Qwen and $15/M for Sonnet 4 when I checked
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: