Cheap256K contextGoogle

Google: Gemma 4 31B

Google: Gemma 4 31B is the kind of model you use when cost matters but you still need vision, tools, structured output, and a huge 256K context window. It is aggressively cheap: 100 short chats cost about $0.02, a long PDF plus questions about $0.01, and even 1,000 coding completions only around $0.19. The non-obvious win is agent work: a 50-step workflow is roughly $0.02, so this is a practical default for high-volume automation.

Best for

  • Reading long PDFs or mixed text-image inputs and answering follow-up questions cheaply.
  • Running tool-based agents that need structured output without burning your budget.
  • Handling large batches of coding completions where per-call cost matters more than prestige.

Not ideal for

  • Teams that want a model bundled inside a major subscription, since none are listed in our catalog.
  • Use cases where you only need a tiny lightweight chat model and do not benefit from vision, tools, or long context.

What it costs in real life

Computed from OpenRouter API pricing ($0.14 input / $0.40 output per 1M tokens)

100 short chats(50K in / 30K out)
$0.02Cheap
1 long PDF + questions(80K in / 5K out)
$0.01Cheap
1,000 coding completions(200K in / 400K out)
$0.19Cheap
Agent workflow (50 steps)(50K in / 25K out)
$0.02Cheap

Variants

NameContextInput/1MOutput/1M
Google: Gemma 4 31B (free)Free256K$0.00$0.00
Google: Gemma 4 31B256K$0.14$0.40

Frequently Asked Questions

Is Google: Gemma 4 31B worth it for everyday work?

Yes, if you care about cost and you regularly work with long documents, coding tasks, or agent flows. It covers more real work than its price suggests, especially when a 50-step agent workflow is only about $0.02.

How much does Google: Gemma 4 31B API cost?

The paid version is $0.14 per 1M input tokens and $0.40 per 1M output tokens. In practice that stays very low: 100 short chats cost around $0.02, and a long PDF plus questions is about $0.01.

How does Google: Gemma 4 31B compare for PDFs, coding, and automation?

It looks strongest as a budget multitool: long-context reading, coding completions, vision input, and tool use in one model family. If your work is high-volume and process-heavy, the low scenario costs make it easier to use this as a default instead of saving it only for special cases.

Capabilities

Vision
Tool calling
Structured output
Reasoning
Open weights
Long context

Cheapest access path

The cheapest route is the free variant of Google: Gemma 4 31B, listed at $0.00 input and $0.00 output per 1M tokens. If you need paid API usage, the standard version is still very cheap at $0.14 input and $0.40 output, and StackTrim AI can help you check whether you are paying elsewhere for similar capability.

cheapvisiontoolsstructured outputreasoninglong context