Tag: anthropic

12 articles

Jun 11, 2026 · ai / claude

Claude Sonnet 4 for developers — what changed from Claude 3

Sonnet 4 is a reliability upgrade for agentic work, not a raw benchmark jump. What changed in the API, where reward hacking dropped 69%, and whether to upgrade now.

Jun 11, 2026 · ai-tools / llm

Context engineering in 2026 — six patterns that work

Context engineering decides what your model sees at inference. Six patterns with code: ordering, caching, compaction, sub-agent isolation, and more.

Jun 5, 2026 · ci / github-actions

Claude Code in CI — automated code review in GitHub Actions

Add Claude Code automated reviews to every PR with anthropics/claude-code-action@v1. Ten-minute setup, $0.01–$0.15 per review, catches real bugs before merge.

Jun 4, 2026 · ai-tools / llm

Prompt caching in 2026 — Anthropic, OpenAI, and Gemini compared

Prompt caching cuts costs 90%. Anthropic requires explicit markers, OpenAI caches automatically, Gemini bills hourly. Here is which one fits your workload.

Jun 4, 2026 · llm / openai

LLM structured outputs: JSON mode, function calling, and Zod

Grammar-constrained sampling is the only reliable LLM primitive. How OpenAI, Anthropic, Zod, and Vercel AI SDK v6 compare — and where each still fails you.

Jun 4, 2026 · typescript / ai-agents

How to build an AI agent in TypeScript — tools, memory, MCP

Build a production TypeScript AI agent with @anthropic-ai/sdk v0.100.1: tool calling, the agentic loop, session and persistent memory, and an MCP server.

Jun 4, 2026 · claude / anthropic

Claude API 2026: Prompt Caching, Tool Use & Batches

A practical guide to the three Claude API features that separate toy prototypes from production integrations: prompt caching, tool use, and Message Batches API.

Jun 4, 2026 · openrouter / llm

OpenRouter vs direct API — when the gateway pays off

OpenRouter wins for multi-model projects and automatic failover. Direct API wins at high volume or for compliance-critical workloads. Here is how to decide.

Jun 4, 2026 · claude / ai-tools

Claude Sonnet 4.6 for Coding — Is It Worth the Upgrade?

Sonnet 4.6 costs the same as 4.5, runs 28% cheaper than Sonnet 3.7, and extends context to 1M tokens. Here is who should upgrade and who should wait.

May 17, 2026 · llm / cost-optimization

LLM cost routing: when Haiku beats Opus and when it does not

Routing 1M classification tokens from Opus 4.7 to Haiku 4.5 saves $6.00 — 80% reduction. Here is the task taxonomy, the latency case, and the tools to implement it.

May 16, 2026 · claude / haiku

Claude Haiku 4.5 for Coding — Benchmark and Cost Guide

At $1/1M tokens and 93 t/s, Haiku 4.5 is the right model for bounded coding — 73.3% SWE-bench Verified, 55% win rate on PR reviews. Here is the task split.

May 16, 2026 · claude / ai-tools

Claude Opus 4.7 for Coding — When the Big Model Wins

Opus 4.7 leads SWE-bench Verified at 87.6% and scores 70% on CursorBench vs. 58% for Opus 4.6. It costs ~2× Sonnet 4.6 after the tokenizer uplift. Here is exactly when it is worth it.