Context length errors happen when the request input, instructions, tools and expected output exceed the model context window.
Symptoms
- The API rejects a large request before generating output.
- Long chat history works until one extra message is added.
- A coding agent sends full files or logs into one request.
Likely causes
- Prompt, conversation history or pasted logs are too large.
- The request includes unnecessary tool schemas or repeated context.
- The selected model has a smaller context window than expected.
Fix steps
- Remove duplicate context and irrelevant logs.
- Summarize earlier conversation history before sending it again.
- Split large files or tasks into smaller requests.
- Check the selected model family and token limits.
Verify the fix
- Run the same request with a shorter input.
- Log approximate input size before sending.
- Keep a minimal failing payload for comparison.
FAQ
Can max tokens fix context length?
Lowering output tokens can help only if total input plus output budget fits the model window.
Should I paste complete repositories into one request?
No. Send the smallest files and context needed for the current task.
Related tools and guides
Last updated: May 18, 2026