⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

Conversation

@zastrowm
Copy link
Member

@zastrowm zastrowm commented Jan 5, 2026

Description

The current retry logic for handling ModelThrottledException is hardcoded in event_loop.py with fixed values (6 attempts, exponential backoff starting at 4s). This makes it impossible for users to customize retry behavior for their specific use cases, such as:

This PR refactors the hardcoded retry logic into a ModelRetryStrategy class so that folks can customize the parameters.

Not Included:

The PR does not introduce a RetryStrategy base class. I started to do so, but am deferring it because:

  1. It requires some additional design work to accommodate the tool-retries, which I anticipate should be accounted for in the design
  2. It simplifies this review which refactors how the default retries work internally
  3. ModelRetryStrategy provides enough benefit to allow folks to customize the agent loop without blocking on a more extensible design

Public API Changes

Added a new retry_strategy parameter to Agent.__init__():

from strands import ModelRetryStrategy

agent = Agent(
    model="anthropic.claude-3-sonnet",
    retry_strategy=ModelRetryStrategy(
        max_attempts=3,
        initial_delay=2,
        max_delay=60
    )
)
# Retries up to 2 times with 2s-60s exponential backoff

The retry_strategy parameter only accepts ModelRetryStrategy, no derived classes. We've been discussing a RetryStrategy base class that is more abstract and supports additional exception types, but I'm punting on that as it requires additional design work whereas this provides immediate benefit to callers attempting to custimize the current agent-loop retry behavior.

For now, alternative retry strategies can be implemented by creating a hook provider that sets event.retry = True on the AfterModelCallEvent when a retry should occur.

Backwards Compatibility

Retry delay

The general default behavior is unchanged — agents still retry up to 5 times (6 attempts in total) with the same exponential backoff. The EventLoopThrottleEventand ForceStopEvent are still emitted during retries, maintaining backwards compatibility with existing hooks that listen for this event.

The exact delay times have changed!. Because of a bug in the original logic, the initial delay was actually doubled the first time it executed (see test_agent_events.py for the test changes to accomidate this). Previous to these changes, the delay(s) were:

8s, 16s, 32s, 64s, 128s

Afer these changes, the delays are:

4s, 8s, 16s, 32s, 64s

I think this are okay changes to make, however.

Default retry behavior

The default retry behavior also reads from event_loop.MAX_ATTEMPTS etc so that anyone who was previously modifying those constants will continue to do so

Implementation Decisions

  • We preserve backwards comptability to emit EventLoopThrottleEventand ForceStopEvent events as we used to.
    • Because backwards compatability, we're forced to.
  • We do emit ForceStopEvent whenever an exception bubbles out of the model invocation
    • This seems to be the convention
  • Naming
    • We name it retry_strategy so that as hooks are expanded to allow retrying tools, we can also enable tool retry strategies
    • We name it ModelRetryStrategy since it's only focused on model retries - in the future we might vend other strategies, but we can add a new strategy rather than attempting to fit it all into this one.

Related Issues

Documentation PR

TODO once we align on this approach

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@codecov
Copy link

codecov bot commented Jan 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

zastrowm added a commit to zastrowm/docs that referenced this pull request Jan 8, 2026
…igh-Level constructs

In doing api bar raising for strands-agents/sdk-python/pull/1424, we determined that HookProvider is a too-low-level interface for exposing directly to integrators.  This captures that decision & reasoning in log format and sets us up to record future decisions in a similar way going forward.

See DECISIONS.md on the decision & the format
zastrowm added a commit to zastrowm/docs that referenced this pull request Jan 14, 2026
…igh-Level constructs

In doing api bar raising for strands-agents/sdk-python/pull/1424, we determined that HookProvider is a too-low-level interface for exposing directly to integrators.  This captures that decision & reasoning in log format and sets us up to record future decisions in a similar way going forward.

See DECISIONS.md on the decision & the format
Enforces that Agent only accepts ModelRetryStrategy instances (not subclasses) for the retry_strategy parameter to prevent API confusion before a base RetryStrategy class is introduced.
@github-actions github-actions bot added size/l and removed size/l labels Jan 14, 2026
@zastrowm zastrowm marked this pull request as ready for review January 14, 2026 19:17
Unshure
Unshure previously approved these changes Jan 15, 2026
@github-actions github-actions bot added size/l and removed size/l labels Jan 15, 2026
zastrowm added a commit to strands-agents/docs that referenced this pull request Jan 16, 2026
…igh-Level constructs (#420)

In doing api bar raising for strands-agents/sdk-python/pull/1424, we determined that HookProvider is a too-low-level interface for exposing directly to integrators. This captures that decision & reasoning in log format and sets us up to record future decisions in a similar way going forward.

See DECISIONS.md on the decision & the format

Co-authored-by: Mackenzie Zastrow <[email protected]>
@zastrowm zastrowm linked an issue Jan 16, 2026 that may be closed by this pull request
Copy link
Contributor

@strands-agent strands-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type check for retry_strategy parameter is too restrictive and will break for subclasses:

if retry_strategy and type(retry_strategy) is not ModelRetryStrategy:
    raise ValueError("retry_strategy must be an instance of ModelRetryStrategy")

This uses type() with is not, which fails for subclasses. Consider:

class MyCustomRetry(ModelRetryStrategy):
    # Custom retry logic
    pass

agent = Agent(retry_strategy=MyCustomRetry())  # ❌ Raises ValueError!

Recommendation: Use isinstance() check instead:

if retry_strategy is not None and not isinstance(retry_strategy, HookProvider):
    raise TypeError(f"retry_strategy must implement HookProvider, got {type(retry_strategy).__name__}")

This allows:

  • Subclasses of ModelRetryStrategy ✅
  • Custom HookProvider implementations ✅
  • Better error message with actual type ✅
  • Type safety with proper inheritance check ✅

🤖 This is an experimental AI agent response from the Strands team, powered by Strands Agents. We're exploring how AI agents can help with community support and development. Your feedback helps us improve! If you'd prefer human assistance, please let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] make event loop settings configurable

4 participants