⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

Conversation

@alexluong
Copy link
Collaborator

@alexluong alexluong commented Jan 21, 2026

Remove DeliveryEvent concept

This PR replaces the monolithic DeliveryEvent type with purpose-specific types that better represent the different stages of event delivery.

New Types

  • DeliveryTask - Message type for delivery processing queues. Flows from publishmqdeliverymq and from retry → deliverymq. Contains the Event, DestinationID, Attempt, and Manual flag. Provides IdempotencyKey() for deduplication.

  • RetryTask - Message type for scheduling retries. Contains only the metadata needed to re-fetch the event and re-attempt delivery (EventID, TenantID, DestinationID, Attempt). Converts to DeliveryTask when the retry fires.

  • LogEntry - Message type for the log queue. Contains both Event and Delivery for persistence to the logstore.

  • DeliveryRecord - Query result type returned by logstore. Contains Delivery with optional Event population for API responses.

Logstore Interface Changes

  • InsertManyDeliveryEventInsertMany(events, deliveries)
  • ListDeliveryEventListDelivery (returns DeliveryRecord)
  • RetrieveDeliveryEventRetrieveDelivery (returns DeliveryRecord)

Other Changes

  • Removes delivery_event_id column from database schema
  • Removes legacy /delivery-events API endpoints
  • Idempotency now uses event_id:destination_id format (with :manual suffix for manual retries)

@vercel
Copy link

vercel bot commented Jan 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
outpost-docs Ready Ready Preview, Comment Jan 27, 2026 7:13pm
outpost-website Ready Ready Preview, Comment Jan 27, 2026 7:13pm

Request Review

alexluong and others added 10 commits January 26, 2026 20:10
- Rename interface methods: InsertManyDeliveryEvent -> InsertMany,
  ListDeliveryEvent -> ListDelivery, RetrieveDeliveryEvent -> RetrieveDelivery
- Rename request types: ListDeliveryEventRequest -> ListDeliveryRequest, etc.
- Add DeliveryRecord type for query results with Event and Delivery
- Update memlogstore, pglogstore, chlogstore implementations
- Update all API handlers and tests to use new interface
- Remove DeliveryEventID field from Delivery struct

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add DeliveryTask struct with IdempotencyKey() and RetryID() methods
- Update deliverymq to publish/consume DeliveryTask instead of DeliveryEvent
- Update publishmq to create and enqueue DeliveryTask
- Update RetryTask to convert to DeliveryTask
- Update API handlers, eventtracer, alert, emetrics to use DeliveryTask
- Fix Delivery fields (TenantID, Attempt, Manual) not being set before logging
- Add :manual suffix to idempotency key for manual retries

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Update chlogstore/README.md method names and SQL examples
- Update pglogstore/README.md method names
- Update tracer_test.go comment to reference DeliveryTask

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Preserves Event-Delivery pairing through the insert flow, eliminating
the need for eventMap reconstruction in ClickHouse implementation.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
* refactor: add event fetching to retry scheduler

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* fix: align mock eventGetter with logstore behavior

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* refactor: remove logStore from messagehandler

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* chore: remove dead eventGetter code from messagehandler tests

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* test: verify manual retry publishes full event data

Extends TestRetryDelivery to verify that the manual retry API publishes
a DeliveryTask with complete event data to deliveryMQ. This ensures
the deliverymq handler receives full event data for manual retries,
consistent with the scheduled retry flow which fetches event data
in the retry scheduler.

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* test: add failing tests for retry race condition

Add unit and e2e tests verifying that retries are not lost when the
retry scheduler queries logstore before the event has been persisted.

Test scenario:
1. Initial delivery fails, retry is scheduled
2. Retry scheduler queries logstore for event data
3. Event is not yet persisted (logmq batching delay)
4. Retry should remain in queue and be reprocessed later

Tests added:
- TestRetryScheduler_RaceCondition_EventNotYetPersisted (unit test)
- TestE2E_Regression_RetryRaceCondition (e2e test)

Also adds:
- RetryVisibilityTimeoutSeconds config option
- WithRetryVisibilityTimeout scheduler option
- mockDelayedEventGetter for simulating delayed persistence

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* fix: return error when event not found in logstore during retry

Return error instead of nil so the retry message stays in queue and
will be reprocessed after the visibility timeout.

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* test: improve flaky tests

* chore: dev yaml

* chore: `make test` skip cmd/e2e by default

---------

Co-authored-by: Claude Opus 4.5 <[email protected]>
alexluong and others added 8 commits January 27, 2026 01:28
Update test files to use the new Attempt naming:
- logstore/drivertest/*.go: AttemptFactory, ListAttempt, RetrieveAttempt
- deliverymq/*_test.go: AttemptStatus*, entry.Attempt
- apirouter/*_test.go: AttemptFactory, /attempts API paths
- logmq/batchprocessor_test.go: AttemptFactory, Attempt struct fields

All unit tests pass (1665 tests).

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Revert attempt_metadata → delivery_metadata in OpenAPI (matches Go code)
- Fix config: delivery_prefix → attempt_prefix, example att → atm
- Fix variable names: deliveryID → attemptID, attErr → atmErr
- Fix test names: TestListDeliveries → TestListAttempts, etc.
- Update doc links for renamed API endpoints
- Update comments and test descriptions

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Remove comments that simply restate what the next line of code does,
such as "// Create client" before NewClient() or "// Check if exists"
before .Exists(). Preserves all meaningful comments including godoc,
section headers, WHY explanations, and unit clarifications.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
refactor: rename delivery -> attempt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempts belong to a destination, unclear to me why we removed /tenants/:tenant_id,/destinations/:dest_id/attempts

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that this is the same as /tenants/:tenant_id/attempts?destination_id=dest_id, so it's effectively syntactic sugar. I thought it was simpler to have only 1 way of doing things.

Not an issue to add this back, so let me know which you prefer.

Comment on lines +80 to 89
if err != nil {
// Transient error (DB connection issue, etc) - return error so scheduler retries
if logger != nil {
logger.Ctx(ctx).Error("failed to fetch event for retry",
zap.Error(err),
zap.String("event_id", retryTask.EventID),
zap.String("tenant_id", retryTask.TenantID),
zap.String("destination_id", retryTask.DestinationID))
}
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in this case? The message isn't removed from Redis and then what?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSMQ supports visibility timeout, so it will be visible again for try later

TenantID string `json:"tenant_id"`
EventID string `json:"event_id"`
DestinationID string `json:"destination_id"`
AttemptNumber int `json:"attempt"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems a bit odd label, what about just Number?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will also include a migration to change field attempt -> number, right?

FWIW Hookdeck also calls it attempt_number (ref)

CleanShot 2026-01-29 at 01 12 20

const [retrying, setRetrying] = useState<boolean>(false);

const retryDelivery = useCallback(
const retryAttempt = useCallback(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant on that terminology, you aren't really retrying an attempt, you retry a delivery with generates and attempt. Thoughts?

{ label: "Overview", path: "" },
{ label: "Settings", path: "/settings" },
{ label: "Deliveries", path: "/deliveries" },
{ label: "Attempts", path: "/attempts" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Attempts is the right term for the UI, end-users don't think in terms of Attempts. I would call it "Event deliveries" which shows a list of delivery attemtps. This is what Stripe does

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants