Notification Architecture
Architecture of the surface-aware notification system across MongoDB, API modules, and realtime services.
Notification Architecture
Component Map
Data Model
- Single notification document per logical event.
surfaces[]indicates delivery targets.statusBySurface.<surface>tracks lifecycle timestamps and metadata.
API Boundary
Read APIs (apps/api/src/modules/notification) are strict surface contracts.
Write/producer APIs (apps/api/src/modules/notifications) map business events to targets and dispatch.
Realtime Boundary
- Room key:
notifications:{userId}:{surface} - Join/sync handlers in gateway
- Publish helpers in realtime service
- Event history replay on reconnect
Transactional Layer Architecture (NEW)
Architecture Overview
Layer Responsibilities
| Component | Responsibility |
|---|---|
| OrderNotificationService | Domain-specific entry point. Builds EventContext from order data. Resolves event type based on order_source. Wraps router call with fail-safe. |
| NotificationRouterService | Core orchestration. Looks up config from catalog. Manages Redis dedupe reservation. Dispatches to channels. Handles fallback on errors. |
| TransactionalNotificationCatalog | Static configuration. Maps transactional order/refund event types to template IDs, channels, and priorities. |
| NotificationRecipientResolverService | Resolves recipients from EventContext. Returns userId and optional email. |
| Template Renderers | Pure functions generating subject, html, text, previewText for each event type. |
| NotificationsService | Base service. Enqueues jobs to BullMQ. |
| NotificationsProcessor | Base processor. Handles MongoDB persistence, Firebase push, realtime events. |
Event Flow Sequence
- Context Build: OrderNotificationService queries order + customer data, builds EventContext
- Event Resolution: Resolves event type (e.g., ORDER_PAYMENT_RECEIVED_CHECKOUT) based on order_source
- Router Entry: Calls
notificationRouter.routeEvent(event, context) - Config Lookup: Router gets EventConfig from catalog (templateId, channels, priority, title)
- Dedupe Check: Router builds dedupe key, attempts Redis SET NX EX
- If duplicate: return early (no send)
- If Redis fail: allow send (fail-open)
- Recipient Resolution: Router calls recipientResolver.resolve(context)
- Channel Dispatch: For each channel:
- Push: Build notification body, call sendToUser() with inAppTargets + pushTargets
- Email: Render template, call sendEmailToUsers()
- Queue Enqueue: Both methods add jobs to BullMQ NOTIFICATIONS queue
- Processing: NotificationsProcessor handles delivery (MongoDB persist, Firebase push, realtime emit)
Deduplication Strategy
Two-layer deduplication:
-
Router Layer (Redis):
- Key:
notification_sent:transactional:<event>:<orderId>:<productId> - TTL: 7 days
- Method:
SET key value EX ttl NX - Purpose: Prevent rapid duplicate sends within 7-day window
- Key:
-
Processor Layer (MongoDB):
- Field:
externalRef - Constraint: Unique partial index when externalRef is string
- Purpose: Ensure idempotent persistence per recipient
- Field:
Source-Aware Event Resolution
The OrderNotificationService resolves different events based on order_source field:
| order_source | Order Placed Event | Payment Received Event |
|---|---|---|
checkout | ORDER_PLACED_CHECKOUT | ORDER_PAYMENT_RECEIVED_CHECKOUT |
buy_now | ORDER_PLACED_BUYNOW | ORDER_PAYMENT_RECEIVED_BUYNOW |
admin_manual | ORDER_PLACED_BUYNOW | ORDER_PAYMENT_RECEIVED_BUYNOW |
| null/unknown | ORDER_PLACED_BUYNOW | ORDER_PAYMENT_RECEIVED_BUYNOW |
Channel Configuration
All transactional events are configured with:
- Push: inAppTargets=["mobile_in_app", "web_in_app"], pushTargets=["mobile_push"]
- Email: HTML template with order details, CTAs
Priority mapping:
- High: Payment events (received/failed), Refunds
- Normal: Order placed, Order cancelled
Fail-Open Reliability Model
| Failure Scenario | Behavior |
|---|---|
| Redis unavailable for dedupe | Log warning, allow send (fail-open) |
| Recipient resolution empty | Log warning, skip dispatch |
| Template rendering fails | Use fallback minimal content, continue dispatch |
| Queue enqueue fails | Log error, do not propagate |
| Push delivery fails | Log error, notification persisted in MongoDB |
| Email delivery fails | Log error, notification persisted in MongoDB |
Relationship to Base Infrastructure
The transactional layer does not replace the base infrastructure. Instead:
- Reuses NotificationsService for queue enqueue
- Reuses NotificationsProcessor for delivery
- Reuses MongoDB schemas for persistence
- Reuses Firebase for push delivery
- Reuses RealtimeService for in-app events
- Adds event standardization, deduplication, template rendering on top
Base infrastructure consumers outside transactional order flow continue using direct sendToUser() / sendEmailToUsers() calls without going through the transactional router.
Event Coverage
| Category | Events |
|---|---|
| Order Lifecycle | ORDER_PLACED_CHECKOUT, ORDER_PLACED_BUYNOW, ORDER_PAYMENT_RECEIVED_CHECKOUT, ORDER_PAYMENT_RECEIVED_BUYNOW, ORDER_PAYMENT_FAILED, ORDER_CANCELLED |
| Refund Flow | ORDER_REFUND_REQUESTED, ORDER_REFUND_APPROVED, ORDER_REFUND_REJECTED |
Shared Outbox Cleanup Architecture (NEW)
All transactional notification events that persist through outbox_events participate in a shared cleanup lifecycle.
Key behaviors:
- Pending rows are never deleted.
- Retention and DLQ threshold are environment-configured.
- Cleanup runs via BullMQ job execution, preserving retry/backoff semantics.