Time-Windowed Join

Exclusive Normal 3.5 BasicsData Structures Google

Implement WindowJoiner, a time-windowed joiner between two event streams. Events are matched by key and timestamp proximity, with strict correctness rules.

This mirrors real streaming/analytics components and emphasizes state management, policy decisions, and temporal reasoning.


Event Streams

Stream A events

Each A event has:

  • key
  • t (timestamp)
  • a_val

Stream B events

Each B event has:

  • key
  • t (timestamp)
  • b_val

Join Rules

  • Events may arrive in any order.
  • Two events may join if:
    • Their keys are equal
    • abs(tA - tB) <= W
  • Each individual event may participate in at most one join.

Matching Policy

When multiple join candidates exist for the incoming event:

  1. Choose the match with the smallest absolute time difference.
  2. If still tied, choose the candidate with the earlier timestamp on the opposite stream.
  3. If still tied, choose the candidate that was inserted earlier into the buffer.

Once matched, both events become consumed and cannot be used again.


Time & Eviction Rules

  • Define current_time as the maximum timestamp observed so far across both streams.
  • Events with t < current_time - W can no longer be matched and must be evicted.
  • The implementation must keep memory bounded via eviction.

Constructor

WindowJoiner(W)

  • Initializes the joiner with window size W.

Methods

push_a(key, t, a_val)

  • Adds an event from stream A.
  • Returns all newly produced joined results caused by this call. If there are no newly produced results, print null.

push_b(key, t, b_val)

  • Adds an event from stream B.
  • Returns all newly produced joined results caused by this call. If there are no newly produced results, print null.

Joined Result Format

Each joined result must contain the key and both events’ timestamps and values:

  • key, tA, a_val, tB, b_val

If you output joined results, they should be represented consecutively and space-separated per result.


Notes

  • Do not emit duplicate joins.
  • The correctness of eviction and the matching policy is more important than raw performance.

Examples

Example 1
Input:WindowJoiner 3 push_a x 10 5 push_a y 20 1 push_b x 12 7 push_b y 30 9
Output:null null x 10 5 12 7 null
Accepted 1/7
Acceptance 14%
Loading editor...
Sample Input:
WindowJoiner 10 push_b x 14 3 push_b x 11 2 push_a x 10 1
Expected Output:
null null x 10 1 11 2