The case for a Custom Window in Flink: Expanding your streaming use-cases

Apr 22, 2025

You’ve probably heard that windows are a cornerstone of streaming analytics. They help us group data that just keeps flowing. But how do these windows actually work, and why would we choose one style over another? Let’s walk through the classics—Tumbling, Sliding, and Session—then see why a Rolling window (our custom star) can sometimes be the better fit.

A Quick Refresher on Why Windows Matter

Streaming data never stops. There’s always a new message, a new event, or a reading from that old hardware sensor not replaced yet. Without windows, we’d drown in a flood of unbounded data. Windows let us slice this stream into chunks we can understand.

Before diving into each window style, here’s a tiny timeline of the events we’ll reference:

Stream of Events

| **T (s)** | **Events** |
|-----------|------------|
| 0         | e₁         |
| 2         | e₂, e₃     |
| 5         | e₄         |
| 7         | e₅, e₆     |
| 10        | e₇         |
| 14        | e₈         |

We’ll reuse this humble list to see how each window type groups data over time.

Tumbling Windows: A Clear‑Cut Approach

Think of a tumbling window as a neat conveyor belt. Every few seconds (or minutes, or hours), the belt moves, sealing off the batch of items that just arrived, and a fresh belt section starts collecting new events.

How it works: You define a fixed length. Data that arrives during that length belongs to the same window. Once that period ends, the window closes and a new one begins immediately.
When to use it: If you need tidy, non-overlapping intervals—like computing daily counts of website visits or hourly sums of sensor readings—tumbling windows feel straightforward and easy.

DataStream<Event> stream = ...;
stream
  .keyBy(Event::getUserId)
  .window(TumblingEventTimeWindows.of(Time.seconds(5)))
  .reduce(new MyReduceFunction());

Result on our timeline (5‑second blocks):

| Window  | Events       |
|---------|--------------|
| [0–4]   | e₁,  e₂,  e₃ |
| [5–9]   | e₄,  e₅,  e₆ |
| [10–14] | e₇,  e₈      |

Sliding Windows: Overlapping Time Slices (That Don’t Actually Slide!)

Despite the name, sliding windows are really a parade of fixed windows created at a regular cadence. Each new window covers the last N seconds of data, so they overlap heavily.

How it works: You define a window length and a slide interval. Whenever the clock advances by that slide interval, a new window is defined covering the last “length” of time.
When to use it: When you need partial overlaps. Maybe you want a running average that updates every five seconds, but it still covers the last minute of data.

DataStream<Event> stream = ...;
stream
  .keyBy(Event::getUserId)
  .window(SlidingEventTimeWindows.of(Time.seconds(5), Time.seconds(1)))
  .sum("amount");

Because the window is five seconds and the slide is one, a single event may appear in five different windows. New windows start at T = 0 s, 1 s, 2 s, 3 s, 4 s … forever, each overlapping the previous by four seconds. That means more triggers and more work, but also more granular insight.

| Window  | Events       |
|---------|--------------|
| [0–4]   | e₁,  e₂,  e₃ |
| [1–5]   | e₂,  e₃,  e₄ |
| [2–6]   | e₂,  e₃,  e₄ |
| [3–7]   | e₄,  e₅,  e₆ |
| [4–8]   | e₄,  e₅,  e₆ |
| [5–9]   | e₄,  e₅,  e₆ |
| [6–10]  | e₅,  e₆,  e₇ |
| [7–11]  | e₅,  e₆,  e₇ |
| [8–12]  | e₇           |
| [9–13]  | e₇           |
| [10–14] | e₇,  e₈      |

Heads‑up: Short slides can stress Flink because every record is routed to many overlapping windows.

We have a five-second window that starts a new one every 1 second. So at T=0, T=1s, T=2s, T=3s, and so on, a new window is created. Each of these windows overlaps its predecessor by 1 second. Not only that, but each window collects many of the same events, so one event might show up in multiple windows, triggering more frequent updates.

In practice, this can create performance overhead if your slide is much shorter than your window—Flink enumerates every window a record belongs to. If you have a very large window and a tiny slide, you’ll end up computing overlapping results again and again. Different systems tackle this with more advanced data structures, but this is how Flink does it under the hood.

Session Windows: Activity‑Driven Buckets

Sometimes, time isn’t the right divider—activity is. A session window stays open while a user (or device) is active. If there’s a lull longer than a configured gap, the session closes.

How it works: The window remains open as long as there’s activity. If there’s a gap—longer than a defined “session timeout” (say 30 seconds of silence)—the window closes, and we finalize it.
When to use it: It’s great to identify user behavior and engagement, then classify and measure it. Whether you’re tracking chat messages, page views, or other bursts of usage, session windows help you isolate and analyze periods of active participation.

DataStream<Event> stream = ...;
stream
  .keyBy(Event::getUserId)
  .window(EventTimeSessionWindows.withGap(Time.seconds(3)))
  .apply(new MyWindowFunction());

On our timeline, we get two sessions:

| Emit  T | Events    | Why?                                 |
|---------|-----------|--------------------------------------|
| 0       | e₁        | first event starts session₁          |
| 13      | e₁ ... e₇ | 3 s gap after e₇, so session₁ closes |
| 14      | e₈        | e₈ opens session₂                    |

Rolling Windows: One Buffer, Zero Hassle

Why invent a “rolling” window when sliding exists? Because sliding windows replicate work, each overlap means more triggers, more windows, more state. A rolling window gives you a continuously updated view over the last N seconds without multiplying overlapping windows.

What Makes It Tick?

Our RollingWindowProcessFunction tracks events in a keyed list, prunes anything older than the window size, and emits results on a schedule you pick (say every event‑time second).

public abstract class RollingWindowProcessFunction<KEY, EVENT extends TimestampedEvent, OUTPUT>
    extends KeyedProcessFunction<KEY, EVENT, OUTPUT> {
  // windowSize → how far back we keep events
  // emitInterval → how often to emit results
  // … state setup …
}

On the same sample stream, it behaves like this (trailing five‑second view, emit on every event):

| Emit  T | Period     | Events       |
|---------|------------|--------------|
| 0       | [–4 …  0]  | e₁           |
| 2       | [–2 …  2]  | e₁,  e₂,  e₃ |
| 5       | [1 …  5]   | e₂,  e₃,  e₄ |
| 7       | [3 …  7]   | e₄,  e₅,  e₆ |
| 10      | [6 …  10]  | e₅,  e₆,  e₇ |
| 14      | [10 …  14] | e₇,  e₈      |

One window is constantly updated, and there is no overlapping copy‑paste. Perfect for dashboards that always show “the last X minutes” with minimal overhead.

Everything in One Glance

Feeling visual? Here’s how every window type evolves for our example stream:

Final Thoughts

Windows give shape to the unstoppable flow of streaming data. Tumbling and sliding windows are your bread‑and‑butter—tidy blocks or overlapping slices, respectively. Session windows adapt to user activity. And if you need rolling insight—always focusing on a trailing chunk of time without duplicating effort—a custom rolling window is a simple, elegant answer.

Choose the style that matches your data patterns, performance needs, and the story you want to tell, then let your streams roll. Happy streaming! 🚀