Scroll as a playhead

Thumbnail strips usually aren't the interesting part of an image browser. You scroll, the list moves, you stop. Nothing to think about.

I opened an image in Midjourney's lightbox and accidentally bumped the scroll wheel. The strip did something I wasn't expecting. It scrubbed through the images like a timeline. I didn't realize what was happening until I was already doing it, and by then it felt like the obvious way a thumbnail rail should work.

Watch the strip on the right. Scrolling doesn't step through images one at a time. It flows. The thumbnails near the center of the strip are slightly bigger and brighter than the rest. And when scrolling stops, the strip doesn't freeze wherever momentum left it. It eases into the nearest image.

I was building an image viewer for Batch Banana and wanted this exact feel. So I slowed the video down and started pulling it apart.

Pulling it apart

The size shift caught my eye first. I thought it was a hover effect, but it happens during scroll with no cursor near the strip. It's tied to position. Whichever thumbnail sits closest to the center of the viewport is the biggest. The ones a few spots away are slightly smaller. Everything beyond that rests at a baseline size. As you scroll, this "bright zone" slides along with you.

Then the snap. While scrolling, the strip moves freely. No resistance, no stepping, no pagination. But the moment the wheel goes idle, the strip eases to the nearest thumbnail. It never rests between two images.

One number drives everything

Start with the positioning. The thumbnails themselves don't move. They have fixed positions in a vertical list. What changes is which part of the list is visible in the viewport.

If each thumbnail is 48 pixels tall with 2 pixels of gap, the centers are easy to compute. The first thumbnail's center is at Y=24. The second is at Y=74. The third at Y=124. These never change. They're known the moment the list of images is known.

The part that moves is the offset: which Y coordinate in the list is currently at the center of the viewport. Call that offset focusY.

When focusY is 24, the first thumbnail sits in the middle of the screen. When focusY is 74, the second one does. Set focusY to any value between them, and the strip rests at that exact intermediate position. No snapping, no stepping. Continuous.

Now the scaling. The thumbnails near the focus are bigger. "Near" means within some distance in Y-space. A thumbnail whose center is 10 pixels from focusY is nearly full scale. A thumbnail 80 pixels away is at its resting size. So I need a radius: a zone around focusY where the scaling effect is active. Inside the zone, tiles scale up based on how close they are. Outside, they sit at baseline.

The snap runs on the same data. When the wheel goes idle, find the center closest to focusY and animate toward it.

Stripping it down

No scroll, no fish-eye. Just a slider that controls focusY, and fifty tiles that follow.

Drag the slider slowly. Watch the tiles shift. Now click any tile directly. focusY jumps to that tile's center and the rest of the rail rearranges around it.

Slider

The positioning math is one line:

y = tileCenter - focusY + viewportHeight / 2 - thumbSize / 2

Subtract the focus point from the tile's center to get its offset from the focal position. Add half the viewport height to anchor the focused tile in the middle of the visible area. Every tile runs this same formula, with only tileCenter changing.

This version uses React state for focusY. Each slider change triggers a re-render, and React recomputes every tile's position in JSX. At the speed a human drags a slider, that works fine.

Wiring up the scroll wheel

The slider proves the model works. The next step is to wire it to real input.

Wheel events replace the slider. Each event adjusts focusY by the scroll delta. The strip moves with the wheel, continuously, at whatever speed you scroll.

But there's a performance question. Wheel events fire 30-60 times per second. Each one changes focusY, which changes every tile's position. Running that through useState and React's render cycle means fifty component diffs per frame. It works, but React's diffing is overhead you don't need when the only thing changing is a CSS transform.

Swap React state for a Motion MotionValue. A focusY.on("change") subscription runs on every value change and writes to each tile's el.style.transform directly. Same math, same positions, but React never re-renders during scroll.

Adding the fish-eye

With scroll wired up, the strip positions correctly but every tile looks the same. There's no sense of where you are. The fish-eye gives spatial context.

How much should a tile scale up? It depends on one thing: distance from the focus. A tile sitting right at focusY gets the maximum boost, 1.25x. A tile 64 pixels away gets none. Everything in between scales linearly.

The same distance calculation drives opacity. At the focus: full brightness. At 64 pixels away: 50% brightness. One distance measurement, two visual effects. The math for the interpolation factor is t = max(0, 1 - distance / radius). When t is 1 (right at focus), the tile is fully scaled and fully bright. When t is 0 (outside the radius), it's at baseline.

Scroll through the demo below. Your eye follows the bright zone as it slides through the strip. Two or three tiles are "awake" at any moment, and they shift as the focus moves.

Fish-Eye

Now try stopping mid-way between two tiles. The focus lands wherever momentum left it. The fish-eye splits across neighbors, neither tile is fully scaled, and the rail looks unsettled. No image is clearly active.

Snap on idle

After the last wheel event, a 120-millisecond idle timer starts. If no new scroll input arrives, a short tween animates focusY to the nearest tile center. The rail locks onto a clean position, and one tile is unambiguously active.

If you scroll again before the tween finishes, it cancels immediately and the rail goes back to following the wheel.

Complete

Toggle X-Ray and scroll slowly. The dashed lines mark the 64-pixel fish-eye radius centered on the viewport. The numbers on each tile show their live scale values. Watch them change as the focus passes through. The readout in the footer tracks the raw focusY coordinate and the current active index.

What's left for production

This demo covers the interaction model. The production version in Batch Banana adds a few things on top.

Virtualization. Fifty tiles are fine. Five hundred aren't. The real component only mounts tiles inside the viewport plus a small overscan buffer. A visibleSlice function runs a binary search on the sorted centers array each tick to compute the range. React re-renders only when that range changes, not on every scroll frame.

External sync. The lightbox has other navigation sources: clicking a sidebar item, a context menu action, keyboard shortcuts. Any of these can change the current index from outside the rail. The rail watches for external index changes and jumps focusY to match, but it tracks which changes it initiated itself (via an emittedRef) to avoid fighting its own scroll input.

Grouped thumbnails. Images in Batch Banana are grouped by generation. The rail spaces groups apart with a larger gap and rounds the first and last tile in each group. The layout math accounts for variable gaps between groups.

The full component is on GitHub if you want to read through it.