From streaming thumbnails to pricing widgets, adaptive bandits chase rewards in real time. Let’s test how well you know the algorithms behind the buzz.
A multi‑armed bandit balances exploration with ______ to maximise cumulative reward.
denormalisation
exploitation
pagination
compression
The regret of a bandit algorithm measures the gap between the reward it actually earned and the reward from the ______ arm.
inactive
deleted
optimal
random
Thompson Sampling draws a random value from each arm’s ______ distribution to select the next winner.
sprite sheet
progressive JPEG
posterior
DNS cache
Upper Confidence Bound (UCB) methods add an uncertainty bonus to each arm’s mean to favour those with ______ data.
compressed
fewer DOM nodes
less
TLS
Business Insider’s 2025 report says HBO Max uses a bandit to optimise ______ images shown to viewers.
thumbnail
alt‑text
DNS A‑records
subtitle
Contextual bandits extend the classic model by incorporating ______ features when choosing an arm.
checksum
ISO date
voltage
user or session
In a rapidly changing market, algorithms with sliding‑window Thompson Sampling handle ______ environments better.
lossless
deterministic
monolithic
non‑stationary
Compared with a 50‑50 A/B split, bandits usually expose fewer users to ______ variants.
losing
XML
SSL
PNG
The explore‑then‑commit strategy runs a short exploration phase and then ______.
locks in the current best arm
restarts the test hourly
drops all arms
switches to random
A key downside of bandits versus classic hypothesis tests is the difficulty of computing simple ______ values.
RGB
p‑
SHA‑256
TTL
Starter
You’re new to multi‑armed bandit algorithms. Revisit the fundamentals and try running a few simple tests to build confidence.
Solid
Solid grasp of multi‑armed bandit algorithms concepts—refine the details with more hands‑on practice.
Expert!
Expert level! You can design, run, and interpret advanced multi‑armed bandit algorithms experiments like a pro.