A frame-hierarchy based model to track active user interaction.
:warning: This page is no longer maintained! :warning:
This is now part of the HTML spec (see the section on tracking user activation) and has shipped in Chrome 72.
:warning: This page is no longer maintained! :warning:
The term user activation means the state of a browsing session with respect to user actions: an “active” state typically implies either the user is currently interacting with the page through some input mechanism (typing, clicking with mouse etc.), or the user has completed some interaction since the page got loaded. (User gesture is a misleading term occasionally used to express the same idea, e.g. “allowing something only with a user gesture”, even though a swipe gesture doesn’t typically activate a page.)
Browsers control access to “abusable” of APIs through user activation. The most
obvious example of such an API is opening popups through window.open()
: when
rogue developers started to abuse the API to open popups arbitrarily, most
(all?) browsers began to block popups when the user is not actively
interacting with the page. Since then, browsers have gradually made many other
APIs dependent on activation (more precisely, made them user activation
gated), like making an element fullscreen, vibrating a mobile device,
autoplaying media etc. To highlight the scope, ~30 different
APIs
in Chrome are user activation gated.
The Web is in a terrible state today in terms of user activation behavior. Because each browser has incrementally added user activation dependence to it’s own set of APIs over a course of many years, we see widely divergent behavior among major browsers. For example, pop-blocking behavior is inconsistent among major browsers for all non-trivial cases of user activation.
More importantly, the current HTML spec can’t really fix the broken situation in the Web today because it needs to add important details and doesn’t fully reflect any current implementation.
User Activation v2 (UAv2) introduces a new user activation model that is simple
enough for cross-browser implementation, and hence calls for a new spec from
scratch as a long term fix for the Web. We prototyped the model in Chromium
behind the flag --enable-features=UserActivationV2
in
M67.
The new model maintains a two-bit user activation state at every window
object
in the frame hierarchy:
HasSeenUserActivation
: This is a sticky bit for the APIs that only needs a
signal on historical user activation. The bit gets set on first user action,
and is never reset during the lifetime of the window
object. Example APIs:
<video>
autoplay
and
Navigator.vibrate()
.
HasConsumableUserActivation
: This is a transient bit for the APIs that need
limited invocation per user interaction. The bit gets set on every user
interaction, and is reset either after an expiry time defined by the browser
or through a call to an activation-consuming API
(e.g. window.open()
).
Any user interaction in a window
object sets the activation bits in the
window
objects of all ancestor frames (including the window
being
interacted with). (See Related Links below for an API to modify this default
behavior.)
Any consumption of the transient bit resets the transient bits in the window
objects of the whole frame tree.
In Chromium, the main change introduced by this model is replacing stack-allocated per-process gesture tokens with per-frame states as described above. This effectively:
removes the need for token storing/passing/syncing for every user API,
changes activation visibility from stack-scoped to frame-scoped, and
fuses multiple user interactions within the expiry time interval into a single activation.
For further details on the model and Chromium implementation, see:
Modern browsers already show different levels activation-dependence for activation-aware APIs, and the Web needs a spec for this behavior. The UAv2 model induces a classification of user APIs into three distinct levels, making it easy for any user API to spec its activation-dependence in a concise yet precise manner. The levels are as follows, sorted by their “strength of dependence” on user activation (from strongest to weakest):
Transient activation consuming APIs: These APIs require the transient bit, and
they consume the bit in each call to prevent multiple calls per user
activation.
E.g. window.open()
is most (all?) browsers today.
Transient activation gated APIs: These APIs require the transient bit but
don’t consume it, so multiple calls are allowed per user activation until the
transient bit expires. E.g.
Element.requestFullscreen()
in Chromium and other many browsers.
Sticky activation gated APIs: These APIs require the sticky activation bit, so
they are blocked until the very first user activation. E.g. <video>
autoplay
and
Navigator.vibrate()
in Chromium.
Our prototype implementation preserved all the APIs’ past behavior in Chromium after a few (mostly minor) changes.
Compare these demos in Chrome 72+ vs. in all other browsers to see why UAv2 makes sense.
User activation propagation in the frame tree:
Live UAv2 states in the frame tree: Shows live UAv2 state changes across the frame tree. Sorry, works only in Chrome because of this feature.
Test activation propagation with popups: Shows how UAv2 states change across the frame tree through user interaction and subsequent consumption through popups.
Consistent availability of user activation state through API chaining:
UAv2 with
setTimeouts:
Shows consistency through chaining of setTimeout()
calls.
UAv2 with postMessages to
parent:
Shows consistency through multiple child-to-parent postMessage()
calls.
UAv2 with postMessages to
child:
Shows consistency through multiple parent-to-child postMessage()
calls.
Activation Transfer through
postMessages: An
API to allow developers transfer user activation state to any target window
in the frame tree.
JS API for querying User Activation states: An API to determine the state of user activation. This is independent but somewhat related to UAv2.
Determining activation-defining events: Used for cataloging differences among major browsers in the set of events that define activation.