user-activation-v2

User Activation v2 (UAv2)

A frame-hierarchy based model to track active user interaction.

Introduction

What’s a user activation?

The term user activation means the state of a browsing session with respect to user actions: an “active” state typically implies either the user is currently interacting with the page through some input mechanism (typing, clicking with mouse etc.), or the user has completed some interaction since the page got loaded. (User gesture is a misleading term occasionally used to express the same idea, e.g. “allowing something only with a user gesture”, even though a swipe gesture doesn’t typically activate a page.)

Browsers control access to “abusable” of APIs through user activation. The most obvious example of such an API is opening popups through window.open(): when rogue developers started to abuse the API to open popups arbitrarily, most (all?) browsers began to block popups when the user is not actively interacting with the page. Since then, browsers have gradually made many other APIs dependent on activation (more precisely, made them user activation gated), like making an element fullscreen, vibrating a mobile device, autoplaying media etc. To highlight the scope, ~30 different APIs in Chrome are user activation gated.

What’s the problem today?

The Web is in a terrible state today in terms of user activation behavior. Because each browser has incrementally added user activation dependence to it’s own set of APIs a course of many years, we see widely divergent behavior among major browsers. For example, pop-blocking behavior is inconsistent among major browsers for all non-trivial cases of user activation.

More importantly, the current HTML spec can’t really fix the broken situation in the Web today because it needs to add important details and doesn’t fully reflect any current implementation.

How are we proposing to solve the problem?

User Activation v2 (UAv2) introduces a new user activation model that is simple enough for cross-browser implementation, and hence calls for a new spec from scratch as a long term fix for the Web. We prototyped the model in Chromium behind the flag --enable-features=UserActivationV2 in M67.

Details of the new model

Two-bit state per frame

The new model maintains a two-bit user activation state at every window object in the frame hierarchy:

State propagation across frames

Major functional changes

In Chromium, the main change introduced by this model is replacing stack-allocated per-process gesture tokens with per-frame states as described above. This effectively:

  1. removes the need for token storing/passing/syncing for every user API,

  2. changes activation visibility from stack-scoped to frame-scoped, and

  3. fuses multiple user interactions within the expiry time interval into a single activation.

Design docs

For further details on the model and Chromium implementation, see:

Classifying user activation gated APIs

Modern browsers already show different levels activation-dependence for activation-aware APIs, and the Web needs a spec for this behavior. The UAv2 model induces a classification of user APIs into three distinct levels, making it easy for any user API to spec its activation-dependence in a concise yet precise manner. The levels are as follows, sorted by their “strength of dependence” on user activation (from strongest to weakest):

Our prototype implementation preserved all the APIs’ past behavior in Chromium after a few (mostly minor) changes.

Demo

Compare these demos in Chrome 72+ vs. in all other browsers to see why UAv2 makes sense.