Picture-in-Picture

Editor’s Draft,

This version:
https://wicg.github.io/picture-in-picture
Issue Tracking:
GitHub
Editors:
( Google LLC )
( Google LLC )
Web Platform Tests:
feature-policy/
picture-in-picture/
Not Ready For Implementation

This spec is not yet ready for implementation. It exists in this repository to record the ideas and promote discussion.

Before attempting to implement this spec, please contact the editors.


Abstract

This specification intends to provide APIs to allow websites to create a floating video window always on top of other windows so that users may continue consuming media while they interact with other content sites, or applications on their device.

Status of this document

This specification was published by the Web Platform Incubator Community Group . It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups .

1. Introduction

This section is non-normative.

Many users want to continue consuming media while they interact with other content, sites, or applications on their device. A common UI affordance for this type of activity is Picture-in-Picture (PiP), where the video is contained in a separate miniature window that is always on top of other windows. Picture-in-Picture is a common platform-level feature among desktop and mobile OSs.

This specification aims to allow websites to initiate and control this behavior by exposing the following sets of properties to the API:

The proposed Picture-in-Picture API is very similar to [Fullscreen] as they have similar properties. The API only applies on HTMLVideoElement at the moment but is meant to be extensible.

2. Examples

2.1. Add a custom Picture-in-Picture button

<video id="video" src="https://example.com/file.mp4"></video>
<button id="pipButton"></button>
<script>
  // Hide button if Picture-in-Picture is not supported or disabled.
  pipButton.hidden = !document.pictureInPictureEnabled || video.disablePictureInPicture;
  pipButton.addEventListener('click', function() {
    // If there is no element in Picture-in-Picture yet, let’s request
    // Picture-in-Picture for the video, otherwise leave it.
    if (!document.pictureInPictureElement) {
      video.requestPictureInPicture()
      .catch(error => {
        // Video failed to enter Picture-in-Picture mode.
      });
    } else {
      document.exitPictureInPicture()
      .catch(error => {
        // Video failed to leave Picture-in-Picture mode.
      });
    }
  });
</script>

2.2. Monitor video Picture-in-Picture changes

<video id="video" src="https://example.com/file.mp4"></video>
<script>
  video.addEventListener('enterpictureinpicture', function() {
    // Video entered Picture-in-Picture mode.
  });
  video.addEventListener('leavepictureinpicture', function() {
    // Video left Picture-in-Picture mode.
  });
</script>

2.3. Update video size based on Picture-in-Picture window size changes

<video id="video" src="https://example.com/file.mp4"></video>
<button id="pipButton"></button>
<script>
  pipButton.addEventListener('click', function() {
    video.requestPictureInPicture()
    .then(pipWindow => {
      updateVideoSize(pipWindow.width, pipWindow.height);
      pipWindow.addEventListener('resize', function(event) {
        updateVideoSize(pipWindow.width, pipWindow.height);
      });
    });
  });
  function updateVideoSize(width, height) {
    // TODO: Update video size based on pip window width and height.
  }
</script>

3. Concepts

3.1. Request Picture-in-Picture

When the request Picture-in-Picture algorithm with video is invoked, the user agent MUST run the following steps:

  1. If Picture-in-Picture support is false , throw a NotSupportedError and abort these steps.

  2. If document is not allowed to use the policy-controlled feature named "picture-in-picture" , throw a SecurityError and abort these steps.

  3. If video ’s readyState attribute is HAVE_NOTHING , throw a InvalidStateError and abort these steps.

  4. If video has no video track, throw a InvalidStateError and abort these steps.

  5. OPTIONALLY, if the disablePictureInPicture attribute is present on video , throw a InvalidStateError and abort these steps.

  6. If the algorithm is not triggered by user activation , throw a NotAllowedError and abort these steps.

  7. If video is pictureInPictureElement , abort these steps.

  8. Set pictureInPictureElement to video .

  9. Let Picture-in-Picture window be a new instance of PictureInPictureWindow associated with pictureInPictureElement .

  10. Queue a task to fire an event with the name enterpictureinpicture at the video with its bubbles attribute initialized to true.

It is RECOMMENDED that the video frames are not rendered in the page and in the Picture-in-Picture window at the same time but if they are, they MUST be kept in sync.

When a video is played in Picture-in-Picture, the states SHOULD transition as if it was played inline. That means that the events SHOULD fire at the same time, calling methods SHOULD have the same behaviour, etc. However, the user agent MAY transition out of Picture-in-Picture when the video element enters a state that is considered not compatible with Picture-in-Picture.

3.2. Exit Picture-in-Picture

When the exit Picture-in-Picture algorithm is invoked, the user agent MUST run the following steps:

  1. If pictureInPictureElement is null, throw a InvalidStateError and abort these steps.

  2. Run the close window algorithm with the Picture-in-Picture window associated with pictureInPictureElement .

  3. Unset pictureInPictureElement .

  4. Queue a task to fire an event with the name leavepictureinpicture at the video with its bubbles attribute initialized to true.

It is NOT RECOMMENDED that video playback state changes when the exit Picture-in-Picture algorithm is invoked. The website SHOULD be in control of the experience if it is website initiated. However, user agent MAY expose Picture-in-Picture window controls that change video playback state (e.g. pause).

3.3. Disable Picture-in-Picture

Some pages may want to disable Picture-in-Picture for a video element. To support this, a new disablePictureInPicture attribute is added to the list of content attributes for video elements.

A corresponding disablePictureInPicture IDL attribute which reflects the value of element’s disablePictureInPicture content attribute is added to the HTMLVideoElement interface. The disablePictureInPicture IDL attribute MUST reflect the content attribute of the same name.

If the disablePictureInPicture attribute is present on the video element, the user agent SHOULD NOT play the video element in Picture-in-Picture or present any UI to do so.

When the disablePictureInPicture attribute is added to a video element, the user agent SHOULD run these steps:

  1. Reject any pending promises returned by the requestPictureInPicture() method with InvalidStateError .

  2. If video is pictureInPictureElement , run the exit Picture-in-Picture algorithm .

3.4. Interaction with Remote Playback

The [Remote-Playback] specification defines a local playback device and a local playback state . For the purpose of Picture-in-Picture, the playback is local and regardless of whether it is played in page or in Picture-in-Picture.

3.5. Interaction with Media Session

The API will have to be used with the [MediaSession] API for customizing the available controls on the Picture-in-Picture window.

3.6. Interaction with Page Visibility

The [Page-Visibility] specification defines a visibilityState attribute used to determine the visibility state of a top level browsing context. For the purpose of Picture-in-Picture, the visibilityState attribute is always "visible" when pictureInPictureElement is set and the Operating System lock screen is not shown.

3.7. One Picture-in-Picture window

Operating systems with a Picture-in-Picture API usually restricts Picture-in-Picture to only one window. Whether only one window is allowed in Picture-in-Picture will be left to the implementation and the platform. However, because of the one Picture-in-Picture window limitation, the specification assumes that a given Document can only have one Picture-in-Picture window.

What happens when there is a Picture-in-Picture request while a window is already in Picture-in-Picture will be left as an implementation details: the current Picture-in-Picture window could be closed, the Picture-in-Picture request could be rejected or even two Picture-in-Picture windows can be created. Regardless, the User Agent will have to fire the appropriate events in order to notify the website of the Picture-in-Picture status changes.

4. API

4.1. Extensions to HTMLVideoElement

partial interface HTMLVideoElement {
  [NewObject] Promise<PictureInPictureWindow> requestPictureInPicture();
  attribute EventHandler onenterpictureinpicture;
  attribute EventHandler onleavepictureinpicture;
  [CEReactions] attribute boolean disablePictureInPicture;
};

The requestPictureInPicture() method, when invoked, MUST return a new promise promise and run the following steps in parallel :

  1. Let video be the requested video.

  2. Run the request Picture-in-Picture algorithm with video .

  3. If the previous step threw an exception, reject promise with that exception and abort these steps.

  4. Return promise with the Picture-in-Picture window associated with pictureInPictureElement .

4.2. Extensions to Document

partial interface Document {
  readonly attribute boolean pictureInPictureEnabled;
  [NewObject] Promise<void> exitPictureInPicture();
};

The pictureInPictureEnabled attribute’s getter must return true if Picture-in-Picture support is true and the context object is allowed to use the feature indicated by attribute name picture-in-picture , and false otherwise.

Picture-in-Picture support is true if there is no previously-established user preference, restrictions, or platform limitation, and false otherwise.

The exitPictureInPicture() method, when invoked, MUST return a new promise promise and run the following steps in parallel :

  1. Run the exit Picture-in-Picture algorithm .

  2. If the previous step threw an exception, reject promise with that exception and abort these steps.

  3. Return promise .

4.3. Extension to DocumentOrShadowRoot

partial interface DocumentOrShadowRoot {
  readonly attribute Element? pictureInPictureElement;
};

The pictureInPictureElement attribute’s getter must run these steps:

  1. If the context object is not connected , return null and abort these steps.

  2. Let candidate be the result of retargeting Picture-in-Picture element against the context object .

  3. If candidate and the context object are in the same tree , return candidate and abort these steps.

  4. Return null.

4.4. Interface PictureInPictureWindow

interface PictureInPictureWindow : EventTarget {
  readonly attribute long width;
  readonly attribute long height;
  attribute EventHandler onresize;
};

A PictureInPictureWindow instance represents a Picture-in-Picture window associated with an HTMLVideoElement . When instantiated, an instance of PictureInPictureWindow has its state set to opened .

When the close window algorithm with an instance of PictureInPictureWindow is invoked, its state is set to closed .

The width attribute MUST return the width in CSS pixels of the Picture-in-Picture window associated with pictureInPictureElement if the state is opened . Otherwise, it MUST return 0.

The height attribute MUST return the height in CSS pixels of the Picture-in-Picture window associated with pictureInPictureElement if the state is opened . Otherwise, it MUST return 0.

When the size of the Picture-in-Picture window associated with pictureInPictureElement changes, the user agent MUST queue a task to fire an event with the name resize at pictureInPictureElement .

4.5. Event types

enterpictureinpicture

Fired on a HTMLVideoElement when it enters Picture-in-Picture.

leavepictureinpicture

Fired on a HTMLVideoElement when it leaves Picture-in-Picture.

resize

Fired on a PictureInPictureWindow when it changes size.

4.6. Task source

The task source for all the tasks queued in this specification is the media element event task source of the video element in question.

5. Security considerations

This section is non-normative.

The API applies only to HTMLVideoElement in order to start on a minimal viable product that has limited security issues. Later versions of this specification may allow PIP-ing arbitrary HTML content.

5.1. Feature Policy

This specification defines a policy-controlled feature that controls whether the request Picture-in-Picture algorithm may return a SecurityError and whether pictureInPictureEnabled is true or false .

The feature name for this feature is "picture-in-picture" .

The default allowlist for this feature is * .

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example" , like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note" , like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[CSS-VALUES-4]
Tab Atkins Jr.; Elika Etemad. CSS Values and Units Module Level 4 . 14 August 2018. WD. URL: https://www.w3.org/TR/css-values-4/
[DOM]
Anne van Kesteren. DOM Standard . Living Standard. URL: https://dom.spec.whatwg.org/
[HTML]
Anne van Kesteren; et al. HTML Standard . Living Standard. URL: https://html.spec.whatwg.org/multipage/
[Page-Visibility]
Jatinder Mann; Arvind Jain. Page Visibility (Second Edition) . 29 October 2013. REC. URL: https://www.w3.org/TR/page-visibility/
[PROMISES-GUIDE]
Domenic Denicola. Writing Promise-Using Specifications . 16 February 2016. Finding of the W3C TAG. URL: https://www.w3.org/2001/tag/doc/promises-guide
[Remote-Playback]
Mounir Lamouri; Anton Vayvod. Remote Playback API . 19 October 2017. CR. URL: https://www.w3.org/TR/remote-playback/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels . March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[WebIDL]
Cameron McCormack; Boris Zbarsky; Tobie Langel. Web IDL . 15 December 2016. ED. URL: https://heycam.github.io/webidl/

Informative References

[Fullscreen]
Philip Jägenstedt. Fullscreen API Standard . Living Standard. URL: https://fullscreen.spec.whatwg.org/
[MediaSession]
Media Session . Living Standard. URL: https://wicg.github.io/mediasession/

IDL Index

partial interface HTMLVideoElement {
  [NewObject] Promise<PictureInPictureWindow> requestPictureInPicture();
  attribute EventHandler onenterpictureinpicture;
  attribute EventHandler onleavepictureinpicture;
  [CEReactions] attribute boolean disablePictureInPicture;
};
partial interface Document {
  readonly attribute boolean pictureInPictureEnabled;
  [NewObject] Promise<void> exitPictureInPicture();
};
partial interface DocumentOrShadowRoot {
  readonly attribute Element? pictureInPictureElement;
};
interface PictureInPictureWindow : EventTarget {
  readonly attribute long width;
  readonly attribute long height;
  attribute EventHandler onresize;
};