The <model> element

The HTML <model> element allows a website to embed interactive 3D models as conveniently as any other visual media. Models are served as a standalone resource. The <model> element also has support for interaction and animation playback while presented within the page, and supports more spatial experiences, such as the stereoscopic display of 3D content where available.

As <model> is rendered directly by the user agent, it has the ability to utilize privileged information, such as the user's head and eye position or the lighting of the user's environment, without exposing that information to JavaScript. Additionally, the use of accessibility information and controls can be engaged in a privacy-preserving manner.

As <model> is embedded content, the user agent can present 3D content in a privacy-preserving, interactive, and accessible manner.

There are a number of specified non-goals for this initial incubation; some are beyond the long-term scope of this proposal, and others are deferred for the benefit of reaching consensus on an initial specification, which can hopefully serve the needs of users and authors sooner.

While the use cases for inspecting and manipulating the contents of a given model asset are clear, the initial scope of this proposal is intentionally limited to a reduced complexity, hopefully making the initial specification one that has broad consensus amongst the web community.

While many popular model formats include the ability to encode, play and mix multiple animations in an animation mixing system, This initial specification considers only a single animation timeline, both for simplicity and for having a minimal viable specification on which the web community can build on.

While many popular model formats support tapping on a 3D button inside their contents to show or hide additional contents like a title, play an animation or undertake some other stateful action. Because there is no current intention to provide a JavaScript-facing mechanism to be aware of such actions, stateful interactions should be considered out of scope.

While many popular model formats support the use of external asset requests for better re-use and consistency, this initial specification will only allow assets that are self-contained.

Example 1

The following example shows the simplest way to add a 3D model to a document:

<model src="3d-assets/car"></model>

By using content negotiation the user agent relies on the server to send back a 3D model in a format the user agent supports.

Example 2

The following example shows the syntax for enabling the standard "orbit" mode:

<model stagemode="orbit"> 
  <source src="3d-assets/teapot.usdz" type="model/vnd.pixar.usd">
  <source src="3d-assets/teapot.glb" type="model/gltf-binary">
</model>

Example 3

The following example shows a list of potential source formats that can be selected from:

<model> 
   <source src="3d-assets/teapot.usdz" type="model/vnd.pixar.usd">
   <source src="3d-assets/teapot.glb" type="model/gltf-binary">
</model>

<model> source selection follows <video> selection, where the top-level src attribute takes precedence if present, followed by the first compatible <source> element encountered.

Example 4

The following example supplies fallback content for user agents that do not yet support the element:

<model> 
    <source src="3d-assets/teapot.usdz" type="model/vnd.pixar.usd">
    <source src="3d-assets/teapot.glb" type="model/gltf-binary">
    <img src="image-assets/teapot.png">
</model>

Example 5

The following example supplies alt text information for both the model content and fallback content, in the event that model is not supported on the target user agent.

<model> 
    <source src="3d-assets/teapot.usdz" type="model/vnd.usdz+zip">
    <source src="3d-assets/teapot.glb" type="model/gltf-binary">
    <img src="image-assets/teapot.png" alt="A teapot">
</model>

Categories: Flow content.; Phrasing content.; Embedded content.; If the element's stagemode attribute is set to anything other than none: interactive content.; Palpable content.
Contexts in which this element can be used:: Where embedded content is expected.
Content model:: If the element has a src attribute: transparent, a picture or img, or a media element descendant.; If the element has no src attribute: Zero or more source elements, then transparent, optionally intermixed with script-supporting elements.
Tag omission in text/html:: Neither tag is omissible.
Content Attributes:: Global attributes; autoplay — Hint that the resource can be started automatically when the page is loaded; stagemode — Allows the user to interact with the model in a specified mode; crossorigin — How the element handles crossorigin requests; height — Vertical dimension; loading — Used when determining loading deferral; loop — Whether to loop the media resource; poster — Poster frame to show while the resource is loading; src — Address of the resource; width — Horizontal dimension
DOM Interface: The HTMLModelElement interface provides a means to interface with the embedded resource.

The model element is used for embedding 3D models into a document.

Content may be provided inside the model element. User agents should not show this content to the user; it is intended for web browsers which do not support model, to be shown as fallback content.

Issue 14: Update poster algorithm for model?

HTML defines an algorithm to determine the [=poster frame=], but it's <video> specific. Should we accommodate it support <model> or specify our own?

We also need to consider what happens if the animation resets and the model is paused: does the poster show again? (i.e., do we follow video's behavior?)

Hint that the model can start its animation automatically, if present, when the asset is loaded.

Issue 44: Lazy loading

Some <model>'s resources can be significant in size. As such, it might be good to support the loading attribute to allow these resources to be lazy-loaded.

The poster attribute gives the URL of an image file that the user agent can show while 3D content is unavailable. The attribute, if present, must contain a valid non-empty URL potentially surrounded by spaces.

WebIDL[Exposed=Window]
interface HTMLModelElement : HTMLElement {

  readonly attribute Promise<HTMLModelElement> ready;
  readonly attribute DOMPointReadOnly boundingBoxCenter;
  readonly attribute DOMPointReadOnly boundingBoxExtents;

  attribute DOMMatrixReadOnly entityTransform;

  attribute USVString environmentMap;
  readonly attribute Promise<undefined> environmentMapReady;

  [Reflect=stagemode] attribute DOMString stageMode;
};

Issue 24: Specify the behavior for when a <source> element's parent is a <model> element

The <source> element behaves differently depending on who the parent is. For instance, when the parent is <picture>, the srcset attribute comes into to play. We need to look at the attributes of <source> and figure out what they mean in when used in the context of <model>.

In addition to emitting standard input events, model may interpret input events according to specific stage modes, including none and orbit.

The default value for stageMode. In this this mode, input events do not have any direct action on the behavior of the element.

In this mode, input events in a horizontal direction result in a rotation of the model about the Y axis, and events in a vertical direction result in a rotation about the horizontal axis. This is reflected in the model's entityTransform value.

In this mode, the entityTransform value is read-only.

Setting the mode to orbit results in a change of the position and scale to the orbit fit mode.

Issue 2: Dealing with format specific animations and Interaction

glTF is not a run-time format. It does not define what an application should do with a model once it is loaded and rendered. It does provide some capabilities that a run-time engine may use to enhance the user experience. glTF currently does not store any interactivity information. Currently that is solely a run-time determination. The run-time determines what parts (if any) of the model may be active and the behavior based on any trigger.

Like Interactivity, animation is not built-into glTF. glTF files may contain animation parameters that specify the type of animation (e.g., morph, skin & bones, etc.) and the associated parameters needed to perform the animation. There is nothing in the glTF specification that defines how one animation interacts with another. For example, a human model may include walk, jump, and drop animations; but it is unlikely that they should all be played at the same time.

Any HTML element that wishes to handle animation as stored in a glTF file needs to understand how the content creator intended the animation to play.

Issue 43: Adding "controls"

As with other media elements (again #13), having "controls" for media specific things can be extremely helpful for accessibility (and just generally helpful for developers not needing to deal with things like the fullscreen API).

It would be nice to consider adding support for controls and then leaving it mostly to the UA as to what those controls are... we could figure out a standard set of things, like <video> provides.

The model SHOULD be rendered according to a realtime, physically-based rendering (PBR) shading model, and lit by an image-based light.

If provided, an environment map MUST be interpreted as an equirectangular environment map. If an environmentMap is not specified, the User Agent MUST provide an appropriate map.

Issue 1: Consistent presentation/rendering of model resources

I agree that it is very good to make it easy for people to display 3D content in a web page. I completely disagree with the methods and processes described in this proposal to make it an HTML element. HTML elements need to be fully defined so that they can be similarly implemented across browsers and reflect what people would see in applications outside of browsers. The process of rendering a high-quality model requires proper handling and rendering of the model's geometry, appearance, animation, and interaction.

My knowledge is in glTF (and glTF binary) so these comments may or may not reflect on the capabilities of USDZ. I will address the topics as separate issues: Appearance and Animation / interactivity; with respect to 3D models in glTF format. Static geometry is pretty straight-forward and not subject to much interpretation.

The really difficult part is appearance. The document states that "it is impractical to define a pixel accurate rendering..." for models. However, this is really important. Khronos has done extensive work in the 3D Commerce Working Group towards pixel accurate rendering across multiple 3D viewers (https://www.khronos.org/3dcommerce/certification/). The accuracy was demanded by retailers so their products would appear visually identical across different web sites. There were so many factors that mattered in producing acceptable renderings that include lighting, rendering calculations (including equation approximations), conversion from GPU to display, and tone mapping.

The component that caused the most issues and difficulties is lighting. A model built for physically-based rendering looks best in a complex lighting environment. This is usually done with image based lighting, but punctual plus area lights will also work. The statement that "A future version ... will describe the lighting model and environment .... Both items will require community collaboration and some consensus." makes the process sound much easier that Khronos found it to be.

Some issues that came from the Certification work. Note that the Certification program did not solve all of these in the initial release.

Is lighting done as an 8-bit RGB or 16-bit HDR image
Is lighting done with many point and area lights?
How does the content creator provide for different lighting?
How does the user adjust the lighting to match a particular environment?
What background is used for the model display?
How is the (floating-point) rendering converted to an 8-bit RGB display?
How is the rendering adjusted depending on user environment?
How is the rendering adjusted depending on device (hardware, OS, browser)
How is time-dependent display degradation handled?

It may be possible to construct an initial release without resolving all of these items.

Issue 5: How would the model display on curved screens?

The Oculus browser is displayed on a curved surface.
How we envision the display of multiple models? Would we allow them to bump into each other or would there be clipping?

The The position, rotation, and scale of a displayed model MUST present its contents according to its entityTransform property, a DOMMatrixReadOnly that can be composed using that object's existing API.

Updates to the entityTransform SHOULD be reflected on the next rendered frame.

On the initial load for a model, the entityTransform MUST be set so that the object is fully in view within the model element's width and height on the page.

The bounding box calculation algorithm consists of the following steps.

With the model scene loaded, set the animation to the first frame if present.
let max be a new DOMPoint with values of -Infinity.
let min be a new DOMPoint with values of Infinity.
let queue be an empty list of elements.
Add the model's root object to queue.
While queue is not empty:

set element to the last element in queue and remove it from queue.
If elementcontains any child elements, add them to queue.
If element contains renderable geometry, find the minimum and maximum value for the X, Y and Z locations of that geometry.
Apply the world matrix of element to the bounding box of its geometry.
set each value of min to the minimum of its current value and the minimum for this element's bounding box.
set each value of max to the maximum of its current value and the maximum for this element's bounding box.

Set the values of boundingBoxCenter to be the mean of each values of min and max.
Set the values of boundingBoxExtents to be each value ofmin subtracted from max.

The initial model fit algorithm consists of the following steps.

Retrieve the bounds of the smallest axis-aligned box that contains the geometry of the object using the bounding box calculation algorithm.
Let extents be the boundingBoxExtents of the resource.
Let center be the boundingBoxCenter of the resource.
Divide extents.x by the model's width in the viewport. This is the X-scale.
Divide extents.y by the model's height in the viewport. This is the Y-scale.
scale the entityTransform to be the minimum of the X-scale and Y-scale.
Set the entityTransform to be centered on center.x, center.y and set back from center.z by extents.z / 2, so that the full extents are visible and set directly behind the viewport.

Note

The orbit fit algorithm is triggered when the model's stagemode is set to orbit. The orbit fit algorithm consists of the following steps.

Retrieve the bounds of the smallest axis-aligned box that contains the geometry of the object.
Let extents be the boundingBoxExtents of the resource.
Let center be the boundingBoxCenter of the resource.
Let length be the *length* of the extents of extents.
Divide length by the model's width in the viewport. This is the X-scale.
Divide length by the model's height in the viewport. This is the Y-scale.
scale the entityTransform to be the minimum of the X-scale and Y-scale.
Set the entityTransform to be centered on center.x, center.y and set back from center.z by extents.z / 2, so that the full extents are visible and set directly behind the viewport, and will remain in view at any orientation.

Issue 27: What's the default style?

What's the default CSS style for a model element? Should it have a border around it? what about background color? etc.

Whether a model element is exposing a user interface is not expected to affect the size of the rendering; controls are expected to be overlaid above the page content without causing any layout changes, and may disappear when the user does not need them.

When a model element represents a poster frame, the poster frame is expected to be rendered at the largest size that maintains the aspect ratio of that poster frame without being taller or wider than the model element itself, and is expected to be centered in the model element.

The environmentMapReady Promise resolves when an environment map resource has been loaded, or is rejected if the resource is unable to be loaded.

The model element emits a ready Promise when the model is processed and ready to display. The Promise is rejected if the model source cannot be loaded.

Issue 18: Which formats?

Need to investigate what formats are suitable for model. We might need some kind of evaluation matrix. Model can support multiple formats out of the box, but it might be good to evaluate what is best of users and developers and why.

Issue 13: Is model a "media element"?

The <model> element shares a lot of similarities with the <audio> and <video> elements, yet it's distinct in some ways (we need to tease these out). It's similar in being potentially temporal multimedia content (i.e., it has audio, it potentially animates over time). We need to figure out if model sufficiently different to warrant being its own element class, or if it can reuse much of "media element"'s infrastructure.

Additional integrations into HTML:

Issue 28: HTML parser integration

Need same behavior as audio and video when including into a p element with no end tag.

Issue 29: model is an appropriate child of figure

Need to specify that model is an appropriate child of <figure>.

Issue 30: Integration with `preload`

Need to specify Integration with preload link relationships.

Issue

Issue 31: CSS integrations: Media Playback States

If #13 holds, how does the <model> work with Media Playback States :playing, :paused?

Issue 31: CSS integrations: Media Playback States

If #13 holds, how does the <model> work with Media Playback States :playing, :paused?

Issue 35: Fetch integration: new destination?

The formats that model support can fetch a lot of other resources. We probably need a new fetch destination ("model").

Issue 36: Formats: privacy considerations

We need to investigate what the privacy implications are of each model format we will recommend. The model formats themselves can fetch resources, so we need to put a privacy and security framework around what schemes they can fetch (https only, for instance). We also need to say what all the fetch policies are. Need to investigate if the formats provide any guidance here, or if they leave it up to the implementation. If they do, we need to specify it (i.e., don't send cookies, don't leak the referrer, etc.).

Issue 15: Formats and CSP

Need to clarify that 3D resources can fetch resources, and as such need to be subject the document's CSP (probably "media-src"). However, we need to clarify what this means in relation to, say, "img-src", for example... as models can load png/jpg textures.

Issue 16: Reuse media-src?

Given the close relationship to media elements, and given the reliance on <source> elements, we could just say that media-src applies to <model> too.

Issue 17: Format security concerns

Need to describe that each format will come with its own security considerations (and link to the appropriate security considerations in their respective specs).

Issue 50: Accessibility of model

We need to figure out how to make <model> accessible on a number of different fronts:

Visual: describe what is being presented over time.
Interaction: describe what can interacted with (regions, buttons, etc.).
auditory: describe audio and possibly spoke sounds, potentially over time.

Usually, this would be provide by the embedded format... however, it appears that both glTF and USDZ are quite limited when it comes to accessibility.

As such, it may be that we need to leverage what we can from HTML + ARIA to overcome the shortcomings of these formats. We have quite a bit of precedent (e.g., from the humble, yet limited, alt attribute, to how <canvas> can be made accessibly, to the potential inclusion of <track> elements, and so on).

Issue 39: Accessibility: ARIA integration and HTML Accessibility API Mappings a11y-tracker

We need to define how what the ARIA semantics are and what is exposed (application probably). We need to coordinate with the accessibility folks + get this added to the HTML Accessibility API Mappings.

Note

Rules of ARIA attribute usage by HTML element ([HTML-ARIA])
HTML element	Implicit ARIA semantics	ARIA roles, states and properties which MAY be used
`model`	TBD	TBD

Note

`model`
[wai-aria-1.2]	No corresponding role
MSAA + IAccessible2	Not mapped
UIA	Not mapped
ATK	Not mapped
AX	Not mapped
Comments

Issue 37: MIME integration

We need to check if there are any relevant MIME parameters for model/* content (if any).

Issue 38: Enforce MIME types

Need to describe how we sniff for MIME types in [MIMESNIFF]. See also IANA "model" types. We might need specific rules for sniffing.

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, and SHOULD in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

Issue 14: Update poster algorithm for model?
Issue 44: Lazy loading
Issue 24: Specify the behavior for when a <source> element's parent is a <model> element
Issue 2: Dealing with format specific animations and Interaction
Issue 43: Adding "controls"
Issue 1: Consistent presentation/rendering of model resources
Issue 5: How would the model display on curved screens?
Issue 27: What's the default style?
Issue 18: Which formats?
Issue 13: Is model a "media element"?
Issue 28: HTML parser integration
Issue 29: model is an appropriate child of figure
Issue 30: Integration with `preload`
Issue
Issue 31: CSS integrations: Media Playback States
Issue 31: CSS integrations: Media Playback States
Issue 35: Fetch integration: new destination?
Issue 36: Formats: privacy considerations
Issue 15: Formats and CSP
Issue 16: Reuse media-src?
Issue 17: Format security concerns
Issue 50: Accessibility of model
Issue 39: Accessibility: ARIA integration and HTML Accessibility API Mappings
Issue 37: MIME integration
Issue 38: Enforce MIME types

autoplay §4.1
bounding box calculation algorithm §9.2
bounding box calculation algorithm. §9.2
boundingBoxCenter attribute for HTMLModelElement §5.
boundingBoxExtents attribute for HTMLModelElement §5.
controls §4.3
crossorigin §4.4
entityTransform attribute for HTMLModelElement §5.
environmentMap attribute for HTMLModelElement §5.
environmentMapReady attribute for HTMLModelElement §5.
height §4.5
HTMLModelElement interface §5.
loading §4.6
loop §4.7
model §4.
model fit algorithm §9.2
orbit fit §7.2
orbit fit algorithm §9.2.1
poster §4.8
ready attribute for HTMLModelElement §5.
src §4.9
stagemode §4.2
stageMode attribute for HTMLModelElement §5.
width §4.10

[GEOMETRY] defines the following:
- DOMMatrixReadOnly interface
- DOMPointReadOnly interface
[HTML] defines the following:
- embedded content
- Flow content
- HTMLElement interface
- img element
- interactive content
- Palpable content
- Phrasing content
- picture element
- [Reflect] extended attribute
[WEBIDL] defines the following:
- DOMString interface
- [Exposed] extended attribute
- Promise interface
- undefined type
- USVString interface

[geometry]: Geometry Interfaces Module Level 1. Sebastian Zartner; Yehonatan Daniv. W3C. 4 December 2025. CRD. URL: https://www.w3.org/TR/geometry-1/
[HTML]: HTML Standard. Anne van Kesteren; Domenic Denicola; Dominic Farolino; Ian Hickson; Philip Jägenstedt; Simon Pieters. WHATWG. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[HTML-ARIA]: ARIA in HTML. Scott O'Hara; Patrick Lauke. W3C. 5 August 2025. W3C Recommendation. URL: https://www.w3.org/TR/html-aria/
[RFC2119]: Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc2119
[RFC8174]: Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. B. Leiba. IETF. May 2017. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc8174
[wai-aria-1.2]: Accessible Rich Internet Applications (WAI-ARIA) 1.2. Joanmarie Diggs; James Nurthen; Michael Cooper; Carolyn MacLeod. W3C. 6 June 2023. W3C Recommendation. URL: https://www.w3.org/TR/wai-aria-1.2/
[WEBIDL]: Web IDL Standard. Edgar Chen; Timothy Gu. WHATWG. Living Standard. URL: https://webidl.spec.whatwg.org/

[HTML-AAM-1.0]: HTML Accessibility API Mappings 1.0. Scott O'Hara; Rahim Abdi. W3C. 11 March 2026. W3C Working Draft. URL: https://www.w3.org/TR/html-aam-1.0/

The <model> element

Abstract

Status of This Document

1. Introduction

2. Out of scope

2.1 Scene graph inspection and manipulation

2.2 Animation mixing and blending

2.3 Stateful interaction with model contents

2.4 Subresource loading

3. Examples

3.1 Adding a model to a document

3.2 Enabling orbit stage mode

3.3 Supporting multiple formats

3.4 Providing fallback content for legacy user agents

3.5 Making model accessible

4. The model element

4.1 autoplay attribute

4.2 stagemode attribute

4.3 controls attribute

4.4 crossorigin attribute

4.5 height attribute

4.6 loading attribute

4.7 loop attribute

4.8 poster attribute

4.9 src attribute

4.10 width attribute

5. The HTMLModelElement interface

6. source element integration

6.1 The source element's parent is a model element

7. Interaction

7.1 none

7.2 orbit

8. Controls

9. Rendering

9.1 Environment map

9.2 Model pose

9.2.1 Orbit fit

9.3 Embedded content

10. Events

11. Formats

12. Overlap with "media elements"

13. HTML Integrations

13.1 Preload link relationship

13.2 Integrations with source element

14. CSS integrations

14.1 Media Playback States (:playing, :paused, :seeking)

14.2 Fullscreen Presentation State: the :fullscreen pseudo-class

15. Fullscreen integration

16. Fetch integration

16.1 "model" destination

17. Privacy considerations

18. Security considerations

18.1 CORS

18.2 Formats and CORS

18.3 Content Security Policy

18.4 Format concerns

19. Accessibility

19.1 Requirements for providing text to act as an alternative for 3D content

19.2 For authors

19.3 For implementers

20. MIME

20.1 MIME Sniffing

21. Conformance

22. Issue summary

23. Change log

A. Index

A.1 Terms defined by this specification

A.2 Terms defined by reference

B. References

B.1 Normative references

B.2 Informative references

The `<model>` element

2.3 Stateful interaction with `model` contents

3.5 Making `model` accessible

4. The `model` element

4.1 `autoplay` attribute

4.2 `stagemode` attribute

4.3 `controls` attribute

4.4 `crossorigin` attribute

4.5 `height` attribute

4.6 `loading` attribute

4.7 `loop` attribute

4.8 `poster` attribute

4.9 `src` attribute

4.10 `width` attribute

5. The `HTMLModelElement` interface

6. `source` element integration

6.1 The `source` element's parent is a `model` element

13.2 Integrations with `source` element

14.1 Media Playback States (`:playing`, `:paused`, `:seeking`)

14.2 Fullscreen Presentation State: the `:fullscreen` pseudo-class

16.1 `"model"` destination