1. Introduction
Photos and images constitute the largest chunk of the Web, and many include recognisable features, such as human faces or barcordes/QR codes. Detecting these features is computationally expensive, but would lead to interesting use cases e.g. face tagging, or web URL redirection. While hardware manufacturers have been supporting these features for a long time, Web Apps do not yet have access to these hardware capabilities, which makes the use of computationally demanding libraries necessary.
Text Detection, despite being an interesting field, is not considered stable enough across neither computing platforms nor character sets to be standarized in the context of this document. For reference a sister informative specification is kept in [TEXT-DETECTION-API] .
1.1. Shape detection use cases
Please see the Readme/Explainer in the repository.
2. Shape Detection API
Individual browsers MAY provide Detectors indicating the availability of hardware providing accelerated operation.
Detecting features in an image occurs asynchronously, potentially communicating with acceleration hardware independent of the browser. Completion events use the shape detection task source .
2.1. Image sources for detection
This section is inspired by HTML Canvas 2D Context § image-sources-for-2d-rendering-contexts .
ImageBitmapSource
allows
objects
implementing
any
of
a
number
of
interfaces
to
be
used
as
image
sources
for
the
detection
process.
-
When an
ImageBitmapSourceobject represents anHTMLImageElement, the element’s image must be used as the source image. Specifically, when anImageBitmapSourceobject represents an animated image in anHTMLImageElement, the user agent must use the default image of the animation (the one that the format defines is to be used when animation is not supported or is disabled), or, if there is no such image, the first frame of the animation. -
When an
ImageBitmapSourceobject represents anHTMLVideoElement, then the frame at the current playback position when the method with the argument is invoked must be used as the source image when processing the image, and the source image’s dimensions must be the intrinsic dimensions of the media resource (i.e. after any aspect-ratio correction has been applied). -
When an
ImageBitmapSourceobject represents anHTMLCanvasElement, the element’s bitmap must be used as the source image.
When
the
UA
is
required
to
use
a
given
type
of
ImageBitmapSource
as
input
argument
for
the
detect()
method
of
whichever
detector,
it
MUST
run
these
steps:
-
If any
ImageBitmapSourcehave an effective script origin ( origin ) which is not the same as the Document’s effective script origin, then reject the Promise with a newDOMExceptionwhose name isSecurityError. -
If the
ImageBitmapSourceis anHTMLImageElementobject that is in theBroken( HTML Standard §img-error ) state, then reject the Promise with a newDOMExceptionwhose name isInvalidStateError, and abort any further steps. -
If the
ImageBitmapSourceis anHTMLImageElementobject that is not fully decodable then reject the Promise with a newDOMExceptionwhose name isInvalidStateError, and abort any further steps -
If the
ImageBitmapSourceis anHTMLVideoElementobject whosereadyStateattribute is eitherHAVE_NOTHINGorHAVE_METADATAthen reject the Promise with a newDOMExceptionwhose name isInvalidStateError, and abort any further steps. -
If the
ImageBitmapSourceargument is anHTMLCanvasElementwhose bitmap’sorigin-clean( HTML Standard §concept-canvas-origin-clean ) flag is false, then reject the Promise with a newDOMExceptionwhose name isSecurityError, and abort any further steps.
Note
that
if
the
ImageBitmapSource
is
an
object
with
either
a
horizontal
dimension
or
a
vertical
dimension
equal
to
zero,
then
the
Promise
will
be
simply
resolved
with
an
empty
sequence
of
detected
objects.
2.2. Face Detection API
FaceDetector
represents
an
underlying
accelerated
platform’s
component
for
detection
of
human
faces
in
images.
It
can
be
created
with
an
optional
Dictionary
of
FaceDetectorOptions
.
It
provides
a
single
detect()
operation
on
an
ImageBitmapSource
which
result
is
a
Promise.
This
method
MUST
reject
this
promise
in
the
cases
detailed
in
§ 2.1
Image
sources
for
detection
;
otherwise
it
MAY
queue
a
task
that
utilizes
the
OS/Platform
resources
to
resolve
the
Promise
with
a
Sequence
of
DetectedFace
s,
each
one
essentially
consisting
on
and
delimited
by
a
boundingBox
.
[Exposed =(Window ,Worker ),SecureContext ]interface {FaceDetector constructor (optional FaceDetectorOptions = {});faceDetectorOptions Promise <sequence <DetectedFace >>detect (ImageBitmapSource ); };image
-
FaceDetector(optional FaceDetectorOptions faceDetectorOptions ) -
Constructs
a
new
FaceDetectorwith the optional faceDetectorOptions .Detectors may potentially allocate and hold significant resources. Where possible, reuse the sameFaceDetectorfor several detections. -
detect(ImageBitmapSource image ) -
Tries
to
detect
human
faces
in
the
ImageBitmapSourceimage . The detected faces, if any, are returned as a sequence ofDetectedFaces.
2.2.1.
FaceDetectorOptions
dictionary {FaceDetectorOptions unsigned short maxDetectedFaces ;boolean fastMode ; };
-
maxDetectedFaces, of type unsigned short - Hint to the UA to try and limit the amount of detected faces on the scene to this maximum number.
-
fastMode, of type boolean - Hint to the UA to try and prioritise speed over accuracy by e.g. operating on a reduced scale or looking for large features.
2.2.2.
DetectedFace
dictionary {DetectedFace required DOMRectReadOnly boundingBox ;required sequence <Landmark >?landmarks ; };
-
boundingBox, of type DOMRectReadOnly - A rectangle indicating the position and extent of a detected feature aligned to the image axes.
-
landmarks, of type sequence< Landmark >, nullable - A series of features of interest related to the detected feature.
dictionary {Landmark required sequence <Point2D >locations ;LandmarkType type ; };
-
locations, of type sequence< Point2D > - A point in the center of the detected landmark, or a sequence of points defining the vertices of a simple polygon surrounding the landmark in either a clockwise or counter-clockwise direction.
-
type, of type LandmarkType - Type of the landmark, if known.
enum {LandmarkType "mouth" ,"eye" ,"nose" };
-
mouth - The landmark is identified as a human mouth.
-
eye - The landmark is identified as a human eye.
-
nose - The landmark is identified as a human nose.
[SameObject] readonly attribute unsigned long id;
to
DetectedFace
.
2.3. Barcode Detection API
BarcodeDetector
represents
an
underlying
accelerated
platform’s
component
for
detection
of
linear
or
two-dimensional
barcodes
in
images.
It
provides
a
single
detect()
operation
on
an
ImageBitmapSource
which
result
is
a
Promise.
This
method
MUST
reject
this
Promise
in
the
cases
detailed
in
§ 2.1
Image
sources
for
detection
;
otherwise
it
MAY
queue
a
task
using
the
OS/Platform
resources
to
resolve
the
Promise
with
a
sequence
of
DetectedBarcode
s,
each
one
essentially
consisting
on
and
delimited
by
a
boundingBox
and
a
series
of
Point2D
s,
and
possibly
a
rawValue
decoded
DOMString
.
[Exposed =(Window ,Worker ),SecureContext ]interface {BarcodeDetector constructor (optional BarcodeDetectorOptions = {});barcodeDetectorOptions static Promise <sequence <BarcodeFormat >>getSupportedFormats ();Promise <sequence <DetectedBarcode >>detect (ImageBitmapSource ); };image
-
BarcodeDetector(optional BarcodeDetectorOptions barcodeDetectorOptions ) -
Constructs
a
new
BarcodeDetectorwith barcodeDetectorOptions .-
If barcodeDetectorOptions .
formatsis present and empty, then throw a newTypeError. -
If barcodeDetectorOptions .
formatsis present and containsunknown, then throw a newTypeError.
Detectors may potentially allocate and hold significant resources. Where possible, reuse the sameBarcodeDetectorfor several detections. -
-
getSupportedFormats() -
This
method,
when
invoked,
MUST
return
a
new
Promisepromise and run the following steps in parallel :-
Let
supportedFormats
be
a
new
Array. - If the UA does not support barcode detection, queue a global task on the relevant global object of this using the shape detection task source to resolve promise with supportedFormats and abort these steps.
-
Enumerate
the
BarcodeFormats that the UA understands as potentially detectable in images. Add these to supportedFormats .The UA cannot give a definitive answer as to whether a given barcode format will always be recognized on an image due to e.g. positioning of the symbols or encoding errors. If a given barcode symbology is not in supportedFormats array, however, it should not be detectable whatsoever. -
ResolveQueue a global task on the relevant global object of this using the shape detection task source to resolve promise with supportedFormats .
The list of supportedBarcodeFormats is platform dependent, some examples are the ones supported by Google Play Services and Apple’s QICRCodeFeature . -
Let
supportedFormats
be
a
new
-
detect(ImageBitmapSource image ) -
Tries
to
detect
barcodes
in
the
ImageBitmapSourceimage .
2.3.1.
BarcodeDetectorOptions
dictionary {BarcodeDetectorOptions sequence <BarcodeFormat >formats ; };
-
formats, of type sequence< BarcodeFormat > -
A
series
of
BarcodeFormats to search for in the subsequentdetect()calls. If not present then the UA SHOULD search for all supported formats.Limiting the search to a particular subset of supported formats is likely to provide better performance.
2.3.2.
DetectedBarcode
dictionary {DetectedBarcode required DOMRectReadOnly boundingBox ;required DOMString rawValue ;required BarcodeFormat format ;required sequence <Point2D >cornerPoints ; };
-
boundingBox, of type DOMRectReadOnly - A rectangle indicating the position and extent of a detected feature aligned to the image
-
rawValue, of type DOMString - String decoded from the barcode. This value might be multiline.
-
format, of type BarcodeFormat -
Detect
BarcodeFormat. -
cornerPoints, of type sequence< Point2D > - A sequence of corner points of the detected barcode, in clockwise direction and starting with top-left. This is not necessarily a square due to possible perspective distortions.
2.3.3.
BarcodeFormat
enum {BarcodeFormat "aztec" ,"code_128" ,"code_39" ,"code_93" ,"codabar" ,"data_matrix" ,"ean_13" ,"ean_8" ,"itf" ,"pdf417" ,"qr_code" ,"unknown" ,"upc_a" ,"upc_e" };
-
aztec - This entry represents a square two-dimensional matrix following [iso24778] and with a square bullseye pattern at their centre, thus resembling an Aztec pyramid. Does not require a surrounding blank zone.
-
code_128 - Code 128 is a linear (one-dimensional), bidirectionally-decodable, self-checking barcode following [iso15417] and able to encode all 128 characters of ASCII (hence the naming).
-
code_39 - This part talks about the Code 39 barcode. It is a discrete and variable-length barcode type. [iso16388]
-
code_93 - Code 93 is a linear, continuous symbology with a variable length following [bc5] . It offers a larger information density than Code 128 and the visually similar Code 39 . Code 93 is used primarily by Canada Post to encode supplementary delivery information.
-
codabar - Codabar is a linear barcode symbology developed in 1972 by Pitney Bowes Corp. (
-
data_matrix - Data Matrix is an orientation-independent two-dimensional barcode composed of black and white modules arranged in either a square or rectangular pattern following [iso16022] .
-
ean_13 - EAN-13 is a linear barcode based on the UPC-A standard and defined in [iso15420] . It was originally developed by the International Article Numbering Association (EAN) in Europe as a superset of the original 12-digit Universal Product Code (UPC) system developed in the United States ( UPC-A codes are represented in EAN-13 with the first character set to 0 ).
-
ean_8 - EAN-8 is a linear barcode defined in [iso15420] and derived from EAN-13 .
-
itf - ITF14 barcode is the GS1 implementation of an Interleaved 2 of 5 bar code to encode a Global Trade Item Number. It is continuous, self-checking, bidirectionally decodable and it will always encode 14 digits. was once used in the package delivery industry but replaced by Code 128 . [bc2]
-
pdf417 - PDF417 refers to a continuous two-dimensional barcode symbology format with multiple rows and columns, bi-directionally decodable and according to the Standard [iso15438] .
-
qr_code - QR Code is a two-dimensional barcode respecting the Standard [iso18004] . The information encoded can be text, URL or other data.
-
unknown - This value is used by the platform to signify that it does not know or specify which barcode format is being detected or supported.
-
upc_a - UPC-A is one of the most common linear barcode types and is widely applied to retail in the United States. Define in [iso15420] , it represents digits by strips of bars and spaces, each digit being associated to a unique pattern of 2 bars and 2 spaces, both of variable width. UPC-A can encode 12 digits that are uniquely assigned to each trade item, and it’ss technically a subset of EAN-13 (UPC-A codes are represented in EAN-13 with the first character set to 0 ).
-
upc_e - UPC-E Barcode is a variation of UPC-A defined in [iso15420] , compressing out unnecessary zeros for a more compact barcode.
3. Security and Privacy Considerations
This section is non-normative.
This interface reveals information about the contents of an image source. It is critical for implementations to ensure that it cannot be used to bypass protections that would otherwise protect an image source from inspection. § 2.1 Image sources for detection describes the algorithm to accomplish this.
By providing high-performance shape detection capabilities this interface allows developers to run image analysis tasks on the local device. This offers a privacy advantage over offloading computation to a remote system. Developers should consider the results returned by this interface as privacy sensitive as the original image from which they were derived.
4. Examples
This section is non-normative.
Slightly modified/extended versions of these examples (and more) can be found in e.g. this codepen collection .
4.1. Platform support for a given detector
if ( window. FaceDetector== undefined ) { console. error( 'Face Detection not supported on this platform' ); } if ( window. BarcodeDetector== undefined ) { console. error( 'Barcode Detection not supported on this platform' ); }
4.2. Face Detection
let faceDetector= new FaceDetector({ fastMode: true , maxDetectedFaces: 1 }); // Assuming |theImage| is e.g. a <img> content, or a Blob. faceDetector. detect( theImage) . then( detectedFaces=> { for ( const faceof detectedFaces) { console. log( ' Face @ (${face.boundingBox.x}, ${face.boundingBox.y}),' + ' size ${face.boundingBox.width}x${face.boundingBox.height}' ); } }). catch (() => { console. error( "Face Detection failed, boo." ); })
4.3. Barcode Detection
let barcodeDetector= new BarcodeDetector(); // Assuming |theImage| is e.g. a <img> content, or a Blob. barcodeDetector. detect( theImage) . then( detectedCodes=> { for ( const barcodeof detectedCodes) { console. log( ' Barcode ${barcode.rawValue}' + ' @ (${barcode.boundingBox.x}, ${barcode.boundingBox.y}) with size' + ' ${barcode.boundingBox.width}x${barcode.boundingBox.height}' ); } }). catch (() => { console. error( "Barcode Detection failed, boo." ); })