August 23, 2017

Release Notes for Safari Technology Preview 38

Surfin’ Safari

Safari Technology Preview Release 38 is now available for download for macOS Sierra and betas of macOS High Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 220128-220795.

Beacon API

  • Enabled the Beacon API by default as an experimental feature (r220553)
  • Added support for CORS-preflighting (r220442)
  • Added support for CORS-preflighting on redirects (r220497)
  • Added support for connect-src CSP checks on redirects (r220549)
  • Updated sendBeacon() to rely on FetchBody instead of the whole FetchRequest (r220366)
  • Changed to use “application/octet-stream” content-type for payloads of ArrayBuffer and ArrayBufferView types (r220779)

Fetch API

  • Added support for Request keepalive getter (r220244)
  • Changed Response to keep all ResourceResponse information (r220320)
  • Implemented quota limitation for keepalive Fetch requests (r220751)

Web Payments

  • Enabled Payment Requests as an experimental feature (r220787)

CSS

  • Added support for parsing of the font-display property (r220725)
  • Implement caret-color support (r220706, r220714)
  • Added a fast path for rotate() and rotateZ() transform parsing (r220382)
  • Fixed parsing of <meta http-equiv=refresh> to allow time starting with a ‘.’ without a leading 0 (r220252)
  • Fixed a hit testing issue occurring when an SVG rect element with a non-default stroke style applied (r220717)

Web API

  • Added support for considering file extensions in the accept attribute of HTML file input elements (r220135)
  • Improved support for referrer policies (r220208)
  • Fixed the Promise resolve and reject function to have a length of 1 (r220324)
  • Fixed an early error on an ANY operator before new.target (r220481)
  • Fixed removing an empty <li> element when inside a table cell (r220398)
  • Fixed XHR to only fire an abort event if the cancellation was requested by the client (r220731)

Media

  • Fixed deleting the old subtitle track on src attribute change event (r220472)

Apple Pay

  • Added support for phonetic contact names (r220718)

Web Inspector

  • Added Canvas path preview when viewing a recording (r220370)
  • Changed clicking on an autocomplete suggestion to apply it, not dismiss it (r220614)
  • Removed text-shadow and gradient backgrounds (r220710)
  • Updated the filter icon in the styles sidebar (r220609)

WebDriver

  • Added support for interacting with <option> and <select> elements using the element click command (r220740)
  • Added support for uploading files via <input type=file> (r219874, r220115, r220135, r220147, r220222)
  • Fixed a timeout when a JavaScript alert is shown in onload handler (r220741)
  • Implemented “normal” and “eager” page load strategies from the W3C specification (r220317)
  • Updated code to use in-view center points for clicks and other automation logic (r220314)

By Jon Davis at August 23, 2017 06:00 PM

August 09, 2017

Michael Catanzaro: On Firefox Sync

Igalia WebKit

Epiphany 3.26 is, unfortunately, not going to be packed with cool new features like 3.24 was. We’ve just been too busy working on improving WebKit this cycle. But there is one cool new thing: Firefox Sync support. You can sync bookmarks, history, passwords, and open tabs with other Epiphany instances and as well as both desktop and mobile Firefox. This is already enabled in 3.25.90. Just go to the Sync tab in Preferences and sign in or create your Firefox account there. Please test it out and report bugs now, so we can quash problems you find before 3.26.0 rather than after.

Some thank yous are in order:

  • Thanks to Gabriel Ivascu, for writing all the code.
  • Thanks to Google and Igalia for sponsoring Gabriel’s work.
  • Thanks to Mozilla. This project would never have been possible if Mozilla had not carefully written its terms of service to allow such use.

Go forth and sync!

By Michael Catanzaro at August 09, 2017 07:57 PM

Release Notes for Safari Technology Preview Release 37

Surfin’ Safari

Safari Technology Preview Release 37 is now available for download for macOS Sierra and betas of macOS High Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 219567-220128.

Web API

  • Added initial support for navigator.sendBeacon behind an experimental feature flag (r220121)
  • Implemented document.elementsFromPoint (r219961)
  • Made cross-origin properties enumerable (r219659)
  • Fixed dispatching the click event to the parent when the child target stops hit testing after mouseDown (r219568)
  • Moved DOMException properties to the prototype and changed to use Error.prototype.toString() (r219607, r219663)

JavaScript

  • Added finally method support to Promise (r219989)
  • Added support for optional catch binding (r220068)

WebAssembly

  • Reduced the size of generated binaries (r219899)

Apple Pay

  • Added "carteBancaire" as a supported payment network (r219896)

CSS

  • Aligned quirky number parsing with other browsers (r219642)

Web Inspector

  • Added a context menu item to the Elements tab for taking a screenshot of a node (r219870)
  • The debugger now captures async stack traces when web content calls addEventListener (r220036)
  • Prevented outputting “No message” for multi-value logs like console.log(x, y) (r219900)
  • Fixed warnings about console.assert lines without semicolons (r219894)
  • Inlined multiple console log values if they are simple (r219893)
  • Fixed inspect(aFunction) to jump to the function definition (r219749)
  • Fixed the page overlay highlight to fade out when a page is constantly updating (r219596)
  • Fixed some controls overlaying the header in the Settings tab (r219851)

WebDriver

  • Fixed an issue where implicit navigations didn’t cause a browsing context switch (r219723)
  • Fixed link and partial link queries if the text link contains trailing or leading whitespaces (r219604)
  • Fixed an issue that caused some script evaluations to be attributed to the wrong frame (r219649)

Rendering

  • Changed to disable async image decoding for large images after the first time a tile is painted (r219876)
  • Fixed the minimum font size preference to affect absolute line-height values and prevent text lines from overlapping (r219665)
  • Fixed getting round-trip stroke-width styles causing text to gain a stroke (r219755)
  • Fixed Reeder’s default font to correctly use San Francisco (r220009)

Accessibility

  • Fixed zoom to follow the keyboard insertion point (r219987)
  • Added a background color for the focus state of the icon buttons in the media controls (r220041)
  • Fixed the incorrect range from index and length on <p> tags with contenteditable (r219949)
  • Changed to dispatch accessibilityPerformPressAction asynchronously on macOS (r219906)
  • Fixed silent VoiceOver or skipping over time values on the media player (r219983)
  • Fixed the web page getting reloaded when a node is labelling multiple child nodes (r219661)

Media

  • Fixed media controls missing content in fullscreen when the document has a scroll offset (r219645)
  • Fixed the mouse pointer not hiding during fullscreen playback (r219625)
  • Fixed pressing the Escape key to not be a valid user gesture to enter fullscreen (r219950)

By Jon Davis at August 09, 2017 05:00 PM

August 06, 2017

Michael Catanzaro: Endgame for WebKit Woes

Igalia WebKit

In my original blog post On WebKit Security Updates, I identified three separate problems affecting WebKit users on Linux:

  • Distributions were not providing updates for WebKitGTK+. This was the main focus of that post.
  • Distributions were shipping a insecure compatibility package for old, unmaintained WebKitGTK+ 2.4 (“WebKit1”).
  • Distributions were shipping QtWebKit, which was also unmaintained and insecure.

Let’s review these problems one at a time.

Distributions Are Updating WebKitGTK+

Nowadays, most major community distributions are providing regular WebKitGTK+ updates, so this is no longer a problem for the vast majority of Linux users. If you’re using a supported version of Ubuntu (except Ubuntu 14.04), Fedora, or most other mainstream distributions, then you are good to go.

My main concern here is still Debian, but there are reasons to be optimistic. It’s too soon to say what Debian’s policy will be going forward, but I am encouraged that it broke freeze just before the Stretch release to update from WebKitGTK+ 2.14 to 2.16.3. Debian is slow and conservative and so has not yet updated to 2.16.6, which is sad because 2.16.3 is affected by a bug that causes crashes on a huge number of websites, but my understanding is it is likely to be updated in the near future. I’m not sure if Debian will update to 2.18 or not. We’ll have to wait and see.

openSUSE is another holdout. The latest stable version of openSUSE Leap, 42.3, is currently shipping WebKitGTK+ 2.12.5. That is disappointing.

Most other major distributions seem to be current.

Distributions Are Removing WebKitGTK+ 2.4

WebKitGTK+ 2.4 (often informally referred to as “WebKit1”) was the next problem. Tons of desktop applications depended on this old, insecure version of WebKitGTK+, and due to large API changes, upgrading applications was not going to be easy. But this transition is going much smoother and much faster than I expected. Several distributions, including Debian, Fedora, and Arch, have recently removed their compatibility packages. There will be no WebKitGTK+ 2.4 in Debian 10 (Buster) or Fedora 27 (scheduled for release this October). Most noteworthy applications have either ported to modern WebKitGTK+, or have configure flags to disable use of WebKitGTK+. In some cases, such as GnuCash in Fedora, WebKitGTK+ 2.4 is being bundled as part of the application build process. But more often, applications that have not yet ported simply no longer work or have been removed from these distributions.

Soon, users will no longer need to worry that a huge amount of WebKitGTK+ applications are not receiving security updates. That leaves one more problem….

QtWebKit is Back

Upstream QtWebKit has not been receiving security updates for the past four years or thereabouts, since it was abandoned by the Qt project. That is still the status quo for most distributions, but Arch and Fedora have recently switched to Konstantin Tokarev’s fork of QtWebKit, which is based on WebKitGTK+ 2.12. (Thank you Konstantin!) If you are using any supported version of Fedora, you should already have been switched to this fork. I am hopeful that the fork will be rebased on WebKitGTK+ 2.16 or 2.18 in the near future, to bring it current on security updates, but in the meantime, being a year and a half behind is an awful lot better than being four years behind. Now that Arch and Fedora have led the way, other distributions should find little trouble in making the switch to Konstantin’s QtWebKit. It would be a disservice to users to continue shipping the upstream version.

So That’s Cool

Things are better. Some distributions, notably Arch and Fedora, have resolved all of the above problems (or will in the very near future). Yay!

By Michael Catanzaro at August 06, 2017 09:47 PM

July 26, 2017

Release Notes for Safari Technology Preview 36

Surfin’ Safari

Safari Technology Preview Release 36 is now available for download for macOS Sierra and betas of macOS High Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 219131-219567.

JavaScript

  • Implemented Object Spread (r219443)

CSS

  • Changed to not resolve auto values of align-self and justify-self matching specification changes (r219315)
  • Fixed line-height:<number> to not be visually applied twice when text autosizing is in effect (r219543)

WebRTC

  • Fixed incorrect sdpMLineIndex for video to fix interoperability with Firefox (r219393)
  • Fixed sending silence data for a disabled audio track (r219524)
  • Increased the render audio buffer sizes for WebRTC (r219517)

Web Driver

  • Fixed link and partial link queries when the link contains formatting tags (r219555)

Rendering

  • Disabled asynchronous image decoding for large images by default (r219438)
  • Enabled the Viewport Fit experimental feature by default (r219369)
  • Fixed an element not getting repainted when its background image finishes decoding (r219364)
  • Fixed an issue to avoid unnecessary copying of the frame buffer into a WebGL Layer (r219472)
  • Corrected the radix used in Unicode Escape in invalid character error message (r219396)
  • Fixed the low memory notification to prevent causing style recalculation (r219145)
  • Fixed GIFs with infinite animation often only playing once (r219389)

Media

  • Fixed removing user gesture restrictions when adding the autoplay attribute to a media element during a user gesture (r219509)
  • Fixed reflecting "video" and "audio" when they are not a supported as attribute value (r219234)
  • Fixed bad behavior caused by removing samples when the presentation order does not match decode order (r219519)
  • Fixed media controls drawing behind the captions (r219558)
  • Fixed clicking the edges of media control buttons to execute the action, not just change the visual state of the button (r219549)

Web Inspector

  • Changed to group Inspector Style Sheets as part of the Stylesheets folder (r219185)
  • Improved wording for Potential Custom Element that lacks a Custom definition (r219333)
  • Fixed resources that are sometimes missing from the tree outline right before grouping into folders (r219270)
  • Fixed WebSocket resource tree elements to show the connection status (r219269)

WebAssembly

  • Implemented name section’s module name, and skipped unknown sections (r219134)

Bug Fixes

  • Fixed the “Show My Relationship” link in familysearch.org (r219151)
  • Fixed an issue where font loads can cause Chinese characters to draw as .notdef (r219221)
  • Changed AppCache to ignore fallback entries whose namespace is not prefixed with the manifest path (r219272)

By Jon Davis at July 26, 2017 05:00 PM

July 25, 2017

Adobe Announces Flash Distribution and Updates to End

Surfin’ Safari

Adobe has announced it will stop distributing and updating Flash Player at the end of 2020 and is encouraging web developers to migrate any existing Flash content to open standards. Apple is working with Adobe, industry partners, and developers to complete this transition.

Apple users have been experiencing the web without Flash for some time. iPhone, iPad, and iPod touch never supported Flash. For the Mac, the transition from Flash began in 2010 when Flash was no longer pre-installed. Today, if users install Flash, it remains off by default. Safari requires explicit approval on each website before running the Flash plugin.

To display rich interactive content in the browser, WebKit—the engine that powers Safari—supports the latest standards, including the following:

  • HTML Video and Media Source Extensions support a wide range of video experiences, including short clips, longer content, and live streaming.
  • HTML Canvas and WebGL provide fast, dynamic graphics for games and interactive experiences.
  • CSS Transitions and Animations add polished animations to web interfaces.
  • WebRTC enables real-time peer-to-peer video.
  • WebAssembly allows games and other compute-intensive applications to run faster.

The WebKit Project is excited about the future of the open web. We invite you to follow this blog to learn about new technologies as they’re implemented in WebKit.

By Apple's WebKit Team at July 25, 2017 04:00 PM

July 21, 2017

Update on Web Cryptography

Surfin’ Safari

Cryptography is the cornerstone of information security, including various aspects such as data confidentiality, data integrity, authentication, and non-repudiation. These provide support for the fundamental technologies of today’s Internet like HTTPS, DNSSEC, and VPN. The WebCrypto API was created to bring these important high-level cryptography capabilities to the web. This API provides a set of JavaScript functions for manipulating low-level cryptographic operations, such as hashing, signature generation and verification, encryption and decryption, and shared secret derivation. In addition, it supports generation and management of corresponding key materials. Combining the complete support of various cryptographic operations with a wide range of algorithms, the WebCrypto API is able to assist web authors in tackling diverse security requirements.

This blog post first talks about the advantages of implementing web cryptography through native APIs, and then introduces an overview of the WebCrypto API itself. Next, it presents some differences between the updated SubtleCrypto interface and the older webkit- prefixed interface. Some newly-added algorithms are discussed, and finally we demonstrate how to smoothly transition from the webkit- prefixed API to the new, standards-compliant API.

Native or Not Native?

Long before the WebCrypto API was standardized, several JavaScript cryptography libraries were created and have successfully served the open web since. So why bother implementing a web-facing cryptography library built on native APIs? There are several reasons, one of the more important being performance. Numbers tell the truth. We conducted several performance tests to compare our updated WebCrypto API and some famous pure JavaScript implementations.

The latest SJCL (1.0.7), asmcrypto.js, and CryptoJS (3.1) were selected for the comparison. The test suite contains:

  1. AES-GCM: Test encryption/decryption against a 4MB file, repeat certain times and record down the average speed. It uses a 256-bit AES key.
  2. SHA-2: Hash a 512KB file by SHA-512, repeat certain times and record down the average speed.
  3. RSA: Test RSA-PSS signature and verification against a 512KB file, repeat certain times and record down the average speed. It uses a 2048-bit key pair and SHA-512 for hashing.

The content under test was carefully selected to reflect the most frequently used day-to-day cryptography operations and paired with appropriate algorithms. The test platform was a MacBook Pro (MacBookPro11,5) with a 2.8 GHz Intel Core i7 running MacOS 10.13 Beta (17A306f) and Safari Technology Preview 35. Some of the pure JavaScript implementations do not support all of the test content, therefore corresponding results were omitted from those results.

Here are the test results.

AES-GCM Encryption/Decryption SHA-2 RSA

As you can see, the difference in performance is staggering. This was a surprising result, since most modern JavaScript engines are very efficient. Working with our JavaScriptCore team, we learned that the causes of these pure JavaScript implementations not performing well is that most of them are not actively maintained. Few of them take full advantage of our fast JavaScriptCore engine or modern JavaScript coding practices. Otherwise, the gaps may not be that huge.

Besides superior performance, WebCrypto API also benefits better security models. For example, when developing with pure JavaScript crypto libraries, secret or private keys are often stored in the global JavaScript execution context. It is extremely vulnerable as keys are exposed to any JavaScript resources being loaded and therefore allows XSS attackers be able to steal the keys. WebCrypto API instead protects the secret or private keys by storing them completely outside of the JavaScript execution context. This limits the risk of the private key being exfiltrated and reduces the window of compromise if an attacker gets to execute JavaScript in the victim’s browser. What’s more, our WebCrypto implementation on macOS/iOS is based on the CommonCrypto routines, which are highly tuned for our hardware platforms, and are regularly audited and reviewed for security and correctness. WebCrypto API is therefore the best way to ensure users enjoy the highest security protection.

Overview of WebCrypto API

The WebCrypto API starts with crypto global object:

Crypto
{
    subtle: SubtleCrypto,
    ArrayBufferView getRandomValues(ArrayBufferView array)
}

Inside, it owns a subtle object that is a singleton of the SubtleCrypto interface. The interface is named subtle because it warns developers that many of the crypto algorithms have sophisticated usage requirements that must be strictly followed to get the expected algorithmic security guarantees. The subtle object is the main entry point for interacting with underlying crypto primitives. The crypto global object also has the function getRandomValues, which provides a cryptographically strong random number generator (RNG). The WebKit RNG (macOS/iOS) is based on AES-CTR.

The subtle object is composed of multiple methods to serve the needs of low-level cryptographic operations:

SubtleCrypto
{
    Promise<ArrayBuffer> encrypt(AlgorithmIdentifier algorithm, CryptoKey key, BufferSource data);
    Promise<ArrayBuffer> decrypt(AlgorithmIdentifier algorithm, CryptoKey key, BufferSource data);
    Promise<ArrayBuffer> sign(AlgorithmIdentifier algorithm, CryptoKey key, BufferSource data);
    Promise<boolean> verify(AlgorithmIdentifier algorithm, CryptoKey key, BufferSource signature, BufferSource data);
    Promise<ArrayBuffer> digest(AlgorithmIdentifier algorithm, BufferSource data);
    Promise<CryptoKey or CryptoKeyPair> generateKey(AlgorithmIdentifier algorithm, boolean extractable, sequence<KeyUsage> keyUsages );
    Promise<CryptoKey> deriveKey(AlgorithmIdentifier algorithm, CryptoKey baseKey, AlgorithmIdentifier derivedKeyType, boolean extractable, sequence<KeyUsage> keyUsages );
    Promise<ArrayBuffer> deriveBits(AlgorithmIdentifier algorithm, CryptoKey baseKey, unsigned long length);
    Promise<CryptoKey> importKey(KeyFormat format, (BufferSource or JsonWebKey) keyData, AlgorithmIdentifier algorithm, boolean extractable, sequence<KeyUsage> keyUsages );
    Promise<ArrayBuffer> exportKey(KeyFormat format, CryptoKey key);
    Promise<ArrayBuffer> wrapKey(KeyFormat format, CryptoKey key, CryptoKey wrappingKey, AlgorithmIdentifier wrapAlgorithm);
    Promise<CryptoKey> unwrapKey(KeyFormat format, BufferSource wrappedKey, CryptoKey unwrappingKey, AlgorithmIdentifier unwrapAlgorithm, AlgorithmIdentifier unwrappedKeyAlgorithm, boolean extractable, sequence<KeyUsage> keyUsages );
}

As the names of these methods imply, the WebCrypto API supports hashing, signature generation and verification, encryption and decryption, shared secret derivation, and corresponding key materials management. Let’s look closer at one of those methods:

Promise<ArrayBuffer> encrypt(AlgorithmIdentifier algorithm,
                             CryptoKey key,
                             BufferSource data)

All of the functions return a Promise, and most of them accept an AlgorithmIdentifier parameter. AlgorithmIdentifier can be either a string that specifies an algorithm, or a dictionary that contains all the inputs to a specific operation. For example, in order to do an AES-CBC encryption, one has to supply the above encrypt method with:

var aesCbcParams = {name: "aes-cbc", iv: asciiToUint8Array("jnOw99oOZFLIEPMr")}

CryptoKey is an abstraction of keying materials in WebCrypto API. Here is an illustration:

CryptoKey
{
    type: "secret",
    extractable: true,
    algorithm: { name: "AES-CBC", length: 128 },
    usages: ["decrypt", "encrypt"]
}

This code tells us that this key is an extractable (to JavaScript execution context) AES-CBC “secret” (symmetric) key with a length of 128 bits that can be used for both encryption and decryption. The algorithm object is a dictionary that characterizes different keying materials, while all the other slots are generic. Bear in mind that CryptoKey does not expose the underlying key data directly to web pages. This design of WebCrypto keeps the secret and private key data safely within the browser agent, while allowing web authors to still enjoy the flexibility of working with concrete keys.

Changes to WebKitSubtleCrypto

Those of you that have never heard of WebKitSubtleCrypto may skip this section and use SubtleCrypto exclusively. This section is aimed at providing compelling reasons for current WebKitSubtleCrypto users to switch to our new standards-compliant SubtleCrypto.

1. Standards-compliant implementation

SubtleCrypto is a standards-compliant implementation of the current specification, and is completely independent from WebKitSubtleCrypto. Here is an example code snippet that demonstrates the differences between the two APIs for importing a JsonWebKey (JWK) format key:

var jwkKey = {
    "kty": "oct",  
    "alg": "A128CBC",
    "use": "enc",
    "ext": true,
    "k": "YWJjZGVmZ2gxMjM0NTY3OA"
};

// WebKitSubtleCrypto:
// asciiToUint8Array() takes a string and converts it to an Uint8Array object.
var jwkKeyAsArrayBuffer = asciiToUint8Array(JSON.stringify(jwkKey));
crypto.webkitSubtle.importKey("jwk", jwkKeyAsArrayBuffer, null, false, ["encrypt"]).then(function(key) {
    console.log("An AES-CBC key is imported via JWK format.");
});

// SubtleCrypto:
crypto.subtle.importKey("jwk", jwkKey, "aes-cbc", false, ["encrypt"]).then(function(key) {
    console.log("An AES-CBC key is imported via JWK format.");
});

With the new interface, one no longer has to convert the JSON key to UInt8Array. The SubtleCrypto interface is indeed significantly more standards-compliant than our old WebKitSubtleCrypto implementation. Here are the results of running W3C WebCrypto API tests:

W3C WebCrypto TestSuite Result Chart
This test suite is an improved one based on the most updated web-platform-tests GitHub repository. Pull requests are made for all improvements: #6100, #6101, and #6102.

The new implementation’s coverage is around 95% which is 48X higher than our webkit- prefixed one! The concrete numbers for all selected parties are: 999 for prefixed WebKit, 46653 for Safari 11, 45709 for Chrome 59, and 18636 for FireFox 54.

2. DER encoding support for importing and exporting asymmetric keys

The WebCrypto API specification supports DER encoding of public keys as SPKI, and of private key as PKCS8. Prior to this, WebKitSubtleCrypto only supported the JSON-based JWK format for RSA keys. This is convenient when keys are used on the web because of its structure and human readability. However, when public keys are often exchanged between servers and web browsers, they are usually embedded in certificates in a binary format. Even though some JavaScript frameworks have been written to read the binary format of a certificate and to extract its public key, few of them convert a binary public key into its JWK equivalent. This is why support for SPKI and PKCS8 is useful. Here are code snippets that demonstrate what can be done with the SubtleCrypto API:

// Import:
// Generated from OpenSSL
// Base64URL.parse() takes a Base64 encoded string and converts it to an Uint8Array object.
var spkiKey = Base64URL.parse("MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAwCjRCtFwvSNYMZ07u5SxARxglJl75T7bUZXFsDVxHkMhpNC2RaN4jWE5bwYUDMeD2fVmxhpaUQn/6AbFLh6gHxtwrCfc7rIo/SfDdGd3GkRlXK5xXwGuM6MvP9nuZHaarIyArRFh2U2UZxFlVsKI0pSHo6n58W1fPZ1syOoVEZ/WYE6gLhMMwfpeAm97mro7mekRdMULOV/mR5Ul3CHm9Zt93Dc8GpnPA8bhLiB0VNyGTEMa06nJul4gj1sjxLDoUvZY2EWq7oUUnfLBUYMfiqK0kQcW94wvBrIq2DQUApLyTTbaAOY46TLwX6c8LtubJriYKTC5a9Bb0/7ovTWB0wIDAQAB");
crypto.subtle.importKey("spki", spkiKey, {name: "RSA-OAEP", hash: "sha-256"}, true, ["encrypt"]).then(function(key) {
    console.log("A RSA-OAEP key is imported via SPKI format.");
});

// Export:
var rsaKeyGenParams = {
    name: "RSA-OAEP",
    modulusLength: 2048,
    publicExponent: new Uint8Array([0x01, 0x00, 0x01]),  // Equivalent to 65537
    hash: "sha-256"
};
crypto.subtle.generateKey(rsaKeyGenParams, true, ["decrypt", "encrypt"]).then(function(keyPair) {
    crypto.subtle.exportKey("spki", keyPair.publicKey).then(function(binary) {
        console.log("A RSA-OAEP key is exported via SPKI format.");
    });
});

A live example from a third party to generate public key certificates can be found here.

3. Asynchronously execute time-consuming SubtleCrypto methods

In the previous WebKitSubtleCrypto implementation, only generateKey for RSA executes asynchronously, while all the other operations are synchronous. Even though synchronous operation works well for methods that finish quickly, most crypto methods are time-consuming. Consequently, all time-consuming methods in the new SubtleCrypto implementation execute asynchronously:

Method encrypt decrypt sign verify digest generateKey* deriveKey deriveBits importKey exportKey wrapKey* unwrapKey*
Asynchronous

Note that only RSA key pair generation is asynchronous while EC key pair and symmetric key generation are synchronous. Also notice that AES-KW is the only exception where synchronous operations are still done for wrapKey/unwrapKey. Normally key size is a few hundred bytes, and therefore it is less time-consuming to encrypt/decrypt such small amount of data. AES-KW is the only algorithm that directly supports wrapKey/unwrapKey operations while others are bridged to encrypt/decrypt operations. Hence, it becomes the only algorithm that executes wrapKey/unwrapKey synchronously. Web developers may treat every SubtleCrypto function the same as any other function that returns a promise.

4. Web worker support

Besides making most of the APIs asynchronous, we also support web workers to allow another model of asynchronous execution. Developers can choose which one best suit their needs. Combining these two models, developers now could integrate cryptographic primitives inside their websites without blocking any UI activities. The SubtleCrypto object in web workers uses the same semantics as the one in the Window object. Here is some example code that uses a web worker to encrypt text:

// In Window. 
var rawKey = asciiToUint8Array("16 bytes of key!");
crypto.subtle.importKey("raw", rawKey, {name: "aes-cbc", length: 128}, true, ["encrypt", "decrypt"]).then(function(localKey) {
    var worker = new Worker("crypto-worker.js");
    worker.onmessage = function(evt) {
        console.log("Received encrypted data.");
    }
    worker.postMessage(localKey);
});

// In crypto-worker.js.
var plainText = asciiToUint8Array("Hello, World!");
var aesCbcParams = {
    name: "aes-cbc",
    iv: asciiToUint8Array("jnOw99oOZFLIEPMr"),
}
onmessage = function(key)
{
    crypto.subtle.encrypt(aesCbcParams, key, plainText).then(function(cipherText) {
        postMessage(cipherText);
    });
}

A live example is here to demonstrate how asynchronous execution could help to make a more responsive website.

In addition to the four major areas of improvement above, some minor changes that are worth mentioning include:

  • CryptoKey interface enhancement includes renaming from Key to CryptoKey, making algorithm and usages slots cacheable, and exposing it to web workers.
  • HmacKeyParams.length is now bits instead of bytes.
  • RSA-OAEP can now import and export keys with SHA-256.
  • CryptoKeyPair is now a dictionary type.

Newly added cryptographic algorithms

Together with the new SubtleCrypto interface, this update also adds support for a number of cryptographic algorithms:

  1. AES-CFB: CFB stands for cipher feedback. Unlike CBC, CFB does not require the plain text be padded to the block size of the cipher.
  2. AES-CTR: CTR stands for counter mode. CTR is best known for its parallelizability on both encryption and decryption.
  3. AES-GCM: GCM stands for Galois/Counter Mode. GCM is an authenticated encryption algorithm designed to provide both data authenticity (integrity) and confidentiality.
  4. ECDH: ECDH stands for Elliptic Curve Diffie–Hellman. Elliptic curve cryptography (ECC) is an approach to public-key cryptography based on the algebraic structure of elliptic curves over finite fields. ECC requires smaller keys compared to RSA to provide equivalent security. ECDH is one among many ECC schemes. It allows two parties each of whom owns an ECC key pair to establish a shared secret over an insecure channel.
  5. ECDSA: ECDSA stands for Elliptic Curve Digital Signature Algorithm. It is another ECC scheme.
  6. HKDF: HKDF stands for HMAC-based Key Derivation Function. It transforms secrets into key, allowing to combine additional non-secret inputs when needed.
  7. PBKDF2:PBKDF2 stands for Password-Based Key Derivation Function 2. It takes a password or a passphrase along with a salt value to derive a cryptographic symmetric key.
  8. RSA-PSS: PSS stands for Probabilistic Signature Scheme. It is an improved digital signature algorithm for RSA.

This set of new algorithms not only adds new functionality, e.g. key derivation functions, but also benefits developers from higher efficiency and better security by replacing existing ones having the same functionalities. To demonstrate the benefits, sample code snippets written with selected new algorithms are presented in the following. Implementations under these examples are not written with the best practices and therefore are for demonstration only.

Example 1: AES-GCM

Prior, AES-CBC is the only available block cipher for encryption/decryption. Even though it does a great job for protecting data confidentiality, yet it doesn’t protect the authenticity (integrity) of the produced ciphers. Hence, it often bundles with HMAC-SHA256 to prevent silent corruptions of the ciphers. Here is the corresponding code snippet:

// Assume aesKey and hmacKey are imported before with the same raw key.
var plainText = asciiToUint8Array("Hello, World!");
var aesCbcParams = {
    name: "aes-cbc",
    iv: asciiToUint8Array("jnOw99oOZFLIEPMr"),
}

// Encryption:
// First encrypt the plain text with AES-CBC.
crypto.subtle.encrypt(aesCbcParams, aesKey, plainText).then(function(result) {
    console.log("Plain text is encrypted.");
    cipherText = result;

    // Then sign the cipher text with HMAC.
    return crypto.subtle.sign("hmac", hmacKey, cipherText);
}).then(function(result) {
    console.log("Cipher text is signed.");
    signature = result;

    // Finally produce the final result by concatenating cipher text and signature.       
    finalResult = new Uint8Array(cipherText.byteLength + signature.byteLength);
    finalResult.set(new Uint8Array(cipherText));
    finalResult.set(new Uint8Array(signature), cipherText.byteLength);
    console.log("Final result is produced.");
});

// Decryption:
// First decode the final result from the encryption step.
var position = finalResult.length - 32; // SHA-256 length
signature = finalResult.slice(position);
cipherText = finalResult.slice(0, position);

// Then verify the cipher text.
crypto.subtle.verify("hmac", hmacKey, signature, cipherText).then(function(result) {
    if (result) {
        console.log("Cipher text is verified.");

        // Finally decrypt the cipher text.
        return crypto.subtle.decrypt(aesCbcParams, aesKey, cipherText);
    } else
        return Promise.reject();
}).then(function(result) {
    console.log("Cipher text is decrypted.");
    decryptedText = result;
}, function() {
    // Error handling codes ...
});

So far, the codes are a bit complex with AES-CBC because the extra overhead of HMAC. However, it is much simpler to achieve the same authenticated encryption effect by using AES-GCM as it bundles authentication and encryption together within one single step. Here is the corresponding code snippet:

// Assume aesKey are imported/generated before, and the same plain text is used.
var aesGcmParams = {
    name: "aes-gcm",
    iv: asciiToUint8Array("jnOw99oOZFLIEPMr"),
}

// Encryption:
crypto.subtle.encrypt(aesGcmParams, key, plainText).then(function(result) {
    console.log("Plain text is encrypted.");
    cipherText = result; // It contains both the cipherText and the authentication data.
});

// Decryption:
crypto.subtle.decrypt(aesGcmParams, key, cipherText).then(function(result) {
    console.log("Cipher text is decrypted.");
    decryptedText = result;
}, function(error) {
    // If any violation of the cipher text is detected, the operation will be rejected.
    // Error handling codes ...
});

It is just that simple to use AES-GCM. This simplicity will definitely improve developers’ efficiency. A live example can also be found here to demonstrate how AES-GCM can prevent silent corruption during decrypting corrupted ciphers.

Example 2: ECDH(E)

Block ciphers alone are not sufficient to protect data confidentiality because secret (symmetric) keys need to be shared securely as well. Before this change, only RSA encryption was available for tackling this task. That is to encrypt the shared secret keys and then exchange the ciphers to prevent MITM attacks. This method is not entirely secure as perfect forward secrecy (PFS) is difficult to guarantee. PFS requires session keys, the RSA key pair in this case, to be destroyed once a session is completed, i.e. after a secret key is successfully shared. So the shared secret key can never be recovered even if the MITM attackers are able to record down the exchanged cipher and access the recipient in the future. RSA key pairs are very hard to generate, and therefore maintaining PFS is really a challenge for RSA secret key exchange.

However, maintaining PFS is a piece of cake for ECDH simply because EC key pairs are easy to generate. In average, it takes about 170 ms to generate a RSA-2048 key pair on the same test environment shown in the first section. On the contrary, it only takes about 2 ms to generate a P-256 EC key pair which can provide comparable security to a RSA-3072 alternative. ECDH works in the way that the involved two parties exchange their public keys first and then compute a point multiplication by using the acquired public keys and their own private keys, of which the result is the shared secret. ECDH with PFS is referred as Ephemeral ECDH (ECDHE). Ephemeral merely means that session keys are transient in this protocol. Since the EC key pairs involved with ECDH are transient, they cannot be used to confirm the identities of the involved two parties. Hence, other permanent asymmetric key pairs are needed for authentication. In general, RSA is used as it is widely supported by common public key infrastructures (PKI). To demonstrate how ECDHE works, the following code snippet is shared:

// Assuming Bob and Alice are the two parties. Here we only show codes for Bob's.
// Alice's should be similar.
// Also assumes that permanent RSA keys are obtained before, i.e. bobRsaPrivateKey and aliceRsaPublicKey.
// Prepare to send the hello message which includes Bob's public EC key and its signature to Alice:
// Step 1: Generate a transient EC key pair.
crypto.subtle.generateKey({ name: "ECDH", namedCurve: "P-256" }, extractable, ["deriveKey"]).then(function(result) {
    console.log("EC key pair is generated.");
    bobEcKeyPair = result;

    // Step 2: Sign the EC public key for authentication.
    return crypto.subtle.exportKey("raw", bobEcKeyPair.publicKey);
}).then(function(result) {
    console.log("EC public key is exported.");
    rawEcPublicKey = result;

    return crypto.subtle.sign({ name: "RSA-PSS", saltLength: 16 }, bobRsaPrivateKey, rawEcPublicKey);
}).then(function(result) {
    console.log("Raw EC public key is signed.");
    signature = result;

    // Step 3: Exchange the EC public key together with the signature. We simplify the final result as
    // a concatenation of the raw format EC public key and its signature.
    finalResult = new Uint8Array(rawEcPublicKey.byteLength + signature.byteLength);
    finalResult.set(new Uint8Array(rawEcPublicKey));
    finalResult.set(new Uint8Array(signature), rawEcPublicKey.byteLength);
    console.log("Final result is produced.");

    // Send the message to Alice.
    // ...
});

// After receiving Alice's hello message:
// Step 1: Decode the counterpart from Alice.
var position = finalResult.length - 256; // RSA-2048
signature = finalResult.slice(position);
rawEcPublicKey = finalResult.slice(0, position);

// Step 2: Verify Alice's signature and her EC public key.
crypto.subtle.verify({ name: "RSA-PSS", saltLength: 16 }, aliceRsaPublicKey, signature, rawEcPublicKey).then(function(result) {
    if (result) {
        console.log("Alice's public key is verified.");

        return crypto.subtle.importKey("raw", rawEcPublicKey, { name: "ECDH", namedCurve: "P-256" }, extractable, [ ]);
    } else
        return Promise.reject();
}).then(function(result) {
    console.log("Alice's public key is imported.");
    aliceEcPublicKey = result;

    // Step 3: Compute the shared AES-GCM secret key.
    return crypto.subtle.deriveKey({ name: "ECDH", public: aliceEcPublicKey }, bobEcKeyPair.privateKey, { name: "aes-gcm", length: 128 }, extractable, ['decrypt', 'encrypt']);
}).then(function(result) {
    console.log("Shared AES secret key is computed.");
    aesKey = result;

    console.log(aesKey);

    // Step 4: Delete the transient EC key pair.
    bobEcKeyPair = null;
    console.log("EC key pair is deleted.");
});

In the above example, we omit the way how information, i.e. public keys and their corresponding parameters, is exchanged to focus on parts that WebCrypto API is involved. The ease to implement ECDHE will definitely improve the security level of secret key exchanges. Also, a live example to tell the differences between RSA secret key exchange and ECDH is included here.

Example 3: PBKDF2

The ability to derive a cryptographically secret key from existing secrets such as password is new. PBKDF2 is one of the newly added algorithms that can serve this purpose. The derived secret key from PBKDF2 not only can be used in the subsequent cryptographically operations, but also itself is a strong password hash given it is salted. The following code snippet demonstrates how to derive a strong password hash from a simple password:

var password = asciiToUint8Array("123456789");
var salt = asciiToUint8Array("jnOw99oOZFLIEPMr");

crypto.subtle.importKey("raw", password, "PBKDF2", false, ["deriveBits"]).then(function(baseKey) {
    return crypto.subtle.deriveBits({name: "PBKDF2", salt: salt, iterations: 100000, hash: "sha-256"}, baseKey, 128);
}).then(function(result) {
    console.log("Hash is derived!")
    derivedHash = result;
});

A live example can be found here.

The above examples are just a tip of capabilities of WebCrypto API. Here is a table listing all algorithms that WebKit currently supports, and corresponding permitted operations of each algorithm.

Algorithm name encrypt decrypt sign verify digest generateKey deriveKey deriveBits importKey** exportKey** wrapKey unwrapKey
RSAES-PKCS1-v1_5***
RSASSA-PKCS1-v1_5
RSA-PSS
RSA-OAEP
ECDSA*
ECDH*
AES-CFB
AES-CTR
AES-CBC
AES-GCM
AES-KW
HMAC
SHA-1***
SHA-224
SHA-256
SHA-384
SHA-512
HKDF
PBKDF2
* WebKit doesn’t support P-521 yet, see bug 169231.
** WebKit doesn’t check or produce any hash information from or to DER key data, see bug 165436, and bug 165437.
*** RSAES-PKCS1-v1_5 and SHA-1 should be avoided for security reasons.

Transition to the New SubtleCrypto Interface

This section covers some common mistakes that web developers have made when they have tried to maintain compatibility to both WebKitSubtleCrypto and SubtleCrypto, and then we present recommended fixes to those mistakes. Finally, we summarize those fixes into a de facto rule to maintain compatibility.

Example 1:

// Bad code:
var subtleObject = null;
if ("subtle" in self.crypto)
    subtleObject = self.crypto.subtle;
if ("webkitSubtle" in self.crypto)
    subtleObject = self.crypto.webkitSubtle;

This example wrongly prioritizes window.crypto.webkitSubtle over window.crypto.subtle. Therefore, it will overwrite the subtleObject even that the subtle object actually exists. A quick fix for it is to prioritizes window.crypto.subtle over window.crypto.webkitSubtle.

// Fix:
var subtleObject = null;
if ("webkitSubtle" in self.crypto)
    subtleObject = self.crypto.webkitSubtle;
if ("subtle" in self.crypto)
    subtleObject = self.crypto.subtle;

Example 2:

// Bad code:
(window.agcrypto = window.crypto) && !window.crypto.subtle && window.crypto.webkitSubtle && (console.log("Using crypto.webkitSubtle"), window.agcrypto.subtle = window.crypto.webkitSubtle);
var h = window.crypto.webkitSubtle ? a.utils.json2ab(c.jwkKey) : c.jwkKey;
agcrypto.subtle.importKey("jwk", h, g, !0, ["encrypt"]).then(function(a) {
    ...
});

This example incorrectly pairs window.agcrypto and the latter jwkKey. The first line prioritizes window.crypto.subtle over window.crypto.webkitSubtle, which is correct. However, the second line prioritizes window.crypto.webkitSubtle over window.crypto.subtle again.

// Fix:
(window.agcrypto = window.crypto) && !window.crypto.subtle && window.crypto.webkitSubtle && (console.log("Using crypto.webkitSubtle"), window.agcrypto.subtle = window.crypto.webkitSubtle);
var h = window.crypto.subtle ? c.jwkKey : a.utils.json2ab(c.jwkKey);
agcrypto.subtle.importKey("jwk", h, g, !0, ["encrypt"]).then(function(a) {
    ...
});

A deeper analysis of these examples reveals they both assume window.crypto.subtle and window.crypto.webkitSubtle cannot coexist and therefore wrongly prioritize one over the other. In summary, developers should be aware of the coexistence of these two interfaces and should always prioritize window.crypto.subtle over window.crypto.webkitSubtle.

Feedback

In this blog post, we reviewed WebKit’s update to the WebCrypto API implementation which is available on macOS, iOS, and GTK+. We hope you enjoy it. You can try out all of these improvements in the latest Safari Technology Preview. Let us know how they work for you by sending feedback on Twitter (@webkit, @alanwaketan, @jonathandavis) or by filing a bug.

By Jiewen Tan at July 21, 2017 06:30 PM

July 12, 2017

Release Notes for Safari Technology Preview 35

Surfin’ Safari

Safari Technology Preview Release 35 is now available for download for macOS Sierra and betas of macOS High Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 218629-219131.

Performance

  • Fixed a 50% regression on MotionMark Suites when extended color support was added (r218717)
  • Fixed the bug that Speedometer’s score worsens by 40% when accessibility features are enabled (r218910)

Media

  • Changed image decoding so that when an image appears more than once on a page, decoding to paint one instance repaints them all (r219045)
  • Fixed a frame rate issue that caused getUserMedia to fail on some machines (r218852)
  • Fixed allowing media element to update its state when Mission Control closes the fullscreen window (r218813)
  • Made a change to hide volume controls when AirPlay is active (r218891)
  • Prevented capturing at unconventional resolutions when using the software encoder (r218699)
  • Prevented clearing capture mute from clearing audio mute (r218632)

Web Inspector

  • Added a small delay before showing the progress spinner when loading resources (r219017)
  • Added a toggle button to the left side of the split console navigation bar (r218839)
  • Fixed initial search sometimes being performed twice, producing duplicate results (r219021)
  • Fixed slowness when pausing with a deep call stack by avoiding eagerly generating object previews (r218718)
  • Flipped all go-to-arrow instances in right-to-left mode (r218777)
  • Fixed script timeline bubbles that sometimes appear to miss large events (r218781)
  • Fixed a hang when using “break on all exceptions” throws a stack overflow (r218652)
  • Improved type token background color in the debugger (r219041)

JavaScript

  • Cleaned up Object.entries implementation (r218790)
  • Implemented Object Rest Destructuring (r218861)
  • Made Object.values faster by writing it in C++ (r218697)
  • Removed Reflect.enumerate (r218784)

Accessibility

  • Fixed calling setValue() on contenteditable for ARIA text controls (r218986)
  • Fixed role="none" or role="presentation" on an <iframe> (r219075)

WebAssembly

  • Disabled some WebAssembly APIs under CSP (r218951)

WebCrypto

  • Fixed a backward compatibility issue with CryptoKey objects stored in the IndexedDB (r218666)

Web APIs

  • Fixed a TypeError in the Fetch API when called with body === {} (r218677)
  • Fixed an issue causing Safari to leave a popup window opened during the beforeunload event (r219039)
  • Included audio/vnd.wave as a valid mime-type for wav files (r218634)

CSS

  • Added support for structured serialization of CSS Geometry types (r218644)
  • Fixed @font-face rules with invalid primary fonts never downloading their secondary fonts (r218733)
  • Fixed applying font features only for the particular type of font they are being applied to (r218919)
  • Fixed CSS text properties affecting <video> shadow root (r218655)

By Jon Davis at July 12, 2017 05:00 PM

July 04, 2017

Charlie Turner: Qt Creator Tips for WebKit GTK

Igalia WebKit

Introduction

Using an IDE on GNU/Linux for C/C++ development is slightly contentious in many circles. People either seem to not find IDE’s value-add worthwhile compared to their cult text editor + UNIX tools, or have tried them in the past and not had good experiences, so soilder on with the cult text editor approach. I’ve tended to be in the latter camp of people, knowing that in a perfect world an IDE would help me, but they don’t seem to be up-to-snuff in this environment yet.

I’d limped along with taggers like GNU global and etags alongside Emacs. Cortored find+grep commands wired into Emacs’ helm package were my “Find all references” and “Jump to definition”. It worked to an extent, but it does feel a little primitive, and GUD always frustrated me. New semantic taggers such as SourceWeb and rtags looked interesting and hope they continue to mature, but I was struggling getting WebKit through them. The Clang tooling is also rather slow at processing the source files, upon which both these tools are based.

You can make Qt Creator build and install WebKit into a jhbuild for, say, Epiphany. I describe those steps if you’re inclined have that full IDE experience. The instructions below are annotated with [Building?] for steps that are applicable only to that configuration. I don’t personally do this because I prefer to run the build/install commands outside the IDE. With those introductions out of the way, what I’ve ended up with is a decent code navigator (alternative to Eclipse!) and a good debugger frontend. I’m happier with the combination of cult editor + Qt Creator for working on C++ projects. It’s not perfect, but I hope you might find it useful as well.

First get Qt Creator installed. If using your distributions package manager, just check the version is fairly recent. If it isn’t, download if from the Qt Creator site, but during the installation process, I recommend not installing the Qt libraries.

Load the project into Qt Creator

  • Always run Qt Creator from the WebKit jhbuild environment. E.g., ./Tools/jhbuild/jhbuild-wrapper --gtk run /usr/bin/qtcreator. If you don’t, CMake will find all kinds of random junk it calls dependencies on your system, if you’re lucky.
  • Go to File > Open File or Project.
  • Navigate to the top-level CMakeLists.txt in the WebKit checkout.
  • In the Configure Project screen, change the build directory output to taste. If you’re not planning on building from IDE, this doesn’t matter.
  • [Building?] For build directories, you could put it in $WEBKIT_ROOT/WebKitBuild/{Release,Debug} to match WebKit conventions. Don’t bother with the other configurations, especially not RelWithDebInfo, there are problems in WebKit with this configuration. Now click on Manage in the Desktop kit page (it’s one of those buttons that magically appears on the screen when you hover over it…), scroll down to CMake Configuration and click Change. Remove the Qt install prefix definitions, and add CMAKE_INSTALL_PREFIX:INTERNAL=/path/to/jhbuild/install/ and CMAKE_INSTALL_LIBDIR:INTERNAL=lib. Note carefully that shell variables are not expanded here, so don’t use something like $HOME. You can also change the compiler and debugger used by the kit as well. Also make sure Qt version is None.
  • Once you get into the IDE after these steps, CMake will fail because we haven’t specified a port.
  • From the mode selector, click Projects (thing with the wrench icon) and go to the build settings. Set PORT=GTK and also add a boolean property for DEVELOPER_MODE=ON and ENABLE_WAYLAND_TARGET=OFF if you’re working on X11.
  • [Building?] Click Advanced if you’d like to change the default compiler switches for Debug/Release configurations. With GCC I like to use -ggdb -Og in Debug.
  • Configuring should now suceed
  • [Building?] Click Build and it will likely fail. I’ve found Qt Creator needs to be restarted at this point. Restart and the build should now work.
  • [Building?] As a finishing touch, you can configure the run configuration for launching Epiphany. In the Run Settings window under the projects mode, create a custom deploy step to run command jhbuild with arguments run cmake -P cmake_install.cmake. This will install WebKit in the jhbuild environment. Now add a custom executable and specify the executable to be jhbuild and the arguments to be run epiphany. The Run button will now install WebKit for use by Epiphany and launch the browser ready for attachment (see next section).

Debugging

  • Due to the multi-process nature of WebKit, you can’t just click on “Start Debugging”, since there’s several processes you might want to attach to. Launch WebKit and once it’s running, go to Debug > Start Debugging > Attach to Running Application and select the PID of the process you’d like to attach to.
  • It’s likely Qt Creator will time out the GDB launch and ask if you’d like to give it more time. Say yes, and go to Tools > Options > Debugger > GDB and bump the timeout up to 60 seconds.
  • If you’re getting assembly instructions when you hit a breakpoint, it’s likely your source isn’t getting found with the debugger. This shouldn’t happen to you, but if it does you’ll want to add ../../Source -> $WEBKIT_CHECKOUT/Source source mapping. This can be done in Tools > Options > Debugger > General. The build system doesn’t force
    the compiler to emit absolute paths in debugging info (there are ways around that, but this is easier)
  • GDB commands can be issued by bringing up the poorly named “Debugger Log” in the debugger views menu. Some helpful commands I’ve used on WebKit are handle SIGUSR1 noprint to stop being interrupted by IPC, and set scheduler-locking on to single-step through the current thread (you really don’t want to enable that from the start though 😉 just use it in the middle of debug session when you want to step a thread).
  • Everything else I’ve found convenient to do via the IDE.

Issues

  • Header files don’t have their #if parsed properly, I think because the config.h is indirectly available to header files, which is really unfriendly to static analysis tools used by IDEs. This is with the default code model, I’m sure it would be better if you try the Clang code model, but the current support for that in Qt Creator is limited, and the tradeoff is much, much slower indexing. This isn’t really an issue with the IDE but rather the coding style guidelines of WebKit.
  • Switching kits often requires restarting the IDE, otherwise you get build step errors. I’m guessing this has something to do with the CMake caching the IDE uses. When in doubt, restart the IDE.
  • When you do an expensive interaction with the code model, it blocks the UI thread rendering the whole IDE unresponsive. This is much worse with the Clang code model because it’s so much slower than the default. Can be a problem with the Qt Code Model if you ask for things like the type hierachy.

By cturner at July 04, 2017 12:58 PM

July 03, 2017

A Closer Look Into WebRTC

Surfin’ Safari

We recently announced WebRTC support in Safari 11 on High Sierra and iOS 11 in our last WebKit blog post. Today, we would like to dive into more details of our implementation, and provide some tips on bringing WebRTC support to your website.

A website employing WebRTC and media capture can obtain and broadcast very personal information. Users must explicitly grant their trust to the website, and assume that their pictures and voices are used appropriately. WebKit requires websites to meet certain conditions in order to use the technologies and protect the privacy of its users. Additionally, Safari shows users when their capture devices are used, and gives users ways they can control a website’s access to their capture devices. For developers that use WebKit in their apps, RTCPeerConnection and RTCDataChannel are available in any web view, but access to the camera and microphone is currently limited to Safari.

Develop Menu

Safari Technology Preview 34 exposes various flags to make it easier for you to test your WebRTC website or integrate Safari in your continuous integration systems through the Develop > WebRTC sub-menu:

WebRTC Menu

We’ll go through each of these flags and explain how they can help you in your development below.

In addition, WebKit logs WebRTC state to the system log, which includes SDP offers and answers, ICE candidates, WebRTC statistics, and incoming and outgoing video frame counters.

Security Origin Policy for Media Capture

Websites that wish to access capture devices need to meet two constraints.

First, the document requesting the camera and microphone needs to come from a HTTPS domain. Since that can be burdensome when you’re developing and testing locally, you can bypass the HTTPS restriction by checking “Allow Media Capture on Insecure Sites” in the Develop > WebRTC menu.

Second, when a sub-frame requests a media capture device, the chain of frames leading to the main frame needs to come from the same secure origin. The user may not recognize the sub-frame’s third-party origin in relation to the main frame, so this constraint avoids confusing the user about whom the user is granting access to.

Mock Capture Devices

In the Develop > WebRTC menu, you can select “Use Mock Capture Devices” to replace the use of real capture devices with a mock one. The mock loops a bip-bop AV stream, as displayed below. When used as an incoming stream, the mock’s predictable data makes it easy to evaluate aspects of streaming media playback including synchronization, latency, and selection of input device.

Bip-Bop AV Mock Loop

The mock can also be useful for running automated tests in a continuous integration system. If you are using one and want to avoid prompts from getUserMedia, set the camera and microphone policy for the website to Allow through the Safari Preferences… > Websites panel.

ICE Candidate Restrictions

ICE candidates are exchanged at an early stage of a WebRTC connection to identify all possible network paths between two peers. To do this, WebKit must expose ICE candidates of each peer to websites so that they can be shared. ICE candidates expose IP addresses, and notably those that are host IP addresses can be used for tracking.

In many network topologies, however, host ICE candidates are not necessary to make the connection. Server Reflexive and TURN ICE candidates usually suffice for ensuring the connection, regardless of whether it is used for exchanging video or arbitrary data. Without access to capture devices, WebKit only exposes Server Reflexive and TURN ICE candidates, which expose IPs that could already be gathered by websites. When access is granted, WebKit will expose host ICE candidates, which maximizes the chance the connection succeeds and is efficient. We make this exception since we believe that the user is expressing a high level of trust to the website by granting access to his or her capture streams.

Some test pages may assume the availability of host ICE candidates. To test this, turn on “Disable ICE Candidate Restrictions” from the Develop > WebRTC menu, and reload the page.

Legacy WebRTC and Media Streams API

Through the WebRTC standardization process, the RTCPeerConnection API progressively improved in various ways. Initially callback-based, the API changed to being fully promise-based. API initially focused on MediaStream moved to MediaStreamTrack. Thanks to the upstream effort by the WebRTC in WebKit team, the RTCPeerConnection API was aligned with these two major changes.

We have turned the legacy WebRTC APIs off by default on Safari Technology Preview 34, and plan to ship Safari 11 on macOS High Sierra and iOS 11 without these APIs. Keeping the legacy API around limits our ability to move forward faster on WebRTC. Any website looking to bring support to Safari may need to make other adjustments, so this is as good a time as ever to move away from these legacy APIs. Existing websites may still rely on these legacy APIs, which you can check by turning on “Enable Legacy WebRTC API” in the Develop > WebRTC menu.

More precisely, the following APIs are only available with the legacy API switch turned on, with suggestions for how to update:

partial interface Navigator {
    // Switch to navigator.mediaDevices.getUserMedia
    void getUserMedia(MediaStreamConstraints constraints, NavigatorUserMediaSuccessCallback successCallback, NavigatorUserMediaErrorCallback errorCallback);
};

partial interface RTCPeerConnection {
    // Switch to getSenders, and look at RTCRtpSender.track
    sequence<MediaStream> getLocalStreams();
    // Switch to getReceivers, and look at RTCRtpReceiver.track
    sequence<MediaStream> getRemoteStreams();

    // Switch to getSenders/getReceivers
    MediaStream getStreamById(DOMString streamId);
    // Switch to addTrack
    void addStream(MediaStream stream);
    // Switch to removeTrack
    void removeStream(MediaStream stream);

    // Listen to ontrack event
    attribute EventHandler onaddstream;

    // Update to promise-only version of createOffer
    Promise<void> createOffer(RTCSessionDescriptionCallback successCallback, RTCPeerConnectionErrorCallback failureCallback, optional RTCOfferOptions options);
    // Update to promise-only version of setLocalDescription
    Promise<void> setLocalDescription(RTCSessionDescriptionInit description, VoidFunction successCallback, RTCPeerConnectionErrorCallback failureCallback);
    // Update to promise-only version of createAnswer
    Promise<void> createAnswer(RTCSessionDescriptionCallback successCallback, RTCPeerConnectionErrorCallback failureCallback);
    // Update to promise-only version of setRemoteDescription
    Promise<void> setRemoteDescription(RTCSessionDescriptionInit description, VoidFunction successCallback, RTCPeerConnectionErrorCallback failureCallback);
    // Update to promise-only version of addIceCandidate
    Promise<void> addIceCandidate((RTCIceCandidateInit or RTCIceCandidate) candidate, VoidFunction successCallback, RTCPeerConnectionErrorCallback failureCallback);
};

Many sites polyfill API support through the open source adapter.js project. Updating to the latest release is one way to cover for the API gap, but we recommend switching to the APIs listed in the specification.

Here are a couple examples of how to use the latest APIs. A typical receive-only/webinar-like WebRTC call could be done like this:

var pc = new RTCPeerConnection();
pc.addTransceiver('audio');
pc.addTransceiver('video');
var offer = await pc.createOffer();
await pc.setLocalDescription(offer);
// send offer to the other party
...

And a typical audio-video WebRTC call could be done like this:

var stream = await navigator.mediaDevices.getUserMedia({audio: true, video: true});
var pc = new RTCPeerConnection();
var audioSender = pc.addTrack(stream.getAudioTracks()[0], stream);
var videoSender = pc.addTrack(stream.getVideoTracks()[0], stream);
var offer = await pc.createOffer();
await pc.setLocalDescription(offer);
// send offer to the other party
...

MediaStreamTrack-based APIs make sense as most of the handling is done at this level. Let’s say the 640×480 default resolution of the capture video track is not good enough. Continuing with the previous example, changing it dynamically can be done as follows:

videoSender.track.applyConstraints({width: 1280, height: 720});

Or we might want to mute the video but keep the audio flowing:

videoSender.track.enabled = false;

Oh but wait, let’s say that we actually want to apply some cool filter effects to the current video track, like in this example. All that is needed is a few function calls that will not require any SDP renegotiation:

videoSender.track.enabled = true;
renderWithEffects(video, canvas);
videoSender.replaceTrack(canvas.captureStream().getVideoTracks()[0]);

Access to Capture Streams

Safari allows users to have complete control over a website’s access to their capture devices.

First, the user is prompted to grant website access to capture devices when getUserMedia is first called. Unlike other browsers, however, Safari does not require the user to choose specific devices; instead the prompt requests access for all devices of a specific type, like all cameras or microphones. This reduces fatigue from being prompted multiple times, and potentially avoids training the user to always tap “Allow”. One common case where this could happen is switching between the front and back cameras of iOS devices. The resolved promise in getUserMedia returns a device that fulfills the constraints, and subsequent calls to getUserMedia for the same device type will avoid presenting additional prompts to the user. If you want to allow the user to switch to a different device, be sure to provide UI to do so.

Second, the user can decide to always allow or deny access to the camera and microphone through Safari preferences. The user may do this on a per-origin basis and can even set a general policy for all websites.

Third, once a website creates a MediaStream for a device, icons appear in the Safari UI and the system menu bar indicating that capture devices are being used. The user may click or tap that icon to pause the camera and microphone mid-stream. Here WebKit will send silent audio and black video frames, and your website can present appropriate UI by listening for the mute and unmute events on MediaStreamTrack.

Active Capture Devices Icons

Finally, to avoid unexpected capture, WebKit only allows one tab to capture video or audio at a time. Tabs already using capture devices will see their MediaStreamTracks silenced and receive the mute event when a new tab gains access.

Fingerprinting

navigator.mediaDevices.enumerateDevices exposes the list of capture devices available, and can be queried by websites even when access to those devices is not granted. For users that have custom camera and microphone setups, this can add to a user’s fingerprinting surface. When access hasn’t yet been requested or is explicitly denied, WebKit avoids exposing this additional information by returning a default list of devices that may not correspond to the actual set of devices available. In addition, the devices have missing labels as per the specification. Once access is granted, the full list of devices and their labels are available.

Media Capture and Autoplay Video

In previous posts we’ve discussed changes in autoplay policies for video on macOS and iOS. We’ve adjusted the policies on both platforms to accommodate WebRTC applications, where it is common to want to autoplay incoming media streams that include audio. To address these scenarios while retaining the benefits of the current autoplay rules, the following changes have been made:

  • MediaStream-backed media will autoplay if the web page is already capturing.
  • MediaStream-backed media will autoplay if the web page is already playing audio. A user gesture will still be required to initiate audio playback.

Performance

WebRTC is a very powerful feature that can have numerous applications. We all know that with great power comes great responsibility. Designing WebRTC applications require to have efficiency in mind from the start. CPU, memory and network have limits that can seriously affect user experience. This problem should be tackled by both the web engine and the web application. On the web application side, various mechanisms are already available: choosing the right video resolution and frame rate, selecting the right video codec profile, using CVO, muting tracks at the source, and performing client-side monitoring of WebRTC statistics.

Feedback

That concludes our deep dive into WebRTC and media capture. Our door is always open to hear feedback from you. File a bug, email web-evangelist@apple.com, or tweet to @webkit.

By Youenn Fablet, Jon Lee at July 03, 2017 05:00 PM

June 28, 2017

Release Notes for Safari Technology Preview 34

Surfin’ Safari

Safari Technology Preview Release 34 is now available for download for macOS Sierra and betas of macOS High Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 217978-218629.

WebRTC

  • Added WebRTC options to the Developer menu
  • Disabled Legacy WebRTC API in the Experimental Features menu by default (r218169)
  • Changed behavior to close NetworkProcess WebRTC sockets as soon as the Web Process no longer needs them (r218432)
  • Added support for receive-only SDP offers through addTransceiver (r218431)
  • Changed handling capture status based on MediaStreamTrack (r218399)
  • Changed RTCPeerConnection to return RTCSessionDescriptionInit instead of RTCSessionDescription (r218335)
  • Fixed a cloned MediaStreamTrack to not mute the other tracks using the same source (r218497)
  • Fixed RTCPeerConnection getReceivers() to return transceivers that have an active receiver but no active sender (r218182)
  • Fixed the screen going into sleep mode during WebRTC video (r218151)

Media

  • Fixed high CPU usage when entering fullscreen or seeking during MSE video playback (r218463)
  • Fixed seeking during MSE video playback where audio would begin playing long before rendering the video (r218150)
  • Fixed video flashing black when switching back to a tab (r218291)
  • Improved media controls rendering for long-loading media files (r218600)
  • Prevented media elements continuing to load media data after navigation (r218016)

JavaScript

  • Made Object.assign faster by rewriting it in C++ (r218348)
  • Reduced Structure size (r218070)
  • Updated RegExp.prototype.[@@search]] implementation according to the latest specifications (r218051)
  • Fixed PreTypedArray constructor with a string to not throw an exception (r218082)

Security

  • Applied img-src CSP directive to favicon loads (r218015, r218026)
  • Implemented W3C Secure Contexts Draft Specification (r218027, r218028, r218155, r218196)
  • Restricted filtered painting across cross-origin boundaries with transforms (r218300)
  • Added allow-popups-to-escape-sandbox attribute support for <iframe> elements (r218000)
  • Added Subresource Integrity as an experimental feature (r217996)

Web Inspector

  • Added grid to image previews to clarify transparency and image size (r218159)
  • Fixed console message icons that overlap the source location (r218243)
  • Fixed pretty print, type info, and code coverage buttons disappearing after switching tabs (r218305)
  • Fixed SVG files and favicon files that don’t display properly (r218298)
  • Fixed the search highlight not showing up in resources when paused (r218359)
  • Fixed showing non-shadow children of an element with a shadow root (e.g. <video>) (r218020)

Web API

  • Fixed the meter element not respecting the writing direction (r218468)
  • Fixed WebGPU contexts to have a back reference to the canvas element (r218624)
  • Fixed CSS transitions added while page is not visible so they start animating when the page becomes visible (r217997)
  • Fixed IndexedDB.getAll() use inside a Web Worker (r218041)

WebCrypto

  • Moved SubtleCrypto from the experiemental features menu (r218129)
  • Removed unsupported AES_CMAC, DH, and CONCAT (r218030)

WebAssembly

  • Fixed several miscellaneous bugs found from web platform tests (r218216)

Rendering

  • Added an experimental feature setting for asynchronous frame scrolling (r218534)

Accessibility

  • Exposed the inline property as an accessibility attribute (r218226)

Bug Fixes

  • Fixed mint.com header rendering incorrectly when initially loaded (r218257)
  • Fixed scrubbing backward on a YouTube video

By Jon Davis at June 28, 2017 05:00 PM

June 15, 2017

Michael Catanzaro: Debian Stretch ships latest WebKitGTK+

Igalia WebKit

I’ll keep this update short. Debian has decided to ship the latest version of WebKitGTK+, 2.16.3, in its upcoming Stretch release. Since Debian was the last major distribution holding out on providing WebKit security updates, this is a big deal. Huge thanks to Jeremy Bicha for making this possible.

The bad news is that Debian is still considering whether or not to provide periodic security updates after the release, so there might not be any. But maybe there will be. We will have to wait and see. At least releasing with the latest version is a big step in the right direction.

By Michael Catanzaro at June 15, 2017 04:34 PM

June 14, 2017

Release Notes for Safari Technology Preview 33

Surfin’ Safari

Safari Technology Preview Release 33 is now available for download for macOS Sierra and betas of macOS High Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 217407-217978.

Performance

  • Fixed an issue that could lead to large memory use in the Safari Technology Preview and web content processes when certain Safari Extensions are installed

JavaScript

  • Fixed for-in optimization static analysis in the bytecode generator (r217438)
  • Improved performance of String.prototype.concat (r217648)
  • Improved the bytecode and type information provided for toLength (r217530)
  • Optimized Map and Set constructors (r217525)

WebRTC

Media Streams and Capture

  • Fixed getUserMedia prompting too often (r217910)
  • Prevented getUserMedia requests from background tabs unless the tab is already capturing (r217930)
  • Prevented getUserMedia from prompting again if the user denied access (r217945)

Media

  • Enabled clients to specify a list of codecs that should require hardware decode support (r217799)
  • Exempted exclusively wall-powered devices from client-required hardware codec support (r217906)
  • Aligned Web Audio implementation to specifications when clients pass a value of 0 for bufferSize to the createScriptProcessor() method (r217919)

CSS Grid

  • Added support for orthogonally positioned grid items (r217486)
  • Fixed the behavior of positioned items without specific dimensions (r217411)
  • Fixed the logical margin applied in the tracks sizing algorithm of auto tracks (r217709)
  • Fixed margin applied when stretching an orthogonal item in a fixed size track (r217705)

Web API

  • Aligned <col span> and <colgroup span> limits with the latest HTML specification (r217907)
  • Fixed null content-type and content-length when fetching Blob URLs with XHR (r217901)
  • Fixed getComputedStyle() to return pixel values for left, right, top, and bottom, matching the specifications (r217522)
  • Implemented fromFloat32Array, fromFloat64Array, toFloat32Array, and toFloat64Array for DOMMatrix (r217764)
  • Implemented DOMPointReadOnly.matrixTransform() (r217763)
  • Enabled script modules to be imported via data URLs (r217760)
  • Updated to slightly stricter rules for custom element names from the recent draft standards (r217864)
  • Used the parent box style to adjust RenderStyle for alignment (r217536)
  • Added conditional support for media preloading and aligned media as values (r217863)
  • Aligned preload implementation to specifications with mandatory as value and other alignments (r217962)

Rendering

  • Changed behavior to remove backing store for layers that are outside the viewport (r217696)
  • Fixed an issue where a frame’s composited content is visible when the frame has visibility:hidden (r217472)
  • Changed behavior to destroy the associated renderer subtree when display:contents node is deleted (r217794)

Web Inspector

  • Added contextual menu item to log a WebSocket object to the console (r217912)
  • Added Debug view to the Settings tab for debug settings and experimental features (r217625)
  • Added the ability for the user to choose stylesheet when creating new rules (r217911)
  • Changed the Node Details Sidebar to allow editable key and values in the Attributes table (r217744)
  • Prevented unnecessary layout triggered when a Sidebar is resized while collapsed (r217452)
  • Fixed performing search on reload for an existing query in the Search tab (r217733)
  • Fixed images dragged from Web Inspector to Desktop being named “Unknown.png” (r217584)
  • Fixed an issue where reloading the page after switching from the Resource tab switches back (r217505)
  • Fixed the CodeMirror instance in the ConsolePrompt getting refreshed each time it is shown (r217746)
  • Fixed showing active Web Sockets when opening Web Inspector (r217721)
  • Fixed an issue preventing the go-to arrow to inspect Web Sockets that receive >50 messages per second (r217690)
  • Improved the reliability of automatically pausing in auto-attach when inspecting a JSContext (r217509)

Bug Fixes

  • Enabled AirPods to work with Netflix (r217858)
  • Fixed stuttering audio in YouTube when the page changes visibility (r217936)

By Jon Davis at June 14, 2017 08:05 PM

June 08, 2017

Auto-Play Policy Changes for macOS

Surfin’ Safari

This is a guest post from the Safari team about changes to how Safari handles HTML5 video and audio auto-play. These changes may affect compatibility with your websites.

As follow up to Safari’s updated video policies for iOS last year, this week we announced we are making important changes to auto-play policies on the Mac. These changes provide users the ability to browse the web with fewer distractions, particularly in the form of relief from websites that auto-play with sound.

New Behaviors

The WebKit engine provides rich flexibility for user agents to configure and manage media auto-play policies. Safari in macOS High Sierra uses an automatic inference engine to block media elements with sound from auto-playing by default on most websites. Safari 11 also gives users control over which websites are allowed to auto-play video and audio by opening Safari’s new “Websites” preferences pane, or through the “Settings for This Website…” option in the Safari menu. Further, a new power-saving feature prevents silent videos from auto-playing when either hidden in a background tab or otherwise off-screen.

Best Practices for Web Developers

Websites should assume any use of <video> or <audio> requires a user gesture click to play. It’s important to detect if auto-play was denied, since users now have the ability to turn off all forms of auto-play, including silent videos. The code snippet below shows just how easy this is to check. It’s worth pointing out that the new auto-play policies apply both to video used as a tool for making something visual happen (like background video or video-as-animated-gif) and also to video that serves as consumable content. We recommend web developers test their site with these new behaviors, ensuring any custom media controls behave correctly when auto-play is disabled, either automatically by Safari or by users.

  • Websites using WebKit’s built-in media controls automatically work great with the new policies.
  • Don’t assume a media element will play, and don’t show the pause button from the start. Look at the Promise returned by the play function on HTMLMediaElement to see if it was rejected:
var promise = document.querySelector('video').play();

if (promise !== undefined) {
    promise.catch(error => {
        // Auto-play was prevented
        // Show a UI element to let the user manually start playback
    }).then(() => {
        // Auto-play started
    });
}
  • Auto-play restrictions are granted on a per-element basis. Change the source of the media element instead of creating multiple media elements if you want to play multiple videos back to back (or play a pre-roll ad with sound, followed by the main video).
  • Don’t play ads without showing media controls because they may not auto-play and users will have no way of starting playback.
  • Remember that audio tracks that render silence are still audio tracks, and their existence affects whether a video will auto-play at all. In these cases, a video with silent audio tracks won’t play. The audio track should be removed or, alternatively, the muted attribute can be set on the media element.

Feedback

We’d love to hear from developers on these new auto-play policies. Please reach out to our Web Technologies Evangelist, Jon Davis on Twitter to share your thoughts with our team.

By Kevin Decker at June 08, 2017 05:00 PM

June 07, 2017

Announcing WebRTC and Media Capture

Surfin’ Safari

Today we are thrilled to announce WebKit support for WebRTC, available on Safari on macOS High Sierra, iOS 11, and Safari Technology Preview 32. In this post, we will go through an overview of our implementation. We will have future posts that cover more best practices for developers.

When talking about WebRTC, we immediately think about making a video conference call. There are two steps to start a call. First, WebKit needs access to the user’s camera and microphone. HTTPS websites use the Media Capture and Streams API on Safari for that purpose. Once the user grants permission through a prompt, capture streams start to flow. These streams can be tailored to websites’ needs through the use of constraints.

Second, the user joins the conversation, once he checks to make sure his hair looks good on-screen. Websites send these capture streams to the other participants. The user obviously wants to see the other participants’ coifs as well, so this is where the WebRTC API kicks in. WebKit finds and creates the optimal network channel to connect those streams.

To handle the networking layer, WebKit chose the LibWebRTC open source framework. LibWebRTC provides both a high level of interoperability and a rich set of streaming features for efficient video conferencing. Safari supports modern audio codecs such as Opus, and with the H.264 video codec takes full advantage of power-efficient hardware.

Currently, Safari supports legacy WebRTC APIs. Web developers can check whether their websites conform to the latest specifications by toggling the STP Experimental Features menu item “Remove Legacy WebRTC API”. Legacy WebRTC APIs will be disabled by default on future releases. Websites that need to accommodate older implementations of the WebRTC and Media Capture specifications can take advantage of polyfill libraries like adapter.js.

There are a lot of ingredients that go into a good WebRTC recipe. To make sure we had what we needed to make video conferencing possible, a few partners joined the party to bring Safari support to their products. Kudos to TokBox and BlueJeans for getting betas available today.

The next generation of communication technologies is here, and we’re really excited to see them in WebKit and on Apple platforms. We want to hear your feedback! File a bug, email web-evangelist@apple.com, or tweet to @webkit.

By Youenn Fablet at June 07, 2017 05:00 PM

Safari Technology Preview 32, with WebRTC, is Now Available

Surfin’ Safari

Safari Technology Preview Release 32 is now available for download for macOS Sierra. With this release, Safari Technology Preview is now available for betas of macOS High Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab.

This release covers the same revisions of WebKit from Safari Technology Preview 31 , but includes new Safari and WebKit features that will be present in Safari 11. The following web-facing Safari 11 features are new to Safari Technology Preview 32:

WebRTC and Media Capture
Safari now supports peer-to-peer video conferencing through webRTC. For more information, take a look at “Announcing WebRTC and Media Capture”.

WebAssembly
WebAssembly is a low-level binary format for executable code on the web. It is a standards-based supplement to JavaScript. It is secure, portable, and designed to be fast and efficient. To learn more about what WebAssembly is, and what it isn’t, see “Assembling WebAssembly”.

Auto-Play Improvements
Safari prevents auto-play of media with sound on most websites. Users can control which websites are allowed to auto-play video and audio by opening the new “Websites” preferences pane, or through the “Settings for This Website…” option in the Safari Technology Preview menu. We recommended that web developers test their websites with these new behaviors, ensuring any custom media controls behave correctly when auto-play is disabled, either automatically or by the user.

Many more WebKit features in Safari 11 are present in this release of Safari Technology Preview and have been in past releases. You can learn more about what’s new in Safari 11 from the Apple Developer website.

By Ricky Mondello at June 07, 2017 05:00 PM

June 06, 2017

Assembling WebAssembly

Surfin’ Safari

We’re pleased to announce that WebKit has a full WebAssembly implementation.

Dynamic Duo

WebAssembly is a no-nonsense sidekick to JavaScript. It isn’t meant to be hand-written; rather, it’s a low-level binary format designed to be a suitable compilation target for existing languages such as C++. The WebAssembly code that the browser sees will already have undergone high-level, language-specific optimizations. This is great because it means implementations don’t have to know about how C++ or other languages are optimized. Running expensive language-specific optimization on the developers’ machines allows WebKit to focus on target-specific optimizations. It also means that we can focus WebAssembly compiler optimizations on fast code delivery and predictable performance.

WebAssembly cannot do everything that JavaScript does. For example, WebAssembly cannot access the DOM except by calling into JavaScript. WebAssembly is meant to be used in conjunction with JavaScript. The JavaScript / WebAssembly dynamic duo work in partnership, allowing JavaScript to stay focused on taking over the world, while WebAssembly accelerates computation-intensive tasks.


JavaScript and its sidekick, cosplaying as 1966 Batman

The web wouldn’t be itself without security and portability. WebAssembly delivers on both fronts. It executes C++ code at near native speed with the same joker-less security guarantees that JavaScript offers. Our implementation supports WebAssembly on x86-64 as well as ARM64 platforms. Portable performance is really where WebAssembly shines: the WebAssembly virtual Instruction Set Architecture was designed to be an ideal compilation target for today’s modern processors.

Sidekick

WebAssembly fits in with the rest of the web platform, as a good sidekick would. It re-uses existing web APIs and exposes a new JavaScript API. WebAssembly has four core concepts: WebAssembly.Module, WebAssembly.Instance, WebAssembly.Memory, and WebAssembly.Table. A module represents code and is similar to a program on disk, whereas an instance is an execution of that program. There can be multiple concurrently executing instances of the same module. Finally, a memory represents an instance’s heap. It is contiguous, bounds-checked, and can be exposed to JavaScript as an ArrayBuffer. All WebAssembly memory operations operate on the instance’s memory. Finally, a table holds handles to WebAssembly functions, allowing indirect calls within an instance to target different functions with the same signature (in C++-speak: virtual functions and function pointers). Interestingly, instances can share the same memory, and tables can call directly across instances which enables dynamic linking.

Because WebAssembly exposes itself as a regular JavaScript object, we were able to reuse some machinery that already existed within WebKit. An interesting example is our reuse of our ECMAScript Module implementation to implement WebAssembly.Instance‘s API. For now it’s an implementation detail—invisible to web developers—but integration with ECMAScript Modules is being discussed. For developers who use modules, interacting between JavaScript and WebAssembly would then be totally seamless. Behind the mask of modules, our heroes’ secret identities wouldn’t be revealed.

To allow sharing of modules between Web Workers and to prepare ourselves for future features like threads, we’ve made our internal representation of WebAssembly code thread-safe. This lets developers postMessage a compiled WebAssembly.Module across workers without requiring re-compilation, copying, or any other redundant work. Our implementation of postMessage for modules is simpler than a riddle: sharing a module between workers involves passing a reference to our internal module representation to the other worker. That worker will run the same machine code as the agent that originally produced the module.

Utility Belt

WebAssembly directly exposes 32- and 64-bit integers as well as 32- and 64-bit floating point numbers. Its instruction set is equally simple:

i32.add i64.add f32.add f64.add i32.wrap/i64 i32.load8_s i32.store8
i32.sub i64.sub f32.sub f64.sub i32.trunc_s/f32 i32.load8_u i32.store16
i32.mul i64.mul f32.mul f64.mul i32.trunc_s/f64 i32.load16_s i32.store
i32.div_s i64.div_s f32.div f64.div i32.trunc_u/f32 i32.load16_u i64.store8
i32.div_u i64.div_u f32.abs f64.abs i32.trunc_u/f64 i32.load i64.store16
i32.rem_s i64.rem_s f32.neg f64.neg i32.reinterpret/f32 i64.load8_s i64.store32
i32.rem_u i64.rem_u f32.copysign f64.copysign i64.extend_s/i32 i64.load8_u i64.store
i32.and i64.and f32.ceil f64.ceil i64.extend_u/i32 i64.load16_s f32.store
i32.or i64.or f32.floor f64.floor i64.trunc_s/f32 i64.load16_u f64.store
i32.xor i64.xor f32.trunc f64.trunc i64.trunc_s/f64 i64.load32_s
i32.shl i64.shl f32.nearest f64.nearest i64.trunc_u/f32 i64.load32_u
i32.shr_u i64.shr_u i64.trunc_u/f64 i64.load call
i32.shr_s i64.shr_s f32.sqrt f64.sqrt i64.reinterpret/f64 f32.load call_indirect
i32.rotl i64.rotl f32.min f64.min f64.load
i32.rotr i64.rotr f32.max f64.max
i32.clz i64.clz nop grow_memory
i32.ctz i64.ctz block current_memory
i32.popcnt i64.popcnt f32.demote/f64 loop
i32.eqz i64.eqz f32.convert_s/i32 if get_local
i32.eq i64.eq f32.convert_s/i64 else set_local
i32.ne i64.ne f32.convert_u/i32 br tee_local
i32.lt_s i64.lt_s f32.convert_u/i64 br_if
i32.le_s i64.le_s f32.reinterpret/i32 br_table get_global
i32.lt_u i64.lt_u f32.eq f64.eq f64.promote/f32 return set_global
i32.le_u i64.le_u f32.ne f64.ne f64.convert_s/i32 end
i32.gt_s i64.gt_s f32.lt f64.lt f64.convert_s/i64 i32.const
i32.ge_s i64.ge_s f32.le f64.le f64.convert_u/i32 drop i64.const
i32.gt_u i64.gt_u f32.gt f64.gt f64.convert_u/i64 select f32.const
i32.ge_u i64.ge_u f32.ge f64.ge f64.reinterpret/i64 unreachable f64.const

The instructions are low-level by design, and it is this low-level quality that gives WebAssembly its power. WebAssembly was born as a compilation target, molded by compiler engineers.

OMG! BBQ!

WebKit’s WebAssembly implementation, like our JavaScript implementation, uses a tiering system to balance startup costs with throughput. Currently, there are two tiers to the engine: the Build Bytecode Quickly (BBQ) tier and the Optimized Machine-code Generator (OMG) tier. Both rely on the B3 JIT as their low-level optimizer.

WebAssembly modules can easily contain tons of code, some of which isn’t executed more than once or very frequently. This is why we opted to use two tiers: one that generates decent code quickly, and one that generates optimized code only when the engine thinks the code is hot enough to warrant it. BBQ compiles code about 4× as fast as OMG, but produces code that executes roughly 2× as slow as OMG. We use a background thread when compiling functions using the OMG. When OMG compilation finishes, we pause the executing WebAssembly threads and hot-patch the OMG compilation into the module.

WebAssembly being a low-level format—compared to JavaScript being a dynamic language—means that things aren’t as fickle when compiling WebAssembly. For example, WebAssembly statically tells us the types of all values within a function and gives us the signatures of all functions ahead of time.

BBQ 🔥

In order to produce executable code as soon as possible, the BBQ tier omits many of the optimizations possible in the B3 compiler. Additionally, the BBQ tier also uses a linear scan combined register / stack allocation algorithm. This allocates registers about 11× faster than the graph coloring algorithm that B3 usually uses. Avoiding expensive optimization allows WebKit to produce BBQ fast, so that BBQ may be consumed as soon as possible.

OMG 😲

When a function has been executed enough times, our WebAssembly runtime decides to optimize that function. Since WebAssembly does not require any type speculations, we only use tiering to conserve compile time. BBQ code contains only the profiling needed to detect when code has executed many times.

We share modules, along with all of their runtime state (like tiering-up from BBQ to OMG) between workers. Since we hot-patch the BBQ code, which might be executing on any number of threads, we need to be sure that each call-site can be concurrently updated to point to the OMG code. In order to avoid checking for new code on each call to a function, like JavaScript does, we track each call-site to every function, both in assembly and indirectly. Whenever the OMG version of a function is finished compiling, it replaces each call-site with a pointer to the code.

Mem-SIGNAL 🦇

One of the most important optimizations in WebAssembly is reducing the overhead on memory accesses while preserving security guarantees. WebAssembly specifies that all memory accesses are performed on a single 32-bit linear memory. We do this by reserving slightly more than 4GiB of virtual address space. This virtual reservation doesn’t occupy physical memory unless accessed. We use the hardware’s page protection mechanism to mark all but the lower pages as non-readable and non-writable. All loads and stores from this 32-bit address space can be performed normally, without explicit bounds checking instructions. If a load or store would be out of bounds, it will trigger a hardware fault that will ultimately result in a POSIX SIGSEGV or a Mach EXC_BAD_ACCESS exception. Our custom signal / Mach exception handlers then ensure the fault originated from a WebAssembly memory operation. If so, we set the instruction pointer to a special code stub that performs a WebAssembly trap and materializes a WebAssembly.RuntimeError in JavaScript. This optimization means that a WebAssembly memory operation typically results in a single load or store instruction. Overall, we measured a 15-20% speedup from this optimization on various WebAssembly benchmarks.


WebAssembly memory signal

💥 KAPOW! 💥

Get in touch with us by filing a bug, or contacting JF, Keith, and Saam on Twitter.

By JF Bastien, Keith Miller, Saam Barati at June 06, 2017 07:30 PM

June 05, 2017

Intelligent Tracking Prevention

Surfin’ Safari

The success of the web as a platform relies on user trust. Many users feel that trust is broken when they are being tracked and privacy-sensitive data about their web activity is acquired for purposes that they never agreed to.

WebKit has long included features to reduce tracking. From the very beginning, we’ve defaulted to blocking third-party cookies. Now, we’re building on that. Intelligent Tracking Prevention is a new WebKit feature that reduces cross-site tracking by further limiting cookies and other website data.

What Is Cross-Site Tracking and Third-Party Cookies?

Websites can fetch resources such as images and scripts from domains other than their own. This is referred to as cross-origin or cross-site loading, and is a powerful feature of the web. However, such loading also enables cross-site tracking of users.

Imagine a user who first browses example-products.com for a new gadget and later browses example-recipies.com for dinner ideas. If both these sites load resources from example-tracker.com and example-tracker.com has a cookie stored in the user’s browser, the owner of example-tracker.com has the ability to know that the user visited both the product website and the recipe website, what they did on those sites, what kind of web browser was used, et cetera. This is what’s called cross-site tracking and the cookie used by example-tracker.com is called a third-party cookie. In our testing we found popular websites with over 70 such trackers, all silently collecting data on users.

How Does Intelligent Tracking Prevention Work?

Intelligent Tracking Prevention collects statistics on resource loads as well as user interactions such as taps, clicks, and text entries. The statistics are put into buckets per top privately-controlled domain or TLD+1.

Machine Learning Classifier

A machine learning model is used to classify which top privately-controlled domains have the ability to track the user cross-site, based on the collected statistics. Out of the various statistics collected, three vectors turned out to have strong signal for classification based on current tracking practices: subresource under number of unique domains, sub frame under number of unique domains, and number of unique domains redirected to. All data collection and classification happens on-device.

Actions Taken After Classification

Let’s say Intelligent Tracking Prevention classifies example.com as having the ability to track the user cross-site. What happens from that point?

If the user has not interacted with example.com in the last 30 days, example.com website data and cookies are immediately purged and continue to be purged if new data is added.

However, if the user interacts with example.com as the top domain, often referred to as a first-party domain, Intelligent Tracking Prevention considers it a signal that the user is interested in the website and temporarily adjusts its behavior as depicted in this timeline:

Intelligent Tracking Prevention Timeline

If the user interacted with example.com the last 24 hours, its cookies will be available when example.com is a third-party. This allows for “Sign in with my X account on Y” login scenarios.

This means users only have long-term persistent cookies and website data from the sites they actually interact with and tracking data is removed proactively as they browse the web.

Partitioned Cookies

If the user interacted with example.com the last 30 days but not the last 24 hours, example.com gets to keep its cookies but they will be partitioned. Partitioned means third-parties get unique, isolated storage per top privately-controlled domain or TLD+1, e.g. account.example.com and www.example.com share the partition example.com.

This makes sure users stay logged in even if they only visit a site occasionally while restricting the use of cookies for cross-site tracking. Note that WebKit already partitions caches and HTML5 storage for all third-party domains.

What Does This Mean For Web Developers?

With Intelligent Tracking Prevention, WebKit strikes a balance between user privacy and websites’ need for on-device storage. That said, we are aware that this feature may create challenges for legitimate website storage, i.e. storage not intended for cross-site tracking. Please let us know of such cases and we will try to help (contact info at the end of this blog post).

To get you started, here are some some guidelines.

Storage Requires User Interaction

Check to make sure that you aren’t relying on cookies and other storage to persist if the user does not interact directly with your website on a regular basis. Requiring user interaction covers most legitimate uses of client-side storage. It also provides better transparency and gives users more control over who gets to store data on their devices.

Web Analytics

Make sure to configure your web analytics to not rely on third-party cookies from domains that don’t get user interaction. A popular way to do cross-site analytics for a family of sites is to use link decoration, i.e. pad links with information that needs to be carried across origins and navigations.

Ad Attribution

We recommend server-side storage for attribution of ad impressions on your website. Link decoration can be used to pass on attribution information in navigations.

Managing Single Sign-On

If you run a single sign-on system with a centralized session, the user needs to interact with the domain that controls the session. Otherwise you run the risk of Intelligent Tracking Prevention treating your session controller domain as a tracker.

Intelligent Tracking Prevention and Single Sign-On

Imagine a scenario as depicted above; a central session at account.com used for the three sites SiteA.com, SiteB.com, and SiteC.com. Session information can be propagated from account.com to the dependents during account.com’s 24-hour exemption from cookie partitioning. From that point on the sites must maintain sessions without account.com cookies, or they must re-authenticate daily with a brief stop at account.com to acquire new user interaction. You can grant the sites the ability to propagate session information between themselves through navigations and new cookies set in HTTP responses. Single sign-out needs to invalidate the account.com session on the server.

Feedback and Bug Reports

Please report bugs through bugs.webkit.org and send feedback to our evangelist Jon Davis if you are a web developer or a web user and Intelligent Tracking Prevention isn’t working as intended for your website. If you have technical questions on how the feature works you can find me on Twitter @johnwilander.

By John Wilander at June 05, 2017 09:00 PM

JSC 💕 ES6

Surfin’ Safari

Update: The previous version of this post showed performance data from ARES-6 version 1.0. Version 1.0 had a bug where the score for the Worst 4 Iterations was actually using the first 4 iterations. We have fixed the bug and updated the benchmark to version 1.0.1. The performance results shown in this post now use ARES-6 version 1.0.1, which contains the fix.

ES2015 (also known as ES6), the version of the JavaScript specification ratified in 2015, is a huge improvement to the language’s expressive power thanks to features like classes, for-of, destructuring, spread, tail calls, and much more. But powerful language features come at a high cost for language implementers. Prior to ES6, WebKit had spent years optimizing the idioms that arise in ES3 and ES5 code. ES6 requires completely new compiler and runtime features in order to get the same level of performance that we have with ES5.

Overall Performance Results

WebKit’s JavaScript implementation, called JSC (JavaScriptCore), implements all of ES6. ES6 features were implemented to be fast from the start, though we only had microbenchmarks to measure the performance of those features at the time. While it was initially helpful to optimize for our own microbenchmarks and microbenchmark suites like six-speed, to truly tune our engine for ES6, we needed a more comprehensive benchmark suite. It’s hard to optimize new language features without knowing how those features will be used. Since we love programming, and ES6 has many fun new language features to program with, we developed our own ES6 benchmark suite. This post describes the development of our first ES6 benchmark, which we call ARES-6. We used ARES–6 to drive significant optimization efforts in JSC, and this post describes three of them: high throughput generators, a new Map/Set implementation, and phantom spread. The post concludes with an analysis of performance data to show how JSC’s performance compares to other ES6 implementations.

ARES-6 Benchmark

The suite consists of two subtests we wrote in ES6 ourselves, and two subtests that we imported that use ES6. The first subtest, named Air, is an ES6 port of the WebKit B3 JIT‘s Air::allocateStack phase. The second subtest, named Basic, is an ES6 implementation of the ECMA-55 BASIC standard using generators. The third subtest, named Babylon, is an import of Babel’s JavaScript parser. The fourth subtest, named ML, is an import of the feedforward neural network library from the mljs machine learning framework. ARES-6 runs the Air, Basic, Babylon, and ML benchmarks multiple times to gather lots of data about how the browser starts up, warms up, and janks up. This section describes the four benchmarks and ARES-6’s benchmark running methodology.

Air

Air is an ES6 reimplementation of JSC’s allocateStack compiler phase, along with all of Assembly Intermediate Representation needed to run that phase. Air tries to faithfully use new features like arrow functions, classes, for-of, and Map/Set, among others. Air doesn’t avoid any features out of fear that they might be slow, in the hope that we might learn how to make those features fast by looking at how Air and other benchmarks use them.

The next section documents the motivation and design of Air. You can browse the source here.

Motivation

At the time that Air was written, most JavaScript benchmarks used ES5 or older versions of the language. ES6 testing mostly relied on microbenchmarks or conversions of existing tests to ES6. We use larger benchmarks to avoid over-optimizing for small pieces of code. We also avoid changing existing benchmarks because that approach has no limiting principle: if it’s OK to change a benchmark to use a feature, does that mean we can also change it to remove the use of a feature we don’t like? We feel that one of the best ways to avoid falling into the trap of creating benchmarks that only reinforce what some JS engine is already good at is to create a new benchmark from first principles.

We only recently completed our new JavaScript compiler, named B3. B3’s backend, named Air, is very CPU-intensive and uses a combination of object-oriented and functional idioms in C++. Additionally, it relies heavily on high speed maps and sets. It goes so far as to use customized map/set implementations – even more so than the rest of WebKit. This makes Air a great candidate for ES6 benchmarking. The Air benchmark in ARES-6 is a faithful ES6 implementation of JSC’s Air. It pulls no punches: just as the original C++ Air was written with expressiveness as a top priority, ES6 Air is liberal in its use of modern ES6 idioms whenever this helps make the code more readable. Unlike the original C++ Air, ES6 Air doesn’t exploit a deep understanding of compilers to make the code easy to compile.

Design

Air runs one of the more computationally expensive C++ Air phases, Air::allocateStack(). It turns abstract stack references into concrete stack references, by selecting how to lay out stack slots in the stack frame. This requires liveness analysis and an interference graph.

Air relies on three major ES6 features more so than most of the others:

  • Arrow functions. Like the C++ version with lambdas, Air uses a functional style of iterating most non-trivial data-structures. This is because the functional style allows the callbacks to mutate the data being iterated: if the callback returns a non-null value, forEachArg() will replace the argument with that value. This would not have been possible with for-of. Air’s use of forEachArg() and arrow functions usually looks like this:
   inst.forEachArg((arg, role, type, width) => ...)
  • For-of. Many Air data structures are amenable to for-of iteration. While the innermost loops tend to use functional iteration, pretty much all of the outer logic uses for-of heavily. For example:
   for (let block of code) // Iterate over the basic blocks in a program
       for (let inst of block) // Iterate over the instructions in a block
           ...
  • Map/Set. The liveness analysis in Air::allocateStack() relies on maps and sets. For example, we use a liveAtHead map that is keyed by basic block. Its values are sets of live stack slots. This is a relatively crude way of doing liveness, but it is exactly how the original Air::LivenessAnalysis worked. So we view it as being quite faithful to how a sensible programmer might use Map and Set.

Air also uses some other ES6 features. For example, it makes light use of a Proxy. It makes extensive use of classes, let/const, and Symbols. Symbols are used as enumeration elements, and so, they frequently show up as cases in switch statements.

The workflow of an Air run is pretty simple: we do 200 runs of allocateStack on four IR payloads.

Each IR payload is a large piece of ES6 code that constructs an Air Code object, complete with basic blocks, temporaries, stack slots, and instructions. These payloads are generated by running the Air::dumpAsJS() phase. This is a phase we wrote in the C++ version of Air that will generate JS code that builds an IR payload. Just prior to running the C++ allocateStack phase, we perform Air::dumpAsJS() to capture the payload. The four payloads we generate are for the hottest functions in four other major JS benchmarks:

  • Octane/GBEmu, the executeIteration function.
  • Kraken/imaging-gaussian-blur, the gaussianBlur function.
  • Octane/Typescript, the scanIdentifier function.
  • Air (yes, it’s self referential), an anonymous closure identified by our profiler as ACLj8C.

These payloads allow Air to precisely replay allocateStack on those actual functions.

Air validates its results. We added a Code hashing capability to both the C++ Air and ES6 Air, and we assert each payload looks identical after allocateStack to what it would have looked like after the original C++ allocateStack. We also validate that the payloads hash properly before allcoateStack, to help catch bugs during payload initialization. We have not measured how long hashing takes, but it’s an O(N) operation, while allocateStack is closer to O(N^2). We suspect that barring some engine pathologies, hashing should be much faster than allocateStack, and allocateStack should be where the bulk of time is spent.

Summary

At the time that Air was written, we weren’t happy with the ES6 benchmarks that were available to us. Air makes extensive use of ES6 features in the hope that we can learn about possible optimization strategies by looking at this and other benchmarks.

Air does not use generators at all. We almost used them for forEachArg iteration, but it would have been a very unusual use of generators because Air’s forEach functions are more like map than for-of. The whole point is to allow the caller to mutate entries. We thought that functional iteration was more natural for this case than for-of and generators. But this led us to wonder: how would we use generators?

Basic

Web programming requires reasoning about asynchrony. But back when some of us first learned to program with the BASIC programming language, we didn’t have to worry about that: if you wanted input from the user, you said INPUT and the program would block until the user typed things. In a sense, this is what generators are about: when you say yield, your code blocks until some other stuff happens.

Basic is an ES6 implementation of ECMA-55 BASIC. It implements the interpreter as a recursive abstract syntax tree (AST) walk, where the recursive functions are generators so that they can yield when they encounter BASIC’s INPUT command. The lexer is also written as a generator. Basic tests multiple styles of generators, including functions with multiple yield points and recursive yield* calls.

The next section describes how Basic uses generators in the lexer and in the AST walk. You can browse the source here.

Lexer

Lexers are usually written as a lex function that maintains some state and returns the next token whenever you call it. Basic uses a generator to implement lex, so that it can use local variables for all of its state.

Basic’s lexer is a generator function with multiple nested functions inside it. It contains eight uses of yield, none of which are recursive.

AST Walk

Walking the AST is an easy way to implement an interpreter and this was a natural choice for Basic. In such a scheme, the program is represented as a tree, and each node has code associated with it that may recursively invoke sub-nodes in the tree. Basic uses plain object literals to create nodes. For example:

    {evaluate: Basic.NumberPow, left: primary, right: parsePrimary()}

This code in the parser creates an AST node whose evaluation function is Basic.NumberPow. Were it not for INPUT, Basic could use ordinary functions for all of the AST. But INPUT requires blocking; so, we implement that with yield. This means that all of the statement-level AST nodes use generators. Those generators may call each other recursively using yield*.

The AST walk interpreter contains 18 generator functions with two calls to yield (in INPUT and PRINT) and one call to yield* (in Basic.Program).

Workloads

Each Basic benchmark iteration runs five simple Basic programs:

  • Hello world!
  • Print numbers 1..10.
  • Print a random number.
  • Print random numbers until 100 is printed.
  • Find all prime numbers from 2 to 1999.

The interpreter uses a fixed seed, so the random parts always behave the same way. The Basic benchmark runs 200 iterations.

Summary

We wrote Basic because we wanted to see what an aggressive attempt at using generators looks like. Prior to Basic, all we had were microbenchmarks, which we feared would focus our optimization efforts only on the easy cases. Basic uses generators in some non-obvious ways, like having multiple generator functions, many yield points, and recursive generator calls. Basic also uses for-of, classes, Map, and WeakMap.

Babylon and ML

It is important to us that ARES-6 consists both of tests we wrote, and tests we imported. Babylon and ML are two tests we imported with minor modifications to make them run in the browser without Node.js style modules or ES6 modules. (The reason for not including tests with ES6 modules is that at the time of writing this, <script type="module"> is not on by default in all major browsers.) Writing new tests for ARES-6 is important because it allows us to use ES6 in interesting and sophisticated ways. Importing programs we didn’t write is also imperative for performance analysis because it ensures that we measure the performance of interesting programs already using ES6 out in the wild.

Babylon is an interesting test for ARES-6 because it makes light use of classes. We think that many people will dip their toes into ES6 by adopting classes since many ES5 programs are written in ways that lead to an easy translation to classes. We’ve seen this firsthand in the WebKit project when Web Inspector moved all their ES5 pseudo-classes to use ES6 class syntax. You can browse the Babylon source in ARES-6 here. The Babylon benchmark runs 200 iterations where each iteration consists of parsing four JS programs.

The feedforward neural network library, that the ML benchmark is imported from, depends heavily on ES6 classes. It uses the ml-matrix library which implements matrix manipulations in a decidedly object oriented style using ES6 classes almost exclusively. You can browse the ML source in ARES-6 here. The ML benchmark runs 60 iterations where each iteration consists of performing analysis over different sample data sets using various activation functions.

Benchmarking Methodology

Air, Basic and Babylon each run for 200 iterations, and ML runs for 60 iterations. Each iteration does the same kind of work. We simulate page reload before the first of those iterations, to minimize the chances that the code will benefit from JIT optimizations at the beginning of the first iteration. ARES-6 analyzes these four benchmarks in three different ways:

  • Start-up performance. We want ES6 code to run fast right from the start, even if our JITs haven’t had a chance to perform optimizations yet. ARES-6 reports a First Iteration score that is just the execution time of the first of the 60 or 200 iterations.
  • Worst-case performance. JavaScript engines have a lot of different kinds of jank. The GC, JIT, and various adaptive optimizations all run the risk of causing programs to sometimes run for much longer than the norm. We don’t want our ES6 optimizations to introduce new kinds of jank. ARES-6 reports a Worst 4 Iterations score that is the average of the execution times of the worst 4 of the 60 or 200 iterations, excluding the first iteration which was used for measuring start-up performance.
  • Throughput. If you write some ES6 code and it runs for long enough, then it should eventually benefit from all of our optimizations. ARES-6 reports an Average score that is the average execution time of all iterations excluding the first.

Each repetition of ARES-6 yields 3 scores (measured in milliseconds): First Iteration, Worst 4 Iterations, and Average, for each of the 4 subtests: Air, Basic, Babylon, and ML, for a total of 12 scores. The geometric mean of all 12 scores is the Overall score for the repetition. ARES-6 runs 6 repetitions, and reports the averages of each of these 13 scores along with their 95% confidence intervals. Since these scores are measures of execution time, lower scores are better because they mean that the benchmark completed in less time.

Optimizations

We built ARES-6 so that we could have benchmarks with which to tune our ES6 implementation. ARES-6 was built specifically to allow us to study the impact of new language features on our engine. This section describes three areas where we made significant changes to JavaScriptCore in order to improve our score on ARES-6. We revamped our generator implementation to give generator functions full access to our optimizing JITs. We rewrote our Map/Set implementation so that our JIT compiler can inline Map/Set lookups. Finally, we added significant new escape analysis functionality to eliminate the object allocation of the rest parameter when used with the spread operator.

High-Throughput Generators

The performance of ES6 generators is critical to the overall performance of JavaScript engines. We expect generators to be frequently used as ES6 adoption increases. Generators are a language-supported way to suspend and resume an execution context. They can be used to implement value streams and custom iterators, and are the basis of ES2017’s async and await, which streamlines the notoriously tricky asynchronous Promise programming model into the direct style programming model. To be performant, it was an a priori goal that our redesigned generator implementation had a sound and simple compilation strategy in all of our JIT tiers.

A generator must suspend and resume the execution context at the point of a yield expression. To do this, JSC needs to save and restore its execution state: the instruction pointer and the current stack frame. While we need to save the logical JavaScript instruction pointer to resume execution at the appropriate point in the program, the machine’s instruction pointer may change. If the function is compiled in upper JIT tiers (like baseline, DFG, and FTL), we must select the code compiled in the best tier when resuming. In addition, the instruction pointer may be invalidated if the compiled code is deoptimized.

JSC’s bytecode is a 3AC-style (three-address code) IR (intermediate representation) that operates over virtual registers. The bytecode is allowed an effectively “infinite” number of registers, and the stack frame comprises slots to store each register. In practice, the number of virtual registers used is small. The key insight into our generator implementation is that it is a transformation over JSC’s bytecode. This completely hides the complexity of generators from the rest of the engine. The phases prior to bytecode generatorification (parsing, bytecode generation from the AST) are allowed to view yield as if it were a function call — about the easiest kind of thing for those phases to do. Best of all, the phases after generatorification do not have to know anything about generators. That’s a fantastic outcome, since we already have a massive interpreter and JIT compiler infrastructure that consumes this bytecode.

When generating the original bytecode, JSC’s bytecode compiler treats a generator mostly as if it were a program with normal control flow, where each “yield” point is simply an expression that takes arguments, and results in a value. However, this is not how yield is implemented at a machine level. To transform this initial form of bytecode into a version that does proper state restoration, we rewrite the generator’s bytecode to turn the function into a state machine.

To properly restore the instruction pointer at yield points, our bytecode transformation inserts a switch statement at the top of each generator function. It performs a switch over an integer that represents where in the program the generator should resume when called. Secondly, we must have a way to restore each virtual register at each yield point. It turns out that JSC already has a mechanism for doing exactly this. We turned this problem into a problem of converting each bytecode virtual register in the original bytecode into a closure variable in JSC’s environment record data structure. Each transformed generator allocates a closure to store all generator state. Every yield point saves each live virtual register into the closure, and every resume point restores each live virtual register from the closure.

Because this bytecode transformation changes the program only by adding closures and a switch statement, this approach trivially meets our goal of being compilable in our optimizing compiler tiers. JSC’s optimizing compilers already do a good job at optimizing switch statements and closure variables. The generator’s dispatch gets all of JSC’s switch lowering optimizations just by virtue of using the switch bytecode. This includes our B3 JIT’s excellent switch lowering optimization, and less aggressive versions of that same optimization in lower JIT tiers and in the interpreter. The DFG also has numerous optimizations for closures. This includes inferring types of closure variables, optimizing loads and stores to closures down to a single instruction in most cases, and properly modeling the aliasing effects of closure accesses. We get these benefits without introducing new constructs into the optimizing compiler.

Example

Let’s walk through an example of the bytecode transformation for an example JavaScript generator function:

function *generator()
{
    var temp = 20;
    var result = yield 42;
    return result + temp;
}

JSC generates the bytecode sequence shown in the following figure, which represents the control flow graph.

Pre-Generatorification bytecode

When the AST-to-bytecode compiler sees a yield expression, it just emits the special op_yield bytecode operation. This opcode is a macro. It’s not recognized by any of our execution engines. But it has well-understood semantics, which allows us to treat it as bytecode syntactic sugar for closures and switches.

Generatorification

Desugaring is performed by the generatorification bytecode phase. The figure above explains how generatorification works. Generatorification rewrites the bytecode so that there are no more op_yield statements and the resulting function selects which code to execute using a switch statement. Closure-style property access bytecodes are used to save/restore state around where the op_yield statements used to be.

Previously, JSC bytecode was immutable. It was used to carry information from the bytecode generator, which wrote bytecode as a byproduct of walking the AST, to the compilers, which would consume it to create some other byproduct (either machine code or some other compiler IR). Generatorification changes that. To facilitate this phase, we created a general-purpose bytecode rewriting facility for JSC. It allows us to insert and remove any kind of bytecode instruction, including jumps. It permits us to perform sophisticated control-flow edits on the bytecode stream. We use this to make generatorification easier, and we look forward to using it to implement other crazy future language features.

To allow suspension and resumption of the execution context at op_yield, the rewriter inserts the op_switch_imm opcode just after the op_enter opcode. At the point of op_yield, the transformed function saves the integer that represents the virtual instruction pointer and then returns from the function with op_ret. Then, when resuming, we use the inserted op_switch_imm with the saved integer, jumping to the point just after the op_yield which suspended the context.

To save and restore live registers, this pass performs liveness analysis to find live registers at op_yield and then inserts op_put_to_scope and op_get_from_scope operations to save their state and restore it (respectively). These opcodes are part of our closure object model, which happens to be appropriate here because JSC’s closures are just objects that statically know what their fields are and how they will be laid out. This allows fast allocation and fast access, which we already spent a great deal of time optimizing for actual JS closures. In this figure, generatorification performs liveness analysis and finds that the variable temp is live over the op_yield point. Because of this, the rewriter emits op_put_to_scope and op_get_from_scope to save and restore temp. This does not disrupt our optimizing compiler’s ability to reason about temp‘s value, since we had already previously solved that same problem for variables saved to closures.

Summary

The generatorification rewrite used bytecode desugaring to encapsulate the whole idea of generators into a bytecode phase that emits idiomatic JSC bytecode. This allows our entire optimization infrastructure to join the generator party. Programs that use generators can now run in all of our JIT tiers. At the time this change landed, it sped up Basic’s Average score by 41%. It also improved Basic overall: even the First Iteration score got 3% faster, since our low-latency DFG optimizing JIT is designed to provide a boost even for start-up code.

Generator: Safari-to-WebKit Nightly comparison

The data in the above graph was taken on a Mid 2014, 2.8 GHz Core i7, 16GB RAM, 15″ MacBook Pro, running macOS Sierra version 10.12.0. Safari 10.0.2 is the first version of Safari to ship with JavaScriptCore’s complete ES6 implementation. Safari 10.0.2 does not include any of the optimizations described in this post. The above graph shows that since implementing ES6, we’ve made significant improvements to JSC’s performance running Basic. We’ve made a 1.2x improvement in the First Iteration score, a 1.4x improvement in the Worst 4 Iterations score, and a 2.8x improvement in the Average score.

Desugaring is a classic technique. Crazy compiler algorithms are easiest to think about when they are confined to their own phase, so we turned our bytecode into a full-fledged compiler IR and implemented generatorification as a compiler phase over that IR. Our new bytecode rewriter is powerful enough to support many kinds of phases. It can insert and remove any kind of bytecode instruction including jumps, allowing for complex control flow edits. While it’s only used for generators presently, this rewriting facility can be used to implement complex ECMAScript features in the future.

Fast Map and Set

Map and Set are two new APIs in ES6 that make it a more expressive language to program in. Prior to ES6, there was no official hashtable API for JavaScript. It’s always been possible to use a JavaScript object as a hashtable if only Strings and Symbols are used as keys, but this isn’t nearly as useful as a hashtable that can take any value as a key. ES6 fixes this ancient language deficiency by adding new Map and Set types that accept any JavaScript value as a key.

Both Air and Basic use Map and Set. Profiling showed that iteration and lookup were by far the most common operations, which is fairly consistent with our experience elsewhere in the language. For example, we have always found it more important to optimize property loads than property stores, because loads are so much more common. In this section we present our new Map/Set implementation, which we optimized for the iteration and lookup patterns of ARES-6.

Fast Lookup

To make lookup fast, we needed to teach our JIT compilers about the internal workings of our hashtable. The main performance benefit of doing this comes from inlining the hashtable lookup into the IR for a function. This obviates the need to call out to C++ code. It also allows us to implement the lookup more efficiently by leveraging the compiler’s smarts. To understand why, let’s analyze a hashtable lookup in more detail. JSC’s hashtable implementation is based off the linear probing algorithm. When you perform map.has(key) inside a JS program, we’re really performing a few abstract operations under the hood. Let’s break those operations down into pseudo code:

let h = hash(key);
let bucket = map.findBucket(h);
let has = !isEmptyBucket(bucket);
return has;

You can think of the findBucket(h) operation as:

let bucket = startBucket(h);
while (true) {
    if (isEmptyBucket(bucket))
        return emptyBucketSentinel;

    // Note that the actual key comparison for ES6 Map and Set is not triple equals.
    // But it's close. We can assume it's triple equals for the sake of this explanation.
    if (bucket.key === key)
        return bucket;

    h = nextIndex(h);
}

There are many things that can be optimized here based on information we have about key. The compiler will often know the type of key, allowing it to emit a more efficient hash function, and a more efficient triple equals comparison inside the findBucket loop. The C code must handle JavaScript key values of all possible types. However, inside the compiler, if we know the type of key, we may be able to emit a triple equals comparison that is only a single machine compare instruction. The hash function over key also benefits from knowing the type of key. The C++ implementation of the hash function must handle keys of all types. This means that it’ll have to do a series of type checks on key to determine how it should be hashed. The reason for this is we hash numbers differently than strings, and differently than objects. The JIT compiler will often be able to prove the type of key, thereby allowing us to avoid multiple branches to learn key‘s type.

These changes alone already make Map and Set much faster. However, because the compiler now knows about the inner workings of a hashtable, it can further optimize code in a few neat ways. Let’s consider a common use of the Map API:

if (map.has(key))
    return map.get(key);
...

To understand how our compiler optimizations come into play, let’s look at how our DFG (Data Flow Graph) IR represents this program. DFG IR is used by JSC for performing high-level JavaScript-specific optimizations, including optimizations based on speculation and self-modifying code. This hashtable program will be transformed into roughly the following DFG IR (actual DFG IR dumps have a lot more information):

    BasicBlock #0:
        k: GetArgument("key")
        m: GetArgument("map")
        h: Hash(@k)
        b: GetBucket(@m, @h, @k)
        c: IsNonEmptyBucket(@b)
        Branch(@c, True:#1, False:#2)

    BasicBlock #1:
        h2: Hash(@k)
        b2: GetBucket(@m, @h2, @k)
        r: LoadFromBucket(@b2)
        Return(@r)

    BasicBlock #2:
        ...

DFG IR allows for precise modeling of effects of each operation as part of the clobberize analysis. We taught clobberize how to handle all of our new hashtable operations, like GetBucket, IsNonEmptyBucket, Hash, and LoadFromBucket. The DFG CSE (common subexpression elimination) phase uses this data and runs with it: it can see that the same Hash and GetBucket operations are performed twice without any operations that could change the state of the map in between them. Lots of other DFG phases use clobberize to understand the aliasing and effects of operations. This unlocks loop-invariant code motion and lots of other optimizations.

In this example, the optimized program will look like:

    BasicBlock #0:
        k: GetArgument("key")
        m: GetArgument("map")
        h: Hash(@k)
        b: GetBucket(@m, @h, @k)
        c: IsNonEmptyBucket(@b)
        Branch(@c, True:#1, False:#2)

    BasicBlock #1:
        r: LoadFromBucket(@b)
        Return(@r)

    BasicBlock #2:
        ...

Our compiler was able to optimize the redundant hashtable lookups in has and get down to a single lookup. In C++, hashtable APIs go to great lengths to provide ways of avoiding redundant lookups like the one here. Since ES6 provides only very simple Map/Set APIs, it’s difficult to avoid redundant lookups. Luckily, our compiler will often get rid of them for you.

Fast Iteration

As we wrote Basic and Air, we often found ourselves iterating over Maps and Sets because writing such code is natural and convenient. Because of this, we wanted to make Map and Set iteration fast. However, it’s not immediately obvious how to do this because ES6’s Map and Set iterate over keys in insertion order. Also, if a Map or Set is modified during iteration, the iterator must reflect the modifications.

We needed to use a data structure that is fast to iterate over. An obvious choice is to use a linked list. Iterating a linked list is fast. However, testing for the existence of something inside a linked list is O(n), where n is the number of elements in the list. To accommodate the fast lookup of a hashtable, and the fast iteration of a linked list, we chose a hybrid approach of a combo linked-list hashtable. Every time something is added to a Map or Set, it goes onto the end of the linked list. Every element in the linked list is a bucket. Each entry in the hashtable’s internal lookup buffer points to a bucket inside the linked list.

This is best understood through a picture. Consider the following program:

let map = new Map;
map.set(1, "one");
map.set(2, "two");
map.set(3, "three");

A traditional linear probing hashtable would look like this:

Traditional linear probing hashtable

However, in our scheme, we need to know the order in which we need to iterate the hashtable. So, our combo linked-list hashtable will look like this:

Combo linked-list hashtable

As mentioned earlier, when iterating over a Map or Set while changing it, the iterator must iterate over newly added values, and must not iterate over deleted values. Using a linked list data structure for iteration allows for a natural implementation of this requirement. Inside JSC, Map and Set iterators are just wrappers over hashtable buckets. Buckets are garbage collected, so an iterator can hang onto a bucket even after it has been removed from the hashtable. As an item is deleted from the hashtable, the bucket is removed from the linked list by updating the deleted bucket’s neighbors to now point to each other instead of the deleted bucket. Crucially, the deleted bucket will still point to its neighbors. This allows iterator objects to still point to the deleted bucket and then find their way back to non-deleted buckets. Asking such an iterator for its value will lead the iterator to traverse its neighbor pointer until it finds the next non-deleted bucket. The key insight here is that deleted buckets can always find the next non-deleted bucket by doing a succession of pointer chasing through their neighbor pointer. Note that this pointer chasing might lead the iterator to the end of the list, in which case, the iterator is closed.

For example, consider the previous program, now with the key 2 deleted:

let map = new Map;
map.set(1, "one");
map.set(2, "two");
map.set(3, "three");
map.delete(2);

Now, the hashtable will look like this:

Combo linear probing with entry 2 deleted

Let’s consider what happens when there is an iterator that’s pointing to the bucket for 2. When next() is called on that iterator, it’ll first check if the bucket it’s pointing to is deleted, and if so, it will traverse the linked list until it finds the first non-deleted entry. In this example, the iterator will notice the bucket for 2 is deleted, and it will traverse to the bucket for 3. It will see that 3 is not deleted, and it will yield its value.

Summary

We found ourselves using Map/Set a lot in the ES6 code that we wrote, so we decided to optimize these data structures to further encourage their use. Our Map/Set rewrite represents the hashtable’s underlying data structures in a way that is most natural for our garbage collector and DFG compiler to reason about. At the time this rewrite was committed, it contributed to an 8% overall performance improvement on ARES-6.

Phantom Spread

Another ES6 feature we found ourselves using is the combination of the rest parameter and the spread operator. It’s an intuitive programming pattern to gather some arguments into an array using the rest parameter, and then forward them to another function call using spread. For example, here is one way we use this in the Air benchmark:

visitArg(index, func, ...args)
{
    let replacement = func(this._args[index], ...args);
    if (replacement)
        this._args[index] = replacement;
}

A naive implementation of spread over the rest parameter will cause this code to run slower than needed because a spread operator requires performing the iterator protocol on its argument. Because the function visitArg is creating the args array, the DFG compiler knows exactly what data will be stored into it. Specifically, it’ll be filled with all but the first two arguments to visitArg. We chose to implement an optimization for this programming pattern because we think that the spread of a rest parameter will become a common ES6 programming idiom. In JSC, we also implement an optimization for the ES5 variant of this programming idiom:

foo()
{
    bar.apply(null, arguments);
}

The ideal code the DFG could emit to set up the stack frame for the call to func inside visitArg is a memcpy from the arguments on its own stack frame to that of func‘s stack frame. To be able to emit such code, the DFG compiler must prove that such an optimization is sound. To do this, the DFG must prove a few things. It must first ensure that the array iterator protocol is not observable. It proves it is not observable by ensuring that:

By performing such a proof, the DFG knows exactly what the protocol will do and it can model its behavior internally. Secondly, the DFG must prove that the args array hasn’t changed since being populated with the values from the stack. It does this by performing an escape analysis. The escape analysis will give a conservative answer to the question: has the args array changed since its creation on entry to the function? If the answer to the question is “no”, then the DFG does not need to allocate the args array. It can simply perform a memcpy when setting up func‘s stack frame. This is a huge speed improvement for a few reasons. The primary speed gain is from avoiding performing the high-overhead iterator protocol. The secondary improvement is avoiding a heap object allocation for the args array. When the DFG succeeds at performing this optimization, we say it has converted a Spread into a PhantomSpread. This optimization leverages the DFG’s ability to not only optimize away object allocations, but to rematerialize them if speculative execution fails and we fall back to executing the bytecode without optimizations.

Air: Safari-to-WebKit Nightly comparison

The data in the above graph was taken on a Mid 2014, 2.8 GHz Core i7, 16GB RAM, 15″ MacBook Pro, running macOS Sierra version 10.12.0. PhantomSpread was a significant new addition to our DFG JIT compiler. The Phantom Spread optimization, the rewrite of Map/Set, along with other overall JSC improvements, shows that we’ve sped up Air’s Average score by nearly 2x since Safari 10.0.2. Crucially, we did not introduce such a speed up at the expense of the First Iteration and Worst 4 Iterations scores. Both of those scores have also progressed.

Performance Results

We believe that a great way to keep ourselves honest about JavaScriptCore’s performance is to compare to the best performance baselines available. Therefore, we routinely compare our performance both to past versions of JavaScriptCore and to other JavaScript engines. This section shows how JavaScriptCore compares to Firefox and Chrome’s JavaScript engines.

The graphs in this section use data taken on a Late 2016, 2.9 GHz Core i5, 8GB RAM, 13″ MacBook Pro, running macOS High Sierra Beta 1.

Overall Performance Results

The figure above shows the overall ARES-6 scores in three different browsers: Safari 11 (13604.1.21.0.1), Firefox 53.0.3, and Chrome 58.0.3029.110. Our data shows that we’re nearly to 1.8x faster than Chrome, and close to 5x faster than Firefox.

Detailed Performance Results

The graph above shows detailed results for all four benchmarks and their three scores. It shows that JavaScriptCore is the fastest engine at running ARES-6 not only in aggregate score, but also in all three scores for each individual subtest.

Conclusion

ES6 is a major improvement to the JavaScript language, and we had fun giving it a test drive when creating ARES-6. Writing Air and Basic, and importing Babylon and ML, gave us a better grasp of how ES6 features are likely to be used. Adding new benchmarks to our repertoire always teaches us new lessons. Long term, this only works to our benefit if we keep adding new benchmarks and don’t over optimize for the same set of stale benchmarks. Going forward, we plan to add more subtests to ARES-6. We think adding more tests will keep us honest in not over-tuning for specific ES6 code patterns. If you have exciting ES6 (or ES7 or beyond) code that you think is worth including in ARES-6, we’d love to hear about it. You can get in touch either by filing a bug or contacting Filip, Saam, or Yusuke, on Twitter.

By Saam Barati, Yusuke Suzuki & Filip Pizlo at June 05, 2017 05:45 PM

May 31, 2017

Release Notes for Safari Technology Preview 31

Surfin’ Safari

Safari Technology Preview Release 31 is now available for download for macOS Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 216643-217407.

Web API

  • Added media and type attribute support for <link rel="preload"> (r217247)
  • Added support for DOMMatrix and DOMMatrixReadOnly (r216959)
  • Fixed getElementById to return the correct element when a matching element is removed during beforeload event (r216978)
  • Fixed skipping <slot> children when collecting content for innerText (r216966)

JavaScript

  • Fixed a syntax error thrown when declaring a top level for-loop iteration variable the same as a function parameter (r217200)

Layout & Rendering

  • Added support for transform-box to switch sizing box in SVG (r217236)
  • Fixed clientX and clientY on mouse events to be relative to the layout viewport (r216824)
  • Fixed large animated images getting decoded as a large static image before receiving all of the data (r217262)
  • Fixed screen flickering caused by asynchronous image decoding for large images when interacting with the page (r216901)
  • Fixed element position when dragging jQuery Draggable elements with position:fixed after pinch zoom (r216803)
  • Fixed a timing issue causing a hardware-accelerated transform animation to misplace an element 50% of the time (r217075)

CSS GRID

  • Implemented the place-self shorthand (r216829)
  • Fixed static position of positioned grid items (r216916)
  • Ignored collapsed tracks on content-distribution alignment (r217345)

Font Variations

  • Added support for calc() in font-variation-settings and font-feature-settings (r217267)
  • Enabled the woff2-variations format identifier for @font-face (r217241)
  • Updated the font-style implementation in the font selection algorithm (r217272)

Web Inspector

  • Added a new icon for Web Socket resources (r217067)
  • Changed adding new CSS rules so that the added rules go into a new Inspector Style Sheet resource that can be viewed, edited, and saved (r217258)
  • Fixed an issue where changes are not applied in Styles sidebar when switching tabs without blurring editor (r217266)
  • Fixed content views not getting restored on reload if its tree element is filtered out (r217317)
  • Fixed an error when trying to delete DOM breakpoints from the Debugger tab (r216681)
  • Fixed deleting a disabled XHR breakpoint (r217306)
  • Fixed miscellaneous RTL and localization issues (r216692, r217232, r217229)
  • Prevented loading the active recording until a Timeline view needs to be shown (r217379)

Media

  • Added support for painting MSE video-element to canvas (r217185)
  • Fixed captions and subtitles not showing up in picture-in-picture for MSE content (r216951)
  • Fixed media element reporting hidden when in picture-in-picture mode and tab is backgrounded (r217223)

Web Driver

  • Fixed characters produced with the shift modifier on a QWERTY keyboard to be delivered as shift-down, char-down, char-up, and shift-up events (r217244)
  • Fixed navigator.webdriver to return false if the page is not controlled by automation (r217391)

WebCrypto

  • Replaced CryptoOperationData with BufferSource (r216992)

Security

  • Improved error message for Access-Control-Allow-Origin violations due to a misconfigured server (r217069)

By Jon Davis at May 31, 2017 05:00 PM

May 17, 2017

Release Notes for Safari Technology Preview 30

Surfin’ Safari

Safari Technology Preview Release 30 is now available for download for macOS Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 215859-216643.

Web API

  • Implemented Subresource Integrity (SRI) (r216347)
  • Implemented X-Content-Type-Options:nosniff (r215753, r216195)
  • Added support for Unhandled Promise Rejection events (r215916)
  • Updated document.cookie to only return cookies if the document URL has a network scheme or is a file URL (r216341)
  • Removed the non-standard document.implementation.createCSSStyleSheet() API (r216458)
  • Removed the non-standard Element.scrollByLines() and scrollByPages() (r216456)
  • Changed to allow a null comparator in Array.prototype.sort (r216169)
  • Changed to set the Response.blob() type based on the content-type header value (r216353)
  • Changed Element.slot to be marked as [Unscopable] (r216228)
  • Implemented HTMLPreloadScanner support for <link preload> (r216143)
  • Fixed setting Response.blob() type correctly when the body is a ReadableStream (r216073)
  • Moved offsetParent, offsetLeft, offsetTop, offsetWidth, offsetHeight properties from Element to HTMLElement (r216466)
  • Moved the style property from Element to HTMLElement and SVGElement, then made it settable (r216426)

JavaScript

  • Fixed Arrow function access to this after eval('super()') within a constructor (r216329)
  • Added support for dashed values in unicode locale extensions (r216122)
  • Fixed the behaviour of the .sort(callback) method to match Firefox and Chrome (r216137)

CSS

  • Fixed space-evenly behavior with Flexbox (r216536)
  • Fixed font-stretch:normal to select condensed fonts (r216517)
  • Fixed custom properties used in rgb() with calc() (r216188)

Accessibility

  • Fixed the behavior of aria-orientation="horizontal" on a list (r216452)
  • Prevented exposing empty roledescription (r216457)
  • Propagated aria-readonly to grid descendants (r216425)
  • Changed to ignore aria-rowspan value if a rowspan value is provided for a <td> or <th> (r216167)
  • Fixed an issue causing VoiceOver to skip cells after a cell with aria-colspan (r216134)
  • Changed to treat cells with ARIA table cell properties as cells (r216123)
  • Updated implementation of aria-orientation to match specifications (r216089)

Web Inspector

  • Added resource load error reason text in the details sidebar (r216564)
  • Fixed toggling the Request and Response resource views in certain cases (r216461)
  • Fixed miscellaneous RTL and localization issues (r216465, r216629)
  • Fixed Option-Click on URL behavior in Styles sidebar (r216166)
  • Changed 404 Image Loads to appear as a failures in Web Inspector (r216138)

WebDriver

  • Fixed several issues that prevented cookie-related endpoints from working correctly (r216258, r216261, r216292)

Media

  • Removed black background from video layer while in fullscreen (r216472)

Rendering

  • Fixed problem with the CSS Font Loading API’s load() function erroneously resolving promises when used with preinstalled fonts (r216079)
  • Fixed flickering on asynchronous image decoding and ensured the image is incrementally displayed as new data is received (r216471)

By Jon Davis at May 17, 2017 05:00 PM

May 15, 2017

Responsive Design for Motion

Surfin’ Safari

WebKit now supports the prefers-reduced-motion media feature, part of CSS Media Queries Level 5, User Preferences. The feature can be used in a CSS @media block or through the window.matchMedia() interface in JavaScript. Web designers and developers can use this feature to serve alternate animations that avoid motion sickness triggers experienced by some site visitors.

To explain who this media feature is for, and how it’s intended to work, we’ll cover some background. Skip directly to the code samples or prefers-reduced-motion demo if you wish.

Motion as a Usability Tool

CSS transforms and animations were proposed by WebKit engineers nearly a decade ago as an abstraction of Core Animation concepts wrapped in a familiar CSS syntax. The standardization of CSS Transforms/CSS Animations and adoption by other browsers helped pave the way for web developers of all skill levels. Richly creative animations were finally within reach, without incurring the security risk and battery cost associated with plug-ins.

The perceptual utility of appropriate, functional motion can increase the understandability and —yes— accessibility of a user interface. There are numerous articles on the benefits of animation to increase user engagement:

In 2013, Apple released iOS 7, which included heavy use of animated parallax, dimensionality, motion offsets, and zoom effects. Animation was used as tool to minimize visual user interface elements while reinforcing a user’s understanding of their immediate and responsive interactions with the device. New capabilities in web and native platforms like iOS acted as a catalyst, leading the larger design community to a greater awareness of the benefits of user interface animation.

Since 2013, use of animation in web and native apps has increased by an order of magnitude.

Motion is Wonderful, Except When it’s Not

Included in the iOS accessibility settings is a switch titled “Reduce Motion.” It was added in iOS 7 to allow users the ability to disable parallax and app launching animations. In 2014, iOS included public API for native app developers to detect Reduce Motion (iOS, tvOS) and be notified when the iOS setting changed. In 2016, macOS added a similar user feature and API so developers could both detect Reduce Motion (macOS) and be notified when the macOS pref changed. The prefers-reduced-motion media feature was first proposed to the CSS Working Group in 2014, alongside the release of the iOS API.

Wait a minute! If we’ve established that animation can be a useful tool for increasing usability and attention, why should it ever be disabled or reduced?

The simplest answer is, “We’re not all the same.” Preference is subjective, and many power users like to reduce UI overhead even further once they’ve learned how the interface works.

The more important, objective answer is, “It’s a medical necessity for some.” In particular, this change is required for a portion of the population with conditions commonly referred to as vestibular disorders.

Vestibular Spectrum Disorder

Vestibular disorders are caused by problems affecting the inner ear and parts of the brain that control balance and spatial orientation. Symptoms can include loss of balance, nausea, and other physical discomfort. Vestibular disorders are more common than you might guess: affecting as many as 69 million people in the United States alone.

Most people experience motion sickness at some point in their lives, usually while traveling in a vehicle. Consider the last time you were car-sick, sea-sick, or air-sick. Nausea can be a symptom of situations where balance input from your inner ear seems to conflict with the visual orientation from your eyes. If your senses are sending conflicting signals to your brain, it doesn’t know which one to trust. Conflicting sensory input can also be caused by neurotoxins in spoiled food, hallucinogens, or other ingested poisons, so a common hypothesis is that these conflicting sensory inputs due to motion or vestibular responses lead your brain to infer its being poisoned, and seek to expel the poison through vomiting.

Whatever the underlying cause, people with vestibular disorders have an acute sensitivity to certain motion triggers. In extreme cases, the symptoms can be debilitating.

Vestibular Triggers

The following sections include examples of common vestibular motion triggers, and variants. If your site or web application includes similar animations, consider disabling or using variants when the prefers-reduced-motion media feature matches.

Trigger: Scaling and Zooming

Visual scaling or zooming animations give the illusion that the viewer is moving forward or backward in physical space. Some animated blurring effects give a similar illusion.

Note: It’s okay to keep many real-time, user-controlled direct manipulation effects such as pinch-to-zoom. As long as the interaction is predictable and understandable, a user can choose to manipulate the interface in a style or speed that works for their needs.

Example 1: Mouse-Triggered Scaling

How to Shoot on iPhone incorporates a number of video and motion effects, including a slowly scaling poster when the user’s mouse hovers over the video playback buttons.

The Apple.com team implemented prefers-reduced-motion to disable the scaling effect and background video motion.

Example 2: 3D Zoom + Blur

The macOS web site simulates flying away from Lone Pine Peak in the Sierra Nevada mountain range. A three-dimensional dolly zoom and animated blur give the viewer a sense that physical position and focal depth-of-field is changing.

In mobile devices, or in browsers that can’t support the more complicated animation, the effect is reduced to a simpler scroll view. By incorporating similar visual treatment, the simpler variant retains the original design intention while removing the effect. The same variant could be used with prefers-reduced-motion to avoid vestibular triggers.

Update May 25, 2017: The macOS web site has been modified to support reduced motion when preferred.

Trigger: Spinning and Vortex Effects

Effects that use spiraling or spinning movements can cause some people with vestibular disorders to lose their balance or vertical orientation.

Example 3: Spinning Parallax Starfield

Viljami Salminen Design features a spinning, background star field by default.

It has incorporated prefers-reduced-motion to stop the spinning effect for users with vestibular needs. (Note: The following video is entirely motionless.)

Trigger: Multi-Speed or Multi-Directional Movement

Parallax effects are widely known, but other variants of multi-speed or multi-directional movement can also trigger vestibular responses.

Example 4: iOS 10 Site Scrolling

The iOS 10 site features images moving vertically at varying speeds.

A similar variant without the scroll-triggered image offsets could be used with prefers-reduced-motion to avoid vestibular triggers.

Update May 25, 2017: The iOS 10 site has been modified to support reduced motion when preferred.

Trigger: Dimensionality or Plane Shifting

These animations give the illusion of moving two-dimensional (2D) planes in three-dimensional (3D) space. The technique is sometimes referred to as two-and-a-half-dimensional (2.5D).

Example 5: Plane-Shifted Scrolling

Apple’s Environment site features a animated solar array that tilts as the page scrolls.

The site supports a reduced motion variant where the 2.5D effect remains a still image.

Trigger: Peripheral Motion

Horizontal movement in the peripheral field of vision can cause disorientation or queasiness. Think back to the last time you read a book while in a moving vehicle. The center of your vision was focused on the text, but there was constant movement in the periphery. This type of motion is fine for some, and too much to stomach for others.

Example 6: Subtle, Constant Animation Near a Block of Text

After scrolling to the second section on Apple’s Environment site, a group of 10-12 leaves slowly floats near a paragraph about renewable energy.

In the reduced motion variant, these leaves are stationary to prevent peripheral movement while the viewer focuses on the nearby text content.

Take note that only the animations known to be problematic have be modified or removed from the site. More on that later.

Using Reduce Motion on the Web

Now that we’ve covered the types of animation that can trigger adverse vestibular symptoms, let’s cover how to implement the new media feature into your projects.

CSS @Media Block

An @media block is the easiest way to incorporate motion reductions into your site. Use it to disable or change animation and transition values, or serve a different background-image.

@media (prefers-reduced-motion) {
  /* adjust motion of 'transition' or 'animation' properties */
}

Review the prefers-reduced-motion demo source for example uses.

MediaQueryList Interface

Animations and DOM changes are sometimes controlled with JavaScript, so you can leverage the prefers-reduced-motion media feature with window.matchMedia and register for an event listener whenever the user setting changes.

var motionQuery = matchMedia('(prefers-reduced-motion)');
function handleReduceMotionChanged() {
  if (motionQuery.matches) {
    /* adjust motion of 'transition' or 'animation' properties */
  } else { 
    /* standard motion */
  }
}
motionQuery.addListener(handleReduceMotionChanged);
handleReduceMotionChanged(); // trigger once on load if needed

Review the prefers-reduced-motion demo source for example uses.

Using the Accessibility Inspector

When refining your animations, you could toggle the iOS Setting or macOS Preference before returning to your app to view the result, but this indirect feedback loop is slow and tedious. Fortunately, there’s a better way.

The Xcode Accessibility Inspector makes it easier to debug your animations by quickly changing any visual accessibility setting on the host Mac or a tethered device such as an iPhone.

  1. Attach your iOS device via USB.
  2. Select the iOS device in Accessibility Inspector.
  3. Select the Settings Tab.

Alternate closed-captioned version of the Accessibility Inspector demo below.

Don’t Reduce Too Much

In some cases, usability can suffer when reducing motion. If your site uses a vestibular trigger animation to convey some essential meaning to the user, removing the animation entirely may make the interface confusing or unusable.

Even if your site uses motion in a purely decorative sense, only remove the animations you know to be vestibular triggers. Unless a specific animation is likely to cause a problem, removing it prematurely only succeeds in making your site unnecessarily boring.

Consider each animation in its context. If you determine a specific animation is likely to be a vestibular trigger, consider serving an alternate, simpler animation, or display another visual indicator to convey the intended meaning.

Takeaways

  1. Motion can be a great tool for increasing usability and engagement, but certain visual effects trigger physical discomfort in some viewers.
  2. Avoid vestibular trigger animations where possible, and use alternate animations when a user enables the “Reduce Motion” setting. Try out these settings, and use the new media feature when necessary. Review the prefers-reduced-motion demo source for example uses.
  3. Remember that the Web belongs to the user, not the author. Always adapt your site to fit their needs.

More Information

By James Craig at May 15, 2017 07:00 PM

May 03, 2017

Javier Fernández: Can I use CSS Box Alignment ?

Igalia WebKit

As a member of the Igalia’s team implementing the CSS Grid Layout feature for Blink and WebKit rendering engines, I’m very proud of what we’ve achieved from our collaboration with Bloomberg. I think Grid is a very interesting feature for the Web Platform and we still can’t see all its potential.

One of my main assignments on this project is to implement the CSS Box Alignment spec for Grid. It’s obvious that alignment is an important feature for many cases in web development, but I consider it a key for a layout model like the one Grid provides.

We recently announced that the patch implementing the self-baseline alignment landed in Blink. This was the last alignment functionality pending to implement, so now we can consider that the spec is complete for Grid. However, implementing a feature like CSS Box Alignment has an additional complexity in the form of interoperability issues.

Interoperability is always a challenge when implementing any new specification, but I think it’s specially problematic for a feature like this for several reasons:

  • The feature applies to several layout models.
  • The CSS Flexible Box specification already defined some of the CSS properties and values.
  • Once a new layout model implements the new specification, Flexbox is forced to follow it as well.

I admit that the editors of this new specification document made a huge effort to keep backward compatibility with the Flexbox spec (which caused not so few implementation challenges). However, the current Flexbox implementation of the CSS properties and values that both specs have in common would become a Partial Implementation regarding the new spec.

Recently Florian Rivoal found out that this partial implementation of the CSS Box Alignment feature prevents the use of cascade or @support for providing customized fallbacks for the unimplemented Alignment properties.

What does Partial Implementation actually mean ?

As anybody can imagine, implementing a fancy web feature takes a considerable amount of time. During this period, the feature passes through several phases with different exposure to the end users. It’s precisely due to the importance of end user’s feedback that these new web features are shipped under experimental flags. This workflow is specially useful no only for browser devs but for the spec editors as well.

For this reason, the W3C CSS Working Group defines a general policy to manage Partial Implementations, which can be summarized as follows:

So that authors can exploit the forward-compatible parsing rules to assign fallback values, CSS renderers must treat as invalid (and ignore as appropriate) any at-rules, properties, property values, keywords, and other syntactic constructs for which they have no usable level of support. In particular, user agents must not selectively ignore unsupported property values and honor supported values in a single multi-value property declaration: if any value is considered invalid (as unsupported values must be), CSS requires that the entire declaration be ignored.

This policy is added to every spec as part of its Conformance appendix, so it is in the case of the CSS Box Alignment specification document. However, the interpretation of the Partial Implementation policy is far from trivial, specially for a feature like CSS Box Alignment. The most restrictive interpretation would imply the following facts:

  • Any new CSS property of the new spec should be declared invalid until is supported by all the layout models it applies to.
  • Any of the already existent CSS properties with new values defined in the new spec should be declared invalid until all these new values are implemented in all the layout models such property applies to.
  • Browsers shouldn’t ship (without experimental flags) any CSS property or value until it’s implemented in all the layout model it applies to.

When we discussed about this at Igalia we applied a less restrictive interpretation, based on the assumption that the spec actually defined several features which could be implemented and shipped independently, obviously avoiding any browsers interoperability issues. As it’s been always in the nature of the specification, keeping backward compatibility with Flexbox implementations has been a must, since its spec already defines some of the CSS properties now present in the new spec.

The issue filed by Florian was discussed during the Tokyo F2F Apr 19-21 2017 meeting, where it was agreed to add a new section in the CSS Box Alignment spec to clarify how implementors of this feature should manage Partial Implementations:

Since it is expected that support for the features in this module will be deployed in stages corresponding to the various layout models affected, it is hereby clarified that the rules for partial implementations that require treating as invalid any unsupported feature apply to any alignment keyword which is not supported across all layout modules to which it applies for layout models in which the implementation supports the property in general.

The new text added makes the Partial Implementation policy less restrictive and, even it contradicts our interpretation of independent alignment features per layout model, it affects only to models which already implement any of the CSS properties defined in the new spec. In this case, only Flexbox has to be updated to implement the new values defined for its alignment related CSS properties: align-content, justify-content and align-self.

Analysis of the implementation and shipment status

Before thinking on how to address the Partial Implementation issues, I decided to analyze what’s the status of the CSS Box Alignment feature in the different browsers. If you are interested in the full analysis, it’s available here. The following table shows the implementation status of the new spec in the Safary, Chrome and Firefox browsers, using a color code like unimplemented, only grid or both (flex and grid):

If you can try out some examples of these Partial Implementation issues, just try flexbox vs grid cases with some of these alignment values: align-items: center, align-self: left; align-content: start or justify-content: end.

The 3 major browsers analyzed have shipped most, if not all, the CSS Box Alignment spec implemented for CSS Grid Layout (since Chrome 57, Safari 10.1, Firefox 52). Firefox is the browser which implemented and shipped a wider support for CSS Flexible Box.

We can extract the following conclusions:

  • The 3 browsers analyzed have shipped Partial Implementations of the CSS Box Alignment specification, although Firefox is almost complete.
  • The 3 browsers have shipped a Grid feature that supports completely the new CSS Box Alignment spec, although Safari still misses the self-baseline values.
  • The 3 implementations of the new CSS Box Alignment specification are backward compatible with the CSS Flexible Box specification, even though it implements for some properties a lower level of the spec (e.g. self-baseline keywords)

Work in progress

Although we are still evaluating the problem together with the Blink and WebKit communities, at Igalia we are already working on improving the situation. We all agree on the damage to the Web Platform that these Partial Implementation issues are causing, as Florian pointed out initially, so that’s a good starting point. There are bug reports on both WebKit and Blink and we are already providing patches for some of them.

We are still discussing about the best approach, but our bet would be to request an intent-to-implement-and-ship for a CSS Box Alignment (for flexbox layout) feature. This approach fits naturally in our initial plans of implementing several independent features from the alignment specification. It seems that it’s what Firefox is doing, which already announced the implementation of CSS Box Alignment (for block layout)

Thanks to Bloomberg for sponsoring this work, as part of the efforts that Igalia has been doing all these years pursuing a better and more open web.

Igalia & Bloomberg logos

By jfernandez at May 03, 2017 08:19 PM

Release Notes for Safari Technology Preview 29

Surfin’ Safari

Safari Technology Preview Release 29 is now available for download for macOS Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 215271-215859.

JavaScript

  • Implemented Intl.DateTimeFormat.prototype.formatToParts (r215616)
  • Improved Date.parse to accept wider range of date strings (r215359)
  • Implemented Object.isFrozen() and Object.isSealed() according to ECMA specifications (r215272)

CSS

  • Added support for percentage gaps for CSS Grid (r215463)
  • Changed :focus-within behavior to match specifications (r215719)

Rendering

  • Avoided repaints for invisible animations on tumblr.com (r215507)
  • Fixed rendering flexbox children across columns (r215320)
  • Fixed text-align:start and text-align:end behavior in table cells (r215375)
  • Fixed animations with large negative animation-delays that fail depending on machine uptime (r215352)
  • Reduced redundant text measuring during mid-word breaking (r215666)
  • Changed memory handling to keep all of the decoded frames for an animated image if the total memory size of the frames is under 30MB (up from 5MB) (r215557)
  • Fixed <li> content inside <ul> to wrap mid-word when word-break:break-word is set (r215660)
  • Fixed the location of the “recent searches” popover of <input type="search"> in RTL mode (r215830)

Web Inspector

  • Added regular expression support to XHR breakpoints (r215584)
  • Added a pause reason for “All Requests” XHR breakpoint (r215427)
  • Fixed the enabled state of “All Requests” XHR breakpoint to be correctly restored (r215435)
  • Fixed a bug where XHR breakpoints would disappear when the inspected page is reloaded (r215473)
  • Fixed XHR breakpoints restored from settings but not appearing in the sidebar (r215422)
  • Fixed Network datagrid columns to correctly restore their shown or hidden state (r215449)
  • Added tooltips to Network grid items for easier reading when text overflows (r215631)
  • Fixed sorting by Priority column in Network datagrids (r215793)
  • Fixed the display of Web Socket messages with non-latin letters (r215388)
  • Prevented showing the Search tab for location links, prefer the Resources tab (r215630)
  • Changed to treat Uint8ClampedArray as an array, not an object (r215855)
  • Fixed Command-G (⌘G) shortcut to allow Find next to work in the console (r215795)
  • Implemented autocompletion for CSS variables (r215358)
  • Updated the icon for the Ignore resource cache button in the Network Tab (r215440)

WebCrypto

  • Added support for ECDSA (r215423)
  • Improved converting an ECDSA signature binary into DER format (r215791)

Accessibility

  • Changed the role description of <hr> from “separator” to “rule” (r215532)

Media

  • Restricted WebKit image formats to a known whitelist (r215706, r215829). WebKit now only loads images of the following formats:
    • PNG (.png)
    • GIF (.gif)
    • JPEG (.jpg), (.jpeg), (.jpe), (.jif), (.jfif), (.jfi)
    • JPEG 2000 (.jp2), (.j2k), (.jpf), (.jpx), (.jpm), (.mj2)
    • TIFF (.tiff), (.tif)
    • MPO (.mpo)
    • Microsoft Bitmap (.bmp), (.dib)
    • Microsoft Cursor (.cur)
    • Microsoft Icon (.ico)

Bug Fixes

  • Fixed an issue where the status bar would not display modifier key information (e.g. “Open * in new tab” when holding the Command key) (r215790)
  • Improved performance of typing on pages with many <input> elements
  • Fixed an issue where a hardware “enter” key would not dismiss JavaScript alert, confirm, or prompt; previously, only the “return” key would dismiss a dialog
  • Fixed QuotaExceededError when saving to localStorage in private browsing mode or WebDriver sessions (r215315)
  • Fixed an issue where the Content-Disposition header filename was ignored when the download attribute is specified (r215736)
  • Fixed escaping ‘<‘ and ‘>’ in attribute values when using XMLSerializer.serializeToString() API (r215648)
  • Fixed issues causing beforeunload dialog to be shown even though the user did not interact with the page (r215404)
  • Changed all CORS requests and cross origin access from file:// to be blocked unless Disable Local File Restrictions is selected from the Develop menu

By Jon Davis at May 03, 2017 05:00 PM

Carlos García Campos: WebKitGTK+ remote debugging in 2.18

Igalia WebKit

WebKitGTK+ has supported remote debugging for a long time. The current implementation uses WebSockets for the communication between the local browser (the debugger) and the remote browser (the debug target or debuggable). This implementation was very simple and, in theory, you could use any web browser as the debugger because all inspector code was served by the WebSockets. I said in theory because in the practice this was not always so easy, since the inspector code uses newer JavaScript features that are not implemented in other browsers yet. The other major issue of this approach was that the communication between debugger and target was not bi-directional, so the target browser couldn’t notify the debugger about changes (like a new tab open, navigation or that is going to be closed).

Apple abandoned the WebSockets approach a long time ago and implemented its own remote inspector, using XPC for the communication between debugger and target. They also moved the remote inspector handling to JavaScriptCore making it available to debug JavaScript applications without a WebView too. In addition, the remote inspector is also used by Apple to implement WebDriver. We think that this approach has a lot more advantages than disadvantages compared to the WebSockets solution, so we have been working on making it possible to use this new remote inspector in the GTK+ port too. After some refactorings to the code to separate the cross-platform implementation from the Apple one, we could add our implementation on top of that. This implementation is already available in WebKitGTK+ 2.17.1, the first unstable release of this cycle.

From the user point of view there aren’t many differences, with the WebSockets we launched the target browser this way:

$ WEBKIT_INSPECTOR_SERVER=127.0.0.1:1234 browser

This hasn’t changed with the new remote inspector. To start debugging we opened any browser and loaded

http://127.0.0.1:1234

With the new remote inspector we have to use any WebKitGTK+ based browser and load

inspector://127.0.0.1:1234

As you have already noticed, it’s no longer possible to use any web browser, you need to use a recent enough WebKitGTK+ based browser as the debugger. This is because of the way the new remote inspector works. It requires a frontend implementation that knows how to communicate with the targets. In the case of Apple that frontend implementation is Safari itself, which has a menu with the list of remote debuggable targets. In WebKitGTK+ we didn’t want to force using a particular web browser as debugger, so the frontend is implemented as a builtin custom protocol of WebKitGTK+. So, loading inspector:// URLs in any WebKitGTK+ WebView will show the remote inspector page with the list of debuggable targets.

It looks quite similar to what we had, just a list of debuggable targets, but there are a few differences:

  • A new debugger window is opened when inspector button is clicked instead of reusing the same web view. Clicking on inspect again just brings the window to the front.
  • The debugger window loads faster, because the inspector code is not served by HTTP, but locally loaded like the normal local inspector.
  • The target list page is updated automatically, without having to manually reload it when a target is added, removed or modified.
  • The debugger window is automatically closed when the target web view is closed or crashed.

How does the new remote inspector work?

The web browser checks the presence of WEBKIT_INSPECTOR_SERVER environment variable at start up, the same way it was done with the WebSockets. If present, the RemoteInspectorServer is started in the UI process running a DBus service listening in the IP and port provided. The environment variable is propagated to the child web processes, that create a RemoteInspector object and connect to the RemoteInspectorServer. There’s one RemoteInspector per web process, and one debuggable target per WebView. Every RemoteInspector maintains a list of debuggable targets that is sent to the RemoteInspector server when a new target is added, removed or modified, or when explicitly requested by the RemoteInspectorServer.
When the debugger browser loads an inspector:// URL, a RemoteInspectorClient is created. The RemoteInspectorClient connects to the RemoteInspectorServer using the IP and port of the inspector:// URL and asks for the list of targets that is used by the custom protocol handler to create the web page. The RemoteInspectorServer works as a router, forwarding messages between RemoteInspector and RemoteInspectorClient objects.

By carlos garcia campos at May 03, 2017 03:43 PM

April 20, 2017

A Few Words on Fetching Bytes

Surfin’ Safari

Like all good puzzles, a web browser is composed of many different pieces. Some are all shiny, like your favorite web API. Some are less visible, like HTML parsing and web resource loading.

Even dull pieces require lots of work to standardize their behavior across browsers. For example, HTML parsing originally provided only: Give me HTML and I’ll give you a document. Now, it is much more reliable across browsers because it has been standardized in detail. Similarly, the loading of web resources was somehow consistent up to: give me an HTTP request and I’ll get you a HTTP response. But loading a web resource encompasses much more than that. The Fetch specification thoroughly standardizes those details. As well as specifying how the browser loads resources, the Fetch specification also defines a JavaScript API for loading resources. This API, the Fetch API, is a replacement to XMLHttpRequest, providing the lowest-level set of options possible in the context of a web page. Let’s see how shiny Fetch API might be.

The Fetch API

The Fetch API consists of a single Promise-returning method called fetch. The returned value is a Response object which contains the response headers and body information. Let’s use the Fetch API to retrieve the list of WebKit features:

async function isFetchAPIFeelingGood() {
    let webkitFeaturesURL = "https://svn.webkit.org/repository/webkit/trunk/Source/WebCore/features.json";
    let response = await fetch(webkitFeaturesURL);
    let features = await response.json();
    return features.specification.find((feature) =>
        feature.name == "Fetch API");
}
isFetchAPIFeelingGood().then((value) => alert(!!value ? "Oh yes!" : "not really!"))

You might notice two await uses in the example above. fetch is returning a promise that gets resolved when the response headers are received. The data being requested is JSON. The second promise resolves when the entire response body is available.

fetch can take either a URL or a Request object. The Request object allows access to a whole new set of options compared to XMLHttpRequest. Let’s try again to check whether fetch API is supported in WebKit, but this time, let’s make sure our cache does not serve us some out-of-date information.

async function isFetchAPIFeelingGoodForReal() {
    let webkitFeaturesURL = "https://svn.webkit.org/repository/webkit/trunk/Source/WebCore/features.json";
    let response = await fetch(new Request(webkitFeaturesURL,
        { cache: "no-cache" }
    ));
    let latestFeatures = await response.json();
    return latestFeatures.specification.find((feature) =>
        feature.name == "Fetch API");
}

fetch also provides more flexible access to the response body. In addition to getting it in various flavors (JSON, arrayBuffer, blob, text…), the response provides a ReadableStream body attribute. This makes it possible to process chunks of bytes progressively as they arrive without buffering the whole data, and even aborting the resource load:

async function featureListAsAReader() {
    let webkitFeaturesURL = "https://svn.webkit.org/repository/webkit/trunk/Source/WebCore/features.json";
    let response = await fetch(new Request(webkitFeaturesURL));
    return response.body.getReader();
}

function checkChunk(searched, buffer, count)
{
    var i = 0;
    while (i < buffer.length) {
        if (buffer[i++] == searched.charCodeAt(count)) {
            if (++count == searched.length)
               return count;
        } else if (count) {
            --i;
            count = 0;
        }
    }
    return count;
}

async function isFetchAPIFeelingGoodWhileChunky(reader, count)
{
    reader = reader ? reader : await featureListAsAReader();
    count = count ? count : 0;

    let chunk = await reader.read();
    if (chunk.done)
        return false;

    let searched = "Fetch API";
    count = checkChunk(searched, chunk.value, count);
    if (count == searched.length)
        return true;
    return isFetchAPISupported(reader, count);
}

Fetching The Future

The Fetch API journey is not finished. New proposals might cover important features of XMLHttpRequest that Fetch currently lacks, like early cancellation and timeout. New proposals might also cover HTTP/2 push and priority, as well as wider use of the Response object in web APIs: media elements, Web Assembly… The Fetch algorithm is also being constantly refined to reach full interoperability of web resource loading. A first iteration of WebKit Fetch API implementation shipped in Safari. The WebKit community is eager to hear about your feedback on this feature. Comments, suggestions, priorities, use cases, tests, bug reports and candies are all very welcome through the usual WebKit channels. That would be so fetch indeed!

View post on imgur.com

By Youenn Fablet at April 20, 2017 06:00 PM

April 19, 2017

Release Notes for Safari Technology Preview 28

Surfin’ Safari

Safari Technology Preview Release 28 is now available for download for macOS Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 214535-215271.

Power and Performance

  • Changed to pause silent WebAudio rendering in background tabs (r214721)
  • Changed to pause animated SVG images on pages loaded in the background (r214561)
  • Changed to make inaudible background tabs become eligible for memory kill after 8 minutes (r215077)
  • Changed to kill any WebContent process using over 16 GB of memory (r215055)
  • Throttled DOM Timers to 30fps in cross-origin iframes that the user did not interact with (r215116)
  • Throttled requestAnimationFrame callbacks to 30fps in cross-origin iframes the user did not interact with (r215070, r215153)

CSS

  • Adapted content-alignment properties to the new baseline syntax (r214624)
  • Adapted place-content alignment shorthand to the new baseline syntax (r214852)
  • Adapted self-alignment properties to the new baseline syntax (r214564)
  • Fixed scroll offset jumps after a programmatic scroll in an overflow container with scroll snapping (r215075)
  • Implemented the place-items shorthand (r214966)
  • Implemented stroke-color CSS property (r215261)
  • Implemented stroke-miterlimit CSS property (r214787)
  • Unprefixed CSS cursor values grab and grabbing (r215146)

JavaScript

  • Fixed objects with gaps between numerical keys getting filled by NaN values (r214714)
  • Fixed Object.seal() and Object.freeze() on global this (r215072)
  • Fixed String.prototype.replace to correctly apply special replacement parameters when passed a function (r214662)

Web API

  • Changed _blank, _self, _parent, and _top browsing context names to be case-insensitive (r214944)
  • Cleaned up touch event handler registration when moving nodes between documents (r214819)
  • Fixed <input type="range"> to prevent breaking all mouse events when changing to disabled while active (r214955)
  • Prevented double downloads of preloaded content from when the content is in MemoryCache (r215229)
  • Fixed WebSocket.send (r215102)

Web Inspector

  • Added a preference for Auto Showing Scope Chain sidebar on pause (r214847)
  • Changed the order of Debugger tab sidebar panels: Scope Chain, Resource, Probes (r215047)
  • Changed XHR breakpoints to be global (r214956)
  • Changed hierarchical path component labels to guess directionality based on content for RTL layout (r214862)
  • Fixed RTL alignment of close button shown while docked (r214902)
  • Fixed RTL layout issues in call frame tree elements and async call stacks (r214846)
  • Fixed RTL layout issues in the debugger dashboard putting arrows on the wrong side (r214899)
  • Fixed RTL layout issues in Type Profiler popovers (r214906)
  • Fixed misplaced highlights in Search results of the Search navigation sidebar for RTL layout (r214864)
  • Fixed disappearing section when clicking on the body of a CSS rule after editing (r214863)
  • Fixed showing indicators for hidden DOM element breakpoints in the Elements tab (r214844)
  • Fixed blank Network tab content view after reload (r214551)
  • Made “Enter Class Name” text field wider so the placeholder text doesn’t clip (r215192)
  • Fixed probe values not showing in the Debugger tab sidebar (r214967)
  • Fixed focusing the Find banner immediately after showing it (r214856)
  • Fixed showing Source Map Resources in the Debugger Sources list (r215082)
  • Fixed Styles sidebar warning icon appearing inside property value text (r214617)
  • Fixed broken tabbing in Styles sidebar when additional “:” and “;” are in the property value (r215170)
  • Fixed clipped data in WebSockets data grid (r215206)
  • Fixed staying scrolled to the bottom as new WebSocket log messages get added (r214587)
  • Included additional pause reason details for DOM “subtree modified” breakpoint (r214861)
  • Included more Network information in Resource Details Sidebar (r214903)
  • Included all headers in the Request Headers section of the Resource details sidebar (r215062)

WebDriver

  • Fixed an issue that prevented non-popup windows from being maximized or resized
  • Fixed an issue that caused previously opened tabs to reopen when Safari was launched in order to run a WebDriver test

Accessibility

  • Exposed a new AXSubrole for the explicit ARIA “group” role (r214623)
  • Fixed VoiceOver web article navigation with an article rotor for sites like Facebook and Twitter (r215236)

Media

  • Fixed seeks to currentTime=0 if currentTime is already 0 (r214959)

Rendering

  • Fixed clipping across page breaks when including <caption>, <thead> or <tbody> in a <table> (r214712)
  • Fixed Japanese fonts in vertical text to support synthesized italics (r214848)
  • Fixed long Arabic text in ContentEditable with CSS white-space=pre to prevent hangs (r214726)
  • Fixed overly heavy fonts on facebook.com by attempting to normalize variation ranges (r214585, r214572)

WebCrypto

  • Added support for AES-CTR (r215051)

Security

  • Changed private browsing sessions to not look in keychain for client certificates (r215125)

AppleScript

  • Fixed an issue where Safari would throw an exception when evaluating JavaScript ending with an implied return value, where the final statement doesn’t include the return keyword

By Jon Davis at April 19, 2017 05:00 PM

April 05, 2017

WebGPU Prototype and Demos

Surfin’ Safari

A few weeks ago, we announced the creation of the W3C GPU for the Web Community Group and our proposal, WebGPU. As of Safari Technology Preview Release 26, and WebKit Nightly Build, a WebGPU prototype is available for you to experiment with on macOS.

To enable WebGPU, first make sure the Develop menu is visible using SafariPreferencesAdvancedShow Develop menu in menu bar. Then, in the Develop menu, make sure Experimental FeaturesWebGPU is checked.

We also have written some simple demos to give you an idea of how the API works. Note that our implementation and documented proposal are not quite aligned, so the code in these demos will change over time. We’ve tried to indicate the places in the code where the prototype is behind the proposal. And, to reiterate, this is a proposal – the final API will almost certainly be quite different.

You should be aware that it is not recommended to browse the Web with this feature enabled. Not only is it experimental and non-standard, we haven’t implemented any validation of content, so an error in WebGPU might cause a crash, or worse.

By Dean Jackson at April 05, 2017 10:40 PM

Release Notes for Safari Technology Preview 27

Surfin’ Safari

Safari Technology Preview Release 27 is now available for download for macOS Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 213822-214535.

Browser Changes

  • Added a “Reload Page From Origin” alternate menu item to the View menu. This action reloads a page without using cached resources.
  • Removed the Option-Command-R (⌥⌘R) keyboard shortcut from “Enter/Exit Responsive Design Mode” and mapped it to “Reload Page From Origin” instead.
  • Removed the Disable Caches menu item in the Develop menu. The equivalent functionality is now available through Web Inspector’s Network tab.

JavaScript

  • Implemented ESNext Object Spread proposal (r214038)
  • Changed to allow labels named let when parsing a statement in non-strict mode (r213850)
  • Fixed const location = "foo" in a worker to not throw a SyntaxError (r214145)

Web API

  • Aligned initEvent, initCustomEvent, initMessageEvent with the latest specification (r213825)
  • Aligned Document.elementFromPoint() with the CSSOM specification (r213836)
  • Changed XMLHttpRequest getAllResponseHeaders() to transform header names to lowercase before sorting (r214252)
  • Fixed sending an empty "Access-Control-Request-Headers" in preflight requests (r214254)
  • Implemented self.origin (r214147)
  • Implemented the “noopener” feature for window.open() (r214251)
  • Improved index validation when using uint index values in WebGL (r214086)
  • Prevented beforeunload alerts when the user hasn’t interacted with the web page (r214277)
  • Prevented innerText setter from inserting an empty text node if the value starts with a newline (r214136)
  • Prevented new navigations during document unload (r214365)
  • Prevented WebSQL databases from being openable in private browsing (r214309)
  • Changed serialization of custom properties in longhand to be "", not the value of the shorthand property (r214383)
  • Changed to tear down descendant renderers when a <slot> display value is set to "contents" (r214232)

Rendering

  • Fixed pausing animated SVG images when they are outside the viewport, or removed from the document (r214503, r214327)
  • Changed asynchronous image decoding to consider when the drawing size is smaller than the size of the image (r213830)
  • Prevented large images from being decoded asynchronously when they are drawn on a canvas (r214450)
  • Fixed the flow state for positioned inline descendants (r214119)
  • Fixed initial letter rendering that follows pagination (r214110)
  • Fixed clipping columns horizontally in multi-column layout (r213832)
  • Fixed animated GIFs that fail to play in multi-column layout (r213826)

CSS

  • Fixed an issue where a dynamically applied :empty pseudo class with display:none does not get unapplied (r214290)
  • Unprefixed -webkit-min-content, -webkit-max-content and -webkit-fit-content (r213831)

Media

  • Fixed loading media files in a <video> tag that are served without a filename extension (r214269)
  • Suspended silent videos playback in background tabs to save CPU (r214195)

Web Inspector

  • Added “Disable Caches” toggle in the Network tab that only applies to the inspected page while Web Inspector is open. (r214494)
  • Added “Save Selected” context menu item to Console (r214077)
  • Added RTL support to the Timeline tab (r213925, r213942, r213928, r213997, r213924, r214009, r214062, r214076)
  • Added RTL support for the Find banner (r214048)
  • Added more accurate Resource Timing data in Web Inspector (r213917)
  • Added context menu item to log content of WebSocket frame (r214371)
  • Added icons for SVG Image cluster path components (r214011)
  • Added keyboard shortcut to clear timeline records (r214140)
  • Added a connection indicator for when a WebSocket connection is open or close (r214354)
  • Changed Option-clicking the close tab button to close all other tabs (r214464)
  • Changed to allow the user to copy locked CSS selectors in Style Rules sidebar (r213887)
  • Changed to allow users to click links in inline and user-agent styles (r214366)
  • Changed SVG image content view to allow toggling between the image and source (r213999)
  • Changed Event Listeners detail section to show listeners by element rather than by event (r213874)
  • Changed Event Listeners to add missing ‘once’ and ‘passive’ event listener flags (r213873)
  • Fixed an issue where adding a WebSocket message could change the currently selected resource (r214387)
  • Fixed clicking DOM breakpoint marker to enable and disable breakpoints (r214256)
  • Fixed an exception when clicking on Clear Network Items icon with the timing popover visible (r214199)
  • Fixed local storage keys and values starting with truncated strings (r214308)
  • Fixed empty attributes added to a DOM tree outline element adding whitespace within the tag (r214141)
  • Fixed an exception when fetching computed styles that can break future updates of section (r213961)
  • Fixed syntax highlighting and formatting when inspecting a main resource that is JavaScript or JSON (r214492)
  • Fixed pseudo-class markers overlapping DOM breakpoints and disclosure triangles (r214196)
  • Fixed an issue causing the Resource details sidebar to display previous image metrics when viewing resource where content load failed (r214436)
  • Fixed text selection in the Console to select only message text (r214024)
  • Fixed formatting JSON request data (r214487)
  • Fixed the filename used when saving a resource from the resource image content view (r214133)
  • The file save dialog no longer suggests the top level directory as the default location (r214442)

Accessibility

  • Fixed VoiceOver for editable text on the web (r214112)

WebCrypto

  • Added support for SPKI/PKCS8 Elliptic Curve cryptography (r214074)

By Jon Davis at April 05, 2017 05:00 PM

April 04, 2017

Manuel Rego: Announcing a New Edition of the Web Engines Hackfest

Igalia WebKit

Another year, another Web Engines Hackfest. Following the tradition that started back in 2009, Igalia is arranging a new edition of the Web Engines Hackfest that will happen in A Coruña from Monday, 2nd October, to Wednesday, 4th October.

The hackfest is a gathering of participants from the different parts of the open web platform community, working on projects like Chromium/Blink, WebKit, Gecko, Servo, V8, JSC, SpiderMonkey, Chakra, etc. The main focus of the event is to increase collaboration between the different browsers implementors by working together for a few days. On top of that, we arrange a few talks about some interesting topics which the hackfest attendees are working on, and also arrange breakout sessions for in-depth discussions.

Web Engines Hackfest 2016 Main Room Web Engines Hackfest 2016 Main Room

Last year almost 40 hackers joined the event, the biggest number of attendees ever. Previous attendees might have already received an invitation, but if not, just send us a request if you want to attend this year.

If you don’t want to miss any update, remember to follow @webhackfest on Twitter. See you in October!

April 04, 2017 10:00 PM

March 29, 2017

New Web Features in Safari 10.1

Surfin’ Safari

A new version of Safari shipped with the release of iOS 10.3 and macOS Sierra 10.12.4. Safari on iOS 10.3 and Safari 10.1 on macOS adds many important web features and improvements from WebKit that we are incredibly excited about.

While this release makes the web platform more capable and powerful, it also makes web development easier, simplifying the ongoing maintenance of your code. We’re excited to see how web developers will translate these improvements into better experiences for users.

Read on for quick look at the features included in this release.

Fetch

Fetch is a modern replacement for XMLHttpRequest. It provides a simpler approach to request resources asynchronously over the network. It also makes use of Promises from ECMAScript 2015 (ES6) for convenient, chain-able response handling. Compared to XMLHttpRequest, the Fetch API allows for cleaner, more readable code that is easier to maintain.

let jsonURLEndpoint = "https://svn.webkit.org/repository/webkit/trunk/Source/WebCore/features.json";
fetch(jsonURLEndpoint, {
    method: "get"
}).then(function(response) {
    response.json().then(function(json) {
        console.log(json);
    });
}).catch(function(error) {
    console.error(error);
});

Find out more in the blog post, A Few Words On Fetching Bytes.

CSS Grid Layout

CSS Grid Layout gives web authors a powerful new layout system based on a grid of columns and rows in a container. It is a significant step forward in providing manageable page layout tools in CSS that enable complex graphic designs that respond to viewport changes. Authors can use CSS Grid Layout to more easily achieve designs normally seen in print, that before required the use of layout quirks in existing CSS tools like floats and Flexbox.

Read more in the blog post, CSS Grid Layout: A New Layout Module for the Web.

ECMAScript 2016 & ECMAScript 2017

WebKit added support in Safari 10.1 for both ECMAScript 2016 and ECMAScript 2017, the latest standards revisions for the JavaScript language. ECMAScript 2016 adds small incremental improvements, but the 2017 standard brings several substantial improvements to JavaScript.

ECMAScript 2016 includes the exponentiation operator (x ** y instead of Math.pow(x, y)) and Array.prototype.includes. Array.prototype.includes is similar to Array.prototype.indexOf, except it can find values including NaN.

ECMAScript 2017 brings async and await syntax, shared memory objects including Atomics and Shared Array Buffers, String.prototype.padStart, String.prototype.padEnd, Object.prototype.values, Object.prototype.entries, and allows trailing commas in function parameter lists and calls.

IndexedDB 2.0

WebKit’s IndexedDB implementation has significant improvements in this release. It’s now faster, standards compliant, and supports new IndexedDB 2.0 features. IndexedDB 2.0 adds support for binary data types as index keys, so you’ll no longer need to serialize them into strings or array objects. It also brings object store and index renaming, getKey() on IDBObjectStore, and getPrimaryKey() on IDBIndex.

Find out more in the Indexed Database API 2.0 specification.

Custom Elements

Custom Elements enables web authors to create reusable components defined by their own HTML elements without the dependency of a JavaScript framework. Like built-in elements, Custom Elements can communicate and receive new values in their attributes, and respond to changes in attribute values using reaction callbacks.

For more information, read the Introducing Custom Elements blog post.

Gamepad

The Gamepad API makes it possible to use game controllers in your web apps. Any gamepad that works on macOS without additional drivers will work on a Mac. All MFi gamepads are supported on iOS.

Read more about the API in the Gamepad specifications.

Pointer Lock

In Safari on macOS, requesting Pointer Lock on an element gives developers the ability to hide the mouse pointer and access the raw mouse movement data. This is particularly helpful for authors creating games on the web. It extends the MouseEvents interface with movementX and movementY properties to provide a stream of information even when the movements are beyond the boundaries of the visible range. In Safari, when the pointer is locked on an element, a banner is displayed notifying the user that the mouse cursor is hidden. Pressing the Escape key once dismisses the banner, and pressing the Escape key again will release the pointer lock on the element.

You can get more information from the Pointer Lock specifications.

Keyboard Input in Fullscreen

WebKit used to restrict keyboard input in HTML5 fullscreen mode. With Safari 10.1 on macOS, when using HTML5 fullscreen mode, WebKit removes the keyboard input restrictions.

Interactive Form Validation

With support for HTML Interactive Form Validation, authors can create forms with data validation contraints that are checked automatically by the browser when the form is submitted, all without the need for JavaScript. It greatly simplifies the complexity of ensuring good data entry from users on the client-side and minimizes the need for complex JavaScript.

Read more about HTML Interactive Form Validation in WebKit.

Input Events

Input Events simplifies implementing rich text editing experiences on the web in contenteditable regions. The Input Events API adds a new beforeinput event to monitor and intercept default editing behaviors and enhances the input event with new attributes.

You can read more about Enhanced Editing with Input Events.

HTML5 Download Attribute

The download attribute for anchor elements is now available in Safari 10.1 on macOS. It indicates the link target is a download link that should download a file instead of navigating to the linked resource. It also enables developers to create a link that downloads blob data as files entirely from JavaScript. Clicking a link with a download attribute causes the target resource to be downloaded as a file. The optional value of the download attribute can be used to provide a suggested name for the file.

<a href="https://webkit.org/favicon.ico" download="webkit-favicon.ico">Download Favicon</a>

Find out more from the Downloading resources section in the HTML specification.

HTML Media Capture

In Safari on iOS, HTML Media Capture extends file input controls in forms to allow users to use the camera or microphone on the device to capture data.

File inputs can be used to capture an image, video, or audio:

<input name="imageCapture" type="file" accept="image/*" capture>
<input name="videoCapture" type="file" accept="video/*" capture>
<input name="audioCapture" type="file" accept="audio/*" capture>

More details are available in the HTML Media Capture specification.

Improved Fixed and Sticky Element Positioning

When using pinch-to-zoom, fixed and sticky element positioning has improved behavior using a “visual viewports” approach. Using the visual viewports model, focusing an input field that triggers the on-screen keyboard no longer disables fixed and sticky positioning in Safari on iOS.

Improved Web Inspector Debugging

The WebKit team added support for debugging Web Worker JavaScript threads in Web Inspector’s Debugger tab. There are also improvements to debugger stepping with highlights for the currently-executing and about-to-execute statements. The highlights make it much clearer what code is going to execute during debugging, especially for JavaScript with complex control flow or many expressions on a single line.

Learn more about JavaScript Debugging Improvements in Web Inspector.

CSS Wide-Gamut Colors

Modern devices support a broader range of colors. Now, web authors can use CSS colors in wide-gamut color spaces, including the Display P3 color space. A new color-gamut media query can be used to test if the display is capable of displaying a given color space. Then, using the new CSS color() function, developers can define a color in a specific color space.

@media (color-gamut:p3) {
    .brightred {
        color: color(display-p3 1.0 0 0);
    }
}

For more information, see the CSS Color Module Level 4 standards specification.

Reduced Motion Media Query

The new prefers-reduced-motion media query allows developers using animation to make accommodations for users with conditions where large areas of motion or drastic movements can trigger physical discomfort. With prefers-reduced-motion, authors can create styles that avoid motion for users that set the reduced motion preference in system settings.

@keyframes decorativeMotion {
    /* Keyframes for a decorative animation */
}

.background {
    animation: decorativeMotion 10s infinite alternate;
}

@media (prefers-reduced-motion) {
    .background {
        animation: none;
    }
}

Read more about Responsive Design for Motion.

Feedback

We’re looking forward to what developers will do with these features to make better experiences for users. These improvements are available to users running iOS 10.3 and macOS Sierra 10.12.4, as well as Safari 10.1 for OS X Yosemite and OS X El Capitan.

Most of these features were also previewed in Safari Technology Preview over the last few months. The changes included in this release of Safari span Safari Technology Preview releases 14, 15, 16, 17, 18, 19, and 20. You can download the latest Safari Technology Preview release to stay on the forefront of future web features.

Finally, we’d love to hear from you! Send a tweet to @webkit or @jonathandavis and let us know which of these features will have the most impact on your design or development work on the web.

By Jon Davis at March 29, 2017 10:00 PM

March 24, 2017

Michael Catanzaro: A Web Browser for Awesome People (Epiphany 3.24)

Igalia WebKit

Are you using a sad web browser that integrates poorly with GNOME or elementary OS? Was your sad browser’s GNOME integration theme broken for most of the past year? Does that make you feel sad? Do you wish you were using an awesome web browser that feels right at home in your chosen desktop instead? If so, Epiphany 3.24 might be right for you. It will make you awesome. (Ask your doctor before switching to a new web browser. Results not guaranteed. May cause severe Internet addiction. Some content unsuitable for minors.)

Epiphany was already awesome before, but it just keeps getting better. Let’s look at some of the most-noticeable new features in Epiphany 3.24.

You Can Load Webpages!

Yeah that’s a great start, right? But seriously: some people had trouble with this before, because it was not at all clear how to get to Epiphany’s address bar. If you were in the know, you knew all you had to do was click on the title box, then the address bar would appear. But if you weren’t in the know, you could be stuck. I made the executive decision that the title box would have to go unless we could find a way to solve the discoverability problem, and wound up following through on removing it. Now the address bar is always there at the top of the screen, just like in all those sad browsers. This is without a doubt our biggest user interface change:

Screenshot showing address bar visibleDiscover GNOME 3! Discover the address bar!

You Can Set a Homepage!

A very small subset of users have complained that Epiphany did not allow setting a homepage, something we removed several years back since it felt pretty outdated. While I’m confident that not many people want this, there’s not really any good reason not to allow it — it’s not like it’s a huge amount of code to maintain or anything — so you can now set a homepage in the preferences dialog, thanks to some work by Carlos García Campos and myself. Retro! Carlos has even added a home icon to the header bar, which appears when you have a homepage set. I honestly still don’t understand why having a homepage is useful, but I hope this allows a wider audience to enjoy Epiphany.

New Bookmarks Interface

There is now a new star icon in the address bar for bookmarking pages, and another new icon for viewing bookmarks. Iulian Radu gutted our old bookmarks system as part of his Google Summer of Code project last year, replacing our old and seriously-broken bookmarks dialog with something much, much nicer. (He also successfully completed a major refactoring of non-bookmarks code as part of his project. Thanks Iulian!) Take a look:

Manage Tons of Tabs

One of our biggest complaints was that it’s hard to manage a large number of tabs. I spent a few hours throwing together the cheapest-possible solution, and the result is actually pretty decent:

Firefox has an equivalent feature, but Chrome does not. Ours is not perfect, since unfortunately the menu is not scrollable, so it still fails if there is a sufficiently-huge number of tabs. (This is actually surprisingly-difficult to fix while keeping the menu a popover, so I’m considering switching it to a traditional non-popover menu as a workaround. Help welcome.) But it works great up until the point where the popover is too big to fit on your monitor.

Note that the New Tab button has been moved to the right side of the header bar when there is only one tab open, so it has less distance to travel to appear in the tab bar when there are multiple open tabs.

Improved Tracking Protection

I modified our adblocker — which has been enabled by default for years — to subscribe to the EasyPrivacy filters provided by EasyList. You can disable it in preferences if you need to, but I haven’t noticed any problems caused by it, so it’s enabled by default, not just in incognito mode. The goal is to compete with Firefox’s Disconnect feature. How well does it work compared to Disconnect? I have no clue! But EasyPrivacy felt like the natural solution, since we already have an adblocker that supports EasyList filters.

Disclaimer: tracking protection on the Web is probably a losing battle, and you absolutely must use the Tor Browser Bundle if you really need anonymity. (And no, configuring Epiphany to use Tor is not clever, it’s very dumb.) But EasyPrivacy will at least make life harder for trackers.

Insecure Password Form Warning

Recently, Firefox and Chrome have started displaying security warnings  on webpages that contain password forms but do not use HTTPS. Now, we do too:

I had a hard time selecting the text to use for the warning. I wanted to convey the near-certainty that the insecure communication is being intercepted, but I wound up using the word “cybercriminal” when it’s probably more likely that your password is being gobbled up by various  governments. Feel free to suggest changes for 3.26 in the comments.

New Search Engine Manager

Cedric Le Moigne spent a huge amount of time gutting our smart bookmarks code — which allowed adding custom search engines to the address bar dropdown in a convoluted manner that involved creating a bookmark and manually adding %s into its URL — and replacing it with an actual real search engine manager that’s much nicer than trying to add a search engine via bookmarks. Even better, you no longer have to drop down to the command line in order to change the default search engine to something other than DuckDuckGo, Google, or Bing. Yay!

New Icon

Jakub Steiner and Lapo Calamandrei created a great new high-resolution app icon for Epiphany, which makes its debut in 3.24. Take a look.

WebKitGTK+ 2.16

WebKitGTK+ 2.16 improvements are not really an Epiphany 3.24 feature, since users of older versions of Epiphany can and must upgrade to WebKitGTK+ 2.16 as well, but it contains some big improvements that affect Epiphany. (For example, Žan Doberšek landed an important fix for JavaScript garbage collection that has resulted in massive memory reductions in long-running web processes.) But sometimes WebKit improvements are necessary for implementing new Epiphany features. That was true this cycle more than ever. For example:

  • Carlos García added a new ephemeral mode API to WebKitGTK+, and modified Epiphany to use it in order to make incognito mode much more stable and robust, avoiding corner cases where your browsing data could be leaked on disk.
  • Carlos García also added a new website data API to WebKitGTK+, and modified Epiphany to use it in the clear data dialog and cookies dialog. There are no user-visible changes in the cookies dialog, but the clear data dialog now exposes HTTP disk cache, HTML local storage, WebSQL, IndexedDB, and offline web application cache. In particular, local storage and the two databases can be thought of as “supercookies”: methods of storing arbitrary data on your computer for tracking purposes, which persist even when you clear your cookies. Unfortunately it’s still not possible to protect against this tracking, but at least you can view and delete it all now, which is not possible in Chrome or Firefox.
  • Sergio Villar Senin added new API to WebKitGTK+ to improve form detection, and modified Epiphany to use it so that it can now remember passwords on more websites. There’s still room for improvement here, but it’s a big step forward.
  • I added new API to WebKitGTK+ to improve how we handle giving websites permission to display notifications, and hooked it up in Epiphany. This fixes notification requests appearing inappropriately on websites like the https://riot.im/app/.

Notice the pattern? When there’s something we need to do in Epiphany that requires changes in WebKit, we make it happen. This is a lot more work, but it’s better for both Epiphany and WebKit in the long run. Read more about WebKitGTK+ 2.16 on Carlos García’s blog.

Future Features

Unfortunately, a couple exciting Epiphany features we were working on did not make the cut for Epiphany 3.24. The first is Firefox Sync support. This was developed by Gabriel Ivașcu during his Google Summer of Code project last year, and it’s working fairly well, but there are still a few problems. First, our current Firefox Sync code is only able to sync bookmarks, but we really want it to sync much more before releasing the feature: history and open tabs at the least. Also, although it uses Mozilla’s sync server (please thank Mozilla for their quite liberal terms of service allowing this!), it’s not actually compatible with Firefox. You can sync your Epiphany bookmarks between different Epiphany browser instances using your Firefox account, which is great, but we expect users will be quite confused that they do not sync with your Firefox bookmarks, which are stored separately. Some things, like preferences, will never be possible to sync with Firefox, but we can surely share bookmarks. Gabriel is currently working to address these issues while participating in the Igalia Coding Experience program, and we’re hopeful that sync support will be ready for prime time in Epiphany 3.26.

Also missing is HTTPS Everywhere support. It’s mostly working properly, thanks to lots of hard work from Daniel Brendle (grindhold) who created the libhttpseverywhere library we use, but it breaks a few websites and is not really robust yet, so we need more time to get this properly integrated into Epiphany. The goal is to make sure outdated HTTPS Everywhere rulesets do not break websites by falling back automatically to use of plain, insecure HTTP when a load fails. This will be much less secure than upstream HTTPS Everywhere, but websites that care about security ought to be redirecting users to HTTPS automatically (and also enabling HSTS). Our use of HTTPS Everywhere will just be to gain a quick layer of protection against passive attackers. Otherwise, we would not be able to enable it by default, since the HTTPS Everywhere rulesets are just not reliable enough. Expect HTTPS Everywhere to land for Epiphany 3.26.

Help Out

Are you a computer programmer? Found something less-than-perfect about Epiphany? We’re open for contributions, and would really appreciate it if you would try to fix that bug or add that feature instead of slinking back to using a less-awesome web browser. One frequently-requested feature is support for extensions. This is probably not going to happen anytime soon — we’d like to support WebExtensions, but that would be a huge effort — but if there’s some extension you miss from a sadder browser, ask if we’d allow building it into Epiphany as a regular feature. Replacements for popular extensions like NoScript and Greasemonkey would certainly be welcome.

Not a computer programmer? You can still help by reporting bugs on GNOME Bugzilla. If you have a crash to report, learn how to generate a good-quality stack trace so that we can try to fix it. I’ve credited many programmers for their work on Epiphany 3.24 up above, but programming work only gets us so far if we don’t know about bugs. I want to give a shout-out here to Hussam Al-Tayeb, who regularly built the latest code over the course of the 3.24 development cycle and found lots of problems for us to fix. This release would be much less awesome if not for his testing.

OK, I’m done typing stuff now. Onwards to 3.26!

By Michael Catanzaro at March 24, 2017 01:18 AM

March 22, 2017

Release Notes for Safari Technology Preview 26

Surfin’ Safari

Safari Technology Preview Release 26 is now available for download for macOS Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 213542-213822.

WebGPU

Web API

  • Added support for history.scrollRestoration (r213590)
  • Aligned Document.elementFromPoint() with the CSSOM specification (r213646)
  • Changed the parameter to input.setCustomValidity() to not be nullable (r213606)
  • Fixed transitions and animations of background-position with right-relative and bottom-relative values (r213603)
  • Fixed an issue where WebSQL directories were not removed when removing website data (r213547)
  • Made the XMLHttpRequest method setRequestHeader() use “,” (including a space) as the separator (r213766)
  • Prevented displaying the label of an <option> element in quirks mode (r213542)
  • Prevented extra downloads of preloaded CSS (r213672)
  • Dropped support for non-standard document.all.tags() (r213619)

CSS

  • Implemented stroke-width CSS property (r213634)

Rendering

  • Enabled asynchronous image decoding for large images (r213764, r213563)
  • Fixed memory estimate for layers supporting subpixel-antialised text (r213767)
  • Fixed columns getting clipped horizontally in CSS Multicolumn (r213593)

Web Inspector

  • Added DOM breakpoints for pausing on node and subtree modifications (r213626)
  • Added XHR breakpoints for pausing on requests by URL (r213691)
  • Added a “Create Breakpoint” context menu item for linked source locations (r213617)
  • Added settings for controlling Styles sidebar intelligence (r213635)
  • Added cache source information (Memory Cache or Disk Cache) in the Network tab (r213621)
  • Added protocol, remote address, priority, and connection ID in the Network tab (r213682)
  • Added individual messages to the content pane for a WebSocket (r213666)
  • Fixed an issue where the DOM tree is broken if an element has a debounce attribute (r213565)
  • Fixed an issue in the Resources tab navigation bar allowing the same file from a contextual menu item to be saved more than once (r213738)
  • Improved the layout of the compositing reasons in the Layers sidebar popover (r213739)

WebDriver

  • Fixed an issue where automation commands hang making it impossible to navigate back or forward (r213790)

WebCrypto

  • Implemented ECDH ImportKey and ExportKey operations (r213560)

By Jon Davis at March 22, 2017 05:00 PM

March 20, 2017

Carlos García Campos: WebKitGTK+ 2.16

Igalia WebKit

The Igalia WebKit team is happy to announce WebKitGTK+ 2.16. This new release drastically improves the memory consumption, adds new API as required by applications, includes new debugging tools, and of course fixes a lot of bugs.

Memory consumption

After WebKitGTK+ 2.14 was released, several Epiphany users started to complain about high memory usage of WebKitGTK+ when Epiphany had a lot of tabs open. As we already explained in a previous post, this was because of the switch to the threaded compositor, that made hardware acceleration always enabled. To fix this, we decided to make hardware acceleration optional again, enabled only when websites require it, but still using the threaded compositor. This is by far the major improvement in the memory consumption, but not the only one. Even when in accelerated compositing mode, we managed to reduce the memory required by GL contexts when using GLX, by using OpenGL version 3.2 (core profile) if available. In mesa based drivers that means that software rasterizer fallback is never required, so the context doesn’t need to create the software rasterization part. And finally, an important bug was fixed in the JavaScript garbage collector timers that prevented the garbage collection to happen in some cases.

CSS Grid Layout

Yes, the future here and now available by default in all WebKitGTK+ based browsers and web applications. This is the result of several years of great work by the Igalia web platform team in collaboration with bloomberg. If you are interested, you have all the details in Manuel’s blog.

New API

The WebKitGTK+ API is quite complete now, but there’s always new things required by our users.

Hardware acceleration policy

Hardware acceleration is now enabled on demand again, when a website requires to use accelerated compositing, the hardware acceleration is enabled automatically. WebKitGTK+ has environment variables to change this behavior, WEBKIT_DISABLE_COMPOSITING_MODE to never enable hardware acceleration and WEBKIT_FORCE_COMPOSITING_MODE to always enabled it. However, those variables were never meant to be used by applications, but only for developers to test the different code paths. The main problem of those variables is that they apply to all web views of the application. Not all of the WebKitGTK+ applications are web browsers, so it can happen that an application knows it will never need hardware acceleration for a particular web view, like for example the evolution composer, while other applications, especially in the embedded world, always want hardware acceleration enabled and don’t want to waste time and resources with the switch between modes. For those cases a new WebKitSetting hardware-acceleration-policy has been added. We encourage everybody to use this setting instead of the environment variables when upgrading to WebKitGTk+ 2.16.

Network proxy settings

Since the switch to WebKit2, where the SoupSession is no longer available from the API, it hasn’t been possible to change the network proxy settings from the API. WebKitGTK+ has always used the default proxy resolver when creating the soup context, and that just works for most of our users. But there are some corner cases in which applications that don’t run under a GNOME environment want to provide their own proxy settings instead of using the proxy environment variables. For those cases WebKitGTK+ 2.16 includes a new UI process API to configure all proxy settings available in GProxyResolver API.

Private browsing

WebKitGTK+ has always had a WebKitSetting to enable or disable the private browsing mode, but it has never worked really well. For that reason, applications like Epiphany has always implemented their own private browsing mode just by using a different profile directory in tmp to write all persistent data. This approach has several issues, for example if the UI process crashes, the profile directory is leaked in tmp with all the personal data there. WebKitGTK+ 2.16 adds a new API that allows to create ephemeral web views which never write any persistent data to disk. It’s possible to create ephemeral web views individually, or create ephemeral web contexts where all web views associated to it will be ephemeral automatically.

Website data

WebKitWebsiteDataManager was added in 2.10 to configure the default paths on which website data should be stored for a web context. In WebKitGTK+ 2.16 the API has been expanded to include methods to retrieve and remove the website data stored on the client side. Not only persistent data like HTTP disk cache, cookies or databases, but also non-persistent data like the memory cache and session cookies. This API is already used by Epiphany to implement the new personal data dialog.

Dynamically added forms

Web browsers normally implement the remember passwords functionality by searching in the DOM tree for authentication form fields when the document loaded signal is emitted. However, some websites add the authentication form fields dynamically after the document has been loaded. In those cases web browsers couldn’t find any form fields to autocomplete. In WebKitGTk+ 2.16 the web extensions API includes a new signal to notify when new forms are added to the DOM. Applications can connect to it, instead of document-loaded to start searching for authentication form fields.

Custom print settings

The GTK+ print dialog allows the user to add a new tab embedding a custom widget, so that applications can include their own print settings UI. Evolution used to do this, but the functionality was lost with the switch to WebKit2. In WebKitGTK+ 2.16 a similar API to the GTK+ one has been added to recover that functionality in evolution.

Notification improvements

Applications can now set the initial notification permissions on the web context to avoid having to ask the user everytime. It’s also possible to get the tag identifier of a WebKitNotification.

Debugging tools

Two new debugged tools are now available in WebKitGTk+ 2.16. The memory sampler and the resource usage overlay.

Memory sampler

This tool allows to monitor the memory consumption of the WebKit processes. It can be enabled by defining the environment variable WEBKIT_SMAPLE_MEMORY. When enabled, the UI process and all web process will automatically take samples of memory usage every second. For every sample a detailed report of the memory used by the process is generated and written to a file in the temp directory.

$ WEBKIT_SAMPLE_MEMORY=1 MiniBrowser 
Started memory sampler for process MiniBrowser 32499; Sampler log file stored at: /tmp/MiniBrowser7ff2246e-406e-4798-bc83-6e525987aace
Started memory sampler for process WebKitWebProces 32512; Sampler log file stored at: /tmp/WebKitWebProces93a10a0f-84bb-4e3c-b257-44528eb8f036

The files contain a list of sample reports like this one:

Timestamp                          1490004807
Total Program Bytes                1960214528
Resident Set Bytes                 84127744
Resident Shared Bytes              68661248
Text Bytes                         4096
Library Bytes                      0
Data + Stack Bytes                 87068672
Dirty Bytes                        0
Fast Malloc In Use                 86466560
Fast Malloc Committed Memory       86466560
JavaScript Heap In Use             0
JavaScript Heap Committed Memory   49152
JavaScript Stack Bytes             2472
JavaScript JIT Bytes               8192
Total Memory In Use                86477224
Total Committed Memory             86526376
System Total Bytes                 16729788416
Available Bytes                    5788946432
Shared Bytes                       1037447168
Buffer Bytes                       844214272
Total Swap Bytes                   1996484608
Available Swap Bytes               1991532544

Resource usage overlay

The resource usage overlay is only available in Linux systems when WebKitGTK+ is built with ENABLE_DEVELOPER_MODE. It allows to show an overlay with information about resources currently in use by the web process like CPU usage, total memory consumption, JavaScript memory and JavaScript garbage collector timers information. The overlay can be shown/hidden by pressing CTRL+Shit+G.

We plan to add more information to the overlay in the future like memory cache status.

By carlos garcia campos at March 20, 2017 03:19 PM

Enrique Ocaña: Media Source Extensions upstreaming, from WPE to WebKitGTK+

Igalia WebKit

A lot of good things have happened to the Media Source Extensions support since my last post, almost a year ago.

The most important piece of news is that the code upstreaming has kept going forward at a slow, but steady pace. The amount of code Igalia had to port was pretty big. Calvaris (my favourite reviewer) and I considered that the regular review tools in WebKit bugzilla were not going to be enough for a good exhaustive review. Instead, we did a pre-review in GitHub using a pull request on my own repository. It was an interesting experience, because the change set was so large that it had to be (artificially) divided in smaller commits just to avoid reaching GitHub diff display limits.

394 GitHub comments later, the patches were mature enough to be submitted to bugzilla as child bugs of Bug 157314 – [GStreamer][MSE] Complete backend rework. After some comments more in bugzilla, they were finally committed during Web Engines Hackfest 2016:

Some unforeseen regressions in the layout tests appeared, but after a couple of commits more, all the mediasource WebKit tests were passing. There are also some other tests imported from W3C, but I kept them still skipped because webm support was needed for many of them. I’ll focus again on that set of tests at its due time.

Igalia is proud of having brought the MSE support up to date to WebKitGTK+. Eventually, this will improve the browser video experience for a lot of users using Epiphany and other web browsers based on that library. Here’s how it enables the usage of YouTube TV at 1080p@30fps on desktop Linux:

Our future roadmap includes bugfixing and webm/vp9+opus support. This support is important for users from countries enforcing patents on H.264. The current implementation can’t be included in distros such as Fedora for that reason.

As mentioned before, part of this upstreaming work happened during Web Engines Hackfest 2016. I’d like to thank our sponsors for having made this hackfest possible, as well as Metrological for giving upstreaming the importance it deserves.

Thank you for reading.

 

By eocanha at March 20, 2017 11:55 AM

March 15, 2017

Manuel Rego: CSS Grid Layout is Here to Stay

Igalia WebKit

It’s been a long journey but finally CSS Grid Layout is here! 🚀 In the past week, Chrome 57 and Firefox 52 were released, becoming the first browsers to ship CSS Grid Layout unprefixed (Explorer/Edge has been shipping an older, prefixed version of the spec since 2012). Not only that, but Safari will hopefully be shipping it very soon too.

I’m probably biased after having worked on it for a few years, but I believe CSS Grid Layout is going to be a big step in the history of the Web. Web authors have been waiting for a solution like this since the early days of the Web, and now they can use a very powerful and flexible layout module supported natively by the browser, without the need of any external frameworks.

Igalia has been playing a major role in the implementation of CSS Grid Layout in Chromium/Blink and Safari/WebKit since 2013 sponsored by Bloomberg. This is a blog post about that successful collaboration.

A blast from the past

Grids are not something new at all, since we can even find references to them in some of the initial discussions of the CSS creators. Next is an excerpt from a mail by Håkon Wium Lie in June 1995 to www-style:

Grids! Let the style sheet carve up the canvas into golden rectangles, and use an expert system to lay out the elements!! Ok, drop the expert system and define a set of simple rules that we hardcode.. whoops! But grids do look nice!

-h&kon

Since that time the Web hasn’t stopped moving and there have been different solutions and approaches to try to solve the problem of having grid-based designs in HTML/CSS.

At the beginning of the decade Microsoft started to work on what eventually become the CSS Grid Layout initial specification. This spec was based on the Internet Explorer 10 implementation and the experience gathered by Microsoft during its development. IE10 was released in 2012, shipping a prefixed version of that initial spec.

Then Google started to add support to WebKit at the end of 2011. At that time, WebKit was the engine used by both Chromium and Safari; later in 2012 it would be forked to create Blink.

Meanwhile, Mozilla had not started the Grid implementation in Firefox as they had some conflicts with their XUL grid layout type.

Igalia and Bloomberg collaboration

Bloomberg uses Chromium and they were looking forward to having a proper solution for their layout requirements. They detected performance issues due to the limitations of the current layout modules available on the Web. They see CSS Grid Layout as the right way to fix those problems and cover their needs.

Bloomberg decided to push CSS Grid Layout implementation as part of the collaboration with Igalia. My colleagues, Sergio Villar and Xan López, started to work on CSS Grid Layout around the summer of 2013. In 2014, Javi Fernández and I replaced Xan, joining the effort as well. We’ve been working on this for more than 3 years and counting.

At the beginning, we were working together with some Google folks but later Igalia took the lead role in the development of the specification. The spec has evolved and changed quite a lot since 2013, so we’ve had to deal with all these changes always trying to keep our implementations up to date, and at the same time continue to add new features. As the codebase in Blink and WebKit was still sharing quite a lot of things after the fork, we were working on both implementations at the same time.

Igalia and Bloomberg working together to build a better web Igalia and Bloomberg working together to build a better web

The results of this collaboration have been really satisfactory, as now CSS Grid Layout has shipped in Chromium and enabled by default in WebKit too (which will hopefully mean that it’ll be shipped in the upcoming Safari 10.1 release too).

Thanks @jensimmons for the feedback regarding Safari 10.1.

And now what?

Update your browsers, be sure you grab a version with Grid Layout support and start to use CSS Grid Layout, play with it, experiment and so on. We’d love to get bug reports and feedback about it. It’s too late to change the current version of the spec, but ideas for a future version are already being recorded in the CSS Working Group GitHub repository.

If you want to start with Grid Layout, there are plenty of resources available on the Internet:

It’s possible to think that now that CSS Grid Layout has shipped, it’s all over. Nothing is further from the truth as there is still a lot of work to do:

  • An important step would be to complete the W3C Test Suite. Igalia has been contributing to it and it’s currently imported into Blink and WebKit, but it doesn’t cover the whole spec yet.
  • There are some missing features in the current implementations. For example, nobody supports subgrids yet, web authors tell us that they would love to have them available. Another example, in Blink and WebKit is that we are still finishing the support for baseline alignment.
  • When bugs and issues appear they will need to be fixed and some might even imply some minor modifications to the spec.
  • Performance optimizations should be done. CSS Grid Layout is a huge spec so the biggest part effort so far has been done in the implementation. Now it’s time to improve performance of different use cases.
  • And as I explained earlier, people are starting to think about new features for a future version of the spec. Progress won’t stop now.

Acknowledgements

First of all, it’s important to highlight once again Bloomberg’s role in the development of CSS Grid Layout. Without their vision and support it probably would not be have shipped so soon.

But this is not an individual effort, but something much bigger. I’ll mention several people next, but I’m sure I’ll forget a lot of them, so please forgive me in advance.

So big thanks to:

  • The Microsoft folks who started the spec.
  • The current spec editors: Elika J. Etemad (fantasai), Rossen Atanassov, and Tab Atkins Jr. Especially fantasai & Tab, who have been dealing with most of the issues we have reported.
  • The whole CSS Working Group for their work on this spec.
  • Our reviewers in both Blink and WebKit: Christian Biesinger, Darin Adler, Julien Chaffraix, and many other.
  • Other implementors: Daniel Holbert, Mats Palmgren, etc.
  • People spreading the word about CSS Grid Layout: Jen Simmons, Rachel Andrew, etc.
  • The many other people I’m missing in this list who helped to make CSS Grid Layout the newest layout module for the Web.

Thanks to you all! 😻 And particularly to Bloomberg for letting Igalia be part of this amazing experience. We’re really happy to have walked this path together and we really hope to do more cool stuff in the future.

Translations

March 15, 2017 11:00 PM

March 09, 2017

CSS Grid Layout: A New Layout Module for the Web

Surfin’ Safari

People have been using grid designs in magazines, newspapers, posters, etc. for a long time before the Web appeared. At the point when web developers started to create web pages, many of them were based on a grid layout. Different solutions have been used to create grid layouts, like tables, floats, inline blocks, or flexboxes, but all of these techniques have different issues when you try to define a complex grid design.

In order to solve these problems, a new standard was defined to provide a good solution to create grid designs. This specification, called CSS Grid Layout, allows users to very easily create two-dimensional layouts on the Web. It has been designed specifically for this purpose and it brings very powerful features to divide the web page into different regions, granting great flexibility to web authors in order to define the sizing of the different sections and how the elements are positioned in each of them.

Grid Layout has been under development in WebKit for a while, and you can experiment with Grid Layout today using WebKit in Safari Technology Preview.

Basic Concepts

A grid is a structure made up of a series of intersecting lines. The main concepts of the CSS Grid Layout spec are:

  • The grid lines that define the grid: they can be horizontal or vertical,
    and they are numbered starting at 1.
  • The grid tracks, which are the rows (horizontal)
    or columns (vertical) defined in the grid.
  • The grid cells, the intersection of a row and a column.
  • A grid area, one or more adjacent grid cells that define a rectangle.
Grid Concepts

Grid Definition

To create a grid you just need to use a new value for the display property: grid or inline-grid. This is the same syntax as Flexbox, so you might be already used to it.

Then you will need to define the structure of your grid. For example, to define the size of the tracks you can do something like this:

display: grid;
grid-template-rows: 100px 100px;
grid-template-columns: 400px 200px 100px;

This will create a grid with two rows of 100px each, and three columns, with sizes 400px for first column, 200px for the second column, and 100px for the third column. This grid will have four vertical lines (1, 2, 3, 4) and three horizontal lines (1, 2, 3). You can also name the lines, so you can then reference them later easily. For example:

    grid-template-columns: [first] 400px [main] 200px [side] 100px [last];
Grid Definition

Regarding track sizing, you have a lot of flexibility:

  • You can define a fixed size track, setting the length or percentage.
  • You can define an intrinsic-sized track, where the size is based on the size of its content, using auto, min-content, max-content, or fit-content.
  • You can use a new unit fr to take advantage of the available space.

In addition, there are some functions that can be useful to set the track sizes:

  • minmax(), to set the minimum and maximum size of the track, so its size will end up depending on the available space and the size of the rest of the tracks.
  • repeat(), to define a fixed number of repetitions. It can also be used in automatic mode to cause the number of tracks to depend on the available space.

There is a special syntax that allows you to define the grid structure using ASCII art:

    grid-template-areas: "header  header"
                         "sidebar main  "
                         "sidebar footer";

In this example we create a 2×3 grid where the first row is the header, the rest of the first column is the sidebar, the main content is in the second row and second column, and the footer is in the last row and column.

Grid Areas

You can also define the gutter between tracks. For that, you just need to use the grid-row-gap and grid-column-gap properties.

Item Placement

The children of a grid container are called grid items. They can be positioned in the different parts of the grid using the placement properties grid-row-start, grid-row-end, grid-column-start, and grid-column-end. But in most cases you’ll be using the shorthands grid-row, grid-column, and grid-area.

Note that these properties refer to the grid lines, so if you want to put an element on the third row and the second column you can use something like this:

    grid-row: 3;
    grid-column: 2;

The item can also span several lines, so you can use the following syntax to take three rows and two columns:

    grid-row: 2 / 5;
    grid-column: 3 / span 2;
Grid Placement

Apart from that, you can also refer to named lines or areas with these properties, which is very convenient in some scenarios.

As you can imagine, when you’re using Grid Layout you can very easily break the relationship between the DOM order and the visual order. You have to be careful to keep the right order in the DOM to avoid making your content less accessible.

Lastly, there’s also the possibility to let the items place themselves into the grid. If you don’t set any placement property (or if you use auto), the items will be automatically placed on some empty cell of the grid, creating new rows (by default) or columns (if specified through grid-auto-flow property) when required.

Alignment

One of the big things that come for free when you use CSS Grid Layout is the alignment support. In Grid Layout you can align horizontally and vertically without any issues with just some simple CSS properties.

The alignment capabilities of Grid Layout operate over two different subjects: grid tracks, with regard to the grid container, and grid items in their respective grid areas. In addition, we can operate on both axes, horizontally and vertically.

The CSS properties justify-content and align-content apply to grid tracks to align them horizontally and vertically, respectively. These properties, which define what is known as Content Distribution behavior, can also be used to distribute the grid container’s available space among the tracks following different distributions: between, around, evenly, and stretch. For example, check the following grid:

    display: grid;
    grid-template-rows: 100px 100px;
    grid-template-columns: 150px 150px 150px;
    height: 500px;
    width: 650px;
    align-content: center;
    justify-content: space-evenly;
Grid Alignment Tracks

When it comes to aligning the grid items, the justify-self and align-self properties are used to align horizontally and vertically, respectively. These properties define the Self Alignment behavior of the grid items. It’s possible to define a default behavior for all the items of a grid container by using the Default Alignment properties, align-items and justify-items, which gives an incredible syntactic flexibility for defining the grid’s alignment behavior.

Grid Alignment Items

Responsive Design with Grid

As you can already imagine, all the different Grid Layout properties make it more comfortable to create responsive designs. You can take advantage of the powerful track sizing mechanisms like the fr unit, minmax(), or repeat(). Combine this with media queries to completely change the structure of your grid with just a few CSS lines.

For example:

display: grid;
grid-gap: 10px 20px;
grid-template-rows: 100px 1fr auto;
grid-template-columns: 1fr 200px;
grid-template-areas: "header  header"
                     "content aside "
                     "footer  aside ";

@media (max-width: 600px) {
    grid-gap: 0;
    grid-template-rows: auto 1fr auto auto;
    grid-template-columns: 1fr;
    grid-template-areas: "header "
                         "content"
                         "aside  "
                         "footer ";
}
Responsive grid using flexible sizes and media queries

And Much More

This is just an introductory blog post about Grid Layout, not a deep review of all the different features that it provides: that would require much more than a single post. The different examples explained in this blog post have been published online. You can start to play with them now!

If you want to learn more about CSS Grid Layout, there are a bunch of good resources out there:

On top of that, several people have been talking about Grid Layout at different conferences and events. You won’t have problems finding some of the talks published online.

Conclusion

CSS Grid Layout is here to stay. We’re looking forward to seeing this soon in shipping versions of Safari and other web browsers. This is very good news for the Web authors that have been waiting for a tool like this for years. We believe this is going to be a huge step forward for the Web.

The implementation of Grid Layout in WebKit has been performed by Igalia’s Web Platform team and sponsored by Bloomberg. If you have any comments or questions, don’t hesitate to contact any of the people working on it: Javi (@lajava77), Manuel (@regocas), or Sergio (@svillarsenin). If you want to keep track of the development you can follow bug #60731. New bug reports are very welcome, especially now that Grid Layout is hitting your browser and testing is easier than ever. Exciting times ahead!

By Manuel Rego at March 09, 2017 06:00 PM

March 08, 2017

Release Notes for Safari Technology Preview 25

Surfin’ Safari

Safari Technology Preview Release 25 is now available for download for macOS Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 212356-213542.

Resource Timing

  • Added Resource Timing as an experimental feature enabled by default (r212945)
  • Added Resource Timing support in Workers (r212449)
  • Improved gathering timing information with reliable responseEnd time (r212993)
  • Changed loads initiated by media elements to set the initiatorType to their element name (r212994)

User Timing

  • Enabled User Timing by default as an experimental feature (r212945)
  • Changed performance.measure in Workers to throw a SyntaxError if provided mark name is not found (r212806)

WebCrypto

  • Added support for AES-CFB (r212736)

Web API

  • Added a new webglcontextchanged event that is dispatched when the GraphicsContext3D notices that the active GPU has changed (r212637)
  • Changed the onbeforeunload event return value coercion to match specification behavior (r212625)
  • Exposed Symbol.toPrimitive and Symbol.valueOf on Location instances (r212378)
  • Fixed <input type=color> and <input type=range readonly> to prevent applying the readonly attribute to match specifications (r212617, r212610)
  • Fixed handling of <input>.labels when the input type changes from "text" to "hidden" to "checkbox" (r212522)
  • Prevented aggressive throttling of DOM timers until they’ve reached their maximum nesting level (r212845)

Web Inspector

  • Enabled import() for modules in Web Inspector console (r212438)
  • Changed the zoom level in the Settings tab to use localized formatting (r212578)
  • Changed Web Inspector to use the Resources tab when showing files instead of the Network tab (r212761)
  • Changed the split console to be allowed in the Elements, Resources, Debugger, and Storage tabs when Web Inspector is docked to the bottom (r212400)
  • Changed CSS variable uses that are unresolved to get marked with a warning icon (r213187)
  • Fixed the Zoom level user interface to match the the setting value (r212580)
  • Prevented dismissing popovers when dragging a Web Inspector window (r212427)
  • Improved copy and paste behavior for request headers (r212423)
  • Included additional detail in the display name of Timeline data elements (r212570)

CSS

  • Fixed centering text inside a button set to display:flex with justify-content:center (r213173)
  • Unprefixed -webkit-line-break (r213094)

Rendering

  • Prevented fixed elements from bouncing when scrolling beyond the bottom of the page (r212559)
  • Improved text wrapping consistency where text might wrap when its preferred logical width is used for sizing the containing block (r213008)

Media

  • Fixed local audio-only streams to trigger playback to begin (r212696)

Bug Fixes

  • Changed pending scripts to execute asynchronously after stylesheet loads are completed (r212614)
  • Fixed an issue where font-weight in @font-face can cause a font to be downloaded even when it’s not used (r212513)
  • Made special URLs without a host invalid (r212470)

By Jon Davis at March 08, 2017 06:00 PM

February 22, 2017

Release Notes for Safari Technology Preview 24

Surfin’ Safari

Safari Technology Preview Release 24 is now available for download for macOS Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 211256-212356.

User Timing

  • Added User Timing as an experimental feature (r211332)
  • Implemented PerformanceObserver for User Timing (r211406)
  • Added support for Performance API (performance.now(), UserTiming) in Workers (r211594)

Link Preload

  • Added <link preload> as an experimental feature (r211341)
  • Added support for speculative resource loading (r211480)
  • Prevented preloaded resources from being cleared after parsing is done (r211649)
  • Addressed memory issues related to clearing preloaded resources (r211673)

Web API

  • Changed Location object to throw a TypeError for Object.preventExtensions() (r211778)
  • Changed Pointer Lock to require keyboard focus (r211652)
  • Changed Pointer Lock events to be delivered directly to the target element (r211650)
  • Changed the HTML Form Validation popover to be dismissed when pressing the Escape key (r211653)
  • Changed the HTML Form Validation popover to respect the minimum font size setting (r212325)
  • Fixed an issue causing Fetch to fail when passing undefined as the headers (r212162)
  • Fixed the <details> element to work correctly when content is changed between closing and opening (r212027)
  • Implemented toJSON() for URL objects (r212193)
  • Improved URL specification compliance (r211636, r212279)
  • Prevented a redundant scroll to top-left corner of the page when navigating back to a URL with no fragment (r212197)
  • Made Symbols configurable when exposed on cross-origin Window or Location objects (r211772)

JavaScript

  • Implemented dynamic import operator (r211280)
  • Changed dynamic import through setTimeout() and setInterval() to correctly inherit SourceOrigin (r211314)
  • Changed scripts load priority to ‘high’ (r211334)
  • Fixed Apple Pay line validation to prevent validating line items that are “pending” (r211446)
  • Implemented ArrayBuffer.prototype.byteLength and SharedArrayBuffer.prototype.byteLength (r212196)
  • Implemented lifting template escape sequence restrictions in tagged templates (r211319)

CSS

  • Fixed elements with a backdrop-filter and a mask to correctly mask the backdrop (r211305)
  • Updated line-break:auto to match the latest version of Unicode (r212235)

Web Inspector

  • Enabled the console to evaluate dynamic module import() (r211777)
  • Added CSS color keyword entries for all “grey” and “gray” variations (r211452)
  • Added stroke-linecap property values to CSS autocompletion (r211640)
  • Added a horizontal slider for gradient editor angle value where applicable (r211318)
  • Added a limit on Async Call Stacks for asynchronous loops (r211385)
  • Added a setting to preserve network data on navigation for the Network tab (r211451)
  • Added the ability to show the current value of CSS variables in style rules (r212273)
  • Added a warning that webkitSubtle in WebCrypto is deprecated (r212261)
  • Changed docking Web Inspector to collapse the split console in the Timeline and Network tabs (r211976)
  • Fixed jumping from Search tab results to see the resource in other tabs (Resource, Debugger, Network) (r211608)
  • Fixed a Debugger sidebar panel issue that can cause it to have multiple tree selections (r212171)
  • Fixed DOM tree view collapsing when switching back to the Elements tab (r211829)
  • Removed Shift-Command-W (⇧⌘W) shortcut to a close tab (r211485)

Accessibility

  • Fixed text string range from index and length in text controls when there are newlines (r211491)

Rendering

  • Fixed column progression after enabling pagination on a right-to-left document (r211564)

Performance

  • Suspended SVG animations on hidden pages (r211612)
  • Avoided initially creating a layer backing store for elements outside the visible area (r211845)

Security

  • Changed CSS data URL resources to be treated as same origin loads when loaded through HTML <link> elements (r211926)

By Jon Davis at February 22, 2017 06:00 PM

February 15, 2017

JavaScript Debugging Improvements

Surfin’ Safari

Debugging JavaScript is a fundamental part of developing web applications. Having effective debugging tools makes you more productive by making it easier to investigate and diagnose issues when they arise. The ability to pause and step through JavaScript has always been a core feature of Web Inspector.

The JavaScript debugger hasn’t changed in a long time, but the Web and the JavaScript language have. We recently took at look at our debugger to see how we could improve the experience and make it even more useful for you.

Stepping

Web Inspector now includes extra highlights for the active statement or expression that is about to execute. For previous call frames, we highlight the expression that is currently executing.

Previously, Web Inspector would only highlight the line where the debugger was paused. However, knowing exactly where on the line the debugger was and therefore what was about to execute might not be obvious. By highlighting source text ranges of active expressions we eliminate any confusion and make stepping through code easier and faster.

For example, when stepping through the following code it is always immediately clear what is about to execute, even when we are executing a small part of a larger statement:

Debugger Stepping Highlights
Debugger Stepping Highlights
Debugger Stepping Highlights
Debugger Stepping Highlights
Debugger Stepping Highlights
Debugger Stepping Highlights
Debugger Stepping Highlights
Debugger Stepping Highlights

Highlighting expressions is also useful when looking at previous call frames in the Call Stack. Again, when selecting parent call frames it is always immediately clear where we are currently executing:

Parent Call Frame Expression Highlight
Parent Call Frame Expression Highlight
Parent Call Frame Expression Highlight
Parent Call Frame Expression Highlight

We also made improvements to the stepping behavior itself. We eliminated unnecessary pauses, added pause points that were previously missed, and generally made pausing locations more consistent between old and new syntaxes of the language. Stepping in and out of functions is also more intuitive. Combined with the new highlights stepping through complex code is now easier than ever.

Breakpoints

Web Inspector is now smarter and more forgiving about where it resolves breakpoints, making them more consistent and useful.

Previously, setting a breakpoint on an empty line or a line with a comment would create a breakpoint that would never get triggered. Now, Web Inspector installs the breakpoint on the next statement following the location where the breakpoint was set.

We also made it simpler to set a breakpoint for a function or method. Previously, you would have needed to find the first statment within the function and set a breakpoint on that line. Now, you can just set a breakpoint on the line with the function name or its opening brace and the breakpoint will trigger on the first statement in the function.

New Acceptable Breakpoint Locations
New Acceptable Breakpoint Locations
New Acceptable Breakpoint Locations

A new global breakpoint was added for Assertion failures triggered by console.assert. This breakpoint can be found beside the existing global breakpoints, such as pausing on uncaught exceptions.

Asynchronous Call Stacks

Asynchronous Call Stacks

JavaScript functions make it very convenient to evaluate code asynchronously. Callbacks, Events, Timers, and new language features such as Promises and async functions make it easier than ever to run asynchronous code.

Debugging these kinds of asynchronous chains can be complex. Web Inspector now makes it much easier to debug asynchronous code by displaying the call stacks across asynchronous boundary points. Now when your timer fires and you pause inside your callback, you can see the call stack from where the timer was scheduled, and so on if that call stack had been triggered asynchronously.

WebKit currently records asynchronous call stacks in just a few places and is actively bringing it to more features like Promises.

Web Workers

While JavaScript itself is single threaded, Web Workers allow web applications to run scripts in background threads. Web Inspector can now debug scripts in Workers just as easily as scripts in the Page.

Worker Resources in Resources Tab

When inspecting a page with Workers, Worker resources will show in the Resources tab sidebar. Each Worker becomes a top level resource like the Page, allowing you to quickly see the list of active Workers and their scripts and resources.

An execution context picker becomes available in the quick console, allowing you to choose to evaluate JavaScript in either the Page’s context or a Worker’s context. Workers have dramatically improved console logging support, so you will be able to interact with objects logged from a Worker just as you would expect.

Setting breakpoints behaves as expected. When any single context pauses, all other contexts are immediately paused. Selecting call frames inside of a particular thread allows you to step just that individual thread. Use the Continue debugger control to resume all scripts.

worker-debugging

When debugging a page with Workers, Web Inspector adds a thread name annotation next to the debugger highlights. If you have multiple Workers, or even Workers and the Page, all paused and stepping through the same script you will be able to see exactly where each thread is.

WebKit currently only supports debugging Workers. Profiling Worker scripts with Timelines will come in the future.

Code Coverage and Type Profiling

Web Inspector also supports Code Coverage and Type profiling.

Previously, Web Inspector had a single button to toggle both profilers. A new [C] button was added to toggle just Code Coverage. The [T] button now only toggles Type Profiler.

Feedback

You can try out all of these improvements in the latest Safari Technology Preview. Let us know how they work for you. Send feedback on Twitter (@webkit, @JosephPecoraro) or by filing a bug.






















By Joseph Pecoraro at February 15, 2017 07:00 PM

February 10, 2017

Carlos García Campos: Accelerated compositing in WebKitGTK+ 2.14.4

Igalia WebKit

WebKitGTK+ 2.14 release was very exciting for us, it finally introduced the threaded compositor to drastically improve the accelerated compositing performance. However, the threaded compositor imposed the accelerated compositing to be always enabled, even for non-accelerated contents. Unfortunately, this caused different kind of problems to several people, and proved that we are not ready to render everything with OpenGL yet. The most relevant problems reported were:

  • Memory usage increase: OpenGL contexts use a lot of memory, and we have the compositor in the web process, so we have at least one OpenGL context in every web process. The threaded compositor uses the coordinated graphics model, that also requires more memory than the simple mode we previously use. People who use a lot of tabs in epiphany quickly noticed that the amount of memory required was a lot more.
  • Startup and resize slowness: The threaded compositor makes everything smooth and performs quite well, except at startup or when the view is resized. At startup we need to create the OpenGL context, which is also quite slow by itself, but also need to create the compositing thread, so things are expected to be slower. Resizing the viewport is the only threaded compositor task that needs to be done synchronously, to ensure that everything is in sync, the web view in the UI process, the OpenGL viewport and the backing store surface. This means we need to wait until the threaded compositor has updated to the new size.
  • Rendering issues: some people reported rendering artifacts or even nothing rendered at all. In most of the cases they were not issues in WebKit itself, but in the graphic driver or library. It’s quite diffilcult for a general purpose web engine to support and deal with all possible GPUs, drivers and libraries. Chromium has a huge list of hardware exceptions to disable some OpenGL extensions or even hardware acceleration entirely.

Because of these issues people started to use different workarounds. Some people, and even applications like evolution, started to use WEBKIT_DISABLE_COMPOSITING_MODE environment variable, that was never meant for users, but for developers. Other people just started to build their own WebKitGTK+ with the threaded compositor disabled. We didn’t remove the build option because we anticipated some people using old hardware might have problems. However, it’s a code path that is not tested at all and will be removed for sure for 2.18.

All these issues are not really specific to the threaded compositor, but to the fact that it forced the accelerated compositing mode to be always enabled, using OpenGL unconditionally. It looked like a good idea, entering/leaving accelerated compositing mode was a source of bugs in the past, and all other WebKit ports have accelerated compositing mode forced too. Other ports use UI side compositing though, or target a very specific hardware, so the memory problems and the driver issues are not a problem for them. The imposition to force the accelerated compositing mode came from the switch to using coordinated graphics, because as I said other ports using coordinated graphics have accelerated compositing mode always enabled, so they didn’t care about the case of it being disabled.

There are a lot of long-term things we can to to improve all the issues, like moving the compositor to the UI (or a dedicated GPU) process to have a single GL context, implement tab suspension, etc. but we really wanted to fix or at least improve the situation for 2.14 users. Switching back to use accelerated compositing mode on demand is something that we could do in the stable branch and it would improve the things, at least comparable to what we had before 2.14, but with the threaded compositor. Making it happen was a matter of fixing a lot bugs, and the result is this 2.14.4 release. Of course, this will be the default in 2.16 too, where we have also added API to set a hardware acceleration policy.

We recommend all 2.14 users to upgrade to 2.14.4 and stop using the WEBKIT_DISABLE_COMPOSITING_MODE environment variable or building with the threaded compositor disabled. The new API in 2.16 will allow to set a policy for every web view, so if you still need to disable or force hardware acceleration, please use the API instead of WEBKIT_DISABLE_COMPOSITING_MODE and WEBKIT_FORCE_COMPOSITING_MODE.

We really hope this new release and the upcoming 2.16 will work much better for everybody.

By carlos garcia campos at February 10, 2017 05:18 PM

February 08, 2017

Release Notes for Safari Technology Preview 23

Surfin’ Safari

Safari Technology Preview Release 23 is now available for download for macOS Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 210845-211256.

Gamepads

  • Fixed Gamepad support for PS4 controllers (r211220)
  • Exposed more directional pads for other types of gamepads (r211231)

Pointer Lock

  • Fixed sending Pointer Lock events directly to the target element (r211235)
  • Fixed page requests to re-establish Pointer Lock without a user gesture after being released without a user gesture (r211249)

Media

  • Added client notification when the user plays media otherwise prevented from autoplaying (r211226)

Accessibility

  • Fixed Speak Selection for <iframe> elements (r211095)

Web Inspector

  • Added a way to trigger Garbage Collection (r211075)

Bug Fixes

  • Fixed Flash object placeholder painting when Safari reloads pages with Flash objects after Flash is installed (r211114)
  • Improved switching between GPUs for WebGL content in order to maximize battery life (r211244)

By Jon Davis at February 08, 2017 06:00 PM

Michael Catanzaro: An Update on WebKit Security Updates

Igalia WebKit

One year ago, I wrote a blog post about WebKit security updates that attracted a fair amount of attention at the time. For a full understanding of the situation, you really have to read the whole thing, but the most important point was that, while WebKitGTK+ — one of the two WebKit ports present in Linux distributions — was regularly releasing upstream security updates, most Linux distributions were ignoring the updates, leaving users vulnerable to various security bugs, mainly of the remote code execution variety. At the time of that blog post, only Arch Linux and Fedora were regularly releasing WebKitGTK+ updates, and Fedora had only very recently begun doing so comprehensively.

Progress report!

So how have things changed in the past year? The best way to see this is to look at the versions of WebKitGTK+ in currently-supported distributions. The latest version of WebKitGTK+ is 2.14.3, which fixes 13 known security issues present in 2.14.2. Do users of the most popular Linux operating systems have the fixes?

  • Fedora users are good. Both Fedora 24 and Fedora 25 have the latest version, 2.14.3.
  • If you use Arch, you know you always have the latest stuff.
  • Ubuntu users rejoice: 2.14.3 updates have been released to users of both Ubuntu 16.04 and 16.10. I’m very  pleased that Ubuntu has decided to take my advice and make an exception to its usual stable release update policy to ensure its users have a secure version of WebKit. I can’t give Ubuntu an A grade here because the updates tend to lag behind upstream by several months, but slow updates are much better than no updates, so this is undoubtedly a huge improvement. (Anyway, it’s hardly a bad idea to be cautious when releasing a big update with high regression potential, as is unfortunately the case with even stable WebKit updates.) But if you use the still-supported Ubuntu 14.04 or 12.04, be aware that these versions of Ubuntu cannot ever update WebKit, as it would require a switch to WebKit2, a major API change.
  • Debian does not update WebKit as a matter of policy. The latest release, Debian 8.7, is still shipping WebKitGTK+ 2.6.2. I count 184 known vulnerabilities affecting it, though that’s an overcount as we did not exclude some Mac-specific security issues from the 2015 security advisories. (Shipping ancient WebKit is not just a security problem, but a user experience problem too. Actually attempting to browse the web with WebKitGTK+ 2.6.2 is quite painful due to bugs that were fixed years ago, so please don’t try to pretend it’s “stable.”) Note that a secure version of WebKitGTK+ is available for those in the know via the backports repository, but this does no good for users who trust Debian to provide them with security updates by default without requiring difficult configuration. Debian testing users also currently have the latest 2.14.3, but you will need to switch to Debian unstable to get security updates for the foreseeable future, as testing is about to freeze.
  • For openSUSE users, only Tumbleweed has the latest version of WebKit. The current stable release, Leap 42.2, ships with WebKitGTK+ 2.12.5, which is coincidentally affected by exactly 42 known vulnerabilities. (I swear I am not making this up.) The previous stable release, Leap 42.1, originally released with WebKitGTK+ 2.8.5 and later updated to 2.10.7, but never past that. It is affected by 65 known vulnerabilities. (Note: I have to disclose that I told openSUSE I’d try to help out with that update, but never actually did. Sorry!) openSUSE has it a bit harder than other distros because it has decided to use SUSE Linux Enterprise as the source for its GCC package, meaning it’s stuck on GCC 4.8 for the foreseeable future, while WebKit requires GCC 4.9. Still, this is only a build-time requirement; it’s not as if it would be impossible to build with Clang instead, or a custom version of GCC. I would expect WebKit updates to be provided to both currently-supported Leap releases.
  • Gentoo has the latest version of WebKitGTK+, but only in testing. The latest version marked stable is 2.12.5, so this is a serious problem if you’re following Gentoo’s stable channel.
  • Mageia has been updating WebKit and released a couple security advisories for Mageia 5, but it seems to be stuck on 2.12.4, which is disappointing, especially since 2.12.5 is a fairly small update. The problem here does not seem to be lack of upstream release monitoring, but rather lack of manpower to prepare the updates, which is a typical problem for small distros.
  • The enterprise distros from Red Hat, Oracle, and SUSE do not provide any WebKit security updates. They suffer from the same problem as Ubuntu’s old LTS releases: the WebKit2 API change  makes updating impossible. See my previous blog post if you want to learn more about that. (SUSE actually does have WebKitGTK+ 2.12.5 as well, but… yeah, 42.)

So results are clearly mixed. Some distros are clearly doing well, and others are struggling, and Debian is Debian. Still, the situation on the whole seems to be much better than it was one year ago. Most importantly, Ubuntu’s decision to start updating WebKitGTK+ means the vast majority of Linux users are now receiving updates. Thanks Ubuntu!

To arrive at the above vulnerability totals, I just counted up the CVEs listed in WebKitGTK+ Security Advisories, so please do double-check my counting if you want. The upstream security advisories themselves are worth mentioning, as we have only been releasing these for two years now, and the first year was pretty rough when we lost our original security contact at Apple shortly after releasing the first advisory: you can see there were only two advisories in all of 2015, and the second one was huge as a result of that. But 2016 seems to have gone decently well. WebKitGTK+ has normally been releasing most security fixes even before Apple does, though the actual advisories and a few remaining fixes normally lag behind Apple by roughly a month or so. Big thanks to my colleagues at Igalia who handle this work.

Challenges ahead

There are still some pretty big problems remaining!

First of all, the distributions that still aren’t releasing regular WebKit updates should start doing so.

Next, we have to do something about QtWebKit, the other big WebKit port for Linux, which stopped receiving security updates in 2013 after the Qt developers decided to abandon the project. The good news is that Konstantin Tokarev has been working on a QtWebKit fork based on WebKitGTK+ 2.12, which is almost (but not quite yet) ready for use in distributions. I hope we are able to switch to use his project as the new upstream for QtWebKit in Fedora 26, and I’d encourage other distros to follow along. WebKitGTK+ 2.12 does still suffer from those 42 vulnerabilities, but this will be a big improvement nevertheless and an important stepping stone for a subsequent release based on the latest version of WebKitGTK+. (Yes, QtWebKit will be a downstream of WebKitGTK+. No, it will not use GTK+. It will work out fine!)

It’s also time to get rid of the old WebKitGTK+ 2.4 (“WebKit1”), which all distributions currently parallel-install alongside modern WebKitGTK+ (“WebKit2”). It’s very unfortunate that a large number of applications still depend on WebKitGTK+ 2.4 — I count 41 such packages in Fedora — but this old version of WebKit is affected by over 200 known vulnerabilities and really has to go sooner rather than later. We’ve agreed to remove WebKitGTK+ 2.4 and its dependencies from Fedora rawhide right after Fedora 26 is branched next month, so they will no longer be present in Fedora 27 (targeted for release in November). That’s bad for you if you use any of the affected applications, but fortunately most of the remaining unported applications are not very important or well-known; the most notable ones that are unlikely to be ported in time are GnuCash (which won’t make our deadline) and Empathy (which is ported in git master, but is not currently in a  releasable state; help wanted!). I encourage other distributions to follow our lead here in setting a deadline for removal. The alternative is to leave WebKitGTK+ 2.4 around until no more applications are using it. Distros that opt for this approach should be prepared to be stuck with it for the next 10 years or so, as the remaining applications are realistically not likely to be ported so long as zombie WebKitGTK+ 2.4 remains available.

These are surmountable problems, but they require action by downstream distributions. No doubt some distributions will be more successful than others, but hopefully many distributions will be able to fix these problems in 2017. We shall see!

By Michael Catanzaro at February 08, 2017 06:32 AM

Michael Catanzaro: On Epiphany Security Updates and Stable Branches

Igalia WebKit

One of the advantages of maintaining a web browser based on WebKit, like Epiphany, is that the vast majority of complexity is contained within WebKit. Epiphany itself doesn’t have any code for HTML parsing or rendering, multimedia playback, or JavaScript execution, or anything else that’s actually related to displaying web pages: all of the hard stuff is handled by WebKit. That means almost all of the security problems exist in WebKit’s code and not Epiphany’s code. While WebKit has been affected by over 200 CVEs in the past two years, and those issues do affect Epiphany, I believe nobody has reported a security issue in Epiphany’s code during that time. I’m sure a large part of that is simply because only the bad guys are looking, but the attack surface really is much, much smaller than that of WebKit. To my knowledge, the last time we fixed a security issue that affected a stable version of Epiphany was 2014.

Well that streak has unfortunately ended; you need to make sure to update to Epiphany 3.22.6, 3.20.7, or 3.18.11 as soon as possible (or Epiphany 3.23.5 if you’re testing our unstable series). If your distribution is not already preparing an update, insist that it do so. I’m not planning to discuss the embarrassing issue here — you can check the bug report if you’re interested — but rather on why I made new releases on three different branches. That’s quite unlike how we handle WebKitGTK+ updates! Distributions must always update to the very latest version of WebKitGTK+, as it is not practical to backport dozens of WebKit security fixes to older versions of WebKit. This is rarely a problem, because WebKitGTK+ has a strict policy to dictate when it’s acceptable to require new versions of runtime dependencies, designed to ensure roughly three years of WebKit updates without the need to upgrade any of its dependencies. But new major versions of Epiphany are usually incompatible with older releases of system libraries like GTK+, so it’s not practical or expected for distributions to update to new major versions.

My current working policy is to support three stable branches at once: the latest stable release (currently Epiphany 3.22), the previous stable release (currently Epiphany 3.20), and an LTS branch defined by whatever’s currently in Ubuntu LTS and elementary OS (currently Epiphany 3.18). It was nice of elementary OS to make Epiphany its default web browser, and I would hardly want to make it difficult for its users to receive updates.

Three branches can be annoying at times, and it’s a lot more than is typical for a GNOME application, but a web browser is not a typical application. For better or for worse, the majority of our users are going to be stuck on Epiphany 3.18 for a long time, and it would be a shame to leave them completely without updates. That said, the 3.18 and 3.20 branches are very stable and only getting bugfixes and occasional releases for the most serious issues. In contrast, I try to backport all significant bugfixes to the 3.22 branch and do a new release every month or thereabouts.

So that’s why I just released another update for Epiphany 3.18, which was originally released in September 2015. Compare this to the long-term support policies of Chrome (which supports only the latest version of the browser, and only for six weeks) or Firefox (which provides nine months of support for an ESR release), and I think we compare quite favorably. (A stable WebKit series like 2.14 is only supported for six months, but that’s comparable to Firefox.) Not bad?

By Michael Catanzaro at February 08, 2017 05:56 AM

February 07, 2017

Next-generation 3D Graphics on the Web

Surfin’ Safari

Apple’s WebKit team today proposed a new Community Group at the W3C to discuss the future of 3D graphics on the Web, and to develop a standard API that exposes modern GPU features including low-level graphics and general purpose computation. W3C Community Groups allow all to freely participate, and we invite browser engineers, GPU hardware vendors, software developers and the Web community to join us.

To kick off the discussion, we’re sharing an API proposal, and a prototype of that API for the WebKit Open Source project. We hope this is a useful starting point, and look forward to seeing the API evolve as discussions proceed in the Community Group.

UPDATE: There is now a prototype implementation and demos of WebGPU.

Let’s cover the details of how we got to this point, and how this new group relates to existing Web graphics APIs such as WebGL.

First, a Little History

There was a time where the standards-based technologies for the Web produced pages with static content, and the only graphics were embedded images. Before long, the Web started adding more features that developers could access via JavaScript. Eventually, there was enough demand for a fully programmable graphics API, so that scripts could create images on the fly. Thus the canvas element and its associated 2D rendering API were born inside WebKit, quickly spread to other browser engines, and standardized soon afterward.

Over time, the type of applications and content that people were developing for the Web became more ambitious, and began running into limitations of the platform. One example is gaming, where performance and visual quality are essential. There was demand for games in browsers, but most games were using APIs that provided 3D graphics using the power of Graphics Processing Units (GPUs). Mozilla and Opera showed some experiments that exposed a 3D rendering context from the canvas element, and they were so compelling that the community decided to gather to standardize something that everyone could implement.

All the browser engines collaborated to create WebGL, the standard for rendering 3D graphics on the Web. It was based on OpenGL ES, a cross-platform API for graphics targeted at embedded systems. This was the right starting place, because it made it possible to implement the same API in all browsers easily, especially since most browser engines were running on systems that had support for OpenGL. And even when the system didn’t directly support OpenGL, the API sat at a high enough level of abstraction for projects like ANGLE to emulate it on top of other technologies. As OpenGL evolved, WebGL could follow.

WebGL has unleashed the power of graphics processors to developers on an open platform, and all major browsers support WebGL 1, allowing console-quality games to be built for the Web, and communities like three.js to flourish. Since then, the standard has evolved to WebGL 2 and, again, all major browser engines, including WebKit, are committed to supporting it.

What’s Next?

Meanwhile, GPU technology has improved and new software APIs have been created to better reflect the designs of modern GPUs. These new APIs exist at a lower level of abstraction and, due to their reduced overhead, generally offer better performance than OpenGL. The major platform technologies in this space are Direct3D 12 from Microsoft, Metal from Apple, and Vulkan from the Khronos Group. While these technologies have similar design concepts, unfortunately none are available across all platforms.

So what does this mean for the Web? These new technologies are clearly the next evolutionary step for content that can benefit from the power of the GPU. The success of the web platform requires defining a common standard that allows for multiple implementations, but here we have several graphics APIs that have nuanced architectural differences. In order to expose a modern, low-level technology that can accelerate graphics and computation, we need to design an API that can be implemented on top of many systems, including those mentioned above. With a broader landscape of graphics technologies, following one specific API like OpenGL is no longer possible.

Instead we need to evaluate and design a new web standard that provides a core set of required features, an API that can be implemented on a mix of platforms with different system graphics technologies, and the security and safety required to be exposed to the Web.

We also need to consider how GPUs can be used outside of the context of graphics and how the new standard can work in concert with other web technologies. The standard should expose the general-purpose computational functionality of modern GPUs. Its design should fit with established patterns of the Web, to make it easy for developers to adopt the technology. It needs to be able to work well with other critical emerging web standards like WebAssembly and WebVR. And most importantly, the standard should be developed in the open, allowing both industry experts and the broader web community to participate.

The W3C provides the Community Group platform for exactly this situation. The “GPU for the Web” Community Group is now open for membership.

WebKit’s Initial API Proposal

We anticipated the situation of next-generation graphics APIs a few years ago and started prototyping in WebKit, to validate that we could expose a very low-level GPU API to the Web, and still get worthwhile performance improvements. Our results were very encouraging, so we are sharing the prototype with the W3C Community Group. We will also start landing code in WebKit soon, so that you can try it out for yourself. We don’t expect this to become the actual API that ends up in the standard, and maybe not even the one that the Community Group decides to start with, but we think there is a lot of value in working code. Other browser engines have made their own similar prototypes. It will be exciting to collaborate with the community and come up with a great new technology for graphics.

Let’s take a look at our experiment in detail, which we call “WebGPU”.

Getting a Rendering Context and Rendering Pipeline

The interface to WebGPU is, as expected, via the canvas element.

let canvas = document.querySelector("canvas");
let gpu = canvas.getContext("webgpu"); 

WebGPU is much more object-oriented than WebGL. In fact, that is where some of the efficiencies come from. Rather than setting up state before each draw operation, WebGPU allows you to create and store objects that represent state, along with objects that can process a set of commands. This way we can do some validation up front as the states are created, reducing the work we need to perform during a drawing operation.

A WebGPU context exposes graphics commands and parallel compute commands. Let’s just assume we want to draw something, so we’ll be using a graphics pipeline. The most important elements in the pipeline are the shaders, which are programs that run on the GPU to process the geometric data and provide a color for each drawn pixel. Shaders are typically written in a language that is specialized for graphics.

Deciding on a shading language in a Web API is interesting because there are many factors to consider. We need a language that is powerful, allows programs to be easily created, can be serialized into a format that is efficient for transfer, and can be validated by the browser to make sure the shader is safe. Parts of the industry are moving to shader representations that can be generated from many source formats, sort of like an assembly language. Meanwhile, the Web has thrived on the “View Source” approach, where human readable code is valuable. We expect the discussions around the shading language to be one of the most fun parts of the standardization process, and look forward to hearing community opinions.

For our WebGPU prototype, we decided to defer the issue and just accept an existing language for now. Since we were building on Apple platforms we picked the Metal Shading Language. How do we load our shaders into WebGPU?

let library = gpu.createLibrary( /* source code */ );

let vertexFunction = library.functionWithName("vertex_main");
let fragmentFunction = library.functionWithName("fragment_main");

We ask the gpu object to load and compile the shader from source code, producing a WebGPULibrary. The shader code itself isn’t that important—imagine a very simple vertex and fragment combination. A library can hold multiple shader functions, so we extract the functions we want to use in this pipeline by name.

Now we can create our pipeline.

// The details of the pipeline.
let pipelineDescriptor = new WebGPURenderPipelineDescriptor();
pipelineDescriptor.vertexFunction = vertexFunction;
pipelineDescriptor.fragmentFunction = fragmentFunction;
pipelineDescriptor.colorAttachments[0].pixelFormat = "BGRA8Unorm";

let pipelineState = gpu.createRenderPipelineState(pipelineDescriptor);

We get a new WebGPURenderPipelineState object from the context by passing in the description of what we need. In this case we say which vertex and fragment shaders we’ll use, as well as the type of image data we want.

Buffers

In order to draw something you need to provide data to the rendering pipeline using a buffer. WebGPUBuffer is the object that can hold such data, such as geometry coordinates, colors and normal vectors.

let vertexData = new Float32Array([ /* some data */ ]);
let vertexBuffer = gpu.createBuffer(vertexData);

In this case we have data for each vertex we want to draw in our geometry inside a Float32Array, and then create a WebGPUBuffer from that data. We’ll use this buffer later when we issue a draw operation.

Vertex data such as this rarely changes, but there are data that change nearly every time a draw happens. These are called uniforms. A common example of a uniform is the current transformation matrix representing a camera position. WebGPUBuffers are used for uniforms too, but in this case we want to write into the buffer after we’ve created it.

// Imagine "buffer" is a WebGPUBuffer that was allocated earlier.
// buffer.contents exposes an ArrayBufferView, that we then interpret
// as an array of 32-bit floating point numbers.
let uniforms = new Float32Array(buffer.contents);

// Set the uniform of interest.
uniforms[42] = Math.PI;

One of the nice things about this is that a JavaScript developer can wrap the ArrayBufferView with a class or Proxy object with custom getters and setters, so that the external interface looks like typical JavasScript objects. The wrapper object then updates the right ranges within the underlying Array that the buffer is using.

Drawing

Before we can tell the WebGPU context to draw something, we need to set up some state. This includes the destination of the rendering (a WebGPUTexture that will eventually be shown in the canvas ), and a description of how that texture is initialized and used. That state is stored in a WebGPURenderPassDescriptor.

// Ask the context for the texture it expects the next
// frame to be drawn into.
let drawable = gpu.nextDrawable();

let passDescriptor = new WebGPURenderPassDescriptor();
passDescriptor.colorAttachments[0].loadAction = "clear";
passDescriptor.colorAttachments[0].storeAction = "store";
passDescriptor.colorAttachments[0].clearColor = [0.8, 0.8, 0.8, 1.0];
passDescriptor.colorAttachments[0].texture = drawable.texture;

First we ask the WebGPU context for an object that represents the next frame that we can draw into. This is what is ultimately copied into the canvas element. After we’ve finished our drawing code, we tell WebGPU that we’re done with the drawable object so it can display the results and prepare the next frame.

The WebGPURenderPassDescriptor is initialized indicating that we won’t be reading from this texture in a draw operation (the loadAction is clear), that we will use the texture after the draw (storeAction is store), and the color it should fill the texture with.

Next, we create the objects we’ll need to hold the actual draw operations. A WebGPUCommandQueue has a set of WebGPUCommandBuffers. We push operations into a WebGPUCommandBuffer using a WebGPUCommandEncoder.

let commandQueue = gpu.createCommandQueue();
let commandBuffer = commandQueue.createCommandBuffer();

// Use the descriptor we created above.
let commandEncoder = commandBuffer.createRenderCommandEncoderWithDescriptor(
                        passDescriptor);

// Tell the encoder which state to use (i.e. shaders).
commandEncoder.setRenderPipelineState(pipelineState);

// And, lastly, the encoder needs to know which buffer
// to use for the geometry.
commandEncoder.setVertexBuffer(vertexBuffer, 0, 0);

At this point we have set up a rendering pipeline with shaders, a buffer holding the geometry, a queue that we’ll submit draw operations to, and an encoder that can submit to the queue. Now we just push the actual command to draw into the encoder.

// We know our buffer has three vertices. We want to draw them
// with filled triangles.
commandEncoder.drawPrimitives("triangle", 0, 3);
commandEncoder.endEncoding();

// All drawing commands have been submitted. Tell WebGPU to
// show/present the results in the canvas once the queue has
// been processed.
commandBuffer.presentDrawable(drawable);
commandBuffer.commit();

Like most 3D graphics sample code, it feels like a lot of work in order to draw a simple shape. But it’s not a waste. An advantage of these modern APIs is that much of that code is creating objects that can be reused to draw other things. For example, often content will only need a single WebGPUCommandQueue instance, or can create multiple WebGPURenderPipelineState objects up-front for different shaders. And again, the browser can do a lot of early validation to reduce the overhead during the drawing operations.

Hopefully this gave you a taste of the WebGPU proposal. Even though the final API produced by the W3C Community Group may be very different, we expect a lot of the general design principles to be common.

An Open Invitation

Apple’s WebKit team has proposed establishing a W3C Community Group for GPU on the Web to be the forum for this work, and today you are invited to join us in defining the next standard for GPUs. Our proposal has been received positively by our colleagues at other browser engines, GPU vendors, and framework developers. With support from the industry, we invite all with an interest or expertise in this area to join the Community Group.

By Dean Jackson at February 07, 2017 08:00 PM

February 02, 2017

New Interaction Behaviors in iOS 10

Surfin’ Safari

Last year we published a blog post about getting more responsive tapping on iOS. With the release of iOS 10, we’ve made some minor adjustments to the behavior of our fast tapping, and an important change to a very common user interaction: pinch zooming.

Fast Tapping

A common complaint on iOS 9 and earlier was that events triggered by a user tapping the screen were slightly delayed. This was because the browser was waiting to see if the gesture was a double-tap, indicating the user wanted to zoom. Once a small delay had expired without seeing a second tap, the browser would know it was a single tap and dispatch the event. This made some pages that were designed for an instant reaction to tapping feel slightly slow.

As we described in our introductory post, iOS 10 detects situations when a page can support faster taps and dispatches the events instantly, making Web sites feel much more responsive. The feedback has been very positive. However, before iOS 10 shipped we made a few tweaks to the method described in the original article. Here are the current details.

Enabling fast tapping on iOS 10 requires pages to have the following:

  1. There must be a meta tag of type viewport
  2. The viewport must be defined to have width=device-width
  3. The content must be at a scale of 1, which means both:
    a. the user has not manually zoomed off a scale of 1 (e.g. they can have zoomed, but they must have returned to the original scale)
    b. the page content wasn’t so wide that the browser was forced to shrink it to fit

Note: Explaining 3.b, WebKit often sees pages that define width=device-width but then explicitly layout content at very large widths, often greater than 1000px. These sites are usually designed for a large screen, and have added a viewport tag in the hope it makes it a mobile-friendly design. Unfortunately this is not the case. If the browser respected a misleading viewport rule such as this, the user would only see the top left corner of the content—clearly a bad experience. Instead WebKit looks for this scenario and adjusts the zoom factor appropriately. Conceptually, this behaviour is the same as the browser loading the page, then the user pinch zooming out far enough to see all the content, which means the page is no longer at a scale of 1.

Zooming Everywhere

Safari on iOS 10 allows the user to pinch zoom on every page. As a developer, you should be aware of this, and make sure your content works well when zoomed.

What changed? Prior to iOS 10, Safari allowed the content to block the user from zooming on a page by setting user-scalable=no in the viewport, or appropriate min-scale and max-scale values. This unfortunately enabled pages to pick a text size that was unreadable while giving the user no way to zoom. Also, there is now such a wide range of devices with different display dimensions, screen resolutions, pixel densities… it is very difficult to choose an appropriate text size in a design.

Now, we ignore the user-scalable, min-scale and max-scale settings. If you have content that disabled zoom, please test it on iOS 10, and understand that many users will be zooming now.

As users, we’ve all come across content that is too small to comfortably read. We know that a huge number of people appreciate this zooming improvement, even though it might mean some sites that attempt to block zooming are broken until they update.

Zooming in WKWebView Content

You might have an app that mixes a WKWebView with native content, and having the user being able to scale only that content may be inappropriate. In these cases, you can prevent the user from zooming by setting a new property on WKWebViewConfiguration:

There is new API on WKWebViewConfiguration:

var ignoresViewportScaleLimits: Bool

The default value is false, which means that WKWebView content will allow the content to block zooming. This preserves behavior with older versions of iOS.

Meanwhile, Safari and SafariViewController set the value to true. If your app uses a WKWebView in a similar manner, such as showing a large amount of text, we encourage you to change the value to true too.

For feedback, email web-evangelist@apple.com or tweet to @webkit.

By Dean Jackson at February 02, 2017 05:18 PM

January 27, 2017

Enhanced Editing with Input Events

Surfin’ Safari

Today, the easiest way to create a rich text editor on the web is to add the contenteditable attribute to an element. This allows users to insert, delete and style web content and works great for many uses of editing on the web. However, some web-based rich text editors, such as iCloud Pages or Google Docs, employ JavaScript-based implementations of rich text editing by capturing key events using a hidden contenteditable element and then using the information in these captured events to update the DOM. This gives more control over the editing experience across browsers and platforms.

However, such an approach comes with a weakness — capturing key events only covers a subset of text editing actions. Examples of this include the bold/italic/underline buttons on iOS, the context menu on macOS, and the editing controls shown in the Touch Bar in Safari. While some of these editing actions dispatch input events, these input events do not convey any notion of what the user is trying to accomplish — they only indicate that some editable content has been altered, which is not enough information for a JavaScript-based editor to respond appropriately.

Furthermore, you may need not only to know when a user has performed some editing action, but also to replace the default behavior resulting from this editing action with custom behavior. For instance, you could imagine such functionality being useful for an editable area that only inserts pasted or dropped content as plaintext rather than HTML. Existing input events do not suffice for this purpose, since they are dispatched after the editing action has been carried out, and are therefore non-preventable. Let’s see how input events can address these issues.

Revisiting Input Events

The latest Input Events specification introduces beforeinput events, which are dispatched before any change resulting from the editing action has taken place. These events are cancelable by calling preventDefault() on the event, which also prevents the subsequent input event from being dispatched. Additionally, each input and beforeinput event now contains information relevant to the editing action being performed. Here is an overview of the attributes added to input events:

  • InputEvent.inputType describes the type of editing action being performed. A full list of input types are enumerated in the official spec, linked above. The names of input type are also share prefixes — for instance, all input types that cause text to be inserted begin with the string "insert". Some examples of input types are insertReplacementText, deleteByCut, and formatBold.
  • InputEvent.data contains plaintext data to be inserted in the case of insert* input types, and style information in the case of format* input types. However, if the content being inserted contains rich text, this attribute will be null, and the dataTransfer attribute will be used instead.
  • InputEvent.dataTransfer contains both rich and plain text data to be inserted in a contenteditable area. The rich text data is retrieved as an HTML string using dataTransfer.getData("text/html"), while the plain text representation is retrieved using dataTransfer.getData("text/plain").
  • InputEvent.getTargetRanges is a method that returns a list of ranges that will be affected by editing. For example, when spellchecking or autocorrect replaces typed text with replacement text, the target ranges of the beforeinput event indicate the existing ranges of text that are about to be replaced. It is important to note that each range in this list is a type of StaticRange, as opposed to a normal Range; while a StaticRange is similar to a normal Range in that it has start and end containers and start and end offsets, it does not automatically update as the DOM is modified.

Let’s see how this all comes together in a simple example.

Formatting-only Regions Example

Suppose we’re creating a simple editable area where a user can compose a response to an email or comment. Let’s say we want to restrict editing within certain parts of the message that represent quotes from an earlier response — while we allow the user to change the style of text within a quote, we will not allow the user to edit the text content of the quote. Consider the HTML below:

HTML

<body onload="setup()">
    <div id="editor" contenteditable>
        <p>This is some regular content.</p>
        <p>This text is fully editable.</p>
        <div class="quote" style="background-color: #EFFEFE;">
            <p>This is some quoted content.</p>
            <p>You can only change the format of this text.</p>
        </div>
        <p>This is some more regular content.</p>
        <p>This text is also fully editable.</p>
    </div>
</body>

This gives us the basic ability to edit the contents of our message, which contains a quoted region highlighted in blue. Our goal is to prevent the user from performing editing actions that modify the text content of this quoted region. To accomplish this, we first attach a beforeinput event handler to our editable element. In this handler, we call event.preventDefault() if the input event is not a formatting change (i.e. its inputType does not begin with 'format') and it might modify the contents of the quoted region, which we can tell by inspecting the target ranges of the event. If any of the affected ranges starts or ends within the quoted region, we immediately prevent editing and bail from the handler.

JavaScript

function setup() {
    editor.addEventListener("beforeinput", event => {
        if (event.inputType.match(/^format/))
            return;

        for (let staticRange of event.getTargetRanges()) {
            if (nodeIsInsideQuote(staticRange.startContainer)
                || nodeIsInsideQuote(staticRange.endContainer)) {
                event.preventDefault();
                return;
            }
        }
    });

    function nodeIsInsideQuote(node) {
        let currentElement = node.nodeType == Node.ELEMENT_NODE ? node : node.parentElement;
        while (currentElement) {
            if (currentElement.classList.contains("quote"))
                return true;
            currentElement = currentElement.parentElement;
        }
        return false;
    }
}

After adding the script, attempts to insert or delete text from the quoted region no longer result in any changes, but the format of the text can still be changed. For instance, users can bold text by right clicking selected text in the quote and then choosing Font &rtrif Bold, or by tapping the Bold button in the Touch Bar in Safari. You can check out the final result in an Input Events demo.

Additional Work

Input events are crucial if you want to build a great text editor, but they don’t yet solve every problem. We believe they could be enhanced to give web developers control over more native editing behaviors on macOS and iOS. For instance, it would be useful for an editable element to specify the set of input types that it supports, so that (1) input events of an unsupported input type are not dispatched on the element, and (2) the browser will not show enabled editing UI that would dispatch only unsupported input types.

Another capability is for web pages to provide a custom handler that WebKit can use to determine the style of the current selection. This is particularly useful in the context of the bold/italic/underline controls on both the iOS keyboard and the Touch Bar — these buttons are highlighted if the current selection is already bold, italic or underlined, indicating to the user that interacting with these controls will undo bold, italic or underlined style. If a web page prevents default behavior and renders these text styles via custom means, it would need to inform WebKit of current text style to ensure that platform controls remain in sync with the content.

Input events are enabled by default as of Safari Technology Preview 18, and available in Safari 10.1 in the recent beta releases of macOS 10.12.4 and iOS 10.3. Please give our example a try and experiment with the feature! If you have any questions or comments, please contact me at wenson_hsieh@apple.com, or Jonathan Davis, Apple’s Web Technologies Evangelist, at @jonathandavis or web-evangelist@apple.com.

By Wenson Hsieh at January 27, 2017 09:00 PM

January 25, 2017

Release Notes for Safari Technology Preview 22

Surfin’ Safari

Safari Technology Preview Release 22 is now available for download for macOS Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 210274-210845.

JavaScript

  • Fixed an error when calling an async arrow function which is in a class’s member function (r210558)
  • Improved the speed of Array.prototype.slice in DFG/FTL JITs (r210695)

CSS

  • Implemented scroll-snap-type:proximity scroll snapping (r210560)
  • Fixed updating :active and :hover states across Shadow DOM slots (r210564)
  • Fixed a CSS Grid issue with very big values for grid lines (r210320)
  • Implemented baseline positioning for grid containers (r210792)
  • Made the CSS Grid sizing data persistent through layouts (r210669)
  • Fixed overflow:scroll scroll position getting restored on back navigation (r210329)

Form Validation

  • Fixed the validation message to use singular form of “character” when maxLength value is 1 (r210447)
  • Truncated lengthy validation messages with an ellipsis (r210425)
  • Aligned email validation with the latest HTML specification (r210361)

Web Inspector

  • Added “Persist Logs on Navigation” to Settings tab (r210793)
  • Added UI zoom level to the Settings tab (r210788)
  • Added Command-, (⌘,) keyboard shortcut to open Settings tab (r210772)
  • Fixed showing application cache details in the Storage tab (r210311)
  • Improved the cubic-bezier editor for invalid inputs in component fields (r210674)
  • Fixed an issue clearing pseudo classes toggled on in the Styles sidebar when Web Inspector is closed (r210316)
  • Fixed resources disappearing from the network tab when an iframe gets removed (r210759)
  • Fixed restoring Settings tab when reopening Web Inspector (r210764)
  • Improved the layout of the spring function editor with left-aligned labels and slider tracks (r210618)

Web API

  • Provided more detailed role descriptions for many new HTML5 input types (r210295)
  • Aligned the innerText setter with the HTML specification (r210767)
  • Fixed an issue changing the modified timestamp for a given gamepad when it is updated (r210827)
  • Changed pointer lock to release when the page state is reset for any reason, not just when the process exits (r210281)
  • Fixed editing of nested RTL-to-LTR content (r210831)
  • Support iterating over URLSearchParams objects (r210593)
  • Changed the first parameter of Event.initEvent() to be mandatory (r210559)

Media

  • Added support for MediaKeys.generateRequest() (r210555)
  • Added protection against the MediaPlayer being destroyed in the middle of a load() (r210747)

Rendering

  • Fixed an issue that caused the highlighting of text using the Yoon Gothic webfont to reflow (r210456)
  • Fixed reordering text inside a blockquote when un-indenting the text (r210524)

Security

  • Volume-separated file URLs: disallowed a file URL on one volume from loading a file on another volume in macOS 10.12.4 or later (r210571)
Note Safari WebDriver is broken in this release. We expect this will be fixed in release 23.

By Jon Davis at January 25, 2017 06:00 PM

January 20, 2017

Introducing Riptide: WebKit’s Retreating Wavefront Concurrent Garbage Collector

Surfin’ Safari

As of r209827, 64-bit ARM and x86 WebKit ports use a new garbage collector called Riptide. Riptide reduces worst-case pause times by allowing the app to run concurrently to the collector. This can make a big difference for responsiveness since garbage collection can easily take 10 ms or more, even on fast hardware. Riptide improves WebKit’s performance on the JetStream/splay-latency test by 5x, which leads to a 5% improvement on JetStream. Riptide also improves our Octane performance. We hope that Riptide will help to reduce the severity of GC pauses for many different kinds of applications.

This post begins with a brief background about concurrent GC (garbage collection). Then it describes the Riptide algorithm in detail, including the mature WebKit GC foundation, on which it is built. The field of incremental and concurrent GC goes back a long time and WebKit is not the first system to use it, so this post has a section about how Riptide fits into the related work. This post concludes with performance data.

Introduction

Garbage collection is expensive. In the worst case, for the collector to free a single object, it needs to scan the entire heap to ensure that no objects have any references to the one it wants to free. Traditional collectors scan the entire heap periodically, and this is roughly how WebKit’s collector has worked since the beginning.

The problem with this approach is that the GC pause can be long enough to cause rendering loops to miss frames, or in some cases it can even take so long as to manifest as a spin. This is a well-understood computer science problem. The originally proposed solution for janky GC pauses, by Guy Steele in 1975, was to have one CPU run the app and another CPU run the collector. This involves gnarly race conditions that Steele solved with a bunch of locks. Later algorithms like Baker’s were incremental: they assumed that there was one CPU, and sometimes the application would call into the collector but only for bounded increments of work. Since then, a huge variety of incremental and concurrent techniques have been explored. Incremental collectors avoid some synchronization overhead, but concurrent collectors scale better. Modern concurrent collectors like DLG (short for Doligez, Leroy, Gonthier, published in POPL ’93 and ’94) have very cheap synchronization and almost completely avoid pausing the application. Taking garbage collection off-core rather than merely shortening the pauses is the direction we want to take in WebKit, since almost all of the devices WebKit runs on have more than one core.

The goal of WebKit’s new Riptide concurrent GC is to achieve a big reduction in GC pauses by running most of the collector off the main thread. Because Riptide will be our always-on default GC, we also want it to be as efficient — in terms of speed and memory — as our previous collector.

The Riptide Algorithm

The Riptide collector combines:

  • Marking: The collector marks objects as it finds references to them. Objects not marked are deleted. Most of the collector’s time is spent visiting objects to find references to other objects.
  • Constraints: The collector allows the runtime to supply additional constraints on when objects should be marked, to support custom object lifetime rules.
  • Parallelism: Marking is parallelized on up to eight logical CPUs. (We limit to eight because we have not optimized it for more CPUs.)
  • Generations: The collector lets the mark state of objects stick if memory is plentiful, allowing the next collection to skip visiting those objects. Sticky mark bits are a common way of implementing generational collection without copying. Collection cycles that let mark bits stick are called eden collections in WebKit.
  • Concurrency: Most of the collector’s marking phase runs concurrently to the program. Because this is by far the longest part of collection, the remaining pauses tend to be 1 ms or less. Riptide’s concurrency features kick in for both eden and full collections.
  • Conservatism: The collector scans the stack and registers conservatively, that is, checking each word to see if it is in the bounds of some object and then marking it if it is. This means that all of the C++, assembly, and just-in-time (JIT) compiler-generated code in our system can store heap pointers in local variables without any hassles.
  • Efficiency: This is our always-on garbage collector. It has to be fast.

This section describes how the collector works. The first part of the algorithm description focuses on the WebKit mark-sweep algorithm on which Riptide is based. Then we dive into concurrency and how Riptide manages to walk the heap while the heap is in flux.

Efficient Mark-Sweep

Riptide retains most of the basic architecture of WebKit’s mature garbage collection code. This section gives an overview of how our mark-sweep collector works: WebKit uses a simple segregated storage heap structure. The DOM, the Objective-C API, the type inference runtime, and the compilers all introduce custom marking constraints, which the GC executes to fixpoint. Marking is done in parallel to maximize throughput. Generational collection is important, so WebKit implements it using sticky mark bits. The collector uses conservative stack scanning to ease integration with the rest of WebKit.

Simple Segregated Storage

WebKit has long used the simple segregated storage heap structure for small and medium-sized objects (up to about 8KB):

  • Small and medium-sized objects are allocated from segregated free lists. Given a desired object size, we perform a table lookup to find the appropriate free list and then pop the first object from this list. The lookup table is usually constant-folded by the compiler.
  • Memory is divided into 16KB blocks. Each block contains cells. All cells in a block have the same cell size, called the block’s size class. In WebKit jargon, an object is a cell whose JavaScript type is “object”. For example, a string is a cell but not an object. The GC literature would typically use object to refer to what our code would call a cell. Since this post is not really concerned with JavaScript types, we’ll use the term object to mean any cell in our heap.
  • At any time, the active free list for a size class contains only objects from a single block. When we run out of objects in a free list, we find the next block in that size class and sweep it to give it a free list.

Sweeping is incremental in the sense that we only sweep a block just before allocating in it. In WebKit, we optimize sweeping further with a hybrid bump-pointer/free-list allocator we call bump’n’pop (here it is in C++ and in the compilers). A per-block bit tells the sweeper if the block is completely empty. If it is, the sweeper will set up a bump-pointer arena over the whole block rather than constructing a free-list. Bump-pointer arenas can be set up in O(1) time while building a free-list is a O(n) operation. Bump’n’pop achieves a big speed-up on programs that allocate a lot because it avoids the sweep for totally-empty blocks. Bump’n’pop’s bump-allocator always bumps by the block’s cell size to make it look like the objects had been allocated from the free list. This preserves the block’s membership in its size class.

Large objects (larger than about 8KB) are allocated using malloc.

Constraint-Based Marking

Garbage collection is ordinarily a graph search problem and the heap is ordinarily just a graph: the roots are the local variables, their values are directional edges that point to objects, and those objects have fields that each create edges to some other objects. WebKit’s garbage collector also allows the DOM, compiler, and type inference system to install constraint callbacks. These constraints are allowed to query which objects are marked and they are allowed to mark objects. The WebKit GC algorithm executes these constraints to fixpoint. GC termination happens when all marked objects have been visited and none of the constraints want to mark anymore objects. In practice, the constraint-solving part of the fixpoint takes up a tiny fraction of the total time. Most of the time in GC is spent performing a depth-first search over marked objects that we call draining.

Parallel Draining

Draining takes up most of the collector’s time. One of our oldest collector optimizations is that draining is parallelized. The collector has a draining thread on each CPU. Each draining thread has its own worklist of objects to visit, and ordinarily it runs a graph search algorithm that only sees this worklist. Using a local worklist means avoiding worklist synchronization most of the time. Each draining thread will check in with a global worklist under these conditions:

  • It runs out of work. When a thread runs out of work, it will try to steal 1/Nth of the global worklist where N is the number of idle draining threads. This means acquiring the global worklist’s lock.
  • Every 100 objects visited, the draining thread will consider donating about half of its worklist to the global worklist. It will only do this if the global worklist is empty, the global worklist lock can be acquired without blocking, and the local worklist has at least two entries.

This algorithm appears to scale nicely to about eight cores, which is good enough for the kinds of systems that WebKit usually runs on.

Draining in parallel means having to synchronize marking. Our marking algorithm uses a lock-free CAS (atomic compare-and-swap instruction) loop to set mark bits.

Sticky Mark Bits

Generational garbage collection is a classic throughput optimization first introduced by Lieberman and Hewitt and Ungar. It assumes that objects that are allocated recently are unlikely to survive. Therefore, focusing the collector on objects that were allocated since the last GC is likely to free up lots of memory — almost as much as if we collected the whole heap. Generational collectors track the generation of objects: either young or old. Generational collectors have (at least) two modes: eden collection that only collects young objects and full collection that collects all objects. During an eden collection, old objects are only visited if they are suspected to contain pointers to new objects.

Generational collectors need to overcome two hurdles: how to track the generation of objects, and how to figure out which old objects have pointers to new objects.

The collector needs to know the generation of objects in order to determine which objects can be safely ignored during marking. In a traditional generational collector, eden collections move objects and then use the object’s address to determine its generation. Our collector does not move objects. Instead, it uses the mark bit to also track generation. Quite simply, we don’t clear any mark bits at the start of an eden collection. The marking algorithm will already ignore objects that have their mark bits set. This is called sticky mark bit generational garbage collection.

The collector will avoid visiting old objects during an eden collection. But it cannot avoid all of them: if an old object has pointers to new objects, then the collector needs to know to visit that old object. We use a write barrier — a small piece of instrumentation that executes after every write to an object — that tells the GC about writes to old objects. In order to cheaply know which objects are old, the object header also has a copy of the object’s state: either it is old or it is new. Objects are allocated new and labeled old when marked. When the write barrier detects a write to an old object, we tell the GC by setting the object’s state to old-but-remembered and putting it on the mark stack. We use separate mark stacks for objects marked by the write barrier, so when we visit the object, we know whether we are visiting it due to the barrier or because of normal marking (i.e. for the first time). Some accounting only needs to happen when visiting the object for the first time. The complete barrier is simply:

object->field = newValue;
if (object->cellState == Old)
    remember(object);

Generational garbage collection is an enormous improvement in performance on programs that allocate a lot, which is common in JavaScript. Many new JavaScript features, like iterators, arrow functions, spread, and for-of allocate lots of objects and these objects die almost immediately. Generational GC means that our collector does not need to visit all of the old objects just to delete the short-lived garbage.

Conservative Roots

Garbage collection begins by looking at local variables and some global state to figure out the initial set of marked objects. Introspecting the values of local variables is tricky. WebKit uses C++ local variables for pointers to the garbage collector’s heap, but C-like languages provide no facility for precisely introspecting the values of specific variables of arbitrary stack frames. WebKit solves this problem by marking objects conservatively when scanning roots. We use the simple segregated storage heap structure in part because it makes it easy to ask whether an arbitrary bit pattern could possibly be a pointer to some object.

We view this as an important optimization. Without conservative root scanning, C++ code would have to use some API to notify the collector about what objects it points to. Conservative root scanning means not having to do any of that work.

Mark-Sweep Summary

Riptide implements complex notions of reachability via arbitrary constraint callbacks and allows C++ code to manipulate objects directly. For performance, it parallelizes marking and uses generations to reduce the average amount of marking work.

Handling Concurrency

Riptide makes the draining phase of garbage collection concurrent. This works because of a combination of concurrency features:

  • Riptide is able to stop the world for certain tricky operations like stack scanning and DOM constraint solving.
  • Riptide uses a retreating wavefront write barrier to manage races between marking and object mutation. Using retreating wavefront allows us to avoid any impedance mismatch between generational and concurrent collector optimizations.
  • Retreating wavefront collectors can suffer from the risk of GC death spirals, so Riptide uses a space-time scheduler to put that in check.
  • Visiting an object while it is being reshaped is particularly hard, and WebKit reshapes objects as part of type inference. We use an obstruction-free double collect snapshot to ensure that the collector never marks garbage memory due to a visit-reshape race.
  • Lots of objects have tricky races that aren’t on the critial path, so we put a fast, adaptive, and fair lock in every JavaScript object as a handy way to manage them. It fits in two otherwise unused bits.

While we wrote Riptide for WebKit, we suspect that the underlying intuitions could be useful for anyone wanting to write a concurrent, generational, parallel, conservative, and non-copying collector. This section describes Riptide in detail.

Stopping The World and Safepoints

Riptide does draining concurrently. It is a goal to eventually make other phases of the collector concurrent as well. But so long as some phases are not safe to run concurrently, we need to be able to bring the application to a stop before performing those phases. The place where the collector stops needs to be picked so as to avoid reentrancy issues: for example stopping to run the GC in the middle of the GC’s allocator would create subtle problems. The concurrent GC avoids these problems by only stopping the application at those points where the application would trigger a GC. We call these safepoints. When the collector brings the application to a safepoint, we say that it is stopping the world.

Riptide currently stops the world for most of the constraint fixpoint, and resumes the world for draining. After draining finishes, the world is again stopped. A typical collection cycle may have many stop-resume cycles.

Retreating Wavefront

Draining concurrently means that just as we finish visiting some object, the application may store to one of its fields. We could store a pointer to an unmarked object into an object that is already visited, in which case the collector might never find that unmarked object. If we don’t do something about this, the collector would be sure to prematurely delete objects due to races with the application. Concurrent garbage collectors avoid this problem using write barriers. This section describes Riptide’s write barrier.

Write barriers ensure that the state of the collector is still valid after any race, either by marking objects or by having objects revisited (GC Handbook, chapter 15). Marking objects helps the collector make forward progress; intuitively, it is like advancing the collector’s wavefront. Having objects revisited retreats the wavefront. The literature of full of concurrent GC algorithms, like the Metronome, C4, and DLG, that all use some kind of advancing wavefront write barrier. The simplest such barrier is Dijkstra’s, which marks objects anytime a reference to them is created. I used these kinds of barriers in my past work because they make it easy to make the collector very deterministic. Adding one of those barriers to WebKit would be likely to create some performance overhead since this means adding new code to every write to the heap. But the retreating wavefront barrier, originally invented by Guy Steele in 1975, works on exactly the same principle as our existing generational barrier. This allows Riptide to achieve zero barrier overhead by reusing WebKit’s existing barrier.

It’s easiest to appreciate the similarity by looking at some barrier code. Our old generational barrier looked like this:

object->field = newValue;
if (object->cellState == Old)
    remember(object);

Steele’s retreating wavefront barrier looks like this:

object->field = newValue;
if (object->cellState == Black)
    revisit(object);

Retreating wavefront barriers operate on the same principle as generational barriers, so it’s possible to use the same barrier for both. The only difference is the terminology. The black state means that the collector has already visited the object. This barrier tells the collector to revisit the object if its cellState tells us that the collector had already visited it. This state is part of the classic tri-color abstraction: white means that the GC hasn’t marked the object, grey means that the object is marked and on the mark stack, and black means that the object is marked and has been visited (so is not on the mark stack anymore). In Riptide, the tri-color states that are relevant to concurrency (white, grey, black) perfectly overlap with the sticky mark-bit states that are relevant to generations (new, remembered, old). The Riptide cell states are as follows:

  • DefinitelyWhite: the object is new and white.
  • PossiblyGrey: the object is grey, or remembered, or new and white.
  • PossiblyBlack: the object is black and old, or grey, or remembered, or new and white.

A naive combination generational/concurrent barrier might look like this:

object->field = newValue;
if (object->cellState == PossiblyBlack)
    slowPath(object);

This turns out to need tweaking to work. The PossiblyBlack state is too ambiguous, so the slowPath needs additional logic to work out what the object’s state really was. Also, the order of execution matters: the CPU must run the object->cellState load after it runs the object->field store. That’s hard, since CPUs don’t like to obey store-before-load orderings. Finally, we need to guarantee that the barrier cannot retreat the wavefront too much.

Disambiguating Object State

The GC uses the combination of the object’s mark bit in the block header and the cellState byte in the object’s header to determine the object’s state. The GC clears mark bits at the start of full collection, and it sets the cellState during marking and barriers. It doesn’t reset objects’ cellStates back to DefinitelyWhite at the start of a full collection, because it’s possible to infer that the cellState should have been reset by looking at the mark bit. It’s important that the collector never scans the heap to clear marking state, and even mark bits are logically cleared using versioning. If an object is PossiblyBlack or PossiblyGrey and its mark bit is logically clear, then this means that the object is really white. Riptide’s barrier slowPath is almost like our old generational slow path but it has a new check: it will not do anything if the mark bit of the target object is not set, since this means that we’re in the middle of a GC and the object is actually white. Additionally, the barrier will attempt to set the object back to DefinitelyWhite so that the slowPath path does not have to see the object again (at least not until it’s marked and visited).

Store-Before-Barrier Ordering

The GC must flag the object as PossiblyBlack just before it starts to visit it and the application must store to field before loading object->cellState. Such ordering is not guaranteed on any modern architecture: both x86 and ARM will sink the store below the load in some cases. Inserting an unconditional store-load fence, such as lock; orl $0, (%rsp) on x86 or dmb ish on ARM, would degrade performance way too much. So, we make the fence itself conditional by playing a trick with the barrier’s condition:

object->field = newValue;
if (object->cellState <= blackThreshold)
    slowPath(object);

Where blackThreshold is a global variable. The PossiblyBlack state has the value 0, and when the collector is not running, blackThreshold is 0. But once the collector starts marking, it sets blackThreshold to 100 while the world is stopped. Then the barrier’s slowPath leads with a check like this:

storeLoadFence();
if (object->cellState != PossiblyBlack)
    return;

This means that the application takes a slight performance hit while Riptide is running. In typical programs, this overhead is about 5% during GC and 0% when not GCing. The only additional cost when not GCing is that blackThreshold must be loaded from memory, but we could not detect a slow-down due to this change. The 5% hit during collection is worth fixing, but to put it in perspective, the application used to take a 100% performance hit during GC because the GC would stop the application from running.

The complete Riptide write barrier is emitted as if the following writeBarrier function had been inlined just after any store to target:

ALWAYS_INLINE void writeBarrier(JSCell* target)
{
    if (LIKELY(target->cellState() > blackThreshold))
        return;
    storeLoadFence();
    if (target->cellState() != PossiblyBlack)
        return;
    writeBarrierSlow(target);
}

NEVER_INLINE void writeBarrierSlow(JSCell* target)
{
    if (!isMarked(target)) {
        // Try to label this object white so that we don't take the barrier
        // slow path again.
        if (target->compareExchangeCellState(PossiblyBlack, DefinitelyWhite)) {
            if (Heap::isMarked(target)) {
                // A race! The GC marked the object in the meantime, so
                // pessimistically label it black again.
                target->setCellState(PossiblyBlack);
            }
        }
        return;
    }

    target->setCellState(DefinitelyGrey);
    m_mutatorMarkStack->append(target);
}

The JIT compiler inlines the part of the slow path that rechecks the object’s state after doing a fence, since this helps keep the overhead low during GC. Moreover, our just-in-time compilers optimize the barrier further by removing barriers if storing values that the GC doesn’t care about, removing barriers on newly allocated objects (which must be white), clustering barriers together to amortize the cost of the fence, and removing redundant barriers if an object is stored to repeatedly.

Revisiting

When the barrier does append the object to the m_mutatorMarkStack, the object will get revisited eventually. The revisit could happen concurrently to the application. That’s important since we have seen programs retreat the wavefront enough that the total revisit pause would be too big otherwise.

Unlike advancing wavefront, retreating wavefront means forcing the collector to redo work that it has already done. Without some facilities to ensure collector progress, the collector might never finish due to repeated revisit requests from the write barrier. Riptide tackles this problem in two ways. First, we defer all revisit requests. Draining threads do not service any revisit requests until they have no other work to do. When an object is flagged for revisiting, it stays in the grey state for a while and will only be revisited towards the end of GC. This ensures that if an old object often has its fields overwritten with pointers to new objects, then the GC will usually only scan two snapshots’ worth of those fields: one snapshot whenever the GC visited the object first, and another towards the end when the GC gets around to servicing deferred revisits. Revisit deferral reduces the likelihood of runaway GC, but fully eliminating such pathologies is left to our scheduler.

Space-Time Scheduler

The bitter end of a retreating wavefront GC cycle is not pretty: just as the collector goes to visit the last object on the mark stack, some object that had already been visited gets written to, and winds up back on the mark stack. This can go on for a while, and before we had any mitigations we saw Riptide using 5x more memory than with synchronous collection. This death spiral happens because programs allocate a lot all the time and the collector cannot free any memory until it finishes marking. Riptide prevents death spirals using a scheduler that controls the application’s pace. We call it the space-time scheduler because it links the amount of time that the application gets to run for in a timeslice to the amount of space that the application has used by allocating in the collector’s headroom.

The space-time scheduler ensures that the retreating wavefront barrier cannot wreak havoc by giving the collector an unfair advantage: it will periodically stop the world for short pauses even when the collector could be running concurrently. It does this just so the collector can always outpace the application in case of a race. If this was meant as a garbage collector for servers, you could imagine providing the user with a bunch of knobs to control the schedule of these synthetic pauses. Different applications will have different ideal pause lengths. Applications that often write to old memory will retreat the collector’s wavefront a lot, and so they will need a longer pause to ensure termination. Functional-style programs tend to only write to newly allocated objects, so those could get away with a shorter pause. We don’t want web users or web developers to have to configure our collector, so the space-time scheduler adaptively selects a pause schedule.

To be correct, the scheduler must eventually pause the world for long enough to let the collector terminate. The space-time scheduler is based on a simple idea: the length of pauses increases during collection in response to how much memory the application is using.

The space-time scheduler selects the duration and spacing of synthetic pauses based on the headroom ratio, which is a measure of the amount of extra memory that the application has allocated during the concurrent collection. A concurrent collection is triggered by memory usage crossing the trigger threshold. Since the collector allows the application to keep running, the application will keep allocating. The space that the collector makes available for allocation during collection is called the headroom. Riptide is tuned for a max headroom that is 50% larger than the trigger threshold: so if the app needed to allocate 100MB to trigger a collection, its max headroom is 50MB. We want the collector to complete synchronously if we ever deplete all of our headroom: at that point it’s better for the system to pause and free memory than to run and deplete even more memory. The headroom ratio is simply the available headroom divided by the max headroom. The space-time scheduler will divide time into fixed timeslices, and the headroom ratio determines how much time the application is resumed for during that period.

The default tuning of our collector is that the collector timeslice is 2 ms, and the first C ms of it is given to the collector and the remaining M ms is given to the mutator. We always let the collector pause for at least 0.6 ms. Let H be the headroom ratio: 1 at the start of collection, and 0 if we deplete all headroom. With a 0.6 ms minimum pause and a 2 ms timeslice, we define M and C as follows:

M = 1.4 H
C = 2 – M

For example, at the start of usual collection we will give 0.6 ms to the collector and then 1.4 ms to the application, but as soon as the application starts allocating, this window shifts. Aggressive applications, which both allocate a lot and write to old objects a lot, will usually end collection with the split being closer to 1 ms for the collector followed by 1 ms for the application.

Thanks to the space-time scheduler, the worst that an adversarial program could do is cause the GC to keep revisiting some object. But it can’t cause the GC to run out of memory, since if the adversary uses up all of the headroom then M becomes 0 and the collector gets to stop the world until the end of the cycle.

Obstruction-Free Double Collect Snapshot

Concurrent garbage collection means finding exciting new ways of side-stepping expensive synchronization. In traditional concurrent mark-sweep GCs, which focused on nicely-typed languages, the worst race was the one covered by the write barrier. But since this is JavaScript, we get to have a lot more fun.

JavaScript objects may have properties added to them at any time. The WebKit JavaScript object model has three features that makes this efficient:

  • Each object has a structure ID: The first 32 bits of each object is its structure ID. Using a table lookup, this gives a pointer to the object’s structure: a kind of meta-object that describes how its object is supposed to look. The object’s layout is governed by its structure. Some objects have immutable structures, so for those we know that so long as their structure IDs stay the same, they will be laid out the same.
  • The structure may tell us that the object has inline storage. This is a slab of space in the object itself, left aside for JavaScript properties.
  • The structure may tell us about the object’s butterfly. Each object has room for a pointer that can be used to point to an overflow storage for additional properties that we call a butterfly. The butterfly is a bidirectional object that may store named properties to the left of the pointer and indexed properties to the right.

It’s imperative that the garbage collector visits the butterfly using exactly the structure that corresponds to it. If the object has a mutable structure, it’s imperative that the collector visits the butterfly using the data from the structure that corresponds to that butterfly. The collector would crash if it tried to decode the butterfly using wrong information.

To accomplish this, we use a very simple obstruction-free version of Afek et al’s double collect snapshot. To handle the immutable structure case, we just ensure that the application uses this protocol to set both the structure and butterfly:

  1. Nuke the structure ID — this sets a bit in the structure ID to indicate to the GC that the structure and butterfly are changing.
  2. Set the butterfly.
  3. Set the new (decontaminated) structure ID — decontaminating means clearing the nuke bit.

Meanwhile the collector does this to read both the structure and the butterfly:

  1. Read the structure ID.
  2. Read the butterfly.
  3. Read the structure ID again, and compare to (1).

If the collector ever reads a nuked structure ID, or if the structure ID in (1) and (3) are different, then we know that we will have a butterfly-structure mismatch. But if none of these conditions hold, then we are guaranteed that the collector will have a consistent structure and butterfly. See here for the proof.

Harder still is the case where the structure is mutable. In this case, we ensure that the protocol for setting the fields in the structure is to set them after the structure is nuked but before the new one is installed. The collector reads those fields before/after as well. This allows the collector to see a consistent snapshot of the structure, butterfly, and a bit inside the structure without using any locking. All that matters is that the stores in the application and the loads in the collector are ordered. We get this for free on x86, and on ARM we use store-store fences in the application (dmb ishst) and load-load fences in the collector (dmb ish).

This algorithm is said to be obstruction-free because it will complete in O(1) time no matter what kind of race it encounters, but if it does encounter a race then it’ll tell you to try again. Obstruction-free algorithms need some kind of contention manager to ensure that they do eventually complete. The contention manager must provably maximize the likelihood that the obstruction-free algorithm will eventually run without any race. For example, this would be a sound contention manager: exponential back-off in which the actual back-off amount is a random number between 0 and X where X increases exponentially on each try. It turns out that Riptide’s retreating wavefront revisit scheduler is already a natural contention manager. When the collector bails on visiting an object because it detected a race, it schedules revisiting of that object just as if a barrier had executed. So, the GC will visit any object that encountered such a race again anyway. The GC will visit the object much later and the timing will be somewhat pseudo-random due to OS scheduling. If an object did keep getting revisited, eventually the space-time scheduler will increase the collector’s synthetic pause to the point where the revisit will happen with the world stopped. Since there are no safepoints possible in any of the structure/butterfly atomic protocols, stopping the world ensures that the algorithm will not be obstructed.

Embedded WTF Locks

The obstruction-free object snapshot is great, but it’s not scalable — from a WebKit developer sanity standpoint — to use it everywhere. Because we have been adding more concurrency to WebKit for a while, we made this easier by already having a custom locking infrastructure in WTF (Web Template Framework). One of the goals of WTF locks was to fit locks in two bits so that we may one day stuff a lock into the header of each JavaScript object. Many of the loony corner-case race conditions in the concurrent garbage collector happen on paths where acquiring a lock is fine, particularly if that lock has a great inline fast path like WTF locks. So, all JavaScript objects in WebKit now have a fast, adaptive, and fair WTF lock embedded in two bits of what is otherwise the indexingType byte in the object header. This internal lock is used to protect mutations to all sorts of miscellaneous data structures. The collector will hold the internal lock while visiting those objects.

Locking should always be used with care since it can be a slow-down. In Riptide, we only use locking to protect uncommon operations. Additionally, we use an optimized lock implementation to reduce the cost of synchronization even further.

Algorithm Summary

Riptide is an improvement to WebKit’s collector and retains most of the things that made the old algorithm great. The changes that transformed WebKit’s collector were landed over the past six months, starting with the painful work of removing WebKit’s previous use of copying. Riptide combines Guy Steele’s classic retreating wavefront write barrier with a mature sticky-mark-sweep collector and lots of concurrency tricks to get a useful combination of high GC throughput and low GC latency.

Related Work

The paper that introduced retreating wavefront did not claim to implement the idea — it was just a thought experiment. We are aware of two other implementations of retreating wavefront. The oldest is the BDW (Boehm-Demers-Weiser) collector‘s incremental mode. That collector uses a page-granularity revisit because it relies entirely on page faults to trigger the barrier. The collector makes pages that have black objects read-only and then any write to that page triggers a fault. The fault handler makes the page read-write and logs the entire page for revisiting. Riptide uses a software barrier that precisely triggers revisiting only for the object that got stored to. The BDW collector uses page faults for a good reason: so that it can be used as a plug-in component to any kind of language environment. The compiler doesn’t have to be aware of retreating wavefronts or generations since the BDW collector will be sure to catch all of the writes that it cares about. But in WebKit we are happy to have everything tightly integrated and so Riptide relies on the rest of WebKit to use its barrier. This was not hard since the new barrier is almost identical to our old one.

Another user of retreating wavefront is ChakraCore. It appears to have both a page-fault-based barrier like BDW and a software card-marking barrier that can flag 128-byte regions of memory as needing revisit. (For a good explanation of card-marking, albeit in a different VM, see here.) Riptide uses an object-granularity barrier instead. We tried card-marking, but found that it was slower than our barrier unless we were willing to place our entire heap in a single large virtual memory reservation. We didn’t want our memory structure to be that deterministic. All retreating wavefront collectors require a stop-the-world snapshot-at-the-end increment that confirms that there is no more marking left to do. Both BDW and ChakraCore perform all revisiting during the snapshot-at-the-end. If there is a lot of revisiting work, that increment could take a while. That risk is particularly high with card-marking or fault-based barriers, in which a write to a single object usually causes the revisiting of multiple objects. Riptide can revisit objects with the application resumed. Riptide can also resume the application in between executions of custom constraints. Riptide is tuned so that the snapshot-at-the-end is only confirming that there is no more work, rather than spending an unbounded amount of time creating and chasing down new work.

Instead of retreating wavefront, most incremental, concurrent, and real-time collectors use some kind of advancing wavefront barrier. In those kinds of barriers, the application marks the objects it interacts with under certain conditions. Baker’s barrier marks every pointer you load from the heap. Dijkstra’s barrier marks every pointer you store into the heap. Yuasa’s barrier marks every pointer you overwrite. All of these barriers advance the collector’s wavefront in the sense that they reduce the amount of work that the collector will have to do — the thinking goes that the collector would have marked the object anyway so the barrier is helping. Since these collectors usually allocate objects black during collection, marking objects will not postpone when the collector can finish. This means that advancing wavefront collectors will mark all objects that were live at the very beginning of the cycle and all objects allocated during the cycle. Keeping objects allocated during the GC cycle (which may be long) is called floating garbage. Retreating wavefront collectors largely avoid floating garbage since in those collectors an object can only be marked if it is found to be referenced from another marked object.

Advancing wavefront barriers are not a great match for generational collection. The generational barrier isn’t going to overlap with an advancing wavefront barrier the way that Riptide’s, ChakraCore’s, and BDW’s do. This means double the barrier costs. Also, in an advancing wavefront generational collector, eden collections have to be careful to ensure that their floating garbage doesn’t get promoted. This requires distinguishing between an object being marked for survival versus being marked for promotion. For example, the Domani, Kolodner, Petrank collector has a “yellow” object state and special color-toggling machinery to manage this state, all so that it does not promote floating garbage. The Frampton, Bacon, Cheng, and Grove version of the Metronome collector maintains three nurseries to gracefully move objects between generations, and in their collector the eden collections and full collections can proceed concurrently to each other. While those collectors have incredible features, they are not in widespread use, probably because of increased baseline costs due to extra bookkeeping and extra barriers. To put in perspective how annoying the concurrent-generational integration is, many systems like V8 and HotSpot avoid the problem by using synchronous eden collections. We want eden collections to be concurrent because although they are usually fast, we have no bound on how long they could take in the worst case. Not having floating garbage is another reason why it’s so easy for retreating wavefront collectors to do concurrent eden collection: there’s no need to invent states for black-but-new objects.

Using retreating wavefront means we don’t get the advancing wavefront’s GC termination guarantee. We make up for it by having more aggressive scheduling. It’s common for advancing wavefront collectors to avoid all global pauses because all of collection is concurrent. In the most aggressive advancing wavefront concurrent collectors, the closest thing to a “pause” is that at some point each thread must produce a stack scan. Even if all of Riptide’s algorithms were concurrent, we would still have to artificially stop the application simply to ensure termination. That’s a trade-off that we’re happy with, since we get to control how long these synthetic pauses are.

In many ways, Riptide is a classic mark-sweep collector. Using simple segregated storage is very common, and variants of this technique can be found in Jikes RVM, the Metronome real-time garbage collector, the BDW collector, the Bartok concurrent mark-sweep collector, and probably many others. Combining mark-sweep with bump-pointer is not new; Immix is another way to do it. Our bump’n’pop allocator looks most like Hoard‘s, and the technique was also used in Vam and reaps. Our conservative scan is almost like what the BDW collector does. Sticky mark bits are also used in BDW, Jikes RVM, and ChakraCore.

Evaluation

We enabled Riptide once we were satisfied that it did not have any major remaining regressions (in stability, performance, and memory usage) and that it demonstrated an improvement on some test of GC pauses. Enabling it now enables us to expose it to a lot of testing as we continue to tune and validate this collector. This section summarizes what we know about Riptide’s performance so far.

The synchronization features that enable concurrent collection were landed in many revisions over a six month period starting in July 2016. This section focuses on the performance boost that we get once we enable Riptide. Enabling Riptide means that draining will resume the application and allow the application and collector to run alongside each other. The application will still experience pauses: both synthetic pauses from the space-time scheduler and mandatory pauses for things like DOM constraint evaluation. The goal of this evaluation is to give a glimpse of what Riptide can do for observed pauses.

The test that did the best job of demonstrating our garbage collector’s jankyness was the Octane SplayLatency test. This test is also included in JetStream. WebKit was previously not the best at either version of this test so we wanted a GC that would give us a big improvement. The Octane version of this test reports the reciprocal of the root-mean-squared, which rewards uniform performance. JetStream reports the reciprocal of the average of the worst 0.5% of samples, which rewards fast worst-case performance. We tuned Riptide on the JetStream version of this test, but we show results from both versions.

The performance data was gathered on a 15″ MacBook Pro with a 2.8 GHz Intel Core i7 and 16GB RAM. This machine has four cores, and eight logical CPUs thanks to hyperthreading. We took care to quiet down the machine before running benchmarks, by closing almost all apps, disconnecting from the network, disabling Spotlight, and disabling ReportCrash. Our GC is great at taking advantage of hyperthreaded CPUs, so it runs eight draining threads on this machine.


The figure above shows that Riptide improves the JetStream/splay-latency score by a factor of five.


The figure above shows that Riptide improves the Octane/SplayLatency score by a factor of 2.5.


The chart above shows what is happening over 10,000 iterations of the Splay benchmark: without Riptide, an occasional iteration will pause for >10 ms due to garbage collection. Enabling Riptide brings these hiccups below 3 ms.

You can run this benchmark interactively if you want to see how your browser’s GC performs. That version will plot the time per iteration in milliseconds over 2,000 iterations.

We continue to tune Riptide as we validate it on a larger variety of workloads. Our goal is to continue to reduce pause times. That means making more of the collector concurrent and improving the space-time scheduler. Continued tuning is tracked by bug 165909.

Conclusion

This post describes the new Riptide garbage collector in WebKit. Riptide does most of its work off the main thread, allowing for a significant reduction in worst-case pause times. Enabling Riptide leads to a five-fold improvement in latency as reported by the JetStream/splay-latency test. Riptide is now enabled by default in WebKit trunk and you can try it out in Safari Technology Preview 21. Please try it out and file bugs!

By Filip Pizlo at January 20, 2017 06:00 PM

December 21, 2016

Frédéric Wang: ¡Igalia is hiring!

Igalia WebKit

If you read this blog, you probably know that I joined Igalia early this year, where I have been involved in projects related to free software and web engines. You may however not be aware that Igalia has a flat & cooperative structure where all decisions (projects, events, recruitments, company agreements etc) are voted by members of an assembly. In my opinion such an organization allows to take better decisions and to avoid frustrations, compared to more traditional hierarchical organizations.

After several months as a staff, I finally applied to become an assembly member and my application was approved in November! Hence I attended my first assembly last week where I got access to all the internal information and was also able to vote… In particular, we approved the opening of two new job positions. If you are interested in state-of-the-art free software projects and if you are willing to join a company with great human values, you should definitely consider applying!

December 21, 2016 11:00 PM

December 19, 2016

Miguel A. Gómez: WPE: Web Platform for Embedded

Igalia WebKit

WPE is a new WebKit port optimized for embedded platforms that can support a variety of display protocols like Wayland, X11 or other native implementations. It is the evolution of the port formerly known as WebKitForWayland, and it was born as part of a collaboration between Metrological and Igalia as an effort to have a WebKit port running efficiently on STBs.

QtWebKit has been unmaintained upstream since they decided to switch to Blink, hence relying in a dead port for the future of STBs is a no-go. Meanwhile, WebKitGtk+ has been maintained and live upstream which was perfect as a basis for developing this new port, removing the Gtk+ dependency and trying Wayland as a replacement for X server. WebKitForWayland was born!

During a second iteration, we were able to make the Wayland dependency optional, and change the port to use platform specific libraries to implement the window drawings and management. This is very handy for those platforms were Wayland is not available. Due to this, the port was renamed to reflect that Wayland is just one of the several backends supported: welcome WPE!.

WPE has been designed with simplicity and performance in mind. Hence, we just developed a fullscreen browser with no tabs and multimedia support, as small (both in memory usage and disk space) and light as possible.

Current repositories

We are now in the process of moving from the WebKitForWayland repositories to what will be the WPE final ones. This is why this paragraph is about “current repositories”, and why the names include WebKitForWayland instead of WPE. This will change at some point, and expect a new post with the details when it happens. For now, just bear in mind that where it says WebKitForWayland it really refers to WPE.

  • Obviously, we use the main WebKit repository git://git.webkit.org/WebKit.git as our source for the WebKit implementation.
  • Then there are some repositories at github to host the specific WPE bits. This repositories include the needed dependencies to build WPE together with the modifications we did to WebKit for this new port. This is the main WPE repository, and it can be easily built for the desktop and run inside a Wayland compositor. The build and run instructions can be checked here. The mission of these repositories is to be the WPE reference repository, containing the differences needed from upstream WebKit and that are common to all the possible downstream implementations. Every release cycle, the changes in upstream WebKit are merged into this repository to keep it updated.
  • And finally we have Metrological repositories. As in the previous case, we added the dependencies we needed together with the WebKit code. This third’s repository mission is to hold the Metrological specific changes both to WPE and its dependencies, and it also updated from the main WPE repository each release cycle. This version of WPE is meant to be used inside Metrological’s buildroot configuration, which is able to build images for the several target platforms they use. These platforms include all the versions of the Raspberry Pi boards, which are the ones we use as the reference platforms, specially the Raspberry Pi 2, together with several industry specific boards from chip vendors such as Broadcom and Intel.

Building

As I mentioned before, building and running WPE from the main repository is easy and the instructions can be found here.

Building an image for a Raspberry Pi is quite easy as well, just a bit more time consuming because of the cross compiling and the extra dependencies. There are currently a couple of configs at Metrological’s buildroot that can be used and that don’t depend on Metrological’s specific packages. Here are the commands you need to run in order to test it:

    • Clone the buildroot repository:
      git clone https://github.com/Metrological/buildroot-wpe.git
    • Select the buildroot configuration you want. Currently you can use raspberrypi_wpe_defconfig to build for the RPi1 and raspberrypi2_wpe_defconfig to build for the RPi2. This example builds for the RPi2, it can be changed to the RPi1 just changing this command to use the appropriate config. The test of the commands are the same for both cases.
      make raspberrypi2_wpe_defconfig
    • Build the config
      make
    • And then go for a coffee because the build will take a while.
    • Once the build is finished you need to deploy the result to the RPi’s card (SD for the RPi1, microSD for the RPi2). This card must have 2 partitions:
      • boot: fat32 file system, with around 100MB of space.
      • root: ext4 file system with around 500MB of space.
    • Mount the SD card partitions in you system and deploy the build result (stored in output/images) to them. The deploy commands assume that the boot partition was mounted on /media/boot and the root partition was mounted on /media/rootfs:
      cp -R output/images/rpi-firmware/* /media/boot/
      cp output/images/zImage /media/boot/
      cp output/images/*.dtb /media/boot/
      tar -xvpsf output/images/rootfs.tar -C /media/rootfs/
    • Remove the card from your system and plug it to the RPi. Once booted, ssh into it and the browser can be easily launched:
      wpe http://www.igalia.com

Features

I’m planning to write a dedicated post to talk about technical details of the project where I’ll cover this more in deep but, briefly, these are some features that can be found in WPE:

    • support for the common HTML5 features: positioning, CSS, CSS3D, etc
    • hardware accelerated media playback
    • hardware accelerated canvas
    • WebGL
    • MSE
    • MathML
    • Forms
    • Web Animations
    • XMLHttpRequest
    • and many other features supported by WebKit. If you are interested in the complete list, feel freel to browse to http://html5test.com/ and check it yourself!

captura-de-pantalla-de-2016-12-19-12-09-10

Current adoption status

We are proud to see that thanks to Igalia’s effort together with Metrological, WPE has been selected to replace QtWebKit inside the RDK stack, and that it’s also been adopted by some big cable operators like Comcast (surpassing other options like Chromium, Opera, etc). Also, there are several other STB manufacturers that have shown interest in putting WPE on their boards, which will lead to new platforms supported and more people contributing to the project.

These are really very good news for WPE, and we hope to have an awesome community around the project (both companies and individuals), to collaborate making the engine even better!!

Future developments

Of course, periodically merging upstream changes and, at the same time, keep adding new functionalities and supported platforms to the engine are a very important part of what we are planning to do with WPE. Both Igalia and Metrological have a lot of ideas for future work: finish WebRTC and EME support, improvements to the graphics pipelines, add new APIs, improve security, etc.

But besides that there’s also a very important refactorization that is being performed, and it’s uploading the code to the main WebKit repository as a new port. Basically this means that the main WPE repository will be removed at some point, and its content will be integrated into WebKit. Together with this, we are setting the pieces to have a continuous build and testing system, as the rest of the WebKit ports have, to ensure that the code is always building and that the layout tests are passing properly. This will greatly improve the quality and robustness of WPE.

So, when we are done with those changes, the repository structure will be:

  • The WebKit main repository, with most of the code integrated there
  • Clients/users of WPE will have their own repository with their specific code, and they will merge main repository’s changes directly. This is the case of the Metrological repository.
  • A new third repository that will store WPE’s rendering backends. This code cannot be upstreamed to the WebKit repository as in many cases the license won’t allow it. So only a generic backend will be upstreamed to WebKit while the rest of the backends will be stored here (or in other client specific repositories).

By magomez at December 19, 2016 02:39 PM

December 15, 2016

Claudio Saavedra: Thu 2016/Dec/15

Igalia WebKit

Igalia is hiring. We're currently interested in Multimedia and Chromium developers. Check the announcements for details on the positions and our company.

December 15, 2016 05:13 PM

December 09, 2016

Frédéric Wang: STIX Two in Gecko and WebKit

Igalia WebKit

On the 1st of December, the STIX Fonts project announced the release of STIX 2. If you never heard about this project, it is described as follows:

The mission of the Scientific and Technical Information Exchange (STIX) font creation project is the preparation of a comprehensive set of fonts that serve the scientific and engineering community in the process from manuscript creation through final publication, both in electronic and print formats.

This sounds a very exciting goal but the way it has been achieved has made the STIX project infamous for its numerous delays, for its poor or confusing packaging, for delivering math fonts with too many bugs to be usable, for its lack of openness & communication, for its bad handling of third-party feedback & contribution

Because of these laborious travels towards unsatisfactory releases, some snarky people claim that the project was actually named after Styx (Στύξ) the river from Greek mythology that one has to cross to enter the Underworld. Or that the story of the project is summarized by Baudelaire’s verses from L’Irrémédiable:

Une Idée, une Forme, un Être
Parti de l’azur et tombé
Dans un Styx bourbeux et plombé
Où nul œil du Ciel ne pénètre ;

More seriously, the good news is that the STIX Consortium finally released text fonts with a beautiful design and a companion math font that is usable in math rendering engines such as Word Processors, LaTeX and Web Engines. Indeed, WebKit and Gecko have supported OpenType-based MathML layout for more than three years (with recent improvements by Igalia) and STIX Two now has correct OpenType data and metrics!

Of course, the STIX Consortium did not address all the technical or organizational issues that have made its reputation but I count on Khaled Hosny to maintain his more open XITS fork with enhancements that have been ignored for STIX Two (e.g. Arabic and RTL features) or with fixes of already reported bugs.

As Jacques Distler wrote in a recent blog post, OS vendors should ideally bundle the STIX Two fonts in their default installation. For now, users can download and install the OTF fonts themselves. Note however that the STIX Two archive contains WOFF and WOFF2 fonts that page authors can use as web fonts.

I just landed patches in Gecko and WebKit so that future releases will try and find STIX Two on your system for MathML rendering. However, you can already do the proper font configuration via the preference menu of your browser:

  • For Gecko-based applications (e.g. Firefox, Seamonkey or Thunderbird), go to the font preference and select STIX Two Math as the “font for mathematics”.
  • For WebKit-based applications (e.g. Epiphany or Safari) add the following rule to your user stylesheet: math { font-family: "STIX Two Math"; }.

Finally, here is a screenshot of MathML formulas rendered by Firefox 49 using STIX Two:

Screenshot of MathML formulas rendered by Firefox using STIX 2

And the same page rendered by Epiphany 3.22.3:

Screenshot of MathML formulas rendered by Epiphany using STIX 2

December 09, 2016 11:00 PM

November 21, 2016

A tale of cylinders and shadows

Gustavo Noronha

Like I wrote before, we at Collabora have been working on improving WebKitGTK+ performance for customer projects, such as Apertis. We took the opportunity brought by recent improvements to WebKitGTK+ and GTK+ itself to make the final leg of drawing contents to screen as efficient as possible. And then we went on investigating why so much CPU was still being used in some of our test cases.

The first weird thing we noticed is performance was actually degraded on Wayland compared to running under X11. After some investigation we found a lot of time was being spent inside GTK+, painting the window’s background.

Here’s the thing: the problem only showed under Wayland because in that case GTK+ is responsible for painting the window decorations, whereas in the X11 case the window manager does it. That means all of that expensive blurring and rendering of shadows fell on GTK+’s lap.

During the web engines hackfest, a couple of months ago, I delved deeper into the problem and noticed, with Carlos Garcia’s help, that it was even worse when HiDPI displays were thrown into the mix. The scaling made things unbearably slower.

You might also be wondering why would painting of window decorations be such a problem, anyway? They should only be repainted when a window changes size or state anyway, which should be pretty rare, right? Right, that is one of the reasons why we had to make it fast, though: the resizing experience was pretty terrible. But we’ll get back to that later.

So I dug into that, made a few tries at understanding the issue and came up with a patch showing how applying the blur was being way too expensive. After a bit of discussion with our own Pekka Paalanen and Benjamin Otte we found the root cause: a fast path was not being hit by pixman due to the difference in scale factors on the shadow mask and the target surface. We made the shadow mask scale the same as the surface’s and voilà, sane performance.

I keep talking about this being a performance problem, but how bad was it? In the following video you can see how huge the impact in performance of this problem was on my very recent laptop with a HiDPI display. The video starts with an Epiphany window running with a patched GTK+ showing a nice demo the WebKit folks cooked for CSS animations and 3D transforms.

After a few seconds I quickly alt-tab to the version running with unpatched GTK+ – I made the window the exact size and position of the other one, so that it is under the same conditions and the difference can be seen more easily. It is massive.

Yes, all of that slow down was caused by repainting window shadows! OK, so that solved the problem for HiDPI displays, made resizing saner, great! But why is GTK+ repainting the window even if only the contents are changing, anyway? Well, that turned out to be an off-by-one bug in the code that checks whether the invalidated area includes part of the window decorations.

If the area being changed spanned the whole window width, say, it would always cause the shadows to be repainted. By fixing that, we now avoid all of the shadow drawing code when we are running full-window animations such as the CSS poster circle or gtk3-demo’s pixbufs demo.

As you can see in the video below, the gtk3-demo running with the patched GTK+ (the one on the right) is using a lot less CPU and has smoother animation than the one running with the unpatched GTK+ (left).

Pretty much all of the overhead caused by window decorations is gone in the patched version. It is still using quite a bit of CPU to animate those pixbufs, though, so some work still remains. Also, the overhead added to integrate cairo and GL rendering in GTK+ is pretty significant in the WebKitGTK+ CSS animation case. Hopefully that’ll get much better from GTK+ 4 onwards.

By kov at November 21, 2016 05:04 PM

November 14, 2016

Manuel Rego: Recap of the Web Engines Hackfest 2016

Igalia WebKit

This is my personal summary of the Web Engines Hackfest 2016 that happened at Igalia headquarters (in A Coruña) during the last week of September.

The hackfest is a special event, because the target audience are implementors of the different web browser engines. The idea is to join browser hackers for a few days to discuss different topics and work together in some of them. This year we almost reached 40 participants, which is the largest number of people attending the hackfest since we started it 8 years ago. Also it was really nice to have people from the different communities around the open web platform.

One of the breakout sessions of the Web Engines Hackfest 2016 One of the breakout sessions of the Web Engines Hackfest 2016

Talks

Despite the main focus of the event is the hacking time and the breakout sessions to discuss the different topics, we took advantage of having great developers around to arrange a few short talks. I’ll do a brief review of the talks from this year, which have been already published on a YouTube playlist (thanks to Chema and Juan for the amazing recording and video edition).

CSS Grid Layout

In my case, as usual lately, my focus was CSS Grid Layout implementation on Blink and WebKit, that Igalia is doing as part of our collaboration with Bloomberg. The good thing was that Christian Biesinger from Google was present in the hackfest, he’s usually the Blink developer reviewing our patches, so it was really nice to have him around to discuss and clarify some topics.

My colleague Javi already wrote a blog post about the hackfest with lots of details about the Grid Layout work we did. Probably the most important bit is that we’re getting really close to ship the feature. The spec is now Candidate Recommendation (despite a few issues that still need to be clarified) and we’ve been focusing lately on finishing the last features and fixing interoperability issues with Gecko implementation. And we’ve just sent the Intent to ship” mail to blink-dev, if everything goes fine Grid Layout might be enabled by default on Chromium 57 that will be released around March 2017.

MathML

It’s worth to mention the work performed by Behdad, Fred and Khaled adding support on HarfBuzz for the OpenType MATH table. Again Fred wrote a detailed blog post about all this work some weeks ago.

Also we had the chance to discuss with more googlers about the possibilities to bring MathML back into Chromium. We showed the status of the experimental branch created by Fred and explained the improvements we’ve been doing on the WebKit implementation. Now Gecko and WebKit both follow the implementation note and the tests have been upstreamed into the W3C Web platform Tests repository. Let’s see how all this will evolve in the future.

Acknowledges

Last, but not least, as one of the event organizers, I’ve to say thanks to everyone attending the hackfest, without all of you the event won’t make any sense. Also I want to acknowledge the support from the hackfest sponsors Collabora, Igalia and Mozilla. And I’d also like to give kudos to Igalia for hosting and organizing the event one year more. Looking forward for 2017 edition!

Web Engines Hackfest 2016 sponsors: Collabora, Igalia and Mozilla Web Engines Hackfest 2016 sponsors: Collabora, Igalia and Mozilla

Igalia 15th Anniversary & Summit

Just to close this blog post let me talk about some extra events that happened on the last week of September just after the Web Engines Hackfest.

Igalia was celebrating its 15th anniversary with several parties during the week, one of them was in the last day of the hackfest at night with a cool live concert included. Of course hackfest attendees were invited to join us.

Igalia 15th Anniversary Party Igalia 15th Anniversary Party (picture by Chema Casanova)

Then, on the weekend, Igalia arranged a new summit. We usually do two per year and, in my humble opinion, they’re really important for Igalia as a whole. The flat structure is based on trust in our peers, so spending a few days per year all together is wonderful. It allows us to know each other better, while having a good time with our colleagues. And I’m sure they’re very useful for newcomers too, in order to help them to understand our company culture.

Igalia Fall Summit 2016 Igalia Fall Summit 2016 (picture by Alberto Garcia)

And that’s all for this post, let’s hope you didn’t get bored. Thanks for reading this far. 😊

November 14, 2016 11:00 PM

November 10, 2016

Xabier Rodríguez Calvar: Web Engines Hackfest 2016

Igalia WebKit

From September 26th to 28th we celebrated at the Igalia HQ the 2016 edition of the Web Engines Hackfest. This year we broke all records and got participants from the three main companies behind the three biggest open source web engines, say Mozilla, Google and Apple. Or course, it was not only them, we had some other companies and ourselves. I was active part of the organization and I think we not only did not get any complain but people were comfortable and happy around.

We had several talks (I included the slides and YouTube links):

We had lots and lots of interesting hacking and we also had several breakout sessions:

  • WebKitGTK+ / Epiphany
  • Servo
  • WPE / WebKit for Wayland
  • Layout Models (Grid, Flexbox)
  • WebRTC
  • JavaScript Engines
  • MathML
  • Graphics in WebKit

What I did during the hackfest was working with Enrique and Žan to advance on reviewing our downstream implementation of Media Source Extensions (MSE) in order to land it as soon as possible and I can proudly say that we did already (we didn’t finish at the hackfest but managed to do it after it). We broke the bots and pissed off Michael and Carlos but we managed to deactivate it by default and continue working on it upstream.

So summing up, from my point of view and it is not only because I was part of the organization at Igalia, based also in other people’s opinions, I think the hackfest was a success and I think we will continue as we were or maybe growing a bit (no spoilers!).

Finally I would like to thank our gold sponsors Collabora and Igalia and our silver sponsor Mozilla.

By calvaris at November 10, 2016 08:59 AM

October 31, 2016

Manuel Rego: My experience at W3C TPAC 2016 in Lisbon

Igalia WebKit

At the end of September I attended W3C TPAC 2016 in Lisbon together with my igalian fellows Joanie and Juanjo. TPAC is where all the people working on the different W3C groups meet for a week to discuss tons of topics around the Web. It was my first time on a W3C event so I had the chance to meet there a lot of amazing people, that I’ve been following on the internet for a long time.

Igalia booth

Like past year Igalia was present in the conference, and we had a nice exhibitor booth just in front the registration desk. Most of the time my colleague Juanjo was there explaining Igalia and the work we do to the people that came to the booth.

Igalia Booth at W3C TPAC 2016 Igalia Booth at W3C TPAC 2016 (picture by Juan José Sánchez)

In the booth we were showcasing some videos of the last standards implementations we’ve been doing upstream (like CSS Grid Layout, WebRTC, MathML, etc.). In addition, we were also showing a few demos of our WebKit port called WPE, which has been optimized to run on low-end devices like the Raspberry Pi.

It’s really great to have the chance to explain the work we do around browsers, and check that there are quite a lot of people interested on that. On this regard, Igalia is a particular consultancy that can help companies to develop standards upstream, contributing to both the implementations and the specification evolution and discussions inside the different standard bodies. Of course, don’t hesitate to contact us if you think we can help you to achieve similar goals on this field.

CSS Working Group

During TPAC I was attending the CSS WG meetings as an observer. It’s been really nice to check that a lot of people there appreciate the work we do around CSS Grid Layout, as part of our collaboration with Bloomberg. Not only the implementation effort we’re leading on Blink and WebKit, but also the contributions we make to the spec itself.

New ideas for CSS Grid Layout

There were not a lot of discussion about Grid Layout during the meetings, just a few issues that I’ll explain later. However, Jen Simmons did a very cool presentation with a proposal for a new regions specification.

As you probably remember one of the main concerns regarding CSS Regions spec is the need of dummy divs, that are used to flow the text into them. Jen was suggesting that we could use CSS Grids to solve that. Grids create boxes from CSS where you place elements from the DOM, the idea would be that if you can reference those boxes from CSS then the contents could flow there without the need of dummy nodes. This was linked with some other new ideas to improve CSS Grid Layout:

  • Apply backgrounds and borders to grid cells.
  • Skip cells during auto-placement.

All this is already possible nowadays using dummy div elements (of course if you use a browser with regions and grid support like Safari Technology Preview). However, the idea would be to be able to achieve the same thing without any empty item, directly referring them from CSS.

This of course needs further discussion and won’t be part of the level 1 of CSS Grid Layout spec, but it’d be really nice to get some of those things included in the future versions. I was discussing with Jen and some of them (like skipping cells for auto-placement) shouldn’t be hard to implement. Last, this reminded me about a discussion we had two years ago at the Web Engines Hackfest 2014 with Mihnea Ovidenie.

Last issues on CSS Grid Layout

After the meetings I had the chance to discuss hand by hand some issues with fantasai (one of the Grid Layout spec editors).

One of the topics was discussed during the CSS WG meeting, it was related to how to handle percentages inside Grid Layout when they need to be resolved in an intrinsic size situation. The question here was if both percentage tracks and percentage gaps should resolve the same or differently, the group agreed that they should work exactly the same. However there is something else, as Firefox computes percentage for margins different to what’s done in the other browsers. I tried to explain all this in a detailed issue for the CSS WG, I really think we only have option to do this, but it’ll take a while until we’ve a final resolution on this.

Another issue I think we’ve still open is about the minimum size of grid items. I believe the text on the spec still needs some tweaks, but, thanks to fantasai, we managed to clarify what should be the expected behavior in the common case. However, there’re still some open questions and an ongoing discussion regarding images.

Finally, it’s worth to mention that just after TPAC, the Grid Layout spec has transitioned to Candidate Recommendation (CR), which means that it’s getting stable enough to finish the implementations and release it to the wild. Hopefully this open issues will be fixed pretty soon.

Houdini

I also attended the first day of Houdini meetings, I was lucky as they started with the Layout API (the one I was more interested on). It’s clear that Google is pushing a lot for this to happen, all the new Layout NG project inside Blink seems quite related to the new Houdini APIs. It looks like a nice thing to have, but, on a first sight, it seems quite hard to get it right. Mainly due to the big complexity of the layout on the web.

On the bright side the Painting API transitioned to CR. Blink has some prototype implementations already that can be used to create cool demos. It’s really amazing to see that the Houdini effort is already starting to produce some real results.

Other

It’s always a pleasure to visit Portugal and Lisbon, both a wonderful country and city. One of the conference afternoons I had the chance to go for a walk from the congress center to the lovely Belém Tower. It was clear I couldn’t miss the chance to go there for such a big conference, you don’t always have TPAC so close from home.

Lisbon wall painting Lisbon wall painting

Overall it was a really nice experience, everyone was very kind and supportive about Igalia and our work on the web platform. Let’s keep working hard to push the web forward!

October 31, 2016 11:00 PM

October 05, 2016

Web Engines Hackfest 2016!

Gustavo Noronha

I had a great time last week and the web engines hackfest! It was the 7th web hackfest hosted by Igalia and the 7th hackfest I attended. I’m almost a local Galician already. Brazilian Portuguese being so close to Galician certainly helps! Collabora co-sponsored the event and it was great that two colleagues of mine managed to join me in attendance.

It had great talks that will eventually end up in videos uploaded to the web site. We were amazed at the progress being made to Servo, including some performance results that blew our minds. We also discussed the next steps for WebKitGTK+, WebKit for Wayland (or WPE), our own Clutter wrapper to WebKitGTK+ which is used for the Apertis project, and much more.

Zan giving his talk on WPE (former WebKitForWayland)Zan giving his talk on WPE (former WebKitForWayland)

One thing that drew my attention was how many Dell laptops there were. Many collaborans (myself included) and igalians are now using Dells, it seems. Sure, there were thinkpads and macbooks, but there was plenty of inspirons and xpses as well. It’s interesting how the brand make up shifted over the years since 2009, when the hackfest could easily be mistaken with a thinkpad shop.

Back to the actual hackfest: with the recent release of Gnome 3.22 (and Fedora 25 nearing release), my main focus was on dealing with some regressions suffered by users experienced after a change that made putting the final rendering composited by the nested Wayland compositor we have inside WebKitGTK+ to the GTK+ widget so it is shown on the screen.

One of the main problems people reported was applications that use WebKitGTK+ not showing anything where the content was supposed to appear. It turns out the problem was caused by GTK+ not being able to create a GL context. If the system was simply not able to use GL there would be no problem: WebKit would then just disable accelerated compositing and things would work, albeit slower.

The problem was WebKit being able to use an older GL version than the minimum required by GTK+. We fixed it by testing that GTK+ is able to create GL contexts before using the fast path, falling back to the slow glReadPixels codepath if not. This way we keep accelerated compositing working inside WebKit, which gives us nice 3D transforms and less repainting, but take the performance hit in the final “blit”.

Introducing Introducing “WebKitClutterGTK+”

Another issue we hit was GTK+ not properly updating its knowledge of the window’s opaque region when painting a frame with GL, which led to some really interesting issues like a shadow appearing when you tried to shrink the window. There was also an issue where the window would not use all of the screen when fullscreen which was likely related. Both were fixed.

André Magalhães also worked on a couple of patches we wrote for customer projects and are now pushing upstream. One enables the use of more than one frontend to connect to a remote web inspector server at once. This can be used to, for instance, show the regular web inspector on a browser window and also use IDE integration for setting breakpoints and so on.

The other patch was cooked by Philip Withnall and helped us deal with some performance bottlenecks we were hitting. It improves the performance of painting scroll bars. WebKitGTK+ does its own painting of scrollbars (we do not use the GTK+ widgets for various reasons). It turns out painting scrollbars can be quite a hit when the page is being scrolled fast, if not done efficiently.

Emanuele Aina had a great time learning more about meson to figure out a build issue we had when a more recent GStreamer was added to our jhbuild environment. He came out of the experience rather sane, which makes me think meson might indeed be much better than autotools.

Igalia 15 years cakeIgalia 15 years cake

It was a great hackfest, great seeing everyone face to face. We were happy to celebrate Igalia’s 15 years with them. Hope to see everyone again next year =)

By kov at October 05, 2016 12:23 PM

September 22, 2016

WebKitGTK+ 2.14 and the Web Engines Hackfest

Gustavo Noronha

Next week our friends at Igalia will be hosting this year’s Web Engines Hackfest. Collabora will be there! We are gold sponsors, and have three developers attending. It will also be an opportunity to celebrate Igalia’s 15th birthday \o/. Looking forward to meet you there! =)

Carlos Garcia has recently released WebKitGTK+ 2.14, the latest stable release. This is a great release that brings a lot of improvements and works much better on Wayland, which is becoming mature enough to be used by default. In particular, it fixes the clipboard, which was one of the main missing features, thanks to Carlos Garnacho! We have also been able to contribute a bit to this release =)

One of the biggest changes this cycle is the threaded compositor, which was implemented by Igalia’s Gwang Yoon Hwang. This work improves performance by not stalling other web engine features while compositing. Earlier this year we contributed fixes to make the threaded compositor work with the web inspector and fixed elements, helping with the goal of enabling it by default for this release.

Wayland was also lacking an accelerated compositing implementation. There was a patch to add a nested Wayland compositor to the UIProcess, with the WebProcesses connecting to it as Wayland clients to share the final rendering so that it can be shown to screen. It was not ready though and there were questions as to whether that was the way to go and alternative proposals were floating around on how to best implement it.

At last year’s hackfest we had discussions about what the best path for that would be where collaborans Emanuele Aina and Daniel Stone (proxied by Emanuele) contributed quite a bit on figuring out how to implement it in a way that was both efficient and platform agnostic.

We later picked up the old patchset, rebased on the then-current master and made it run efficiently as proof of concept for the Apertis project on an i.MX6 board. This was done using the fancy GL support that landed in GTK+ in the meantime, with some API additions and shortcuts to sidestep performance issues. The work was sponsored by Robert Bosch Car Multimedia.

Igalia managed to improve and land a very well designed patch that implements the nested compositor, though it was still not as efficient as it could be, as it was using glReadPixels to get the final rendering of the page to the GTK+ widget through cairo. I have improved that code by ensuring we do not waste memory when using HiDPI.

As part of our proof of concept investigation, we got this WebGL car visualizer running quite well on our sabrelite imx6 boards. Some of it went into the upstream patches or proposals mentioned below, but we have a bunch of potential improvements still in store that we hope to turn into upstreamable patches and advance during next week’s hackfest.

One of the improvements that already landed was an alternate code path that leverages GTK+’s recent GL super powers to render using gdk_cairo_draw_from_gl(), avoiding the expensive copying of pixels from the GPU to the CPU and making it go faster. That improvement exposed a weird bug in GTK+ that causes a black patch to appear when shrinking the window, which I have a tentative fix for.

We originally proposed to add a new gdk_cairo_draw_from_egl() to use an EGLImage instead of a GL texture or renderbuffer. On our proof of concept we noticed it is even more efficient than the texturing currently used by GTK+, and could give us even better performance for WebKitGTK+. Emanuele Bassi thinks it might be better to add EGLImage as another code branch inside from_gl() though, so we will look into that.

Another very interesting igalian addition to this release is support for the MemoryPressureHandler even on systems with no cgroups set up. The memory pressure handler is a WebKit feature which flushes caches and frees resources that are not being used when the operating system notifies it memory is scarce.

We worked with the Raspberry Pi Foundation to add support for that feature to the Raspberry Pi browser and contributed it upstream back in 2014, when Collabora was trying to squeeze as much as possible from the hardware. We had to add a cgroups setup to wrap Epiphany in, back then, so that it would actually benefit from the feature.

With this improvement, it will benefit even without the custom cgroups setups as well, by having the UIProcess monitor memory usage and notify each WebProcess when memory is tight.

Some of these improvements were achieved by developers getting together at the Web Engines Hackfest last year and laying out the ground work or ideas that ended up in the code base. I look forward to another great few days of hackfest next week! See you there o/

By kov at September 22, 2016 05:03 PM

December 15, 2014

Web Engines Hackfest 2014

Gustavo Noronha

For the 6th year in a row, Igalia has organized a hackfest focused on web engines. The 5 years before this one were actually focused on the GTK+ port of WebKit, but the number of web engines that matter to us as Free Software developers and consultancies has grown, and so has the scope of the hackfest.

It was a very productive and exciting event. It has already been covered by Manuel RegoPhilippe Normand, Sebastian Dröge and Andy Wingo! I am sure more blog posts will pop up. We had Martin Robinson telling us about the new Servo engine that Mozilla has been developing as a proof of concept for both Rust as a language for building big, complex products and for doing layout in parallel. Andy gave us a very good summary of where JS engines are in terms of performance and features. We had talks about CSS grid layouts, TyGL – a GL-powered implementation of the 2D painting backend in WebKit, the new Wayland port, announced by Zan Dobersek, and a lot more.

With help from my colleague ChangSeok OH, I presented a description of how a team at Collabora led by Marco Barisione made the combination of WebKitGTK+ and GNOME’s web browser a pretty good experience for the Raspberry Pi. It took a not so small amount of both pragmatic limitations and hacks to get to a multi-tab browser that can play youtube videos and be quite responsive, but we were very happy with how well WebKitGTK+ worked as a base for that.

One of my main goals for the hackfest was to help drive features that were lingering in the bug tracker for WebKitGTK+. I picked up a patch that had gone through a number of iterations and rewrites: the HTML5 notifications support, and with help from Carlos Garcia, managed to finish it and land it at the last day of the hackfest! It provides new signals that can be used to authorize notifications, show and close them.

To make notifications work in the best case scenario, the only thing that the API user needs to do is handle the permission request, since we provide a default implementation for the show and close signals that uses libnotify if it is available when building WebKitGTK+. Originally our intention was to use GNotification for the default implementation of those signals in WebKitGTK+, but it turned out to be a pain to use for our purposes.

GNotification is tied to GApplication. This allows for some interesting features, like notifications being persistent and able to reactivate the application, but those make no sense in our current use case, although that may change once service workers become a thing. It can also be a bit problematic given we are a library and thus have no GApplication of our own. That was easily overcome by using the default GApplication of the process for notifications, though.

The show stopper for us using GNotification was the way GNOME Shell currently deals with notifications sent using this mechanism. It will look for a .desktop file named after the application ID used to initialize the GApplication instance and reject the notification if it cannot find that. Besides making this a pain to test – our test browser would need a .desktop file to be installed, that would not work for our main API user! The application ID used for all Web instances is org.gnome.Epiphany at the moment, and that is not the same as any of the desktop files used either by the main browser or by the web apps created with it.

For the future we will probably move Epiphany towards this new era, and all users of the WebKitGTK+ API as well, but the strictness of GNOME Shell would hurt the usefulness of our default implementation right now, so we decided to stick to libnotify for the time being.

Other than that, I managed to review a bunch of patches during the hackfest, and took part in many interesting discussions regarding the next steps for GNOME Web and the GTK+ and Wayland ports of WebKit, such as the potential introduction of a threaded compositor, which is pretty exciting. We also tried to have Bastien Nocera as a guest participant for one of our sessions, but it turns out that requires more than a notebook on top of a bench hooked up to   a TV to work well. We could think of something next time ;D.

I’d like to thank Igalia for organizing and sponsoring the event, Collabora for sponsoring and sending ChangSeok and myself over to Spain from far away Brazil and South Korea, and Adobe for also sponsoring the event! Hope to see you all next year!

Web Engines Hackfest 2014 sponsors: Adobe, Collabora and Igalia

Web Engines Hackfest 2014 sponsors: Adobe, Collabora and Igalia

By kov at December 15, 2014 11:20 PM