March 23, 2023

Release Notes for Safari Technology Preview 166

Surfin’ Safari

Safari Technology Preview Release 166 is now available for download for macOS Monterey 12.3 or later and macOS Ventura. If you already have Safari Technology Preview installed, you can update it in the Software Update pane of System Preferences on macOS Monterey, or System Settings under General → Software Update on macOS Ventura.

This release includes WebKit changes between: 260849@main…261247@main.

Web Inspector

  • Added input fields for editing variation axes values in Fonts sidebar panel (261162@main)

CSS

JavaScript

  • Implemented RegExp v flag with set notation and properties of strings (261188@main)
  • Enabled new WASM baseline JIT (BBQ) for increased performance (261153@main)
  • Inlined Proxy [[Set]] trap in DFG / FTL (261058@main)
  • Made C++ to JS calls faster (260858@main)

Popover

  • Enabled the popover attribute (261193@main)
  • Implemented [popover=auto] and light dismiss behavior (261093@main)

Media

  • Fixed picture-in-picture video snapping to incorrect size (261383@main)

MSE

  • Changed to only fire durationchange after parsing the media buffer (261029@main)

Web API

  • Added support for preconnect via HTTP early hints (261079@main)
  • Added Cancel, Unknown, and Clear keycodes (261008@main)
  • Added selection API that works across shadow boundaries (261021@main)
  • Added support for largeBlob extension for the local authenticator (260958@main)
  • Adjusted text input scrollWidth and scrollHeight to include padding and any whitespace added by decorations (261121@main)
  • Fixed translation for shadow DOM content (261096@main)
  • Fixed translation to treat a floating list item as its own paragraph (261114@main)
  • Fixed Speech Recognition API terminating after one utterance or a short time (260886@main)
  • Fixed window.onload getting repeatedly re-executed when changing the URL fragment during onload (260860@main)
  • Improved support for prioritized HTTPS navigations (261022@main)
  • Stripped tab and newline from Location, URL, <a>, and <area>‘s protocol setter (261017@main)

Accessibility

  • Fixed input[type=date] individual fields getting announced as “group” (261123@main)
  • Fixed the wrong role displayed for input in Web Inspector (260868@main)

March 23, 2023 07:07 PM

March 20, 2023

Enabling the Inspection of Web Content in Apps

Surfin’ Safari

Web Inspector is a powerful tool that allows you to debug the layout of web pages, step through JavaScript, read messages logged to the console, and more. In Safari on macOS, you can use Web Inspector to inspect web pages, extensions, and service workers. iOS and iPadOS allow inspection of the same content as macOS, with the addition of Home Screen web apps.

Web content and JavaScript is used for various purposes in apps, from providing UI from a webpage to enabling apps to be scriptable. Previously, Web Inspector supported inspecting developer-provisioned apps built directly from Xcode for local development, meaning developers could debug this content so long as the app is installed for development. However, released versions of apps had no way to inspect dynamic web content or scripts, leaving developers and users to have to resort to more complicated workflows to get information that would otherwise be made available by Web Inspector. Now, this same functionality is available through an API on WKWebView and JSContext.

How do I enable inspection?

Across all platforms supporting WKWebView or JSContext, a new property is available called isInspectable (inspectable in Objective-C). It defaults to false, and you can set it to true to opt-in to content being inspectable. This decision is made for each individual WKWebView and JSContext to prevent unintentionally making it enabled for a view or context you don’t intend to be inspectable. So, for example, to make a WKWebView inspectable, you would:

Swift
let webConfiguration = WKWebViewConfiguration()
let webView = WKWebView(frame: .zero, configuration: webConfiguration)
webView.isInspectable = true
Objective-C
WKWebViewConfiguration *webConfiguration = [WKWebViewConfiguration new];
WKWebView *webView = [[WKWebView alloc] initWithFrame:CGRectZero configuration:webConfiguration];
webView.inspectable = YES;

For JSContext, matching API is available, with the addition of C API for developers using JSGlobalContextRef:

Swift
let jsContext = JSContext()
jsContext?.isInspectable = true
Objective-C
JSContext *jsContext = [JSContext new];
jsContext.inspectable = YES;
C
JSGlobalContextRef jsContextRef = JSGlobalContextCreate(NULL);
JSGlobalContextSetInspectable(jsContextRef, true);

The inspectable property can be changed at any point during the lifetime of your WKWebView or JSContext. Disabling inspection while Web Inspector actively inspects the content will automatically close Web Inspector, and no further information about the content will be available.

Once you’ve enabled inspection for your app, you can inspect it from Safari’s Develop menu in the submenu for either your current computer or an attached device. For iOS and iPadOS, you must also have enabled Web Inspector in the Settings app under Safari > Advanced > Web Inspector. You do not need to enable Web Inspector for simulators; it is always enabled. Learn more about enabling Web Inspector…

Develop Menu > Patrick's iPhone > Example App

When should I consider making content inspectable?

A common situation in which you may want the content of WKWebView to be inspectable is in an in-app web browser. The browser shows ordinary web content that would be inspectable when loaded in Safari. It can be beneficial both for the app developer, as well as web authors, to be able to inspect content in these views, as the size of the view may not match that of Safari’s, or the app developer may be injecting script into the view to provide integration with their app.

Web content is often dynamic, delivered by a server—not in the app—and easily changed over time. Unfortunately, not all issues can or will get debugged by folks with access to a developer-provisioned copy of your app.

JSContext can also enable scripting in an app whereby the customer provides the scripts to augment the app. Without the ability for a release version of your app to adopt inspectability, your customers may have no way to debug the scripts they have written. It makes it harder for customers to use this functionality of your app.

Provide readable names for inspectable JSContexts

Unlike WKWebView, which automatically gets a name based on the page currently loaded in the view, every JSContext with inspectable enabled will be listed as JSContext in Safari’s Develop menu. We recommend providing a unique, human-readable name for each inspectable JSContext to make it easier for you and your customers to determine what the JSContext represents. For example, if your app runs different pieces of JavaScript on behalf of the user, you should give each JSContext a name based on what runs inside the context.

API is available to set the user-visible name of a JSContext:

Swift
let jsContext = JSContext()
jsContext?.name = "Context name"
Objective-C
JSContext *jsContext = [JSContext new];
jsContext.name = @"Context name";
C
JSGlobalContextRef jsContextRef = JSGlobalContextCreate(NULL);
`JSGlobalContextSetName`(jsContextRef, JSStringCreateWithUTF8CString("Context name"));

Working with older versions of macOS and iOS

For apps linked against an SDK before macOS 13.3 and iOS 16.4 WKWebViews and JSContexts will continue to follow the previous behavior of always being inspectable when built for debugging from Xcode.

Apps that support older versions of macOS and iOS while linked against the most recent SDK will not get the previous behavior of all content being inspectable in debug builds to avoid confusion about what will and will not be inspectable by customers. Apps targeting older OS versions but linking against the new SDK can use this new API conditionally on OS versions that support it. To conditionally guard usage of the API:

Swift
if #available(macOS 13.3, iOS 16.4, tvOS 16.4, *) {
    webView.isInspectable = true
}
Objective-C
if (@available(macOS 13.3, iOS 16.4, tvOS 16.4, *))
    webView.inspectable = YES;

You can learn more about guarding usage of new API on developer.apple.com.

Feedback

As you explore this new API, please help us by providing feedback if you encounter problems. For issues using this new API, please file feedback from your Mac, iPhone, or iPad. Feedback Assistant will collect the information needed to help us understand what’s happening. For any issues you may experience with Web Inspector itself once inspecting your app’s content, please file a bug on bugs.webkit.org.

Also, we love hearing from you. You can find us on Mastodon at @patrickangle@hachyderm.io, @jensimmons@front-end.social, and @jondavis@mastodon.social.

Note: Learn more about Web Inspector from the Web Inspector Reference documentation.

March 20, 2023 06:54 PM

March 14, 2023

Víctor Jáquez: Review of Igalia Multimedia activities (2022)

Igalia WebKit

We, Igalia’s multimedia team, would like to share with you our list of achievements along the past 2022.

WebKit Multimedia

WebRTC

Phil already wrote a first blog post, of a series, on this regard: WebRTC in WebKitGTK and WPE, status updates, part I. Please, be sure to give it a glance, it has nice videos.

Long story short, last year we started to support Media Capture and Streams in WebKitGTK and WPE using GStreamer, either for input devices (camera and microphone), desktop sharing, webaudio, and web canvas. But this is just the first step. We are currently working on RTCPeerConnection, also using GStreamer, to share all these captured streams with other web peers. Meanwhile, we’ll wait for the second episode of Phil’s series 🙂

MediaRecorder

We worked in an initial implementation of MediaRecorder with GStreamer (1.20 or superior). The specification goes about allowing a web browser to record a selected stream. For example, a voice-memo or video application which could encode and upload a capture of your microphone / camera.

Gamepad

While WebKitGTK already has Gamepad support, WPE lacked it. We did the implementation last year, and there’s a blog post about it: Gamepad in WPEWebkit, with video showing a demo of it.

Capture encoded video streams from webcams

Some webcams only provide high resolution frames encoded in H.264 or so. In order to support these resolutions with those webcams we added the support for negotiate of those formats and decode them internally to handle the streams. Though we are just at the beginning of more efficient support.

Flatpak SDK maintenance

A lot of effort went to maintain the Flatpak SDK for WebKit. It is a set of runtimes that allows to have a reproducible build of WebKit, independently of the used Linux distribution. Nowadays the Flatpak SDK is used in Webkit’s EWS, and by many developers.

Among all the features added during the year we can highlight added Rust support, a full integrity check before upgrading, and offer a way to override dependencies as local projects.

MSE/EME enhancements

As every year, massive work was done in WebKit ports using GStreamer for Media Source Extensions and Encrypted Media Extensions, improving user experience with different streaming services in the Web, such as Odysee, Amazon, DAZN, etc.

In the case of encrypted media, GStreamer-based WebKit ports provide the stubs to communicate with an external Content Decryption Module (CDM). If you’re willing to support this in your platform, you can reach us.

Also we worked in a video demo showing how MSE/EME works in a Raspberry Pi 3 using WPE:

WebAudio demo

We also spent time recording video demos, such as this one, showing WebAudio using WPE on a desktop computer.

GStreamer

We managed to merge a lot of bug fixes in GStreamer, which in many cases can be harder to solve rather than implementing new features, though former are more interesting to tell, such as those related with making Rust the main developing language for GStreamer besides C.

Rust bindings and GStreamer elements for Vonage Video API / OpenTok

OpenTok is the legacy name of Vonage Video API, and is a PaaS (Platform As a Service) to ease the development and deployment of WebRTC services and applications.

We published our work in Github of Rust bindings both for the Client SDK for Linux and the Server SDK using REST API, along with a GStreamer plugin to publish and subscribe to video and audio streams.

GstWebRTCSrc

In the beginning there was webrtcbin, an element that implements the majority of W3C RTCPeerConnection API. It’s so flexible and powerful that it’s rather hard to use for the most common cases. Then appeared webrtcsink, a wrapper of webrtcbin, written in Rust, which receives GStreamer streams which will be offered and streamed to web peers. Later on, we developed webrtcsrc, the webrtcsink counterpart: an element which source pads push streams from web peers, such as another browser, and forward those Web streams as GStreamer ones in a pipeline. Both webrtcsink and webrtcsrc are written in Rust.

Behavior-Driven Development test framework for GStreamer

Behavior-Driven Development is gaining relevance with tools like Cucumber for Java and its domain specific language, Gherkin to define software behaviors. Rustaceans have picked up these ideas and developed cucumber-rs. The logical consequence was obvious: Why not GStreamer?

Last year we tinkered with GStreamer-Cucumber, a BDD to define behavior tests for GStreamer pipelines.

GstValidate Rust bindings

There have been some discussion if BDD is the best way to test GStreamer pipelines, and there’s GstValidate, and also, last year, we added its Rust bindings.

GStreamer Editing Services

Though not everything was Rust. We work hard on GStreamer’s nuts and bolts.

Last year, we gathered the team to hack GStreamer Editing Services, particularly to explore adding OpenGL and DMABuf support, such as downloading or uploading a texture before processing, and selecting a proper filter to avoid those transfers.

GstVA and GStreamer-VAAPI

We helped in the maintenance of GStreamer-VAAPI and the development of its near replacement: GstVA, adding new elements such as the H.264 encoder, the compositor and the JPEG decoder. Along with participation on the debate and code reviewing of negotiating DMABuf streams in the pipeline.

Vulkan decoder and parser library for CTS

You might have heard about Vulkan has now integrated in its API video decoding, while encoding is currently work-in-progress. We devoted time on helping Khronos with the Vulkan Video Conformance Tests (CTS), particularly with a parser based on GStreamer and developing a H.264 decoder in GStreamer using Vulkan Video API.

You can check the presentation we did last Vulkanised.

WPE Android Experiment

In a joint adventure with Igalia’s Webkit team we did some experiments to port WPE to Android. This is just an internal proof of concept so far, but we are looking forward to see how this will evolve in the future, and what new possibilities this might open up.

If you have any questions about WebKit, GStreamer, Linux video stack, compilers, etc., please contact us.

By vjaquez at March 14, 2023 12:00 PM

March 08, 2023

Release Notes for Safari Technology Preview 165

Surfin’ Safari

Safari Technology Preview Release 165 is now available for download for macOS Monterey 12.3 or later and macOS Ventura. If you already have Safari Technology Preview installed, you can update it in the Software Update pane of System Preferences on macOS Monterey, or System Settings under General → Software Update on macOS Ventura.

This release includes WebKit changes between: 260164@main…260848@main.

Web Inspector

  • Added support for color-mix CSS values in the Styles details sidebar of the Elements tab (260332@main)
  • Added setting to always show rulers when highlighting elements (260416@main)

CSS

  • Added support for text-transform: full-size-kana (260307@main)
  • Added support for margin-trim for floats in block containers that contain only block boxes (260318@main)
  • Added support for x units in calc() function (260678@main)
  • Added support to image-set() for resolution and type as optional arguments (260796@main)
  • Fixed preserve-3d not being applied to pseudo elements. (260324@main)
  • Fixed opacity not applying to dialog element ::backdrop pseudo-class (260556@main)
  • Fixed the background to not propagate when content: paint is set on the body or root element (260766@main)
  • Fixed table-layout: fixed not being applied when width is max-content (260501@main)
  • Fixed font-optical-sizing: auto having no effect (260447@main)

JavaScript

Layout

  • Fixed accounting of margins in multi-column layout before forced breaks (260510@main)
  • Fixed floats with clear to not be placed incorrectly (260674@main)

Media

  • Fixed SourceBuffer.timestampOffset not behaving correctly with webm content (260822@main)
  • Fixed HDR data to no longer be clipped in AVIF images (260512@main)

Forms

  • Fixed resetting the value of an input type=file to null to make the input invalid (260688@main)
  • Fixed minlength/maxlength attributes to rely on code units instead of grapheme clusters (260838@main)

Web Animations

  • Added support for the length property of CSSKeyframesRule (260400@main)
  • Changed animation of mask-image to be discrete (260756@main)
  • Fixed custom properties not being treated as valid in the shorthand animation property (260759@main)
  • Fixed transition-property: all not applying to custom properties (260384@main)

WebCrypto

  • Fixed Secure Curves not having a namedCurve property (260599@main)

WebGL

  • Fixed restored WebGL context not being visible until layout (260693@main)

Loading

  • Fixed lazily loaded frames to get a contentWindow/contentDocument as soon as they get inserted into the document (260713@main)
  • Fixed frames to not be lazily loaded if they have an invalid or about:blank URL (260612@main)

Web API

Accessibility

  • Fixed aria-errormessage to not be exposed when aria-invalid is false (260545@main)
  • Fixed text associated with various types of elements not being exposed (260521@main)
  • Fixed invalid summary elements to not be exposed as interactive (260546@main)
  • Fixed some inputs not being treated as invalid despite being rendered as such (260544@main)

Web Extensions

  • Fixed Content Blocker API ignoring some CSS selectors with uppercase letters (260638@main)

March 08, 2023 10:03 PM

March 07, 2023

WPE WebKit Blog: Integrating WPE: URI Scheme Handlers and Script Messages

Igalia WebKit

Most Web content is designed entirely for screen display—and there is a lot of it—so it will spend its life in the somewhat restricted sandbox implemented by a web browser. But rich user interfaces using Web technologies in all kinds of consumer devices require some degree of integration, an escape hatch to interact with the rest of their software and hardware. This is where a Web engine like WPE designed to be embeddable shines: not only does WPE provide a stable API, it is also comprehensive in supporting a number of ways to integrate with its environment further than the plethora of available Web platform APIs.

Integrating a “Web view” (the main entry point of the WPE embedding API) involves providing extension points, which allow the Web content (HTML/CSS/JavaScript) it loads to call into native code provided by the client application (typically written in C/C++) from JavaScript, and vice versa. There are a number of ways in which this can be achieved:

  • URI scheme handlers allow native code to register a custom URI scheme, which will run a user provided function to produce content that can be “fetched” regularly.
  • User script messaging can be used to send JSON messages from JavaScript running in the same context as Web pages to an user function, and vice versa.
  • The JavaScriptCore API is a powerful solution to provide new JavaScript functionality to Web content seamlessly, almost as if they were implemented inside the Web engine itself—akin to NodeJS C++ addons.

In this post we will explore the first two, as they can support many interesting use cases without introducing the additional complexity of extending the JavaScript virtual machine. Let’s dive in!

Intermission

We will be referring to the code of a tiny browser written for the occasion. Telling WebKit how to call our native code involves creating a WebKitUserContentManager, customizing it, and then associating it with web views during their creation. The only exception to this are URI scheme handlers, which are registered using webkit_web_context_register_uri_scheme(). This minimal browser includes an on_create_view function, which is the perfect place to do the configuration:

static WebKitWebView*
on_create_view(CogShell *shell, CogPlatform *platform)
{
    g_autoptr(GError) error = NULL;
    WebKitWebViewBackend *view_backend = cog_platform_get_view_backend(platform, NULL, &error);
    if (!view_backend)
        g_error("Cannot obtain view backend: %s", error->message);

    g_autoptr(WebKitUserContentManager) content_manager = create_content_manager();  /** NEW! **/
    configure_web_context(cog_shell_get_web_context(shell));                         /** NEW! **/
 
    g_autoptr(WebKitWebView) web_view =
        g_object_new(WEBKIT_TYPE_WEB_VIEW,
                     "user-content-manager", content_manager,  /** NEW! **/
                     "settings", cog_shell_get_web_settings(shell),
                     "web-context", cog_shell_get_web_context(shell),
                     "backend", view_backend,
                     NULL);
    cog_platform_init_web_view(platform, web_view);
    webkit_web_view_load_uri(web_view, s_starturl);
    return g_steal_pointer(&web_view);
}
What is g_autoptr? Does it relate to g_steal_pointer? This does not look like C!

In the shown code examples, g_autoptr(T) is a preprocessor macro provided by GLib that declares a pointer variable of the T type, and arranges for freeing resources automatically when the variable goes out of scope. For objects this results in g_object_unref() being called.

Internally the macro takes advantage of the __attribute__((cleanup, ...)) compiler extension, which is supported by GCC and Clang. GLib also includes a convenience macro that can be used to define cleanups for your own types.

As for g_steal_pointer, it is useful to indicate that the ownership of a pointer declared with g_autoptr is transferred outside from the current scope. The function returns the same pointer passed as parameter and resets it to NULL, thus preventing cleanup functions from running.

The size has been kept small thanks to reusing code from the Cog core library. As a bonus, it should run on Wayland, X11, and even on a bare display using the DRM/KMS subsystem directly. Compiling and running it, assuming you already have the dependencies installed, should be as easy as running:

cc -o minicog minicog.c $(pkg-config cogcore --libs --cflags)
./minicog wpewebkit.org

If the current session kind is not automatically detected, a second parameter can be used to manually choose among wl (Wayland), x11, drm, and so on:

./minicog wpewebkit.org x11

The full, unmodified source for this minimal browser is included right below.

Complete minicog.c source (Gist)

/*
 * SPDX-License-Identifier: MIT
 *
 * cc -o minicog minicog.c $(pkg-config wpe-webkit-1.1 cogcore --cflags --libs)
 */
 
#include <cog/cog.h>
 
static const char *s_starturl = NULL;
 
static WebKitWebView*
on_create_view(CogShell *shell, CogPlatform *platform)
{
    g_autoptr(GError) error = NULL;
    WebKitWebViewBackend *view_backend = cog_platform_get_view_backend(platform, NULL, &error);
    if (!view_backend)
        g_error("Cannot obtain view backend: %s", error->message);
 
    g_autoptr(WebKitWebView) web_view =
        g_object_new(WEBKIT_TYPE_WEB_VIEW,
                     "settings", cog_shell_get_web_settings(shell),
                     "web-context", cog_shell_get_web_context(shell),
                     "backend", view_backend,
                     NULL);
    cog_platform_init_web_view(platform, web_view);
    webkit_web_view_load_uri(web_view, s_starturl);
    return g_steal_pointer(&web_view);
}
 
int
main(int argc, char *argv[])
{
    g_set_application_name("minicog");
 
    if (argc != 2 && argc != 3) {
        g_printerr("Usage: %s [URL [platform]]\n", argv[0]);
        return EXIT_FAILURE;
    }
 
    g_autoptr(GError) error = NULL;
    if (!(s_starturl = cog_uri_guess_from_user_input(argv[1], TRUE, &error)))
        g_error("Invalid URL '%s': %s", argv[1], error->message);
 
    cog_modules_add_directory(COG_MODULEDIR);
 
    g_autoptr(GApplication) app = g_application_new(NULL, G_APPLICATION_DEFAULT_FLAGS);
    g_autoptr(CogShell) shell = cog_shell_new("minicog", FALSE);
    g_autoptr(CogPlatform) platform =
        cog_platform_new((argc == 3) ? argv[2] : g_getenv("COG_PLATFORM"), &error);
    if (!platform)
        g_error("Cannot create platform: %s", error->message);
 
    if (!cog_platform_setup(platform, shell, "", &error))
        g_error("Cannot setup platform: %s\n", error->message);
 
    g_signal_connect(shell, "create-view", G_CALLBACK(on_create_view), platform);
    g_signal_connect_swapped(app, "shutdown", G_CALLBACK(cog_shell_shutdown), shell);
    g_signal_connect_swapped(app, "startup", G_CALLBACK(cog_shell_startup), shell);
    g_signal_connect(app, "activate", G_CALLBACK(g_application_hold), NULL);
 
    return g_application_run(app, 1, argv);
}

URI Scheme Handlers

“Railroad” diagram of URI syntax URI syntax (CC BY-SA 4.0, source), notice the “scheme” component at the top left.

A URI scheme handler allows “teaching” the web engine how to handle any load (pages, subresources, the Fetch API, XmlHttpRequest, …)—if you ever wondered how Firefox implements about:config or how Chromium does chrome://flags, this is it. Also, WPE WebKit has public API for this. Roughly:

  1. A custom URI scheme is registered using webkit_web_context_register_uri_scheme(). This also associates a callback function to it.
  2. When WebKit detects a load for the scheme, it invokes the provided function, passing a WebKitURISchemeRequest.
  3. The function generates data to be returned as the result of the load, as a GInputStream and calls webkit_uri_scheme_request_finish(). This sends the stream to WebKit as the response, indicating the length of the response (if known), and the MIME content type of the data in the stream.
  4. WebKit will now read the data from the input stream.

Echoes

Let’s add an echo handler to our minimal browser that replies back with the requested URI. Registering the scheme is straightforward enough:

static void
configure_web_context(WebKitWebContext *context)
{
    webkit_web_context_register_uri_scheme(context,
                                           "echo",
                                           handle_echo_request,
                                           NULL /* userdata */,
                                           NULL /* destroy_notify */);
}
What are “user data” and “destroy notify”?

The userdata parameter above is a convention used in many C libraries, and specially in these based on GLib when there are callback functions involved. It allows the user to supply a pointer to arbitrary data, which will be passed later on as a parameter to the callback (handle_echo_request in the example) when it gets invoked later on.

As for the destroy_notify parameter, it allows passing a function with the signature void func(void*) (type GDestroyNotify) which is invoked with userdata as the argument once the user data is no longer needed. In the example above, this callback function would be invoked when the URI scheme is unregistered. Or, from a different perspective, this callback is used to notify that the user data can now be destroyed.

One way of implementing handle_echo_request() could be wrapping the request URI, which is part of the WebKitURISchemeRequest parameter to the handler, stash it into a GBytes container, and create an input stream to read back its contents:

static void
handle_echo_request(WebKitURISchemeRequest *request, void *userdata)
{
    const char *request_uri = webkit_uri_scheme_request_get_uri(request);
    g_print("Request URI: %s\n", request_uri);

    g_autoptr(GBytes) data = g_bytes_new(request_uri, strlen(request_uri));
    g_autoptr(GInputStream) stream = g_memory_input_stream_new_from_bytes(data);

    webkit_uri_scheme_request_finish(request, stream, g_bytes_get_size(data), "text/plain");
}

Note how we need to tell WebKit how to finish the load request, in this case only with the data stream, but it is possible to have more control of the response or return an error.

With these changes, it is now possible to make page loads from the new custom URI scheme:

Screenshot of the minicog browser loading a custom echo:// URI It worked!

Et Tu, CORS?

The main roadblock one may find when using custom URI schemes is that loads are affected by CORS checks. Not only that, WebKit by default does not allow sending cross-origin requests to custom URI schemes. This is by design: instead of accidentally leaking potentially sensitive data to websites, developers embedding a web view need to consciously opt-in to allow CORS requests and send back suitable Access-Control-Allow-* response headers.

In practice, the additional setup involves retrieving the WebKitSecurityManager being used by the WebKitWebContext and registering the scheme as CORS-enabled. Then, in the handler function for the custom URI scheme, create a WebKitURISchemeResponse, which allows fine-grained control of the response, including setting headers, and finishing the request instead with webkit_uri_scheme_request_finish_with_response().

Note that WebKit cuts some corners when using CORS with custom URI schemes: handlers will not receive preflight OPTIONS requests. Instead, the CORS headers from the replies are inspected, and if access needs to be denied then the data stream with the response contents is discarded.

In addition to providing a complete CORS-enabled custom URI scheme example, we recommend the Will It CORS? tool to help troubleshoot issues.

Further Ideas

Once we have WPE WebKit calling into our custom code, there are no limits to what a URI scheme handler can do—as long as it involves replying to requests. Here are some ideas:

  • Allow pages to access a subset of paths from the local file system in a controlled way (as CORS applies). For inspiration, see CogDirectoryFilesHandler.
  • Package all your web application assets into a single ZIP file, making loads from app:/... fetch content from it. Or, make the scheme handler load data using GResource and bundle the application inside your program.
  • Use the presence of a well-known custom URI to have a web application realize that it is running on a certain device, and make its user interface adapt accordingly.
  • Provide a REST API, which internally calls into NetworkManager to list and configure wireless network adapters. Combine it with a local web application and embedded devices can now easily get on the network.

User Script Messages

While URI scheme handlers allow streaming large chunks of data back into the Web engine, for exchanging smaller pieces of information in a more programmatic fashion it may be preferable to exchange messages without the need to trigger resource loads. The user script messages part of the WebKitUserContentManager API can be used this way:

  1. Register a user message handler with webkit_user_content_manager_register_script_message_handler(). As opposed to URI scheme handlers, this only enables receiving messages, but does not associate a callback function yet.
  2. Associate a callback to the script-message-received signal. The signal detail should be the name of the registered handler.
  3. Now, whenever JavaScript code calls window.webkit.messageHandlers.<name>.postMessage(), the signal is emitted, and the native callback functions invoked.
Haven't I seen postMessage() elsewhere?

Yes, you have. The name is the same because it provides a similar functionality (send a message), it guarantees little (the receiver should validate messages), and there are similar restrictions in the kind of values that can be passed along.

It’s All JavaScript

Let’s add a feature to our minimal browser that will allow JavaScript code to trigger rebooting or powering off the device where it is running. While this should definitely not be functionality exposed to the open Web, it is perfectly acceptable in an embedded device where we control what gets loaded with WPE, and that exclusively uses a web application as its user interface.

Pepe Silvia conspiracy image meme, with the text “It's all JavaScript” superimposed Yet most of the code shown in this post is C.

First, create a WebKitUserContentManager, register the message handler, and connect a callback to its associated signal:

static WebKitUserContentManager*
create_content_manager(void)
{
    g_autoptr(WebKitUserContentManager) content_manager = webkit_user_content_manager_new();
    webkit_user_content_manager_register_script_message_handler(content_manager, "powerControl");
    g_signal_connect(content_manager, "script-message-received::powerControl",
                     G_CALLBACK(handle_power_control_message), NULL);
    return g_steal_pointer(&content_manager);
}

The callback receives a WebKitJavascriptResult, from which we can get the JSCValue with the contents of the parameter passed to the postMessage() function. The JSCValue can now be inspected to check for malformed messages and determine the action to take, and then arrange to call reboot():

static void
handle_power_control_message(WebKitUserContentManager *content_manager,
                             WebKitJavascriptResult *js_result, void *userdata)
{
    JSCValue *value = webkit_javascript_result_get_js_value(js_result);
    if (!jsc_value_is_string(value)) {
        g_warning("Invalid powerControl message: argument is not a string");
        return;
    }

    g_autofree char *value_as_string = jsc_value_to_string(value);
    int action;
    if (strcmp(value_as_string, "poweroff") == 0) {
        action = RB_POWER_OFF;
    } else if (strcmp(value_as_string, "reboot") == 0) {
        action = RB_AUTOBOOT;
    } else {
        g_warning("Invalid powerControl message: '%s'", value_as_string);
        return;
    }

    g_message("Device will %s now!", value_as_string);
    sync(); reboot(action);
}

Note that the reboot() system call above will most likely fail because it needs administrative privileges. While the browser process could run as root to sidestep this issue—definitely not recommended!—it would be better to grant the CAP_SYS_BOOT capability to the process, and much better to ask the system manager daemon to handle the job. In machines using systemd a good option is to call the .Halt() and .Reboot() methods of its org.freedesktop.systemd1 interface.

Now we can write a small HTML document with some JavaScript sprinkled on top to arrange sending the messages:

<html>
  <head>
    <meta charset="utf-8" />
    <title>Device Power Control</title>
  </head>
  <body>
    <button id="reboot">Reboot</button>
    <button id="poweroff">Power Off</button>
    <script type="text/javascript">
      function addHandler(name) {
        document.getElementById(name).addEventListener("click", (event) => {
          window.webkit.messageHandlers.powerControl.postMessage(name);
          return false;
        });
      }
      addHandler("reboot");
      addHandler("poweroff");
    </script>
  </body>
</html>

The complete source code for this example can be found in this Gist.

Going In The Other Direction

But how can one return values from user messages back to the JavaScript code running in the context of the web page? Until recently, the only option available was exposing some known function in the page’s JavaScript code, and then using webkit_web_view_run_javascript() to call it from native code later on. To make this more idiomatic and allow waiting on a Promise, an approach like the following works:

  1. Have convenience JavaScript functions wrapping the calls to .postMessage() which add an unique identifier as part of the message, create a Promise, and store it in a Map indexed by the identifier. The Promise is itself returned from the functions.
  2. When the callback in native code handle messages, they need to take note of the message identifier, and then use webkit_web_view_run_javascript() to pass it back, along with the information needed to resolve the promise.
  3. The Javascript code running in the page takes the Promise from the Map that corresponds to the identifier, and resolves it.

To make this approach a bit more palatable, we could tell WebKit to inject a script along with the regular content, which would provide the helper functions needed to achieve this.

Nevertheless, the approach outlined above is cumbersome and can be tricky to get right, not to mention that the effort needs to be duplicated in each application. Therefore, we have recently added new API hooks to provide this as a built-in feature, so starting in WPE WebKit 2.40 the recommended approach involves using webkit_user_content_manager_register_script_message_handler_with_reply() to register handlers instead. This way, calling .postMessage() now returns a Promise to the JavaScript code, and the callbacks connected to the script-message-with-reply-received signal now receive a WebKitScriptMessageReply, which can be used to resolve the promise—either on the spot, or asynchronously later on.

Even More Ideas

User script messages are a powerful and rather flexible facility to make WPE integrate web content into a complete system. The provided example is rather simple, but as long as we do not need to pass huge amounts of data in messages the possibilities are almost endless—especially with the added convenience in WPE WebKit 2.40. Here are more ideas that can be built on top of user script messages:

  • A handler could receive requests to “monitor” some object, and return a Promise that gets resolved when it has received changes. For example, this could make the user interface of a smart thermostat react to temperate updates from a sensor.
  • A generic handler that takes the message payload and converts it into D-Bus method calls, allowing web pages to control many aspects of a typical Linux system.

Wrapping Up

WPE has been designed from the ground up to integrate with the rest of the system, instead of having a sole focus on rendering Web content inside a monolithic web browser application. Accordingly, the public API must be comprehensive enough to use it as a component of any application. This results in features that allow plugging into the web engine at different layers to provide custom behaviour.

At Igalia we have years of experience embedding WebKit into all kinds of applications, and we are always sympathetic to the needs of such systems. If you are interested collaborating with WPE development, or searching for a solution that can tightly integrate web content in your device, feel free to contact us.

March 07, 2023 06:00 PM

February 22, 2023

Release Notes for Safari Technology Preview 164

Surfin’ Safari

Safari Technology Preview Release 164 is now available for download for macOS Monterey 12.3 or later and macOS Ventura. If you already have Safari Technology Preview installed, you can update in the Software Update pane of System Preferences on macOS Monterey, or System Settings under General → Software Update on macOS Ventura.

This release includes WebKit changes between: 259549@main…260164@main.

Web Inspector

  • Elements tab
    • Added showing grid and flex overlays when in element selection and highlighting elements (259989@main, 260061@main)
    • Prevented showing ::backdrop rules for elements without a backdrop (259894@main)
  • Sources tab
    • Added experimental feature to enable aggressive limits on the length of lines that are formatted for sources (259603@main)

CSS

  • Fixed dynamically setting the width of tables with fixed layout and auto width (260143@main)
  • Improved serialization of mask and background properties (260157@main)
  • Made -webkit-image-set() an alias of image-set() (259994@main)
  • Made margin-trim trim collapsed margins at block-start and block-end sides (259734@main)

JavaScript

ResizeObserver

  • Fixed the initial last reported size of ResizeObservation (259673@main)

Rendering

  • Fixed content truncation when text-overflow is ellipsis (259850@main)
  • Fixed table cells, rows, sections or column (groups) to support margins (259955@main)
  • Fixed the margin for summary on details for right-to-left mode (260063@main)
  • Fixed inline text boxes containing Zero Width Joiner, Zero Width Non-Joiner, or Zero Width No-Break Space characters to not use simplified text measuring (259618@main)

Web Animations

  • Fixed animating two custom property list values with mismatching types to use a discrete animation (259557@main)
  • Fixed the animation of color list custom properties when iterationComposite is incorrect (259761@main)
  • Fixed composite of implicit keyframes for CSS Animations to be replace (259739@main)
  • Fixed keyframes to be recomputed if a custom property registration changes (259737@main)
  • Fixed keyframes to be recomputed when bolder or lighter is used on a font-weight property (259740@main)
  • Fixed keyframes to be recomputed when a parent element changes value for a custom property set to inherit (259812@main)
  • Fixed keyframes to be recomputed when a parent element changes value for a non-inherited property set to inherit (259645@main)
  • Fixed keyframes to be recomputed when the currentcolor value is used on color related properties (259736@main)
  • Fixed keyframes to be recomputed when the currentcolor value is used on a custom property (259808@main)
  • Fixed line-height to not transition from the default value to a number (260028@main)
  • Fixed animations without a browsing context to be idle (260101@main)
  • Fixed an @keyframes rule using an inherit value to update the resolved value when the parent style changes (259631@main)
  • Fixed non-inherited custom property failing to inherit from parent when inherit is set (259809@main)

WebAuthn

  • Fixed conditional passkey requests not cancelling correctly after AbortController.abort() (259754@main)

Media

  • Fixed distorted audio after getUserMedia when playing with AudioWorkletNode (259964@main)
  • Fixed getDisplayMedia to not build a list of every screen and window (259969@main)

HTTP

  • Enabled Clear-Site-Data HTTP header support (259970@main)
  • Added support for Clear-Site-Data: "executionContext" (259940@main)

Editing

  • Turned on the feature to make selection return a live range from getRangeAt and throw errors as specified (259904@main)
  • Fixed incorrect text caret placement when right-to-left text starts with whitespace (259868@main)

Web API

  • Added optional submitter parameter to FormData constructor (259558@main)
  • Added canvas.drawImage support for SVGImageElement (259869@main)
  • Implemented focus fixup rule so that focused elements are blurred when they are no longer focusable due to style changes (260067@main)
  • Fixed <link> elements with media queries that do not match to not block visually first paint
    (259963@main)
  • Fixed a Fetch bug with empty header values in Headers objects with “request-no-cors” guard (260066@main)
  • Fixed caret move by line when padding-top is set (259906@main)
  • Fixed individually paused or playing animations not being effected by Play All Animations and Pause All Animations (259971@main)
  • Fixed find on page failing to show results in PDFs in Safari (259655@main)
  • Fixed navigation within an iframe not exiting fullscreen for a parent iframe element (260024@main)
  • Fixed scrolling away from and back to an individually playing animation causing it to be incorrectly paused (259910@main)

Safari Web Extensions

  • Fixed Cross-Origin-Resource-Policy blocking fetch from extensions (259976@main)

February 22, 2023 10:52 PM

February 16, 2023

Philippe Normand: WebRTC in WebKitGTK and WPE, status updates, part I

Igalia WebKit

Some time ago we at Igalia embarked on the journey to ship a GStreamer-powered WebRTC backend. This is a long journey, it is not over, but we made some progress …

By Philippe Normand at February 16, 2023 08:30 PM

Web Push for Web Apps on iOS and iPadOS

Surfin’ Safari

Today marks the release of iOS and iPadOS 16.4 beta 1, and with it comes support for Web Push and other features for Home Screen web apps.

iPhone Lock Screen showing a notification arriving

Today also brings the first beta of Safari 16.4. It’s a huge release, packed with over 135 features in WebKit — including RegExp lookbehind assertions, Import Maps, OffscreenCanvas, Media Queries Range Syntax, @property, font-size-adjust, Declarative Shadow DOM, and much more. We’ll write all about these new WebKit features when Safari 16.4 is released to the public. Meanwhile, you can read a comprehensive list of new features and fixes in the Safari 16.4 beta 1 release notes.

But let’s set Safari aside and talk about Home Screen web apps on iOS and iPadOS.

Since the first iPhone, users could add any website to their Home Screen — whether it’s a brochure site, a blog, a newspaper, an online store, a social media platform, a streaming video site, productivity software, an application for creating artwork, or any other type of website. For the last ten years, users of Safari on iOS and iPadOS could do this by tapping the Share button to open the Share menu, and then tapping “Add to Home Screen”. The icon for that website then appears on their Home Screen, where a quick tap gets them back to the site.

Web developers have the option to create a manifest file (with its display member set to standalone or fullscreen) and serve it along with their website. If they do, that site becomes a Home Screen web app. Then, when you tap on its icon, the web app opens like any other app on iOS or iPadOS instead of opening in a browser. You can see its app preview in the App Switcher, separate from Safari or any other browser.

Web Push for Web Apps added to the Home Screen

Now with iOS and iPadOS 16.4 beta 1, we are adding support for Web Push to Home Screen web apps. Web Push makes it possible for web developers to send push notifications to their users through the use of Push API, Notifications API, and Service Workers all working together.

A web app that has been added to the Home Screen can request permission to receive push notifications as long as that request is in response to direct user interaction — such as tapping on a ‘subscribe’ button provided by the web app. iOS or iPadOS will then prompt the user to give the web app permission to send notifications. Once allowed, the user can manage those permissions per web app in Notifications Settings — just like any other app on iPhone and iPad.

The notifications from web apps work exactly like notifications from other apps. They show on the Lock Screen, in Notification Center, and on a paired Apple Watch.

This is the same W3C standards-based Web Push that was added in Safari 16.1 for macOS Ventura last fall. If you’ve implemented standards-based Web Push for your web app with industry best practices — such as using feature detection instead of browser detection — it will automatically work on iPhone and iPad.

Web Push on iOS and iPadOS uses the same Apple Push Notification service that powers native push on all Apple devices. You do not need to be a member of the Apple Developer Program to use it. Just be sure to allow URLs from *.push.apple.com if you are in control of your server push endpoints.

To learn more about how to setup Web Push, read the article Meet Web Push on webkit.org, or watch the WWDC22 session video Meet Web Push.

Focus support

Notifications are a powerful tool, but it’s easy for people to get into situations where they are overwhelmed by too many of them. Notifications for Home Screen web apps on iPhone and iPad integrate with Focus, allowing users to precisely configure when or where to receive them. For users who add the same web app to their Home Screen on more than one iOS or iPadOS device, Focus modes automatically apply to all of them.

Badging API

Home Screen web apps on iOS and iPadOS 16.4 beta 1 now support the Badging API. Just like any app on iOS and iPadOS, web apps are now able to set their badge count. Both setAppBadge and clearAppBadge change the count while the user has the web app open in the foreground or while the web app is handling push events in the background — even before permission to display the count has been granted.

Permission to display the badge on the app icon is granted in exactly the same way as other apps on iOS and iPadOS. Once a user gives permission to allow notifications, the icon on the Home Screen will immediately display the current badge count. Users can then configure permissions for Badging in Notifications Settings, just like any other app on iOS or iPadOS.

Manifest ID

WebKit for iOS and iPadOS 16.4 beta 1 adds support for the id member from the Web Application Manifest standard. It’s a string (in the form of a URL) that acts as the unique identifier for the web application, intended to be used by an OS in whatever way desired. iOS and iPadOS use the Manifest ID for the purpose of syncing Focus settings across multiple devices.

iOS has supported multiple installs of the same web app since the very beginning. We believe the ability for people to install any web app more than once on their device can be useful — providing additional flexibility to support multiple accounts, separate work vs personal usage, and more.

When adding a web app to the Home Screen, users are given the opportunity to change the app’s name. iOS and iPadOS 16.4 beta 1 combine this name with the Manifest ID to uniquely identify the web app. That way, a user can install multiple copies of the web app on one device and give them different identities. For example, notifications from “Shiny (personal)” can be silenced by Focus while notifications from “Shiny (work)” can be allowed. If the user gives their favorite website the same name on multiple devices, Focus settings on one device will sync and apply to the others as well.

Third-party browser support for Add to Home Screen

In iOS and iPadOS 16.4 beta 1, third-party browsers can now offer their users the ability to add websites and web apps to the Home Screen from the Share menu.

Applications on iOS and iPadOS present the Share menu by creating a UIActivityViewController with an array of activityItems. For “Add to Home Screen” to be included in the Share menu the following must be true:

  1. The application has the com.apple.developer.web-browser managed entitlement
  2. A WKWebView is included in the array of activityItems
  3. The WKWebView is displaying a document with an HTTP or HTTPS URL
  4. If the device is an iPad, it must not be configured as a Shared iPad

As described above, after a user adds to Home Screen, any website with a Manifest file that sets the display member to standalone or fullscreen will open as a web app when a user taps its icon. This is true no matter which browser added the website to the Home Screen.

If there is no manifest file configured to request web app behavior (and no meta tag marking the site as web app capable), then that website will be saved as a Home Screen bookmark. Starting in iOS and iPadOS 16.4 beta 1, Home Screen bookmarks will now open in the user’s current default browser.

New Fallback Icon

Web developers usually provide icons to represent their website throughout the interface of a browser. If icons for the Home Screen are not provided, previously iOS and iPadOS would create an icon from a screenshot of the site. Now, iOS and iPadOS 16.4 beta 1 will create and display a monogram icon using the first letter of the site’s name along with a color from the site instead.

To provide the icon to be used for your website or web app, list the icons in the Manifest file — a capability that’s been supported since iOS and iPadOS 15.4. Or you can use the long-supported technique of listing apple-touch-icons in the HTML document head. (If you do both, apple-touch-icon will take precedence over the Manifest-declared icons.)

New Web API for Web Apps

Besides Web Push, Badging API, and Manifest ID, many of the other new features in Webkit for iOS and iPadOS 16.4 beta 1 are of particular interest to web app developers focusing on Home Screen web apps. These include:

See the release notes for Safari 16.4 beta 1 for the full list of features.

Feedback

Are you seeing a bug? That’s to be expected in a beta. Please help us such squash bugs before iOS and iPadOS 16.4 are released to the public by providing feedback from your iPhone or iPad. Feedback Assistant will collect all the information needed to help us understand what’s happening.

Also, we love hearing from you. You can find us on Mastodon at @jensimmons@front-end.social, @bradeeoh@mastodon.social and @jondavis@mastodon.social. Or send a tweet to @webkit to share your thoughts on these new features.

February 16, 2023 06:30 PM

February 15, 2023

The User Activation API

Surfin’ Safari

As a web developer, you’ve probably noticed that certain APIs only work if an end-user clicks or taps on an HTML element. For example, if you try to run the following code in Safari’s Web Inspector, it will result in an error:

await navigator.share({ text: "hi" });
NotAllowedError: The request is not allowed by the user agent or 
the platform in the current context, possibly because the user denied permission.

This error happens when code is not run as a direct result of the end-user clicking or tapping on an HTML element (e.g., a <button>).

Having code that runs as a result of an end-user action is what the HTML specification refers to as “user activation”. There are a large number of APIs on the web that depend on user activation. Common ones include:

  • window.open()
  • navigator.share()
  • navigator.wakelock.request()
  • PaymentRequest.prototype.show()
  • and there are many, many more…

So what constitutes a “user activation”?

The HTML spec defines the following events as “activation triggering user events”:

Together, this list effectively constitutes “user activation”. You’ll note the list of events above is really small. It’s restricted so that certain calls to APIs can only happen as a result of those very distinct end-user actions. This prevents end-users from being accidentally (or deliberately!) spammed with popup windows or other intrusive browser dialogs.

Now that we know about these special events, we can now write code to take into account user activation:

button.addEventListener("click", async () => {
   // This works fine...
   await navigator.share({text: "hi"});
});

So even though we are not specifically listening for a "mousedown" event, we know that the activation triggering event has occurred, so our code can run without throwing any errors.

End-user protections

Now you might be wondering, can one run execute multiple commands that require user activation insingle or multiple event listeners? Consider the following code sample:

button.addEventListener("click", async () => {

   // This works fine...
   await navigator.share({text: "hi"});

   // This will fail...
   window.open("https://example.com");
});

button.addEventListener("click", async () => {
   // This will now fail too...
   window.open("https://example.com");
});

But why do the calls to window.open() fail there? To understand that, we need to delve deeper into how browsers handle user activation under the hood.

Meet “transient” and “sticky” activation

When an “activation triggering user event” occurs what actually happens is that the browser starts an internal timer specifically tied to a browser tab. This timer is not directly exposed to the web page and runs for a short time (a few seconds, maybe). Each browser engine can determine how much time is allocated and it can change for a number of reasons (i.e., it’s deliberately not observable by JavaScript!). It’s designed to give your code enough time to perform some particular task (e.g., it could process some image data and then call navigator.share() to share the image with another application).

In HTML, this timer is called transient activation. And while this timer is running, that browser window “has transient activation”. HTML also defines a concept called sticky activation. This simply means that the web page has had transient activation at some point in the past.

Although rare, some APIs (e.g., Web Audio) use sticky activation to perform some actions.

Now, the above doesn’t explain why window.open() failed above. To understand why, we need to now discuss what HTML calls “activation-consuming APIs”.

APIs that “consume” the user activation

As the name suggests, “activation-consuming APIs” consume the user activation. That is, when those APIs are called, they effectively reset the transient activation timer, so the web page no longer has transient activation.

This behavior is why window.open() fails above: calling navigator.share() consumed the user activation, meaning that window.open() no longer had transient activation (so it fails).

A list of common APIs that consume transient activation in WebKit:

  • Web Notification’s requestPermission() method.
  • Payment Request: the show() method.
  • And, as we have already discussed, Web Share’s share() method.

This list is not exhaustive, and new APIs are being added to the web all the time that either rely on or consume transient activation.

As a point of interest: not all APIs consume the user activation. Some only require transient activation but won’t consume it. That allows multiple asynchronous operations dependent on user activation to take place. Otherwise, it would require the user to click or press on a button over and over again to complete a task, which would be quite annoying for them.

Scope of transient activation

A really useful thing to know about transient activation is that it’s scoped to the entire window (or browser tab)! That means that, so long as all iframes on a page are same-origin, they all have transient activation. However, for security reasons, cross-origin iframes will not have transient activation.

Transient activation across all same origin iframes

For third-party iframes to have transient activation, a user must explicitly activate an HTML element inside the third-party iframe. However, once they activate an element then transient activation propagates to the parent and to any iframes that are same origin to iframe where the activation took place:

Activation propagating to parent frame, to other iframes that match the third-party iframe

Security Note: you can (and should!) restrict what capacities third-party iframes have access to by setting the allow= and/or sandbox= attributes, as needed.

The UserActivation API

To assist developers with dealing with user activation, the HTML standard introduces a simple API to check if a page has transient and/or sticky activation.

  • navigator.userActivation.isActive:
    Returns true when the window has transient activation.
  • navigator.userActivation.hasBeenActive:
    Returns true if the window has had transient activation in the past (i.e., “sticky activation”).

So for example, you can do something like:

if (navigator.userActivation.isActive) {
    await navigator.share({text: "hi"})
}

Limitations and ongoing standards work

There are two significant limitations with the current user activation model that standards folks are still grappling with.
Firstly, consider the following case, where a file takes too long to download and the transient activation timer runs out:

button.onclick = () => {
    // Slow network + really big file
    const image = await fetch("really-big-file");

    // Oh no!!! transient activation expired! 😢
    navigator.share({files: [image]});
}   

There are ongoing discussions at the WHATWG and W3C about how we might address the problem above. Unfortunately, we don’t yet have a solution, but naturally we need some means to extend the transient activation so the code above doesn’t fail.

Secondly, there are legitimate use cases for enabling transient activation in a third-party iframe from a first-party document (e.g., to allow a third-party to process a request for payment). There is ongoing discussions to see if there is some means to safely enable third-party iframes to also have transient activation in special cases.

Automation and testing

To help developers deal with tricky edge-cases that could arise from the transient activation unexpectedly expiring, WebKit has been working with other browser vendors to allow the user activation to be consumed via Web Driver.

Conclusion

Web APIs being gated on user activation helps keep the user safe from annoying intrusions, like multiple popup windows or notification spam, while allowing developers to do the right thing in response to user interaction. The UserActivation API can help you determine if it’s OK to call a function that depends on user activation.

You can try out the User Activation API in Safari Technology Preview release 160 or later.

February 15, 2023 05:24 PM

February 13, 2023

Declarative Shadow DOM

Surfin’ Safari

We’re pleased to announce that support for the declarative shadow DOM API has been added and enabled by default in Safari Technology Preview 162. To recap, shadow DOM is a part of Web Components, a set of specifications that were initially proposed by Google to enable the creation of reusable widgets and components on the web. Since then these specifications have been integrated into the DOM and HTML standards. Shadow DOM, in particular, provides a lightweight encapsulation for DOM trees by allowing a creation of a parallel tree on an element called a “shadow tree” that replaces the rendering of the element without modifying its own DOM tree.

Up until this point, creating a shadow tree on an element required calling attachShadow() on the element in JavaScript. This meant that this feature was not available when JavaScript is disabled such as in email clients, and it required care to hide the content supposed to be in a shadow tree until relevant scripts are loaded to avoid flush of contents. In addition, many modern websites and web-based applications deploy a technique called “server-side rendering” whereby programs running on a web server generate HTML markup with the initial content for web browsers to consume, instead of fetching content over the network once scripts are loaded. This helps reducing page load time and also improves SEO because the page content is readily available for search engine crawlers to consume. Many server-side-rendering technologies try to eliminate the need for JavaScript for the initial rendering to reduce the initial paint latency and progressively enhance the content with interactivity as scripts and related metadata are loaded. This was, unfortunately, not possible when using shadow DOM because of the aforementioned requirement to use attachShadow().

Declarative shadow DOM addresses these use cases by providing a mechanism to include shadow DOM content in HTML. In particular, specifying a shadowrootmode content attribute on a template element tells web browsers that the content inside of this template element should be put into a shadow tree attached to its parent element. For example, in the following example, the template element with shadowrootmode will attach a shadow root on some-component element with a text node containing “hello, world.” as its sole child node.

<some-component>
    <template shadowrootmode="closed">hello, world.</template>
</some-component>

When scripts are loaded and ready to make this content interactive, the shadow root can be accessed via ElementInternals as follows:

customElements.define('some-component', class SomeComponent extends HTMLElement {
    #internals;
    constructor() {
        super();
        this.#internals = this.attachInternals();

        // This will log "hello, world."
        console.log(this.#internals.shadowRoot.textContent.trim());
    }
});

We designed this API with backwards compatibility in mind. For example, calling attachShadow() on an element with a declarative shadow DOM returns the declaratively attached shadow root with all its children removed instead of failing by throwing an exception. It means that adopting declarative shadow DOM is backwards compatible with existing JavaScript which relies on attachShadow() to create shadow roots. Note that none of the JavaScript parser APIs (such as DOMParser and innerHTML) support declarative shadow DOM by default to avoid creating new cross-site scripting vulnerabilities in existing websites that accepts arbitrary template content (since script elements in such content had been previously inert and would not run).

In addition, we’re introducing the ability to clone shadow roots. Until now, ShadowRoot and its descendant nodes could not be cloned by cloneNode() or importNode(). attachShadow() now takes cloneable flag as an option. When this flag is set to true, existing JavaScript API such as cloneNode() and importNode() will clone ShadowRoot when cloning its shadow host. Declarative shadow DOM automatically sets this flag to true so that declarative shadow DOM which appears inside other template elements can be cloned along with its host. In the following example, the outer template element contains an instance of some-component element and its shadow tree content is serialized using declarative shadow DOM. Cloning template1.content with document.importNode(template1.content, true) will clone some-component as well as its (declaratively defined) shadow tree.

<template id="template1">
    <some-component>
        <template shadowrootmode="closed">hello, world.</template>
    </some-component>
</template>

In summary, declarative shadow DOM introduces an exciting new way of defining a shadow tree in HTML, which will be useful for server-side rendering of Web Components as well as in context where JavaScript is disabled such as email clients. This has been a highly requested feature with lots of discussions among browser vendors. We’re happy to report its introduction in Safari Technology Preview 162.

February 13, 2023 06:10 PM

Cathie Chen: How does ResizeObserver get garbage collected in WebKit?

Igalia WebKit

ResizeObserver is an interface provided by browsers to detect the size change of a target. It would call the js callback when the size changes. In the callback function, you can do anything including delete the target. So how ResizeObserver related object is managed?

ResizeObserver related objects

Let’s take a look at a simple example.
<div id="target"></div>
<script>
{
  var ro = new ResizeObserver( entries => {});
  ro.observe(document.getElementById('target'));
} // end of the scope
</script>
  • ro: a JSResizeObserver which is a js wrapper of ResizeObserver,
  • callback: JSResizeObserverCallback,
  • entries: JSResizeObserverEntry,
  • and observe() would create a ResizeObservation,
  • then document and the target.
So how these objects organized? Let’s take a look at the code.
ResizeObserver and Document
It needs Document to create ResizeObserver, and store document in WeakPtr<Document, WeakPtrImplWithEventTargetData> m_document;.
On the other hand, when observe(), m_document->addResizeObserver(*this), ResizeObserver is stored in Vector<WeakPtr<ResizeObserver>> m_resizeObservers;.
So ResizeObserver and Document both hold each other by WeakPtr.
ResizeObserver and Element
When observe(), ResizeObservation is created, and it is stored in ResizeObserver by Vector<Ref<ResizeObservation>> m_observations;.
ResizeObservation holds Element by WeakPtr, WeakPtr<Element, WeakPtrImplWithEventTargetData> m_target.
On the other hand, target.ensureResizeObserverData(), Element creates ResizeObserverData, which holds ResizeObserver by WeakPtr, Vector<WeakPtr<ResizeObserver>> observers;.
So the connection between ResizeObserver and element is through WeakPtr.

Keep JSResizeObserver alive

Both Document and Element hold ResizeObserver by WeakPtr, how do we keep ResizeObserver alive and get released properly?
In the example, what happens outside the scope? Per [1],
Visit Children – When JavaScriptCore’s garbage collection visits some JS wrapper during the marking phase, visit another JS wrapper or JS object that needs to be kept alive.
Reachable from Opaque Roots – Tell JavaScriptCore’s garbage collection that a JS wrapper is reachable from an opaque root which was added to the set of opaque roots during marking phase.
To keep JSResizeObserver itself alive, use the second mechanism “Reachable from Opaque Roots”, custom isReachableFromOpaqueRoots. It checks the target of m_observations, m_activeObservationTargets, and m_targetsWaitingForFirstObservation, if the targets containsWebCoreOpaqueRoot, the JSResizeObserver won’t be released. Note that it uses GCReachableRef, which means the targets won’t be released either. The timeline of m_activeObservationTargets is from gatherObservations to deliverObservations. And the timeline of m_targetsWaitingForFirstObservation is from observe() to the first time deliverObservations. So JSResizeObserver won’t be released if the observed targets are alive, or it has size changed observations not delivered, or it has any target not delivered at all.

ResizeObservation

ResizeObservation is owned by ResizeObserver, so it will be released if ResizeObserver is released.

Keep `JSCallbackDataWeak* m_data` in `JSResizeObserverCallback` alive

Though ResizeObserver hold ResizeObserverCallback by RefPtr, it is a IsWeakCallback.

JSCallbackDataWeak* m_data; in JSResizeObserverCallback does not keep align with JSResizeObserver.
Take a close look at JSCallbackDataWeak, there is JSC::Weak<JSC::JSObject> m_callback;.

To keep JSResizeObserver itself alive, ResizeObserver using the first mechanism “Visit Children”.
In JSResizeObserver::visitAdditionalChildren, it adds m_callback to Visitor, see:
void JSCallbackDataWeak::visitJSFunction(Visitor& visitor)
{
    visitor.append(m_callback);
}

JSResizeObserverEntry

Like JSResizeObserver and callback, JSResizeObserverEntry would make sure the target and contentRect won’t be released when it is alive.
void JSResizeObserverEntry::visitAdditionalChildren(Visitor& visitor)
{
    addWebCoreOpaqueRoot(visitor, wrapped().target());
    addWebCoreOpaqueRoot(visitor, wrapped().contentRect());
}
ResizeObserverEntry is RefCounted.
class ResizeObserverEntry : public RefCounted<ResizeObserverEntry>
It is created in ResizeObserver::deliverObservations, and passed to the JS callback, if JS callback doesn’t keep it, it will be released when the function is finished.
[1] https://github.com/WebKit/WebKit/blob/main/Introduction.md#js-wrapper-lifecycle-management

By cchen at February 13, 2023 02:33 PM

February 08, 2023

Release Notes for Safari Technology Preview 163

Surfin’ Safari

Safari Technology Preview Release 163 is now available for download for macOS Monterey 12.3 or later and macOS Ventura. If you already have Safari Technology Preview installed, you can update in the Software Update pane of System Preferences on macOS Monterey, or System Settings under General → Software Update on macOS Ventura.

This release includes WebKit changes between: 258383@main…259548@main.

Web Inspector

  • General
    • Fixed Web Inspector not remembering which side of the window it was attached to (259320@main)
    • Fixed undocked Web Inspector windows being placed in a different window set from the window they are inspecting when using Stage Manager (258672@main)
  • Elements Tab
    • Fixed the ITAL variation axis slider showing NaN values in the Fonts details sidebar panel (259351@main)
    • Fixed “Inspect Element” not revealing the selected element in the DOM tree if the element was hidden behind the “Show All Nodes” button (258805@main)
  • Timelines Tab
    • Disabled the Screenshots timeline when inspecting targets that can’t support it (259326@main)

Masonry Layout

CSS Custom Properties

  • Added dependency cycle handling that involves root style (258985@main)
  • Added detection for complex custom property cycles involving multiple loops (259506@main)
  • Added handling for computational dependencies in transform functions (259353@main, 259298@main)
  • Changed rules in shadow trees to be ignored (258880@main)
  • Updated CSS custom properties containing var() to update when the referenced property is animated (258786@main)
  • Fixed failing to provide intialValue in registerProperty with non-universal syntax should throw (258909@main)
  • Fixed “<color> | <color>+” to match “yellow blue” (259166@main)
  • Ensured transition-property values fill with a custom property when other transition CSS properties are used with a longer list of items (258770@main)

:has() pseudo-class

  • Added invalidation support for :buffering and :stalled pseudo-classes (258891@main)
  • Made :has() require valid selectors for all selectors in the selector list (258712@main)

Media Queries Level 4

  • Allowed negative values in media queries (258938@main)
  • Made “layer” an invalid media type name (258957@main)

CSS

  • Added support for leading-trim (258766@main)
  • Added support for using currentcolor with color-mix() (258970@main, 259145@main)
  • Fixed box-shadow not painting correctly on inline elements (258923@main)
  • Made CSS animations participate in cascade (258514@main)
  • Fixed invalidation for class names within :nth-child() selector lists (258917@main)
  • Fixed focusing an element with scroll snap to not always result in snapping to that element (259381@main)
  • Fixed font-face src list failing early if the component fails (258749@main, 258870@main)
  • Fixed font-face src local() to invalidate CSS-wide keywords (258695@main)
  • Fixed text-decoration-thickness property to always trigger a repaint when changed (258641@main)
  • Fixed overscroll-behavior: none to prevent overscroll when the page is too small to scroll (259227@main)
  • Fixed appearance: slider-vertical to only apply to range inputs (258924@main)
  • Fixed that initial whitespace breaks the query in window.matchMedia() (259357@main)
  • Fixed a bug that Disconnected <fieldset> elements sometimes incorrectly match :valid / :invalid selectors (259422@main)
  • Removed “specified hue” color interpolation method for gradients and color-mix() (259190@main)
  • Stopped requiring whitespace between of and the selector list in :nth-child and :nth-last-child (258703@main)

CSS Typed OM

  • Stopped treating grid-row-start, grid-column-start, grid-row-end, and grid-column-end as list properties (258764@main)

Forms

  • Fixed input[type=submit], input[type=reset], and input[type=button] to honor font-size/padding/height and to work with multi-line values (258754@main)

Rendering

  • Added a guard against zero or negative space shortage (258647@main)
  • Changed the default oblique angle from 20 degrees to 14 degrees to match other browsers (258722@main)
  • Fixed underlines not appearing and disappearing when expected (258914@main)
  • Fixed hairline on selection when bidi text is involved (259537@main)
  • Fixed incorrect paint of translate property animation (259173@main)
  • Fixed incorrect repaint when inline content shrinks vertically (259141@main)
  • Fixed scrolling through content hidden with clip-path not propagating below (259368@main)
  • Fixed margin: auto to be the only rule resolved against the “available width adjusted with intrusive floats” (e.g. percent values are based on container width). (259125@main)
  • Fixed lazy image loading failure when overflow: clip was set on just one axis (259007@main)
  • Fixed scrolling for a fixed header inside overflow scroll with a transformed ancestor (259175@main)
  • Fixed incorrect custom pseudo-scrollbar sizes (259389@main)
  • Improved balancing for border, padding, and empty block content (259246@main)

JavaScript

  • Fixed Atomics.waitAsync to be invocable from the main thread (258856@main)
  • Fixed module scripts to always decode using UTF-8 (259251@main, 259261@main)
  • Fixed toLocaleLowerCase and toLocaleUpperCase to not throw an exception on empty string (259242@main)
  • Optimized Number constructor calls further (259533@main)
  • Optimized WebAssemblyInstance#exports (259017@main)
  • Updated Intl.DurationFormat to align with updated standards (259317@main)

WebAssembly

Gamepad API

  • Added support for vibrationActuator (258680@main)
  • Added experimental support for “trigger-rumble” effect type behind a feature flag (259507@main)
  • Fixed GamepadHapticActuator.playEffect() with a magnitude less than 0.1 having no effect (258874@main)
  • Fixed how magnitude values passed to GamepadHapticActuator.playEffect() (258988@main)
  • Made gamepad.vibrationActuator.playEffect() throw for invalid effect parameters (258752@main)
  • Made vibrationActuator limit the duration of vibration effects (258759@main)
  • Made vibrationActuator stop vibrating when its document becomes hidden (258802@main)
  • Made Gamepad.vibrationActuator work with the GameController framework (258674@main)
  • Made Gamepad.vibrationActuator return null when the gamepad doesn’t support dual-rumble (258812@main)
  • Set Gamepad.vibrationActuator.type to “dual-rumble” (258758@main)

Web API

  • Enabled default ARIA for custom elements (258743@main)
  • Implemented StorageManager.estimate() (258610@main)
  • Added support for Ed25519 keys to Web Crypto (259277@main, 259489@main)
  • Changed queryCommandValue("stylewithcss") to always return an empty string (258777@main)
  • Disabled DOMParser’s support for declarative shadow DOM (258768@main)
  • Fixed a bug that FileSystemSyncAccessHandle’s close function throws an exception on the second call (258736@main)
  • Fixed data written via FileSystemSyncAccessHandle disappearing after creating a new FileSystemFileHandle (258876@main)
  • Fixed negative shadow repaint issue (259497@main)
  • Fixed getting input.value for number inputs with over 39 characters returning an empty string (258614@main)
  • Fixed right-to-left tab handling (259428@main, 259460@main)
  • Fixed Clear-Site-Data HTTP header to obey origin partitioning (259466@main)
  • Fixed a bug that a new SharedWorker will be dysfunctional after the old one terminated via SharedWorkerGlobalScope.close() (259228@main)
  • Fixed a bug that HTMLSelectElement’s value setter sets incorrect values if there are grouped options (259249@main)
  • Fixed stripping of leading slashes in URL.hostname setter (259366@main)
  • Fixed a bug that formDisabledCallback() is sometimes called even when disabledness hasn’t changed (259372@main)
  • Made autofilling form to trigger input event in addition to change event (259434@main)
  • Moved oncopy, oncut, and onpaste to GlobalEventHandlers (258390@main)
  • Removed the precision="float" attribute on <input type="range"> (258625@main)
  • Removed HTMLHeadElement.profile (258397@main)
  • Removed HTMLPreElement.wrap (258445@main)
  • Removed SVGFEMorphologyElement.setRadius(radiusX, radiusY) (258733@main)
  • Removed HTMLFrameElement.location (259067@main)
  • Updated Content Security Policy when the header is sent as part of a 304 response (258931@main)

WebGL

  • Enabled WEBGL_provoking_vertex by default (259499@main)

SVG

  • Fixed rotate: x and transform: rotate(x) yielding different behavior with SVG (258882@main)
  • Fixed SVG textLength (258921@main)
  • Handled animation freeze when repeatDur is not a multiple of dur (259212@main)
  • Fixed the bug that SVG sometimes doesn’t repaint when resolving color changes (259082@main)
  • Fixed computing the keyTimes index correctly for discrete values animations (258939@main)

Scrolling

  • Fixed page scrolling more than one screenful when pressing Space or Fn+Down (259146@main)

Media

  • Enabled AudioSession API by default with a reduced subset (259074@main)
  • Moved Media Source API settings back into the Experimental Features (258853@main)
  • Changed to try using low latency for the WebRTC HEVC encoder, if available (259128@main)
  • Fixed HLS videos sometimes failing to reach “ended” state and not able to be restarted (259342@main)
  • Fixed MediaStreamTrack ending due to a capture failure when using bluetooth headsets (259150@main)
  • Fixed AudioBufferSourceNode.start with a duration sometimes failing (259234@main)
  • Fixed duplicate timeupdate events in Text Track Code (259023@main)
  • Fixed a bug that video element pauses after bluetooth audio input is disconnected (259415@main, 259478@main)
  • Updated MediaController.currentTime to return the previously set position (259020@main)

Accessibility

  • Fixed aria-controls to be exposed as AXLinkedUIElements and not AXARIAControls (258922@main)
  • Fixed VoiceOver when selecting “Sign in with Apple” on some websites (259147@main)

WebDriver

  • Fixed the Shift modifier key not applying to typed text (259039@main)

Safari Web Extensions

  • Added support for support setExtensionActionOptions() in the declarativeNetRequest API
  • Changed declarativeNetRequest rules to default to false for isUrlFilterCaseSensitive
  • Increased browser.storage.session data limit to 10MB
  • Fixed :has() selector for Safari Content Blockers (259068@main)

February 08, 2023 09:49 PM

Try out CSS Nesting today

Surfin’ Safari

Back in December, we wrote an article detailing three different options for CSS Nesting. In it, we explained the differences between Option 3, Option 4 and Option 5, demonstrating how each would work through a series of examples. Then we asked a simple question: “Which option is best for the future of CSS?”

Web developers responded to the poll with great clarity. Option 3 won in a landslide.

And so now, both Safari and Chrome have implemented Option 3. Two weeks ago, on January 25th, CSS Nesting shipped in Safari Technology Preview 162, on by default. If you have a Mac, simply download and open Safari Technology Preview, write some nested CSS, and experience how it works!

How CSS Nesting Works

Imagine you have some CSS that you’d like to write in a more compact way.

.foo {
  color: green;
}
.foo .bar {
  font-size: 1.4rem;
}

With CSS Nesting, you can write such code as:

.foo {
  color: green;
 .bar {
    font-size: 1.4rem;
  }
}

If you’ve been nesting styles in Sass, you will find this very familiar.

Unlike Sass, however, this kind of nesting will not always work. Because of limitations in browser parsing engines, you must make sure the nested selector (.bar in the above example) always starts with a symbol. If it’s a class, ID, pseudo-class, pseudo-element, attribute selector, or any selector that uses a symbol at the beginning — you’ve succeeded. For example, all of these will be fine. All of the following nested selectors start with a symbol — . # : [ * + > ~ — not a letter:

main {
 .bar { ... }
 #baz { ...}
 :has(p) { ... }
 ::backdrop { ... }
 [lang|="zh"] { ... }
 * { ... }
 + article { ... }
 > p { ... }
 ~ main { ... }
}

There is one kind of selector that starts with a letter, however — a nested element selector. This example will not work:

main {
 article { ... }
}

That code will fail, because article begins with a letter, and not a symbol. How will it fail? The same way it would fail if you’d misspelled article as atirlce. The nested CSS which depends on that particular selector is simply ignored.

You have several options for what to do about this limitation. Let’s start by looking at the solution that you’ll probably use most often. You can simply put an & before the element selector, like this:

main {
 & article { ... }
}

The & signals to the browser “this is where I want the selector from outside this nest to go”. By using an & before any element selectors, you succeed at starting the nested selector with a symbol, not a letter. Therefore, it will work.

aside {
 & p { ... }
}

is the nested equivalent to:

aside p { ... }

The & is also super handy for other use cases.

Imagine you have this unnested code:

ul {
  padding-left: 1em;
}
.component ul {
  padding-left: 0;
}

You’ll notice that the intended selector is .component ul — where the ul is second.

To write nested rules that yield such a result, you can write:

ul {
  padding-left: 1em;
  .component & {
    padding-left: 0;
  }
}

Again, the & gives you a way to say “this is is where I want the nested selector to go”.

It’s also handy when you don’t want a space between your selectors. For example:

a {
  color: blue;
  &:hover {
    color: lightblue;
  }
}

Such code yields the same result as a:hover {. Without the &, you’d get a :hover { — notice the space between a and :hover — which would fail to style your hover link.

But what if you have this unnested code?

ul {
  padding-left: 1em;
}
article ul {
  padding-left: 0;
}

You do not want to write the nested version like this:

ul {
  padding-left: 1em;
  & article & {
    padding-left: 0;
  }
}

Why not? Because that will actually behave in the same way as these unnested rules:

ul {
  padding-left: 1em;
}
ul article ul {
  padding-left: 0;
}

Two unordered lists in ul article ul? No, that’s not what we want.

So, what do we do instead, since we need to start article & with a symbol?

We can write our code like this:

ul {
  padding-left: 1em;
  :is(article) & {
    padding-left: 0;
  }
}

Any selector can be wrapped by an :is() pseudo-class and maintain the same specificity and meaning (when it’s the only selector inside the parentheses). Put an element selector inside :is(), and you get a selector that starts with a symbol for the purposes of CSS Nesting.

In summary, CSS Nesting will work just like Sass, but with one new rule: you must make sure the nested selector always starts with a symbol.

Investigations are currently underway to see if this restriction can be relaxed without making the parsing engine slower. The restriction may very well be removed — whether sometime very soon or years in the future. Everyone agrees Nesting will be much better without any such restriction. But we also all agree that web pages must appear in the browser window right away. Adding even the slightest pause before rendering begins is not an option.

What is an option? You being able to structure your nested code however you’d like. You can nest more than one layer deep — nesting CSS inside already-nested CSS — in as many levels as you desire. You can mix Nesting with Container Queries, Feature Queries, Media Queries, and/or Cascade Layers however you want. Anything can go inside of anything.

Try out CSS Nesting today and see what you think. Test to your code in both Safari Technology Preview and Chrome Dev (after flipping the “Experimental Web Platform features” flag) to make sure it yields the same results. This is the best time to find bugs — before this new feature has shipped in any browser. You can report issues at bugs.webkit.org or bugs.chromium.org. Also, keep an eye out for the release notes for next several versions of Safari Technology Preview. Each might add polish to our CSS Nesting implementation, including efforts that add support for CSSOM or other updates to match any potential spec changes made by the CSS Working Group.

A lot of people across multiple companies have been working to bring nesting to CSS for almost five years. The syntax has been hotly debated, in long conversations about the pros and cons of many different solutions. We hope you find the result immensely helpful.

February 08, 2023 06:00 PM

February 06, 2023

ElementInternals and Form-Associated Custom Elements

Surfin’ Safari

In Safari Technology Preview 162 we enabled the support for ElementInternals and the form-associated custom elements by default. Custom elements is a feature which lets web developers create reusable components by defining their own HTML elements without relying on a JavaScript framework. ElementInternals is a new addition to custom elements API, which allows developers to manage a custom element’s internal states such as default ARIA role or ARIA label as well as having custom elements participate in form submissions and validations.

Default ARIA for Custom Elements

To use ElementInternals with a custom element, call this.attachInternals() in a custom element constructor just the same way we’d call attachShadow() as follows:

class SomeButtonElement extends HTMLElement {
    #internals;
    #shadowRoot;
    constructor()
    {
        super();
        this.#internals = this.attachInternals();
        this.#internals.ariaRole = 'button';
        this.#shadowRoot = this.attachShadow({mode: 'closed'});
        this.#shadowRoot.innerHTML = '<slot></slot>';
    }
}
customElements.define('some-button', SomeButtonElement);

Here, #internals and #shadowRoot are private member fields. The above code will define a simple custom element whose ARIA role is button by default. Achieving the same effect without using ElementInternals required sprouting ARIA content attribute on the custom element itself like this:

class SomeButtonElement extends HTMLElement {
    #shadowRoot;
    constructor()
    {
        super();
        this.#shadowRoot = this.attachShadow({mode: 'closed'});
        this.#shadowRoot.innerHTML = '<slot></slot>';
        this.setAttribute('role', 'button');
    }
}
customElements.define('some-button', SomeButtonElement);

This code is problematic for a few reasons. For one, it’s surprising for an element to automatically add content attributes on itself since no built-in element does this. But more importantly, the above code prevents users of this custom element to override ARIA role like this because the constructor will override the role content attribute upon upgrades:

<some-button role="switch"></some-button>

Using ElementInternals’s ariaRole property as done above, this example works seamlessly. ElementInternals similarly allows specifying the default values of other ARIA features such as ARIA label.

Participating in Form Submission

ElementInternals also adds the capability for custom elements to participate in a form submission. To use this feature of custom elements, we must declare that a custom element is associated with forms as follows:

class SomeButtonElement extends HTMLElement {
    static formAssociated = true;
    static observedAttributes = ['value'];
    #internals;
    constructor()
    {
        super();
        this.#internals = this.attachInternals();
        this.#internals.ariaRole = 'button';
    }
    attributeChangedCallback(name, oldValue, newValue)
    {
        this.#internals.setFormValue(newValue);
    }
}
customElements.define('some-button', SomeButtonElement);

With the above definition of a some-button element, some-button will submit the value of the value attribute specified on the element for the name attribute specified on the same element. E.g., if we had a markup like <some-element name="some-key" value="some-value"></some-element>, we would submit some-key=``some-value.

Participating in Form Validation

Likewise, ElementInternals adds the capability for custom elements to participate in form validation. In the following example, some-text-field is designed to require a minimum of two characters in the input element inside its shadow tree. When there are less than two characters, it reports a validation error to the user using the browser’s native UI using setValidity() and reportValidity():

class SomeTextFieldElement extends HTMLElement {
    static formAssociated = true;
    #internals;
    #shadowRoot;
    constructor()
    {
        super();
        this.#internals = this.attachInternals();
        this.#shadowRoot = this.attachShadow({mode: 'closed', delegatesFocus: true});
        this.#shadowRoot.innerHTML = '<input autofocus>';
        const input = this.#shadowRoot.firstChild;
        input.addEventListener('change', () => {
            this.#internals.setFormValue(input.value);
            this.updateValidity(input.value);
        });
    }
    updateValidity(newValue)
    {
        if (newValue.length >= 2) {
            this.#internals.setValidity({ });
            return;
        }
        this.#internals.setValidity({tooShort: true}, 
            'value is too short', this.#shadowRoot.firstChild);
        this.#internals.reportValidity();
    }
}
customElements.define('some-text-field', SomeTextFieldElement);

With this setup, :invalid pseudo class will automatically apply to the element when the number of characters user typed is less than 2.

Form-Associated Custom Element Callbacks

In addition, form-associated custom elements provide the following set of new custom element reaction callbacks:

  • formAssociatedCallback(form) – Called when the associated form element changes to form. ElementInternals.form returns the associated from element.
  • formResetCallback() – Called when the form is being reset. (e.g. user pressed input[type=reset] button). Custom element should clear whatever value set by the user.
  • formDisabledCallback(isDisabled) – Called when the disabled state of the element changes.
  • formStateRestoreCallback(state, reason) – Called when the browser is trying to restore element’s state to state in which case reason is “restore”, or when the browser is trying to fulfill autofill on behalf of user in which case reason is “autocomplete”. In the case of “restore”, state is a string, File, or FormData object previously set as the second argument to setFormValue.

Let’s take a look at formStateRestoreCallback as an example. In the following example, we store input.value as state whenever the value of input element inside the shadow tree changes (second argument to setFormValue). When the user navigates away to some other page and comes back to this page, browser can restore this state via formStateRestoreCallback. Note that WebKit currently has a limitation that only string can be used for the state, and “autocomplete” is not supported yet.

class SomeTextFieldElement extends HTMLElement {
    static formAssociated = true;
    #internals;
    #shadowRoot;
    constructor()
    {
        super();
        this.#internals = this.attachInternals();
        this.#shadowRoot = this.attachShadow({mode: 'closed', delegatesFocus: true});
        this.#shadowRoot.innerHTML = '<input autofocus>';
        const input = this.#shadowRoot.querySelector('input');
        input.addEventListener('change', () => {
            this.#internals.setFormValue(input.value, input.value);
        });
    }
    formStateRestoreCallback(state, reason)
    {
        this.#shadowRoot.querySelector('input').value = state;
    }
}
customElements.define('some-text-field', SomeTextFieldElement);

In summary, ElementInternals and form-associated custom elements provide an exciting new way of writing reusable component that participates in form submission and validation. ElementInternals also provides the ability to specify the default value of ARIA role and other ARIA properties for a custom element. We’re excited to bring these features together to web developers.

February 06, 2023 08:51 PM

February 01, 2023

Interop 2023: Pushing interoperability forward

Surfin’ Safari

A year ago, Apple, Bocoup, Google, Igalia, Microsoft, and Mozilla came together to improve the interoperability of the web and to continue our commitments to web standards — actions that ensure the web will work in any browser, on any operating system, with any computer.

Throughout last year, Interop 2022 focused on fifteen key areas of most importance to web developers, selecting automated tests to evaluate how closely each browser engine matches the web standards for those areas. Browsers made remarkable progress between January and December 2022, improving the number of tests that pass in all three browsers from 49% to 83%.

screenshot of the 2022 graph, also available at http://wpt.fyi/interop-2022Interop 2022 was a great success. The “Interop” line, in dark green, shows the percentage of tests that passed in all three browsers

The WebKit team channeled efforts across all focus areas, and is proud to have reached a 98.2% pass-rate by the end of 2022.

The final scores for Interop 2022, ending in December. Chrome 88. Firefox 92. Safari 98. The final browser scores for Interop 2022.

Announcing this year’s Interop 2023

Now we are pleased to announce this year’s Interop 2023 project! Once again, we are joining with Bocoup, Google, Igalia, Microsoft, and Mozilla to move the interoperability of the web forward.

The scores have reset. We retired half of the tests used for scoring Interop 2022 last year, and added many new tests — all of which are focused on the technology web developers most expressed they want improved next.

Screenshot of the new Interop 2023 dashboardThe new Interop 2023 dashboard provides more insight than ever. Click the name of each technology to see the tests used to evaluate conformance to web standards.

Each “Focus Area” collects together a set of automated tests for a particular technology, used to evaluate browser implementations. The “Investigations” are team projects run by the people behind Interop 2023 to investigate a particularly-complex issue as a group, and find ways to make progress.

Last fall, the collaborative team planning Interop 2023 received 87 proposals for what to include. Of those, 35 were accepted and combined into 18 new Focus Areas, plus 2 new Investigations. They join 5 Focus Areas carried over from 2022 and 3 Focus Areas carried over from 2021, for a total of 26 “Active Focus Areas” for 2023.

Active focus areas tableThe new “Interop” column reflects the percentage of tests that pass in all three browser engines, which is the goal, to increase interoperability.

We achieved pretty great interoperability in 7 Focus Areas from 2022, and so we’re moving these to “Previous Focus Areas”, a new section on the Interop dashboard, where they no longer count towards the overall top-level score.

Previous Focus Areas tableA new “Previous Focus Areas” section lists areas of focus from past years, where we can keep an eye on them.

The 2023 Focus Areas

Let’s take a look at all the web technology included in each of the 26 Focus Areas for 2023. They now include features in JavaScript and Web APIs, as well as CSS.

Border Image

Gray square with a gradient border from purple at the top to yellow at the bottom

The ability to use an image to provide the visual styling for a box’s border has been supported in browsers for many years. It opens up a world of possibilities for how a border can look. But behavioral differences between browsers have discouraged web developers from using border-image. Things have greatly improved over time, but there are still inconsistencies. By including Border Image in Interop 2023, the hope is that a renewed attention to detail will make all the difference.

Color Spaces and Functions

Expression of color is vital for designers. Having complete tools to cover the gamut of color spaces helps creative designers make the web a more beautiful place to visit. Color Spaces and Functions is a Focus Area carried over from Interop 2022. To ensure the work is completed, this year’s area still includes the tests for three expanded color spaces (lab, lch, P3), and writing color in CSS through functional notation with color-mix().

For 2023, this area now includes styling gradients so that interpolation — the method of determining intermediate color values — can happen across different color spaces. This illustration shows the differences between the default sRGB interpolation compared to interpolation in lab and lch color spaces:

Three stripes of red to purple gradients showing interpolation differences for sRGB, LAB, and LCH color spaces

Learn more about color spaces and functions in Improving Color on the Web, Wide Gamut Color in CSS with Display-P3, and Wide Gamut 2D Graphics using HTML Canvas.

Container Queries

Container Queries started arriving in 2022, allowing developers to apply styles to a particular item based on qualities of the container they live inside. Size queries let developers create components that adapt depending on the size of their container, and Container Query Units provide a measurement of the container, in cq* units.

Screenshot of Container Queries CSS example code with a browser window showing a demo store of apparel product card components with a large hero layout, three-column tile layout, and sidebar layout

A single card component can appear in different places in the layout at different sizes — a large hero graphic, a medium size displayed in rows, and a small version for the sidebar.

By including Size Queries and Container Query Units in Interop 2023, we can ensure the web has interoperable implementations across browsers.

Containment

Containment in CSS improves performance by limiting calculations of layout, style, paint, size (or any combination) to an isolated segment of the DOM instead of the entire page. The Focus Area includes the contain, contain-intrinsic-size, and content-visibility CSS properties. They are used to help the browser make optimization decisions. For example, content-visibility: auto is a convenient way to defer element rendering until the content becomes relevant to the user by scrolling to it, find-in-page, tab order navigation, etc.

CSS Pseudo-classes

This Focus Area covers a collection of CSS pseudo-classes: :dir(), :nth-child(), :nth-last-child(), :nth-of-type(), :nth-last-of-type(), :modal, :user-valid and :user-invalid.

The :nth-child(n of <selector>) and :nth-last-child(n of <selector>) pseudo-classes are particularly interesting. For example, :nth-child(2 of .foo), matches the 2nd element that has the class .foo among all the children. Here’s an example you can try in Safari, where this feature has been supported since 2015.


If you want to count from the bottom, use :nth-last-child(n of <selector>).

It’s a particularly exciting time to add new and improve existing CSS pseudo-classes, because they can be used inside :has() — exponentially increasing their usefulness.

Custom Properties

The @property at-rule extends the capabilities of CSS variables much further by allowing developers to specify the syntax of the variable, the inheritance behavior, and the variable initial value. It allows developers to do things in CSS that were impossible before, like animating gradients or specific parts of transforms.

@property --size {
  syntax: "<length>";
  inherits: false;
  initial-value: 0px;
}

With @property support, developers can declare custom properties in a fashion that’s similar to how browser engines define CSS properties.

Flexbox

This flexible one-dimensional layout solution for arranging items in rows or columns has been around for over fifteen years. Over that time, the Flexbox specification has matured with both slight behavior changes and updated clarifications of precise details. This year’s Flexbox Focus Area expands on previous years, adding the latest WPT tests. Staying ahead of new unevenness in implementations maintains developer confidence in this widely-used layout technology.

Font Feature Detection and Palettes


Font feature detection extends Feature Queries by adding two functions for testing which font formats and technologies are supported by a browser. Developers can write statements like @supports font-format(woff) { ... } or @supports font-tech(color-COLRv0) { ... } to conditionally apply CSS only when WOFF fonts or COLRv0 are supported.

Color fonts provide a way to add richness to designs without sacrificing the benefits of using regular text. Regardless of how decorative a color font is, the underlying text is always searchable, copy/paste-able, scalable, translatable, and compatible with screen readers.

Font palettes provide a mechanism for web developers to alter the color palette used by a color font. The font-palette property provides a way for web developers to select one of several different predefined color palettes contained inside a color font — for example, to declare that a font’s dark color palette be used for the site’s dark mode design. The @font-palette-values rule provides a way for web developers to define their own custom color palette for recoloring color fonts. Learn more by reading Customizing Color Fonts on the Web.

Forms

Setting written text vertically can be commonplace in languages like Japanese and Chinese. Support for vertical text has been available in browsers for several years, but support for vertical text in form fields has been missing. Interop 2023 is a commitment by the industry to change that with vertical writing mode support in input, textarea, and select menus.

The 2023 Forms Focus Area also carries over the tests from Interop 2022. This includes tests for the appearance property, form, events on disabled form controls, input elements, form submission, and form validation.

Grid

CSS Grid is another layout tool even more powerful than Flexbox. The ability to divide the page into regions and define the relationship of content to areas of the grid offers unparalleled control within a structured layout. Similar to Flexbox, the Grid Focus Area expands on work from Interop 2021 to ensure reliable layout and adoption of this powerful technology.

:has()

The :has() pseudo-class wasn’t considered for inclusion in Interop 2022, because it still wasn’t clear that it’s possible to implement a “parent-selector” in a performant way in browsers. Then in March 2022, Safari 15.4 proved it could be done.

This simple tool gives developers the long-awaited ability to apply styles to an ancestor based on the state of its descendant or sibling elements. It’s a powerful way to reduce the need for JavaScript, and something that will be most useful once it’s implemented interoperably in all browsers. That makes it important to include in Interop 2023. Learn about how powerful :has() can be by reading Using :has() as a CSS Parent Selector and much more.

Inert

Inert subtrees were first used by the modal dialog element to prevent user interaction with content that appears behind opened dialogs. The Inert Focus Area covers the new inert attribute, which expands this capability to all HTML elements. When an element is marked as inert, it’s no longer editable, focusable or clickable; and it’s hidden from assistive technologies. Read Non-interactive elements with the inert attribute to learn more.

Masking

CSS Masking provides several mechanisms to mask out part of an image or clip off part of a box. They have both been supported in browsers for a long time, but like many things implemented long ago there are painful differences between browsers. This is exactly the kind of developer pain point that Interop 2023 addresses. The Masking Focus Area includes improving CSS clipping and masking behaviors, including their use in animations and with SVG content.

Math Functions

CSS Math Functions help developers create complex calculations to style complex layouts or control animations without the need for JavaScript. Interop 2023 includes:

  • Exponential functions: pow(), sqrt(), hypot(), log(), exp()
  • Sign-related functions: abs(), sign()
  • Stepped value functions: round(), mod(), rem()
  • Trigonometric functions: sin(), cos(), tan(), asin(), acos(), atan(), atan2()

Support for all of these CSS math functions first appeared on the web in Safari 15.4.

Media Queries 4

As most front-end web developers know, CSS Media Queries provide the mechanism for applying CSS depending on the size of the viewport or other qualities of the device. The most recent level of the specification, Media Queries level 4, adds new syntax for combining and modifying queries, plus a simplified range syntax that makes it easier for developers to write complicated queries. This new syntax matches the options available in Container Queries.

The and, not, and or conditionals make complex queries more readable. The new range syntax offers a more straightforward pattern for declaring a viewport range. For example, @media (500px <= width < 900px) { ... } applies when the viewport width is equal to 500px or between 500px and 900px, but not equal to 900px.

Modules

The Modules Focus Area includes support for Modules in Web Workers, Import Maps and Import Assertions.

JavaScript Modules allow web developers to import and export variables, functions, and more. Import Maps give web developers the ability to control the behavior of JavaScript imports. And Import Assertions add syntax for module import statements to indicate the expected module type to help protect sites from unintentionally running malicious JavaScript.

Motion Path

CSS Motion Path (also known as Offset Path) is used to describe a path for an element to follow. It’s powerful when combined with CSS transformations, and especially helpful with CSS animations — making it possible to code complex movements in CSS and avoid JavaScript performance costs. This Focus Area covers offset, offset-anchor, offset-distance, offset-path, offset-position, and offset-rotate.

Offscreen Canvas

When using Canvas, rendering, animation, and user interaction usually happen on the main execution thread of a web application. Offscreen Canvas provides a canvas that can be rendered off screen, decoupling the DOM and the Canvas API so that the <canvas> element is no longer entirely dependent on the DOM. Rendering can also be run inside a worker context, allowing developers to run tasks in a separate thread and avoid heavy work on the main thread.

The combination of DOM-independent operations and rendering off the main thread can provide a significantly better experience for users, especially on low-power devices.

This Focus Area also includes requestAnimationFrame() in web workers which can be used alongside OffscreenCanvas to perform other rendering and animation related tasks off the main thread.

Pointer & Mouse Events

Pointer events are DOM events that are fired for a pointing device. They are designed to create a single DOM event model to handle pointing input devices such as a mouse, pen/stylus or touch (one or more fingers).

Last year, the Interop 2022 team took on an Investigation into the state of pointer and mouse events. Older incomplete web standards had led to many differences between browsers, operating systems, and devices. The 2022 Investigation project took on a mission to assess what can be done to increase interoperability, and chose a set of specific tests to reflect what browsers can improve.

The 2023 Focus Area now includes 16 tests that cover pointer and mouse interaction with pages, including how they behave with hit testing and scrolling areas. Touch and stylus are not included since additional WPT test infrastructure is needed before the appropriate devices can be tested.

Scrolling

The Scrolling Focus Area is a carryover from Interop 2022. There’s more work to do to increase interoperability, so everyone agreed to include it again. The effort includes Scroll Snap, scroll-behavior, and overscroll-behavior.

Scroll Snap provides the tools for designers and developers to control how interfaces scroll and how content appears. The scroll-behavior property sets the behavior for a scrolling box when scrolling is triggered by navigation or CSSOM scrolling APIs. And the overscroll-behavior property determines what a browser does when reaching the boundary of a scrolling area.

Subgrid

Subgrid is another Focus Area from Interop 2022 which the team agreed to carry over. Getting an interoperable implementation of Subgrid across all browsers will take layout on the web to the next level, fully realizing the vision of CSS Grid.

a screenshot of the Grid Inspector in Web Inspector showing the subgrid

Subgrid provides an easy way to put grandchildren of a grid container on that grid. It makes it possible to line up items across complex layouts, without any regard for the DOM structure.

Transforms

a 3D cylinder of numbered squares

CSS Transforms is a Focus Area continued from Interop 2021. Often used in animations, CSS Transforms provide a mechanism for transforming boxes in two- or three-dimensional space. Historically, 3D transforms have been tied to how the rendering engine handles layers. Over the last two years, engineers at Firefox, Chrome and Safari have been closely collaborating to especially improve the interoperability of 3D. By continuing to keep attention on this area, Interop 2023 aims to raise the number of tests that pass in all three browser engines up from its current 92.8%.

URL

URLs are a fundamental part of the web. Without them, the web would not exist. But like many things invented very early in the history of the web, they are something that haven’t been fully interoperable. To improve this, the WHATWG wrote a specification packed with details on precisely how URLs should work. To further promote interoperability, URL is now a Focus Area for Interop 2023.

WebCodecs (video)

The WebCodecs API gives web developers complete control over how media is processed by providing low-level access to the individual frames of a video stream and chunks of audio. This is especially useful for applications that do video or audio editing, video conferencing, or other real-time processing of video or audio. For Interop 2023, this Focus Area includes the video processing portion of WebCodecs.

Web Compat 2023

Similar to Interop 2022, this year’s project includes a Focus Area named “Web Compat”. It’s a grab bag of various bugs known to cause website compatibility issues, and features that, if missing, are most likely to make a website work in one browser and not in another.

Last year’s effort was incredibly successful. The 2022 tests now have 96.9% interoperability across all three browser engines, with a score of 100% in two of the three browsers. Because of that success, the tests from Web Compat 2022 are now retired as a “Previous Focus Area”.

For 2023, a whole new set of tests have been selected. They include tests for Regex Lookbehind, inline editing, event dispatching on disabled controls, CSS image-set, white-space, and text-emphasis.

Web Components

Web Components is a suite of technologies for creating reusable custom elements, with encapsulated functionality. This Interop 2023 Focus Area includes Constructable stylesheets, adoptedStyleSheets, ElementInternals, Form-associated Custom Elements, and the basic behavior of Shadow DOM & Custom Elements.

2023 Investigation Projects

There are two Investigation efforts planned for 2023. Both are projects to improve the testing infrastructure of WPT, so that Interop 2024 can potentially include a wider range of technology.

One effort will take a look at how to test the interoperability of browser engines on mobile operating systems. The other will take a look at how to test the interoperability of accessibility-specific web technology.

Our Ongoing Commitment to Interoperability

We continue to believe that interoperability is one of the fundamental pillars that makes the web such a successful platform. WebKit’s efforts in Interop 2022 demonstrate how deeply we care about the web. We are excited to again collaborate with our colleagues in seizing this opportunity help websites and web apps work better for everyone.

February 01, 2023 05:00 PM

January 31, 2023

Allowing Web Share on Third-Party Sites

Surfin’ Safari

As of Safari Technology Preview 160, it is no longer possible to use the W3C’s Web Share API with third-party sites within an iframe without including an allow attribute. All browser vendors agreed to this change as part of the W3C’s standardization process, and it is being rolled out in all major browser engines (including Chrome, Edge, Firefox, on mobile and desktop browsers).

The Web Share API allows web developers to enable the native sharing functionality of a device, such as sharing a link via email or social media. Prior to this change, the API could be used on any website within an iframe without restriction. However, due to concerns about privacy and security, browser vendors at the W3C have decided to limit the use of the API to only those sites that have explicitly been given permission to use it.

Web developers must now include an allow attribute in the iframe HTML element to use the Web Share API within an iframe on a third-party site. The attribute accepts a value of web-share and optionally the origin of the site that is allowed to use the API.

<iframe allow="web-share" src="https://example.com">
</iframe>

or

<iframe allow="web-share https://example.com" src="https://example.com">
</iframe>

Without the allow attribute, the API will throw an exception and will not function within the iframe. The syntax of the allow attribute is defined by the W3C’s Permissions Policy specification. You can learn more about the syntax on MDN.

This change is a necessary step in protecting user privacy and security. It helps ensure that the Web Share API is only used by sites that the developer signals is trustworthy. However, that means web developers will need to make some code changes to continue using the API within third-party iframes.

January 31, 2023 05:27 PM

January 19, 2023

WPE WebKit Blog: Status of the new SVG engine in WebKit

Igalia WebKit

figure { margin: 0; } figure > figure { border: 1px #cccccc solid; padding: 4px; } figcaption { background-color: #cccccc; color: black; padding: 1px; text-align: center; margin-bottom: 4px; }

In the previous posts of this series, various aspects of the WPE port architecture were covered. Besides maintaining and advancing the WPE port according to our customers’ needs, Igalia also participates in the development of the WebCore engine itself, which is shared by all WebKit ports. WebCore is the part of the browser engine that does the heavy lifting: it contains all functionality necessary to load, parse, lay out, and paint Web content.

Since late 2019, Igalia has been working on a new SVG engine, dubbed Layer-Based SVG Engine (LBSE), that will unify the HTML/SVG rendering pipelines in WebCore. This will resolve long-standing design issues of the “legacy” SVG engine and unlock a bunch of new exciting possibilities for Web developers to get creative with SVG. Hardware-accelerated compositing, driven by CSS transform animations, 3D perspective transformations for arbitrary SVG elements, CSS z-index support for all SVG graphics elements, and proper coverage rectangle computations and repaints are just a few highlights of the capabilities the future SVG engine will offer.

In this article, an overview is given about the problems that LBSE aims to solve, and the importance of a performant, well-integrated SVG engine especially for the embedded market. Finally, the current upstreaming status is summarized including an outlook for the year 2023.

LBSE in a nutshell

Before diving into the technical topics, let’s take a few minutes to recap the motivations behind the LBSE work, and explain the importance of a well-integrated, performant SVG engine in WebKit, especially for the embedded market.

Motivation

Many of our customers build products that utilize a Linux-powered embedded device, typically using non-x86 CPUs, custom displays with built-in input capabilities (e.g., capacitive touchscreens) often without large amounts of memory or even permanent storage. The software stack for these devices usually consists of a device-specific Linux distribution, containing the proprietary network, GPU, and drivers for the embedded device - the vendor-approved “reference distribution”.

No matter what type of product is built nowadays, many of them need an active Internet connection, to e.g. update their software stack and access additional information. Besides the UI needed to control the product, a lot of additional dialogs, wizards and menus have to be provided to be able to alter the devices’ “system settings”, such as date/time information, time zones, display brightness, WiFi credentials, Bluetooth settings, and so on.

A variety of toolkits exist that assist in writing GUI applications for embedded devices, with a few open-source projects on the market, as well as commercial products providing closed-source, proprietary solutions, that specifically target the embedded market and are often optimized for specific target device families, e.g. certain ARM processors / certain GPUs.

If the need arises, not only to communicate with the Internet but also to display arbitrary Web content, WPE comes into play. As presented in the first post in this series, the flexible and modular WPE architecture makes it an ideal choice for any product in the embedded market that needs Web browsing abilities. The GLib/C-based WPE public APIs allow for customization of the browsing engine and its settings (react on page load/input events, inject custom JS objects, modify style sheets, etc.) and allow the embedder to control/monitor all relevant Web browsing-related activities.

With a full-fledged Web engine at hand, one might ponder if it is feasible to replace the whole native GUI stack with a set of Web pages/applications, and only use WPE to paint the UI in full-screen mode, thus migrating away from native GUI applications — following the trend in the desktop market. The number of organizations migrating native GUI applications into Web applications is rapidly increasing, since there are compelling reasons for Web apps: “write once, use everywhere”, avoiding vendor lock-in, easy/reliable deployment and update mechanisms, and efficient test/development cycles (local in-browser testing!).

Due to the sheer capabilities of the Web platform, it has grown to an environment in which any kind of application can be developed – ranging from video editing applications, big data processing pipelines to 3D games, all using JS/WebAssembly in a browser, presented using HTML5/CSS. And as an important bonus: in 2023, it’s much easier to find and attract talented Web developers and designers that are fluent in HTML/CSS/JS, than those that are comfortable designing UI applications in proprietary, closed-source C/C++ frameworks.

A long-term customer, successfully using WPE in their products, had very similar thoughts and carried out a study, contracting external Web designers to build a complete UI prototype using Web technology. The mock-up made extensive use of SVG2, embedded inline into HTML5 documents or via other mechanisms (CSS background-image, etc.). The UI fulfilled all expectations and worked great in Blink and WebKit-based browsers, delivering smooth animations. On the target device, however, the performance was too slow, far away from usable. A thorough analysis revealed that large parts of the Web page were constantly repainted, and layout operations were repeated for every frame when animations were active. The accumulated time to display a new frame during animations was in the order of a few milliseconds on desktop machines, but took 20-25 milliseconds on the target device, making smooth 60 FPS animations impossible.

The poor performance is not the result of shortcomings in the WPE port of WebKit: when replacing the aforementioned animated SVG document fragments with HTML/CSS “equivalents” (e.g. simulating SVG circles with CSS border-radius tricks) the performance issue vanisheed. Why? SVG lacks support for a key feature called accelerated compositing, which has been available for HTML/CSS since its introduction more than a decade ago. This compositing heavily relies on the Layer Tree, which is unaware of SVG. Extending the Layer Tree implementation to account for SVG is the main motivation for LBSE.

If you are unfamiliar with the concepts of Render Tree and Layer Tree, you might want to read the “Key concepts” section of an earlier LBSE design document, which provides an overview of the topic.

Prototyping

The LBSE effort began in October 2019 as a research project, to find out an ideal design for the SVG Render Tree, that allows SVG to re-use the existing Layer Tree implementation with minimal changes. The aim for LBSE is to share as much code as possible with the HTML/CSS implementation, removing the need for things like SVG specific clipping/masking/filter code and disjoint HTML counterparts for the same operations.

After an extensive phase of experimentation, two abandoned approaches, and a long time spent on regression fixing, the LBSE prototype was finally finished after almost two years of work. It passed all 60k+ WebKit layout tests and offered initial support for compositing, 3D transformations, z-index, and more. The intent was to prove that we can reach feature parity with the legacy SVG engine and retrieve the very same visual results, pixel-by-pixel (except for progressions of LBSE). Shortly after the finalization, the prototype was presented during the WebKit contributors meeting in 2021.

As the name “prototype” indicates, LBSE was not ready for integration into WebKit at this point. It replaced the old SVG engine with a new one, resulting in a monolithic patch exceeding 650 KB of code changes. External contributions generally demand small patches, with ChangeLogs, tests, etc. – no conscientious reviewer in any company would approve a patch replacing a core component of a browser engine in one shot. Splitting up into small pieces is also not going to work, since SVG needs to be kept intact upstream all the time. Duplicating the whole SVG engine? Not practicable either. With that problem in mind, a fruitful discussion took place with Apple during and after the WebKit contributors meeting: a realistic upstreaming strategy was defined - thanks Simon Fraser for suggesting a pragmatic approach!

The idea is simple: bootstrap LBSE in parallel to the legacy SVG engine. Upstream LBSE behind a compile-time flag and additionally a runtime setting. This way the LBSE code is compiled by the EWS bots during upstreaming (rules out bit-rot) and we gain the ability to turn LBSE on, selectively, from our layout tests – very useful during early bootstrapping. For WebKit, that strategy is the best – for LBSE another major effort is necessary: moving from a drop-in replacement approach to a dual-stack SVG engine: LBSE + legacy built into the same WebKit binaries. At least the timing was good since a split-up into small pieces was needed anyhow for upstreaming. Time to dissect the huge branch into logical, atomic pieces with proper change logs.

Before we jump to the upstreaming status, one question should be answered, that came up during the WebKit contributors meeting and also during various discussions: why don’t you just fix the existing SVG engine and instead propose a new one - isn’t that too risky for Web compatibility?

Why don’t you fix the existing SVG engine?

LBSE logo

There was no initial intention to come up with a new SVG engine. During LBSE development it became apparent how much SVG-specific code can be erased when unifying certain aspects with HTML/CSS. After carrying out the integration work, layout/painting and hit-testing work fundamentally different than before. Since that time, LBSE is labeled as a “new SVG engine”, even though the SVG DOM tree part remained almost identical. Web compatibility will improve with LBSE: a few long-standing, critical interop issues with other browser vendors are solved in LBSE. Therefore, there are no concerns regarding Web compatibility risks from our side.

To answer the initial question, whether it is possible to fix the existing SVG engine to add layer support without adding a “new” SVG engine in parallel? Short answer: no.

In the following section, it is shown that adding support for layers implies changing the class hierarchy of the SVG render tree. All SVG renderers need to inherit from RenderLayerModelObject – a change like this cannot be split up easily into small, atomic pieces. Improving the design is difficult if there’s a requirement to keep the SVG engine working all the time upstream: all patches in that direction end up being large as many renderers have to be changed at the same time. Having distinct, LBSE-only implementations of SVG renderers, independent of the legacy engine, leaves a lot of freedom to strive for an optimal design, free of legacy constraints, and avoids huge patches that are impossible to review.

Let’s close the introduction and review the upstreaming status, and discuss where we stand today.

Upstreaming progress

Planning

To unify the HTML/CSS and SVG rendering pipelines there are two possible paths to choose from: teach the Layer Tree about the SVG Render Tree and its rendering model, or vice-versa. For the latter path, the HTML/CSS-specific RenderLayer needs to split into HTML/SVG subclasses and a base class, that is constructible from non-RenderLayerModelObject-derived renderers. The layer management code currently in RenderLayerModelObject would need to move into another place, and so forth. This invasive approach can potentially break lots of things. Besides that danger, many places in the layer/compositing system would need subtle changes to account for the specific needs of SVG (e.g. different coordinate system origin/convention).

Therefore the former route was chosen, which requires transforming the SVG render tree class hierarchy, such that all renderers that need to manage layers derive from RenderLayerModelObject. Using this approach support, for SVG can be added to the layer/compositing system in a non-invasive manner, with only a minimum of SVG-specific changes. The following class hierarchy diagrams illustrate the planned changes.

Legacy design (click to enlarge) Visualization of the legacy SVG render tree class hierarchy in WebCore LBSE design (click to enlarge) Visualization of the LBSE SVG render tree class hierarchy in WebCore

The first graph shows the class hierarchy of the render tree in the legacy SVG engine: RenderObject is the base class for all nodes in the render tree. RenderBoxModelObject is the common base class for all HTML/CSS renderers. It inherits from RenderLayerModelObject, potentially allowing HTML renderers to create layers. For the SVG part of the render tree, there is no common base class shared by all the SVG renderers, for historical reasons.

The second graph shows only the SVG renderers of the LBSE class hierarchy. In that design, all relevant SVG renderers may create/destroy/manage layers, via RenderLayerModelObject. More information regarding the challenges can be found in the earlier LBSE design document.

Report

The upstreaming work started in December 2021, with the introduction of a new layer-aware root renderer for the SVG render subtree: RenderSVGRoot. The existing RenderSVGRoot class was renamed to LegacyRenderSVGRoot (as well as any files, comments, etc.) and all call sites and build systems were adapted. Afterward, a stub implementation of a layer-aware RenderSVGRoot class was added and assured that the new renderer is created for the corresponding SVG DOM element if LBSE is activated.

That process needs to be repeated for all SVG renderers that have substantially changed in LBSE and thus deserve an LBSE-specific upstream implementation. For all other cases, in-file #if ENABLE(LAYER_BASED_SVG_ENGINE) ... #endif blocks will be used to encapsulate LBSE-specific behavior. For example, RenderSVGText / RenderSVGInlineText are almost identical in LBSE downstream, compared to their legacy variants; thus, they are going to share the renderer implementation between the legacy SVG engine and LBSE.

The multi-step procedure was repeated for RenderSVGModelObject (the base class for SVG graphics primitives), RenderSVGShape, RenderSVGRect, and RenderSVGContainer. Core functionality such as laying out children of a container, previously hidden in SVGRenderSupport::layoutChildren() in the legacy SVG engine, now lives in a dedicated class: SVGContainerLayout. Computing the various SVG bounding boxes - object/stroke/decorated bounding box - is precisely specified in SVG2 and got a dedicated implementation as the SVGBoundingBoxComputation class, instead of fragmenting the algorithms all over the SVG render tree as in the legacy SVG engine.

By February 2022, enough functionality was in place to construct the LBSE render tree for basic SVG documents, utilizing nested containers and rectangles as leaves. While this doesn’t sound exciting at all, it provided an ideal environment to implement support for SVG in the RenderLayer-related code - before converting all SVG renderers to LBSE, and before implementing painting in the SVG renderers.

Both RenderLayer and RenderLayerBacking query CSS geometry information such as border box, padding box, or content box from their associated renderer, which is expected to be a RenderBox in many places. This is incorrect for SVG: RenderSVGModelObject inherits from RenderLayerModelObject, but not from RenderBox since it doesn’t adhere to the CSS box model. Various call sites cast the associated renderer to RenderBox to call e.g. borderBoxRect() to retrieve the border box rectangle. There are similar accessors in SVG to query the geometry, but there is no equivalent of a border box or other CSS concetps in SVG. Therefore, we extended RenderSVGModelObject to provide a CSS box model view of an SVG renderer, by offering methods such as borderBoxRectEquivalent() or visualOverflowRectEquivalent() that return geometry information in the same coordinate system using the same conventions as their HTML/CSS counterparts.

We also refactored RenderLayer to use a proxy method - rendererBorderBoxRect() - that provides access to the borderBoxRect() for HTML and the borderBoxRectEquivalent() for SVG renderers, and the same fix to RenderLayerBacking. With these fixes in place, support to position and size SVG layers and to compute overflow information could be added – both pre-conditions to enable painting.

By March 2022, LBSE was able to paint basic SVG documents - a major milestone for the bootstrapping process, demonstrating that the layer painting code was functional for SVG. It was time to move on to transformations: implementing RenderSVGTransformableContainer (e.g. <g> elements with a non-identity transform attribute or CSS transform property) and CSS/SVG transform support for all other graphics primitives, utilizing the RenderLayer-based CSS Transform implementation. As preparation, the existing code was reviewed and cleaned up: transform-origin computation was decoupled from CTM computation (CTM = current transformation matrix, see CSS Transforms Module Level 1) and transform-box computations were unified in a single place.

In April 2022, 2D transforms were enabled and became fully functional a few weeks later. Besides missing compositing support upstream, downstream work showed that enabling 3D transforms for SVG required fixing a decade-old bug that made the computed perspective transformation dependent on the choice of transform-origin. That became apparent when testing the layer code with SVG, which uses different default values for certain transform-related CSS properties than HTML does: transform-box: view-box and transform-origin: 0 0 are the relevant defaults for SVG, referring to the top-left corner of nearest SVG viewport vs. the center of the element in HTML.

By May 2022, the legacy SVG text rendering code was altered to be usable for LBSE as well. At this point, it made sense to run layout tests using LBSE. Previously most tests were expected to fail, as most either utilize text, paths, or shapes, and sometimes all three together. LBSE render tree text dumps (dumping the parsed render tree structure in a text file) were added for all tests in the LayoutTests/svg subdirectory, as well as a new pixel test baseline (screenshots of the rendering as PNGs), generated using the legacy SVG engine, to verify that LBSE produces pixel-accurate results. All upcoming LBSE patches are expected to change the expected layout test result baseline, and/or the TestExpectations file, depending on the type of patch. This will ease the reviewing process a lot for future patches.

To further proceed, a test-driven approach was used to prioritize the implementation of the missing functionality. At that time, missing viewBox support for outer <svg> elements was causing many broken tests. The effect of the transformation induced by the viewBox attribute, specified on outer <svg> elements, cannot be implemented as an additional CSS transformation applied to the outermost <svg> element, as that would affect the painted dimensions of the SVG document, which are subject to the CSS width/height properties and the size negotiation logic only. The viewBox attribute is supposed to only affect the visual appearance of the descendants, by establishing a new local coordinate system for them. The legacy SVG engine manually handled the viewBox-induced transformation in various places throughout LegacyRenderSVGRoot, to only affect the painting of the descendants and not e.g. the position/dimension of the border surrounding the <svg>, if the CSS border property is specified. In LBSE, transformations are handled on RenderLayer-level and not in the renderers anymore.

By July 2022, after testing different approaches, a proper solution to add viewBox support was upstreamed. The chosen solution makes use of another CSS concept that arises in the context of generated content: “anonymous boxes”. The idea is to wrap the direct descendants of RenderSVGRoot in an anonymous RenderSVGViewportContainer (“anonymous” = no associated DOM element) and apply the viewBox transformation as a regular CSS transformation on the anonymous renderer. With that approach, LBSE is left with just a single, unified viewBox implementation, without error-prone special cases in RenderSVGRoot, unlike the legacy SVG engine which has two disjoint implementations in LegacyRenderSVGViewportContainer and LegacyRenderSVGRoot.

After the summer holidays, in August 2022, the next major milestone was reached: enabling compositing support for arbitrary SVG elements, bringing z-index support, hardware-accelerated compositing and 3D transforms to SVG. This time all lessons learned from the previous LBSE prototypes were taken into account, resulting in a complete compositing implementation, that works in various scenarios: different transform-box / transform-origin combinations, inline SVG enclosed by absolute/relative positioned CSS boxes and many more, all way more polished than in the “final” LBSE prototype.

The aforementioned patch contained a fix for a long-standing bug (“Composited elements appear pixelated when scaled up using transform”), that made composited elements look blurry when scaling up with a CSS transform animation. The so-called “backing scale factor” of the associated GraphicLayers (see here for details about the role of GraphicLayer in the compositing system) never changes during the animation. Therefore, the rendered image was scaled up instead of re-rendering the content at the right scale. LBSE now enforces updates of that scale factor, to avoid blurry SVGs. The fix is not activated yet for HTML as that requires more thought - see the previously-linked bug report for details.

With all the new features in place and covered by tests, it was time to finish the remaining SVG renderers: RenderSVGEllipse, RenderSVGPath and RenderSVGViewportContainer (for inner <svg> elements), RenderSVGHiddenContainer, RenderSVGImage, and RenderSVGForeignObject. A proper <foreignObject> implementation was lacking in WebKit for 15+ years, due to the fundamental problem that the layer tree was not aware of the SVG subtree. The LBSE variant of RenderSVGForeignObject looks trivial, yet offers a fully compatible <foreignObject> implementation - for the first time without issues with non-static positioned content as a direct child of <foreignObject>, at least a few weeks later after it landed.

Returning to the test-driven approach, the next best target to fix was text rendering, which was working but not pixel-perfect. The legacy SVG engine takes into account the transformation from the text element up to the topmost renderer when computing the effective “on-screen” font size used to select a font for drawing/measuring, during layout time. LBSE needed a way to calculate the CTM for a given SVG renderer, up to a given ancestor renderer (or root), taking into account all possible transformation scenarios, including CSS transform, translate, rotate, SVG transform attribute, shifts due to transform-origin, perspective transformations, and much more. The same functionality is required to implement getCTM() / getScreenCTM().

By the end of August 2022, SVGLayerTransformComputation was added that re-used the existing mapLocalToContainer() / TranformState API to obtain the CTM. The CTM construction and ancestor chain walk - to accumulate the final transformation matrix - is performed by mapLocalToContainer() and no longer needs a special, incomplete SVG approach: the existing general approach now works for SVG too.

September 2022 was mostly devoted to bug fixes related to pixel-snapping. Outermost <svg> elements were not always enforcing stacking contexts and failed to align to device pixels. All other elements behaved fine with respect to pixel snapping (not applied for SVG elements) unless compositing layers were active. In that case, a RenderLayerBacking code path was used that unconditionally applied pixel-snapping - avoid that for SVG.

By October 2022 LBSE could properly display SVGs embedded into HTML host documents via <object> elements – the size negotiation logic failed to take into account the LBSE-specific renderers before. CSS background-image / list-image / HTML <img> / etc. were fixed as well. Zooming and panning support were implemented and improved compared to the legacy engine. Along the way an important bug was fixed, one that other browsers had already fixed back in 2014. The bug caused percentage-sized documents (e.g. width: 100%; height: 100%) that also specify a viewBox to always keep the document size, regardless of the zoom level. Thus, upon zooming, only the stroke width enlarged, but not the boundaries of the document, and thus scrollbars never appeared.

Over the following weeks, text-related issues had to be fixed, which were responsible for a bunch of the remaining test issues. Transformed text did not render, which turned out to be a simple mistake. More tests were upstreamed, related to compositing and transformations. More test coverage revealed that transform changes were not handled consistently – it took a period of investigation to land a proper fix. SVG transform / SMIL <animateMotion> / SMIL <animateTransform> / CSS transform changes are now handled consistently in LBSE, leading to proper repaints, as expected.

Transformation support can be considered complete and properly handled both during the painting and layout phases. Dynamic changes at runtime are correctly triggering invalidations. However, the Web-exposed SVG DOM API that allows querying the transformation matrices of SVG elements, such as getCTM() and getScreenCTM(), was still missing. By November 2022 a complete implementation was upstreamed, that utilized the new SVGLayerTransformComputation class to construct the desired transformation matrices. This way the same internal API is used for painting/layout/hit-testing and implementing the SVG DOM accessors.

By December 2022 LBSE was in a good shape: most important architectural changes were upstreamed and the most basic features were implemented. The year closed with a proposed patch that will avoid re-layout when an element’s transform changes. The legacy SVG engine always needs a re-layout if transform changes, as the size of each ancestor can depend on the presence of transformations on the child elements – a bad design decision two decades ago that LBSE will resolve. Only repainting should happen, but no layouts, in LBSE.

Let’s move on to 2023, and recap what’s still missing in LBSE.

Next steps

Besides fixing all remaining test regressions (see LayoutTests/platform/mac-ventura-wk2-lbse-text/TestExpectations) “SVG resources” are missing in LBSE. That includes all “paint servers” and advanced painting operations: there is no support for linear/radial gradients, no support for patterns, and no support for clipping/masking and filters.

From the painting capabilities, LBSE is still in a basic shape. However, this was intentional, since a lot of the existing code for SVG resource handling is no longer needed in LBSE. Clipping/masking and filters will be handled via RenderLayer, reusing the existing HTML/CSS implementations. Temporary ImageBuffers are no longer needed for clipping, and thus there is no need to cache the “per client” state in the resource system (e.g. re-using the cached clipping mask for repainting). This will simplify the implementation of the “SVG resources” a lot.

Therefore the first task in 2023 is to implement clipping, then masking, gradients, patterns, and as the last item, filters, since they require a substantial amount of refactoring in RenderLayerFilters. Note that these implementations are already complete in LBSE downstream and do not need to be invented from scratch. The first patches in that direction should be up for review by February 2023.

After all “SVG resources” are implemented in LBSE, feature parity is almost there and performance work will follow afterward. WebKit has a golden rule to never ship a performance regression; therefore, LBSE needs to be at least as fast in the standard performance tests, such as MotionMark, before it can replace the legacy engine. Currently, LBSE is slower than the legacy engine with respect to static rendering performance. Quoting numbers does not help at present, since the problem is well understood and will be resolved in the following months.

LBSE currently creates more RenderLayer objects than necessary: for each renderer, unconditionally. This is a great stress test of the layer system, and helpful for bootstrapping, but the associated overhead and complexity are simply not necessary for many cases, and actively hurt performance. LBSE already outperforms the legacy SVG engine whenever animated content is viewed, if it benefits from the hardware acceleration in LBSE.

2023 will be an exciting year, and hopefully brings LBSE to the masses, stay tuned!

Demos

“A picture is worth a thousand words”, so we’d like to share with you the videos shown during the WebKit contributors meeting in 2022 that demo the LBSE capabilities. Be sure to check them out so you can get a good picture of the state of the work. Enjoy!

  1. Accelerated 2D transforms (Tiger)

  2. Accelerated 3D transform (Tiger)

  3. Transition storm (Tiger)

  4. Vibrant example

Final thoughts

We at Igalia are doing our best to fulfill the mission and complete the LBSE upstreaming as fast as possible. In the meanwhile, let us know about your thoughts:

  • What would you do with a performant, next-level SVG engine?
  • Any particular desktop / embedded project that would benefit from it?
  • Anything in reach now, that seemed impossible before with the given constraints in WebKit?

Thanks for your attention! Be sure to keep an eye on our “Upstreaming status” page at GitHub to follow LBSE development.

January 19, 2023 12:00 AM

January 17, 2023

Manuel Rego: 10 years ago

Igalia WebKit

Once upon a time…

10 years ago I landed my first WebKit patch 🎂:

Bug 107275 - [GTK] Implement LayoutTestController::addUserScript

That was my first patch to a web engine related project, and also my first professional C++ patch. My previous work at Igalia was around web applications development with PHP and Java. At that time, 2013, Igalia had been already involved on web rendering engines for several years, and the work around web applications was fading out inside the company, so moving to the other side of the platform looked like a good idea. 10 years have passed and lots of great things have happened.

Since that first patch many more have come mostly in Chromium/Blink and WebKit; but also in WPT, CSS, HTML and other specs; and just a few, sadly, in Gecko (I’d love to find the opportunity to contribute there more at some point). I’ve became committer and reviewer in both Chromium and WebKit projects. I’m also member of the CSS Working Group (though not very active lately) and Blink API owner.

During these years I have had the chance to attend several events, spoke at a few conferences, got some interviews. I’ve been also helping to organize the Web Engines Hackfest since 2014 and a CSSWG face-to-face meeting at Igalia headquarters in A Coruña in January 2020 (I have so great memories of the last dinner).

These years have allowed me to meet lots of wonderful people from which I’ve learnt, and still learn every day, many things. I have the pleasure to work with amazing folks on a daily basis in the open. Every new feature or project in which I have to work, it’s again a new learning experience, and that’s something I really like about this kind of work.

In this period I’ve seen Igalia grow a lot in the web platform community, to the point that these days we’re the top world consultancy on the web platform, with an important position in several projects and standards bodies. We’re getting fairly well know in this ecosystem, and we’re very proud about having achieved that.

Looking back, I’m so grateful for the opportunity given by Igalia. Thanks to the amazing colleagues and the community and how they have helped me to contribute work on the different projects. Also thanks to the different customers that have allowed me to work upstream and develop my career around web browsers. 🙏 These days the source code I write (together with many others) is being used daily by millions of people, that’s totally unbelievable if you stop for a minute to think about it.

Looking forward to the next decade of involvement in the web platform. See you all on the web! 🚀

January 17, 2023 11:00 PM

Amanda Falke: Tutorial: Building WPE WebKit for Raspberry Pi 3

Igalia WebKit

A lightning guide for building WPE WebKit with buildroot

This tutorial will be for getting “up and running” with WPE WebKit using a Raspberry Pi 3 using a laptop/desktop with Linux Fedora installed. WPE WebKit has many benefits; you may read here about why WPE WebKit is a great choice for embedded devices. WPE WebKit has minimal dependencies and it displays high-quality animations, WebGL and videos on embedded devices.

WebPlatformForEmbedded is our focus; for this tutorial, we’ll be building WPE WebKit with buildroot to build our image, so, make sure to clone the buildroot repository.

You will need:

Raspberry pi 3 items:

  • A raspberry pi 3.
  • A microSD card for the pi. (I usually choose 32GB microSD cards).
  • An external monitor, extra mouse, and extra keyboard for our rpi3, separately from the laptop.
  • An HDMI cable to connect the rpi3 to its own external monitor.

Laptop items:

  • Linux laptop/desktop.
    • This tutorial will be based on Fedora, but you can use Debian or any distro of your choice.
  • You also need a way to interface the microSD card with your laptop. You can get an SD Adapter for laptops that have an SD Card adapter slot, or you can use an external SD Card adapter interface for your computer.
    • This tutorial will be based on having a laptop with an SD card slot, and hence an SD Card Adapter will work just fine.

Items for laptop to communicate with rpi3:

  • An ethernet cable to connect the rpi3 to your laptop.
  • You need some way to get ethernet into your laptop. This is either in the form of an ethernet port on your laptop (not likely), or an adapter of some sort (likely a USB adapter).

Steps: High level overview

This is a high level overview of the steps we will be taking.

  1. Partition the blank SD card.
  2. In the buildroot repository, make <desired_config>.
  3. Run the buildroot menuconfig with make menuconfig to set up .config file.
  4. Run make to build sdcard.img in buildroot/output dir; change .config settings as needed.
  5. Write sdcard.img file to the SD card.
  6. Connect the rpi3 to its own external monitor, and its own mouse and keyboard.
  7. Connect the rpi3 to the laptop using ethernet cable.
  8. Put the SD card into the rpi3.
  9. Setup a shared ethernet connection between the laptop and rpi to get the IP address of rpi.
  10. ssh into the rpi and start WPE WebKit.

Steps: Detailed overview/sequence

1. Partition the blank SD card using fdisk. Create a boot partition and a root partition.

  • Note: this is only needed in case the output image format is root.tar. If it’s sdcard.img, then that is dumped directly to the sdcard, that image is already partitioned internally.
  • If you’re unfamiliar with fdisk, this is a good tutorial.

2. In the buildroot repository root directory, make the desired config.

  • Make sure to clone the buildroot repository.
  • Since things change a lot over time, it’s important to note the specific buildroot commit this tutorial was built on, and that this tutorial was built on January 12th, 2023. It is recommended to build from that commit for consistency to ensure that the tutorial works for you.
  • We are building the cog 2.28 wpe with buildroot. See the build options from the buildroot repository.
  • Run make list-defconfigs to get a list of configurations.
  • Copy raspberrypi3_wpe_2_28_cog_defconfig and run it: make raspberrypi3_wpe_2_28_cog_defconfig.
  • You will quickly get output which indicates that a .config file has been written in the root directory of the buildroot repository.

3. Run the buildroot menuconfig with make menuconfig to set up .config file.

  • Run make menuconfig. You’ll see options here for configuration. Go slowly and be careful.
  • Change these settings. Help menus are available for menuconfig, you’ll see them displayed on the screen.
Operation in menuconfig Location Value
ENABLE Target packages -> Filesystem and flash utilities dosfstools
ENABLE Target packages -> Filesystem and flash utilities mtools
ENABLE Filesystem images ext2/3/4 root filesystem
SET VALUE Filesystem images -> ext2/3/4 root filesystem -> ext2/3/4 variant ext4
DISABLE Filesystem images initial RAM filesystem linked into linux kernel

4. Run make to build sdcard.img in buildroot/output dir; change .config settings as needed.

  • Run make. Then get a coffee as the build and cross-compilation will take awhile.

  • In reality, you may encounter some errors along the way, as cross-compilation can be an intricate matter. This tutorial will guide you through those potential errors.
  • When you encounter errors, you’ll follow a “loop” of sorts:

    Run make -> encounter errors -> manually edit .config file -> -> remove buildroot/output dir -> run make again until sdcard.img is built successfully.

  • If you encounter CMake errors, such as fatal error: stdlib.h: No such file or directory, compilation terminated, and you have a relatively new version of CMake on your system, the reason for the error may be that buildroot is using your local CMake instead of the one specified in the buildroot configuration.
    • We will fix this error by setting in .config file: BR2_FORCE_HOST_BUILD=y. Then remove buildroot/output dir, and run make again.
  • If you encounter error such as path/to/buildroot/output/host/lib/gcc/arm-buildroot-linux-gnueabihf/9.2.0/plugin/include/builtins.h:23:10: fatal error: mpc.h: No such file or directory:
  • then we can fix this error by changing Makefile in ./output/build/linux-rpi-5.10.y/scripts/gcc-plugins/Makefile, by adding -I path/to/buildroot/output/host/include in plugin_cxxflags stanza. Then, as usual, remove buildroot/output dir, and run make again.

5. Write sdcard.img file to the SD card.

  • At this point after the make process, we should have sdcard.img file in buildroot/output/images directory.
  • Write this file to the SD card.
  • Consider using Etcher to do so.

6. Connect the rpi3 to its own external monitor, and its own mouse and keyboard.

  • We’ll have separate monitor, mouse and keyboard all connected to the raspberry pi so that we can use it independently from the laptop.

7. Connect the rpi3 to the laptop using ethernet cable.

8. Put the SD card into the rpi3.

9. Setup a shared ethernet connection between the laptop and rpi to get the IP address of rpi.

In general, one of the main problems for connecting via ssh to the Raspberry Pi is to know the IP address of the device. This is very simple with Raspbian OS; simply turn on the raspberry pi and edit configurations to enable ssh, often over wifi.

This is where the ethernet capabilities of the raspberry pi come in.

Goal: To find syslog message DHCPACK acknowledgement and assignment of the IP address after setting up shared connection between raspberry pi and the laptop.

Throughout this process, continually look at logs. Eventually we will see a message DHCPACK which will likely be preceded by several DHCP handshake related messages such as DHCP DISCOVER, REQUEST etc. The DHCPACK message will contain the IP address of the ethernet device, and we will then be able to ssh into it.

    1. Tail the syslogs of the laptop. On Debian distributions, this is often /var/log/syslog. Since we are using Fedora, we’ll be using systemd's journald with the journactl command:
      • sudo journalctl -f
      • Keep this open in a terminal window.
      • You can also come up with a better solution like grepping logs, if you like, or piping output of stdout elsewhere.
    1. In a second terminal window, open up the NetworkManager.
      • Become familiar with existing devices prior to powering on the raspberry pi, by running nmcli.
    1. Power on the raspberry pi. Watch your system logs.
      • Syslogs will detail the raspberry pi’s name.
    1. Look for that name in NetworkManager nmcli device.
      • Using NetworkManager nmcli, set up shared connection for the ethernet device.
      • Setting up a shared connection is as simple as nmcli connection add type ethernet ifname $ETHERNET_DEVICE_NAME ipv4.method shared con-name local
      • This is a good tutorial for setting up a shared connection with NetworkManager.
    1. Once the connection is shared, syslogs will show a DHCPACK message acknowledging the ethernet device and its IP address. (You may need to power cycle the rpi to see this message, but it will happen).

10. ssh into the rpi and start WPE WebKit.

  • Now that we have the IP address of the raspberry pi, we can ssh into it from the laptop: ssh root@<RPI3_IP_ADDRESS>. (The default password is ‘root’. You can also add your user public key to /root/.ssh/authorized_keys on the pi. You can simplify this process by creating an overlay/root/.ssh/authorized_keys on your computer and by specifying the path to the overlay directory in the BR2_ROOTFS_OVERLAY config variable. That will copy everything in the overlay dir to the image.)
  • After that, export these env variables WPE_BCMRPI_TOUCH=1 and WPE_BCMRPI_CURSOR=1 to enable keyboard and mouse control.
    • Why: Recall that generally WPE WebKit is for embedded devices, such as kioks, or set top boxes requiring control with a remote control or similar device or touch interaction. We are exporting these environment variables so that we can “test” WPE WebKit with our separate mouse and keyboard for our raspberry pi without the need for a touch screen or special hardware targets, or a Wayland compositor such as weston. If this piques your curiosity, please see the WPE WebKit FAQ on Wayland.
  • Start WPE WebKit with cog: cog "http://www.igalia.com/"
  • A browser will launch in the external monitor connected to the raspberry pi 3, and we can control the browser with the raspberry pi’s mouse and keyboard!

That’s all for now. Feel free to reach out in the support channels for WPE on Matrix.

WPE’s Frequently Asked Questions

By Amanda Falke at January 17, 2023 08:03 AM