June 27, 2016

Improved Font Loading

WebKit Blog

Web fonts are becoming increasingly popular all over the web. Using web fonts allows a site author to include and use specific fonts along with the assets of their websites. Browsers download the fonts with the rest of the site and use them when they finish downloading. Unfortunately, not all web fonts download instantaneously, which means there is some time during the load where the browser must show the webpage without the font. Many web fonts are quite large, causing a long delay until the browser can use them. WebKit has mitigated this in two main areas: improving font download policies, and making fonts download faster.

Default Font Loading Policy

The browser should attempt to show a web page in the best way possible despite not being able to use a font due to an ongoing download. WebKit will always end up using the downloaded font when it is available; however, some fonts may require quite some time to complete downloading. For these fonts, WebKit will show elements with invisible text during the delay. Older versions of WebKit would continue to show this invisible text until the font completes downloading. Instead, newer versions of WebKit will show this invisible text for a maximum of 3 seconds, at which point it will switch to a local font chosen from the element’s style before finally switching to the downloaded font when the download completes.

This new policy minimizes a flash of fallback text for fast-downloading fonts without making long-downloading fonts illegible indefinitely. The best part is that websites don’t have to do anything to adopt this new policy; all font loads will use it by default.

CSS Font Loading API

Different web authors may want different policies for headline fonts or for body fonts. With the CSS Font Loading API, new in WebKit, web authors have complete control over their own font loading policy.

This API exposes two main pieces to JavaScript: the FontFace interface and the FontFaceSet interface. FontFace represents the same concept as a CSS @font-face rule, so it contains information about a font like its url, weight, unicode-range, family, etc. The document’s FontFaceSet contains all the fonts the webpage can use for rendering. Each document has its own FontFaceSet, accessible via document.fonts.

Using these objects to implement a particular font loading policy is straightforward. Once a FontFace has been constructed, the load() method initiates asynchronous downloading of the font data. It returns a Promise that is fulfilled when the download succeeds or rejected when the download fails. By chaining a handler to the returned promise, JavaScript can take action when the download finishes. Therefore, a website could implement a policy of using a fallback font immediately with no timeout by using the following code:

<div id="target" style="font-family: MyFallbackFont;">hamburgefonstiv</div>

let fontFace = new FontFace("MyWebFont", "url('MyWebFont.woff2') format('woff2'),
url('MyWebFont.woff') format('woff')");
fontFace.load().then(function(loadedFontFace) {
document.fonts.add(loadedFontFace);
document.getElementById("target").style.fontFamily = "MyWebFont";
});


In this example, the element is styled declaratively to use MyFallbackFont. Then, JavaScript runs and starts requesting MyWebFont. If the font download is successful, the newly downloaded font is added to the document’s FontFaceSet and the element’s style is modified to use this new font.

WOFF 2

No matter which font loading policy is used, the user’s experience will be best when fonts download quickly. WebKit on macOS Sierra, iOS 10, and GTK supports a new, smaller font file format named “WOFF 2.” The fonts are smaller because they use advanced Brotli compression, which means that WOFF 2 fonts can download significantly faster than the same font represented with other common formats like WOFF, OpenType, and TrueType. WebKit also recognizes the format("woff2") identifier in the src property inside the CSS @font-face rule. Because not all browsers support WOFF 2 yet, it is possible to elegantly fall back to another format if the browser doesn’t support WOFF 2, like so:

@font-face {
font-family: MyWebFont;
src: url("MyWebFont.woff2") format("woff2"),
url("MyWebFont.woff") format("woff");
}


In this example, browsers will use the format CSS function to selectively download whichever fonts they support. Because browsers will prefer fonts appearing earlier in the list over fonts appearing later, browsers encountering this code will always prefer the WOFF 2 font over the WOFF font if they support it.

unicode-range

Bandwidth is expensive (especially to mobile devices), so minimizing or eliminating font downloads is incredibly valuable. WebKit will now honor the unicode-range CSS property for the purposes of font downloading. Consider the following content:

@font-face {
font-family: MyWebFont;
src: url("MyWebFont-extended.woff2") format("woff2"),
url("MyWebFont-extended.woff") format("woff");
unicode-range: U+110-24F;
}

@font-face {
font-family: MyWebFont;
src: url("MyWebFont.woff2") format("woff2"),
url("MyWebFont.woff") format("woff");
unicode-range: U+0-FF;
}

<div style="font-family: MyWebFont;">hamburgefonstiv</div>


In this content, none of the characters styled with MyWebFont fall within the unicode-range of the first CSS @font-face rule, which means that there is no reason to download it at all! It’s important to remember, however, that styling any character with a web font will always cause the last matching @font-face to be downloaded in order to correctly size the containing element according to the metrics of the font. Therefore, the most-common CSS @font-face rule should be listed last. Using unicode-range to partition fonts can eliminate font downloads entirely, thereby saving your users’ bandwidth and battery.

Using these techniques can dramatically improve users’ experiences on your website when using web fonts. For more information, contact me at @Litherum, or Jonathan Davis, Apple’s Web Technologies Evangelist, at @jonathandavis or web-evangelist@apple.com.

June 22, 2016

Release Notes for Safari Technology Preview Release 7

WebKit Blog

Safari Technology Preview Release 7 is now available for download. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. Release 7 of Safari Technology Preview covers WebKit revisions 201541–202085.

JavaScript

• Implemented options argument to addEventListener (r201735, r201757)
• Updated JSON.stringify to correctly transform numeric array indices (r201674)
• Improved the performance of Encode operations (r201756)
• Addressed issues with Date setters for years outside of 1900-2100 (r201586)
• Fixed an issue where reusing a function name as a parameter name threw a syntax error (r201892)
• Added the error argument for window.onerror event handlers (r202023)
• Improved performance for accessing dictionary properties (r201562)
• Updated Proxy.ownKeys to match recent changes to the spec (r201672)
• Prevented RegExp unicode parsing from reading an extra character before failing (r201714)
• Updated SVGs to report their memory cost to the JavaScript garbage collector (r201561)
• Improved the sampling profiler to protect itself against certain forms of sampling bias that arise due to the sampling interval being in sync with some other system process (r202021)
• Fixed global lexical environment variables scope for functions created using the Function constructor (r201628)
• Fixed parsing super when the default parameter is an arrow function (r202074)
• Added support for trailing commas in function parameters and arguments (r201725)

CSS

• Added the unprefixed version of the pseudo element ::placeholder (r202066)
• Fixed a crash when computing the style of a grid with only absolute-positioned children (r201919)
• Fixed computing a grid container’s height by accounting for the horizontal scrollbar (r201709)
• Fixed placing positioned items on the implicit grid (r201545)
• Fixed rendering for the text-decoration-style values: dashed and dotted (r201777)
• Fixed support for using border-radius and backdrop-filter properties together (r201785)
• Fixed clipping for border-radius with different width and height (r201868)
• Fixed CSS reflections for elements with WebGL (r201639)
• Fixed CSS reflections for elements with a backdrop-filter property (r201648)
• Improved the Document’s font selection lifetime in preparation for the CSS Font Loading API (r201799)
• Improved memory management for CSS value parsing (r201608)
• Improved font face rule handling for style change calculations (r201971, r202085)
• Fixed multiple selector rule behavior for keyframe animations (r201818)
• Fixed applying CSS variables correctly for writing-mode properties (r201875)
• Added experimental support for spring() based CSS animations (r201759)
• Changed the initial value of background-color to transparent per specs (r201666)

Web APIs

• Changed CanvasRenderingContext2D.createPattern() and CanvasRenderingContext2D.putImageData() to throw the correct exception type and align with the specification (r201664)
• Fixed a number of issues with Web Workers (r201876, r201970, r201918, r201926, r201791, r201898, r201925, r201808)

Web Inspector

• Added ⌘T keyboard shortcut to open the New Tab tab (r201692, r201762)
• Added the ability to show and hide columns in data grid tables (r202009, r202081)
• Fixed an error when trying to delete nodes with children (r201843)
• Added a Top Functions view for Call Trees in the JavaScript & Events timeline (r202010, r202055)
• Added gaps to the overview and category graphs in the Memory timeline where discontinuities exist in the recording (r201686)
• Improved the performance of DOM tree views (r201840, r201833)
• Fixed filtering to apply to new records added to the data grid (r202011)
• Improved snapshot comparisons to always compare the later snapshot to the earlier snapshot no matter what order they were selected (r201949)
• Improved performance when processing many DOM.attributeModified messages (r201778)
• Fixed the 60fps guideline for the Rendering Frames timeline when switching timeline modes (r201937)
• Included the exception stack when showing internal errors in Web Inspector (r202025)
• Added ⌘P keyboard shortcut for quick open (r201891)
• Removed Text → Content subsection from the Visual Styles Sidebar when not necessary (r202073)
• Show <template> content that should not be hidden as Shadow Content (r201965)
• Fixed elements in the Elements tab losing focus when selected by the up or down key (r201890)
• Enabled combining diacritic marks in input fields in Web Inspector (r201592)

Media

• Prevented double-painting the outline of a replaced video element (r201752)
• Properly prevented video.play() for video.src="file" with audio user gesture restrictions in place (r201841)
• Prevented showing the caption menu if the video has no selectable text or audio tracks (r201883)
• Improved performance of HTMLMediaElement.prototype.canPlayType that was accounting for 250–750ms first loading theverge.com (r201831)
• Fixed inline media controls to show PiP and fullscreen buttons (r202075)

Rendering

• Fixed a repaint issue with vertical text in an out-of-flow container (r201635)
• Show text in a placeholder font while downloading the specified font (r201676)
• Fixed rendering an SVG in the correct vertical position when no vertical padding is applied, and in the correct horizontal position when no horizontal padding is applied (r201604)
• Fixed blending of inline SVG elements with transparency layers (r202022)
• Fixed display of hairline borders on 3x displays (r201907)
• Prevented flickering and rendering artifacts when resizing the web view (r202037)
• Fixed logic to trigger new layout after changing canvas height immediately after page load (r201889)

Bug Fixes

• Fixed an issue where Find on Page would show too many matches (r201701)
• Exposed static text if form label text only contains static text (r202063)
• Added Origin header for CORS requests on preloaded cross-origin resources (r201930)
• Added support for the upgrade-insecure-requests (UIR) directive of Content Security Policy (r201679, r201753)
• Added proper element focus and caret destination for keyboard users activating a fragment URL (r201832)
• Increased disk cache capacity when there is lots of free space (r201857)
• Prevented hangs during synchronous XHR requests if a network session doesn’t exist (r201593)
• Fixed the response for a POST request on a blob resource to return a “network error” instead of HTTP 500 response (r201557)
• Restricted HTTP/0.9 responses to default ports and cancelled HTTP/0.9 resource loads if the document was loaded with another HTTP protocol (r201895)
• Fixed parsing URLs containing tabs or newlines (r201740)
• Fixed cookie validation in private browsing (r201967)
• Provided memory cache support for the Vary header (r201800, r201805)

June 15, 2016

Introducing JSC’s New Sampling Profiler

WebKit Blog

JavaScriptCore (JSC) has a new sampling profiler that powers Web Inspector’s JavaScript & Events timeline. The sampling profiler is a replacement of JSC’s old tracing profiler. It provides more accurate data about where time is spent in the executing program and is also an order of magnitude faster than the old tracing profiler. Tracing profilers work by inserting instrumentation into the running program. Sampling profilers, on the other hand, work by pausing the executing program at regular intervals and collecting data about the program’s execution state. When recording a timeline in Web Inspector today, you will experience a 30x speed improvement over the old timeline. This speedup is derived by two fundamental architecture changes made by JSC and Web Inspector when recording a timeline. The primary speedup comes from replacing the tracing profiler with the sampling profiler. Further speedup comes from removing debugging instrumentation from the executing program. JSC inserts debugging instrumentation into the executing program to support the debugger when Web Inspector is open. This instrumentation allows JSC to detect when a breakpoint has been hit in the executing program. However, this debugging instrumentation is not needed when recording a timeline; and by removing it, JSC will execute JavaScript 2x faster.

Profiling Methods

Tracing profilers are often less accurate for doing performance analysis because the high overhead of inserting instrumentation into the program changes the distribution of where time is spent in the executing program. For example, in JSC’s tracing profiler, each call site is decorated with a bytecode instruction both before and after the call to time how long each call takes. Tracing profilers are good for other forms of analysis though. For example, they can be used to construct a lossless dynamic call graph.

Sampling profilers operate by periodically sampling the target thread and collecting interesting data about its execution state. Often, the data collected will be a stack trace of the thread while it is paused. In JSC, the sampling profiler works without inserting any instrumentation into the running program. This is crucial in keeping the overhead of the sampling profiler low. A sampling profiler’s collected data set will always be lossy because it is sampling the executing thread at a fixed interval. It will never be able to reconstruct a lossless call graph of the executing program. That said, sampling profilers are more accurate than tracing profilers at determining where time is spent in the executing program. There are two main reasons for this. First, sampling profilers are low overhead which makes them good at observing the program in its natural state. Second, if a certain part of the program is hot, there is a high probability that the sampling profiler will take a sample while the hot portion of the program is executing. Put another way, the probability of not sampling the hot portions of the program is very low.

It’s imperative that a sampling profiler contains as little bias as possible as to where in the program’s execution state it samples. If there is bias in where data samples are taken, the data set will not be representative of the natural execution state of the program. For example, some profilers are implemented using safepoints. Safepoints are specific places in the program where the executing thread is yielding its execution to allow other tasks to run. Often, safepoints are used for GC and provide strict guarantees about the state of the executing thread when it enters the safepoint. One way to implement a sampling profiler in JSC would be to use a safepoint-based mechanism. JSC would compile a safepoint at each function prologue and at each loop header. As Mytkowicz, Diwan, Hauswirth, and Sweeney show, safepoint-based profilers are not accurate. They introduce too much bias into the measured data set. In JSC, a safepoint based implementation would also suffer from being biased. Crucially, it would prevent the profiler from knowing when the executing code enters the C runtime. For some programs, much of the program’s time is spent in JavaScript functions that call into the C runtime. It’s necessary to attribute the time spent in the C runtime to the JavaScript frame that called into it. For JSC to use a safepoint-based implementation, the C runtime would need to learn how to interface with the profiler. This would both be error prone and cumbersome. Safepoint-based profilers are continually playing a cat and mouse game to minimize bias. Once a part of the engine that has bias is exposed, the profiler must remove that bias by teaching that part of the engine about the profiler.

JSC does not use a safepoint-based profiler because of these downfalls. The design JSC implemented is simpler and cleaner because instead of teaching different parts of the engine about the profiler, JSC taught the profiler about the different parts of the engine. This turns out to be much simpler than using safepoints because teaching the profiler about the engine requires very little work. JSC’s profiler works by using a background sampling thread that wakes up at a given frequency, pauses the JSC execution thread, and takes a conservative stack trace. This design also naturally leads to a profiler that is both more accurate and has lower overhead than a profiler implemented using safepoints.

Implementing JSC’s Sampling Profiler

JSC’s sampling profiler must take a conservative stack trace. The sampling thread doesn’t know beforehand what state the executing thread will be in when it gets paused (JSC doesn’t’ use safepoints, after all). The executing thread may be in a state where it’s not possible to take a stack trace. If the sampling thread encounters such a state, it will go back to sleep, and try again on the next sample. This can be cause for concern. Not being able to take a stack trace at certain points in the program can lead to bias in the data set if the places where a stack trace can’t be taken is where the program is spending a lot of its time. However, the places where this is a problem are limited. They happen at very specific regions of machine code that JSC generates. The most common place is at certain instruction sequences in JSC’s calling convention code. Because JSC is in control of the code it generates, it’s able to limit the effect of such biases. When JSC engineers find such biases, they can try to minimize their impact by structuring the machine code in such a way that both creates fewer and smaller regions where taking a stack trace is not possible.

Because the sampling profiler is cleanly separated from other parts of the VM, its implementation is quite simple. The rest of this section will investigate the more interesting implementation details in JSC’s sampling profiler. To explore these details, let’s first analyze a high-level pseudocode implementation of JSC’s sampling profiler algorithm:

void SamplingProfiler::startSampling()
{
VM& vm = getCurrentVM();
MachineThread* jscThread = getJSCExecutionThread();

while (true) {

std::sleep_for(std::chrono::microseconds(1000));

if (vm.isIdle())
continue;

// Note that the sampling thread doesn't control the state in
// which the execution thread pauses. This means it can be holding
// arbitrary locks such as the malloc lock when it gets paused.
// Therefore, the sampling thread can't malloc until the execution
// is resumed or the sampling thread may deadlock.
jscThread->pause();

// Get interesting register values from the paused execution thread.
void* machinePC = thread->PC();
void* machineFramePointer = thread->framePointer();
// JSC designates a machine register to hold the bytecode PC when
// executing interpreter code. This register is only used when the
// sampling thread pauses the top frame inside the interpreter.
void* interpreterPC = thread->interpreterPC();

void* framePointer = machineFramePointer;
if (!isMachinePCInsideJITCode(machinePC)
&& !isMachinePCInsideTheInterpreter(machinePC)) {
// When JSC's JIT code calls back into the C runtime, it will
// store the frame pointer for the current JavaScript upon
// entry into the runtime. This is needed for many reasons.
// Because JSC does this, the sampling profiler can use that frame
// as the top frame in a stack trace.
framePointer = vm.topCallFrame;
}

bool success =
takeConservativeStackTrace(framePointer, machinePC, interpreterPC);

if (!success)
continue;

jscThread->resume();

// The sampling thread can now malloc and do interesting things with
// other locks again.
completeStackTrace();
}
}


The actual implementation is only slightly more complicated than the above pseudocode. Because the sampling thread doesn’t control the state in which the JSC execution thread gets paused, the sampling thread can’t assume anything about which locks the execution thread holds. This means that if the sampling thread wants to guarantee that the execution thread doesn’t hold a particular lock before it gets paused, the sampling thread must acquire the lock prior to pausing the execution thread. There are other locks, though, that the sampling thread doesn’t want to acquire and that it makes sure aren’t acquired transitively while the execution thread is paused. The most interesting lock in this category is the malloc lock. The conservative stack trace that the sampling thread takes must not malloc any memory. If the sampling thread were to malloc memory, it would cause a deadlock if the execution thread were holding the malloc lock while it was paused. To prevent mallocing any memory, the sampling thread pre-allocates a buffer for the conservative stack trace to place its data. If it runs out of space in that buffer, it will end its stack trace early and it will grow its buffer after it resumes the execution thread.

The sampling profiler must be aware of where the machine’s program counter (PC) is when the execution thread is paused. There are four states that the sampling thread cares about. The first state is when the VM is idle; when encountering this state, the sampling thread just goes back to sleep. The last three are based on the machine’s PC when the VM is not idle. The PC can be inside JIT code, inside interpreter code, or neither. The sampling thread takes that last state to mean that the PC is inside the C runtime. The sampling thread is able to easily determine when the execution thread is inside the C runtime and attribute that time to the JavaScript caller into the C runtime. The sampling thread can do this by reading a field off the VM class which holds the frame pointer for the JavaScript frame that called into the C runtime. This prevents JSC’s sampling profiler implementation from suffering from one of the biggest downsides of the hypothetical safepoint-based implementation. The reason that the sampling thread must distinguish between the interpreter and JIT code is that the interpreter uses a virtual PC to hold the current bytecode instruction that’s being executed. The sampling thread uses the interpreter’s virtual PC to determine where in the function the program is executing. The machine PC does not hold enough information to do that inside an interpreter. When the machine PC is inside JIT code, JSC uses a mapping from PC to what JSC calls CodeOrigin. A CodeOrigin is used to determine the line and column number of the current operation that the program is executing.

Making JSC Fast When Sampling

With the tracing profiler, Web Inspector would always enable JSC’s debugger even when it was recording a timeline. Enabling the debugger prevents many interesting compiler optimizations from taking place. For example, JSC disables inlining. JSC also marks all variables as closure variables. This forces some variables that would have otherwise lived on the stack to live in a heap allocated closure object, making all uses of that variable a read from the heap, and all writes to that variable a write into the heap. Also, all functions must do at least one object allocation for the closure object. Because the debugger drastically changes the executing state of the program, it biases the executing program away from its natural state. Also, the tracing profiler wasn’t implemented in JSC’s FTL JIT. Because of these shortcomings, the data recorded with the tracing profiler can be skewed at best, and completely wrong at worst.

To prevent these same biases with the sampling profiler, Web Inspector disables JSC’s debugger before starting a timeline recording. To make the timeline recording reflect the natural execution state of the program, JSC can’t be sampling a program that was compiled with debugging instrumentation. Unlike the tracing profiler, the sampling profiler doesn’t restrict which compiler tiers functions are allowed to execute in. The sampling profiler can take a stack trace with any frame in the stack being compiled under any tier. By removing these shortcomings, when using the sampling profiler in Web Inspector when recording a timeline, JSC both has great performance and also collects data that is more representative of the natural execution state of the program.

To see how much more accurate the sampling profiler can be than the tracing profiler, let’s examine a program in which the tracing profiler’s data set indicates that significant time is spent in a particular function when the program in its natural state will spend little time in that function:

var shouldComputeSin = false;
var shouldComputeCos = true;

function computeSin(obj, x) {
if (shouldComputeSin)
obj.sin = Math.sin(x);
}

function computeCos(obj, x) {
if (shouldComputeCos)
obj.cos = Math.cos(x);
}

function computeResults(x) {
var results = {};
computeSin(results, x);
computeCos(results, x);
}

function run() {
for (var i = 0; i < 1000000; i++) {
computeResults(i);
}
}
run();


When running the example program in JSC in its natural state, JSC will perform the following important optimizations that will remove most of the overhead of the program:

• The calls to computeSin and computeCos will be inlined into computeResults.
• The branches on shouldComputeSin and shouldComputeCos will be constant folded because JSC will realize that these variables are global and have been constant within the execution of the program.
• When computeResults tiers up to the FTL, the FTL will perform object allocation sinking which will prevent computeResults from performing an allocation for the results object. The FTL can do this because after it has inlined computeSin and computeCos, it proves that results is only used locally and doesn’t escape. Allocation sinking will then transform results‘s fields into local variables.

The example program runs super fast when recording a timeline in Web Inspector in Safari Technology Preview; it executes in about 10ms on my machine. The sampling profiler shows the following distribution of where time is spent:

Time is only spent inside computeResults and computeCos; no time is spent in computeSin. This makes sense because once computeResults tiers up to the optimizing JITs, JSC will perform optimizations that will make computeSin turn into a no-op.

When recording a timeline of the same program inside Safari 9, the results are completely different:

The tracing profiler measures that the program runs in about 1.85 seconds. That’s 185x slower than what the sampling profiler measured inside Safari Technology Preview. Also, when looking at the call tree for the tracing profiler, the data shows that a significant amount of time was spent inside computeSin. This is just wrong. When the program runs in its natural, unobserved state, effectively no time is spent inside computeSin. It’s imperative for a profiler to report accurate information for programs like the example above. When a profiling tool reports misleading data, it causes users of the tool to spend time optimizing the wrong thing. If a user were to trust the tracing profiler’s data for this example, it might lead them to remove the function calls from computeResults and to manually inline computeSin and computeCos into computeResults. However, the sampling profiler shows that this isn’t necessary because JSC will perform this optimization on its own inside the DFG and FTL JITs.

To compare the performance of both profilers on a more realistic workload, let’s examine their performance on the Octane benchmark:

To gather this data, I wanted to run Octane in the browser both with Safari Technology Preview and Safari 9 while recording a timeline in Web Inspector. Unfortunately, Safari 9 crashes when recording a timeline while running Octane. So not only is the sampling profiler much faster, it’s also more reliable. To get Octane to run without crashing, I ran it using the jsc command line utility and used the necessary jsc command line options to simulate what recording a timeline in Safari Technology Preview with the sampling profiler is like and what recording a timeline in Safari 9 using the tracing profiler is like. These results clearly show that Web Inspector and JSC are an order of magnitude faster with the new profiling architecture than the old tracing profiling architecture.

Web Inspector Integration

To compliment the new sampling profiler, Web Inspector has introduced a new Call Trees view inside the JavaScript & Events timeline. This view allows users to view compact call tree data about the entire program or hand selected time ranges. The call tree can be viewed Top Down or Bottom Up (which is my favorite view for doing performance analysis). Here are two images showing the new Call Trees view for the above example JavaScript program.

Summary

There were many changes made to JSC and Web Inspector to make the new sampling profiler both fast and accurate. The accuracy of the new profiler is a product of removing all instrumentation in the executing program. JSC moved to using a sampling profiler instead of a tracing profiler because sampling profilers don’t need to compile instrumentation into the executing program. Sampling profilers are also more accurate for doing performance analysis because they introduce less bias into the measured data set than tracing profilers. Also, Web Inspector now disables the debugger when recording a timline to prevent debugging instrumentation being compiled into the JavaScript program. Together, these changes make the new JavaScript & Events timeline a great experience for doing performance analysis work by making JavaScript run 30x faster than it used to. If you have any comments or questions regarding JSC’s new sampling profiler, please get in touch with me or Jon Davis on Twitter.

June 14, 2016

Next Steps for Legacy Plug-ins

WebKit Blog

The web platform is capable of amazing things. Thanks to the ongoing hard work of standards bodies, browser vendors, and web developers, web standards are feature-rich and continuously improving. The WebKit project in particular emphasizes security, performance, and battery life when evaluating and implementing web standards. These standards now include most of the functionality needed to support rich media and interactive experiences that used to require legacy plug-ins like Adobe Flash. When Safari 10 ships this fall, by default, Safari will behave as though common legacy plug-ins on users’ Macs are not installed.

On websites that offer both Flash and HTML5 implementations of content, Safari users will now always experience the modern HTML5 implementation, delivering improved performance and battery life. This policy and its benefits apply equally to all websites; Safari has no built-in list of exceptions. If a website really does require a legacy plug-in, users can explicitly activate it on that website.

If you’re a web developer, you should be aware of how this change will affect your users’ experiences if parts of your websites rely on legacy plug-ins. The rest of this post explains the implementation of this policy and touches on ways to reduce a website’s dependence on legacy plug-ins.

How This Works

By default, Safari no longer tells websites that common plug-ins are installed. It does this by not including information about Flash, Java, Silverlight, and QuickTime in navigator.plugins and navigator.mimeTypes. This convinces websites with both plug-in and HTML5-based media implementations to use their HTML5 implementation.

Of these plug-ins, the most widely-used is Flash. Most websites that detect that Flash isn’t available, but don’t have an HTML5 fallback, display a “Flash isn’t installed” message with a link to download Flash from Adobe. If a user clicks on one of those links, Safari will inform them that the plug-in is already installed and offer to activate it just one time or every time the website is visited. The default option is to activate it only once. We have similar handling for the other common plug-ins.

When a website directly embeds a visible plug-in object, Safari instead presents a placeholder element with a “Click to use” button. When that’s clicked, Safari offers the user the options of activating the plug-in just one time or every time the user visits that website. Here too, the default option is to activate the plug-in only once.

Safari 10 also includes a menu command to reload a page with installed plug-ins activated; it’s in Safari’s View menu and the contextual menu for the Smart Search Field’s reload button. All of the settings controlling what plug-ins are visible to web pages and which ones are automatically activated can be found in Safari’s Security preferences.

Whenever a user enables a plug-in on a website, it’ll remain enabled as long as the user regularly visits the website and and website still uses the plug-in. More specifically, Safari expires a user’s request to activate a plug-in on a particular website after it hasn’t seen that plug-in used on that site for a little over a month.

Recommendations for Web Developers

Before Safari 10 is released this fall, we encourage you to test how these changes impact your websites. You can do that by installing a beta of macOS Sierra. There will be betas of Safari 10 for OS X Yosemite and OS X El Capitan later this summer.

To avoid making your users have to explicitly activate a plug-in on your website, you should try to implement features using technologies built into the web platform. You can use HTML5 <audio>, <video>, the Audio Context API, and Media Source Extensions to implement robust, secure, customized media players. New in Safari 10, text can be cut or copied to the clipboard using execCommand, which was previously only possible using a plug-in. A host of CSS features, including animations, backdrop filters, and font feature settings can add some visual polish to a site. And WebGL is great for creating interactive 2D or 3D content, like games.

If you serve a different version of your website to mobile browsers, it may already implement its media playback features using web standards. As browsers continue to transition away from legacy plug-ins, you can preserve the rest of your users’ experiences by serving those same implementations to all visitors of your site.

If you can’t replace a plug-in-based system in the short term, you may want to teach your users how to enable that plug-in for your website in Safari. In an enterprise setting, system administrators can deploy managed policies to enable a plug-in on specific websites, if necessary.

Help Us Help You

If you find that you can’t implement parts of your websites without using legacy plug-ins, you can help yourself and other developers by telling us about it. In general, any time the web platform falls short of your needs, we want to know about it. Your feedback has and will continue to shape the priorities of the WebKit project and the Safari team. To send that type of feedback, please write email to or tweet at Jonathan Davis.

And if you have questions about Safari’s policies for using Flash or other plug-ins, feel free to reach me on Twitter at @rmondello.

June 08, 2016

Release Notes for Safari Technology Preview 6

WebKit Blog

Safari Technology Preview Release 6 is now available for download. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. Release 6 of Safari Technology Preview covers WebKit revisions 201084–201541.

JavaScript

• Added support for trailing commas in function parameters per draft ECMAScript spec (r201488)
• Improved RegExp matching when the result array becomes large (r201451)
• Made RegExp throw an exception instead of a crash when matching deeply nested subexpressions (r201412)
• Made TypedArray.prototype.slice no longer throw an exception if no arguments are provided (r201364)
• Improved performance of TypedArray access by 30% in the 64-bit low-level interpreter (r201335)
• Fixed a regression where String.prototype.replace would fail after being used many times with different replace values (r201254)
• Improved integer to float conversion code generation in the B3 JIT Compiler (r201208)
• Fixed arrow functions as default parameter values so they capture this properly (r201122)

CSS

• Added support for normal keyword value per draft CSS Box Alignment Module Level 3 spec (r201498)
• Updated the parsing of CSS Grid’s fixed-size per the draft spec (r201399)
• Made elements with backdrop-filter clip when used with clip-path or mask (r201374)
• Made changing border-color and border-width on table cells with border-collapse: collapse repaint properly (r201296)
• Fixed overflow: hidden so it always repaints clipped content when overflow changes (r201407)

Web APIs

• Started allowing empty strings in the FontFace constructor and parse them as if they are omitted (r201421)
• Stopped firing a popstate event with a null state when navigating back to a stateless cached page (r201310)
• Started allowing custom drag-and-drop even without placing data in the pasteboard (r201227)

Web Inspector

• Made the split console stay closed when using Inspect Element context menu item (r201222)
• Fixed a regression where CSS properties modified via JavaScript didn’t update in the DOM tree or Styles sidebar (r201192)
• Improved garbage collection time by 2x when recording heap snapshots (r201520)
• Made heap snapshot views remove objects that have been garbage collected (r201183)
• Corrected how transitively dominated objects in heap snapshots display their retained size (r201477)
• Made ShadowChicken properly handle when the entry stack frame is a tail deleted frame (r201465)
• Added indicators to show nesting levels for DOM elements in the Elements tab (r201454)
• Fixed a regression where WebSQL databases were no longer shown in the Storage tab on first open (r201409)
• Improved load time of Web Inspector by profiling with Web Inspector (r201245)
• Fixed resuming the debugger after breaking on an exception inside a Promise callback (r201211)
• Fixed the main resource not showing up in the Debugger tab sidebar after a reload (r201210)

Media

• Reduced flicker and jumpiness when entering and exiting fullscreen presentation mode (r201405, r201474, r201530)

Bug Fixes

• Fixed scrolling on iTunes Connect pages (r201218)
• Fixed autocorrection so it is easier to type contractions and email addresses (r201490)
• Fixed a crash during font download failure after garbage collection (r201358)
• Reverted the change to ignore clicks inside button elements when the mouse moves, due to a regression (r201292)
• Fixed a regression that broke Zoom In (⌘+) on pages (r201090)

June 06, 2016

Memory Debugging with Web Inspector

WebKit Blog

Web Inspector now includes two new timelines for debugging a webpage’s memory usage. The first is a high-level Memory timeline intended to help developers to better understand the memory characteristics of their webpages, to identify spikes, and to detect general memory growth. The second is a detailed JavaScript Allocations timeline that allows developers to record, compare, and analyze snapshots of the JavaScript heap; useful for finding and addressing JavaScript memory growth and leaks.

Memory Timeline

Webpages come in all shapes and sizes. There may be static pages with lots of images and presentational animations that spike in memory, which can cause a sudden termination on memory-constrained platforms. Or they may be long-living interactive JavaScript applications that start small and accumulate memory over time and slowing down after long use. The high-level Memory timeline helps categorize how memory is being used and identify what may need further investigation.

The Memory timeline shows the total memory footprint of the inspected page. It breaks the total memory out into four different categories:

• JavaScript – JavaScript heap size. This includes JavaScript objects, strings, functions, and corresponding engine data associated with these objects. This section will only ever decrease in size due to a garbage collection.

• Images – Decoded image data. Most often this corresponds with the images visible in the viewport.

• Layers – Graphics layer data. This includes WebKit’s tile grid, page content using compositing layers, and any other layers the engine may make as an implementation detail.

• Page – All other memory. This includes engine memory related to the DOM, styles, rendering data, memory caches, system allocations, etc.

We feel that these categories give a good overview of most memory used in a webpage. In some pages Layers and Image data will be the largest categories, but on others the JavaScript heap may be larger. Having this breakdown gives you a place to start for investigating spikes and growth.

When investigating memory spikes, the Max Comparison memory chart can be useful. After selecting a specific time range at the top, you can see how the total memory usage at the end of the selection compares to the peak memory seen during the recording. The timeline will also include markers when apps receive memory pressure events.

Once you have the breakdown, you have an idea of where to look to make reductions. For large Image data, inspect your Image resources. To debug Layer data, use Web Inspector to enable Layer Borders to highlight the visible content on the page that uses compositing layers. For inspecting the JavaScript heap, we have the new JavaScript Allocations timeline. Before we look at this new timeline, let’s refresh our understanding of JavaScript object lifetimes.

JavaScript Leaks

JavaScript is a garbage collected language. As objects are created and modified, the engine automatically allocates any necessary memory for the objects. Once an object is no longer referenced, the engine can reclaim (“collect”) the memory allocated for that object.

To determine if an object should be kept alive or collected, the engine has to check if the object is reachable from a set of root objects. The window object in a webpage is a root object. The engine may have its own internal list of other root objects. JavaScriptCore includes a conservative garbage collector, so it treats any address on the stack that points to a heap allocated object as a root.

By following references from these root objects to other objects, and recursively on to objects they reference, the engine can mark all of the objects that are reachable (“live”), and should be kept alive. At the end, all of the objects in the heap that are not marked are unreachable (“dead”) and can be collected.

JavaScript applications will grow in memory as new objects are created and referenced. A memory leak occurs when objects that are no longer needed are still referenced, causing their memory to not be released. In JavaScript this can happen unintentionally if application logic fails to clear a reference to an object that is no longer needed.

Many leaks may be obvious once pointed out. For example, in this snippet a global variable holds onto a NodeList and will be kept alive:

function addClickHandlers() {
paragraphs = document.querySelectorAll("p");
for (let p of paragraphs)
p.addEventListener("click", () => console.log("clicked"));
}

addClickHandlers();


The NodeList in paragraphs is not needed after the function returns, but is leaked because it accidentally created a global variable window.paragraphs. Simple errors, like accidentally creating a global variable here, can be caught by using strict mode JavaScript. However, the same pattern can be less obvious:

class ElementDebugger {
constructor() { this.enabled = false; }

enable() { this.enabled = true; }
disable() { this.enabled = false; }

addElements(selector) {
this.elements = document.querySelectorAll(selector);
for (let elem of this.elements) {
elem.addEventListener("click", (event) => {
console.log("clicked", elem);
if (this.enabled)
debugger;
});
}
}
}

let paragraphDebugger = new ElementDebugger();
paragraphDebugger.addElements("p");
paragraphDebugger.enable();


In this example, we want the paragraphDebugger global object to be kept alive so that we can enable or disable it whenever we want. However, the elements NodeList may unintentionally be kept around. To avoid the leak here, we could have made a local variable for the list with let elements, or explicitly cleared the reference when it is determined to not be needed anymore, with this.elements = null or this.elements = undefined.

NOTE: It may be tempting to use the delete operator, but that can introduce its own performance penalties. For named properties on an object, delete should be avoided in favor of just setting the property to null or undefined.

The above examples included explicit direct references to objects (variables and object properties). However, data referenced by a closure is not explicit, and it is easy to encounter situations where objects are unnecessarily captured in closures and contribute to memory growth:

class MessageList {
constructor() { this.messages = []; }
addMessage(xhr) {
this.messages.push({
text() { return xhr.responseText; }
});
}
}

window.messageList = new MessageList();

// Add messages from completed XHRs.
messageList.addMessage(xhr1);
messageList.addMessage(xhr2);


In this example, the leak is not as obvious. In addMessage we add an object to our list of messages. Each message has a text method which will get the text for that message. However, we created this method as a closure function() { return xhr.responseText; }. This function captures xhr, so the complete XMLHttpRequest object is being retained by this closure even though we only need a small portion of its data. This is unnecessarily wasteful.

Even worse, this XMLHttpRequest can have event listeners that it retains, and those event listeners may also be closures that retain even more objects! All this, when all we need to retain is just the text. To avoid retaining the XMLHttpRequest in this example, we can just avoid capturing it in our closure, and we can instead just keep the data we need:

addMessage(xhr) {
let messageText = xhr.responseText;
this.messages.push({
text() { return messageText; }
});
}


For many webpages small memory growth is not problematic. The page will use a bit of extra memory, but when the user navigates it will get cleaned up. Memory growth becomes a much bigger problem with long running JavaScript applications. As small to medium memory leaks build up over time, the application’s performance can start to degrade. Ultimately the memory footprint may reach the limits of memory-constrained devices and cause a crash.

JavaScript Allocations Timeline

The JavaScript Allocations Timeline gathers snapshots of the JavaScript heap which can then be analyzed. The timeline takes a snapshot at the start of recording, periodically during the recording, and at the end of recording. You can also use the button in the timeline’s navigation bar or call console.takeHeapSnapshot(<label>) in your code.

A heap snapshot performs a full garbage collection and builds a graph of nodes (the live JavaScript objects) and directed edges (the references between the nodes). Node data includes some basic information about the object: a unique identifier, type, and size. Edge data lets us later know exactly how this object was kept alive, so we record a name for the edge that will be useful when displaying this path. For example if the edge was an object property we would record the property name, or if it is a captured closure variable we record the name of the variable.

The snapshot itself does not retain any JavaScript objects. This is important for detecting leaks; you want to allow objects to get collected, so that later you can identify the leaked objects which were not collected.

When you drill into an individual snapshot, we provide a few different views that let you explore and inspect. There is the Object Graph view, which allows you to explore the heap from a set of root objects, namely Window objects. Then there is the Instances view, which groups objects by class. Because we are connected to the live page, if a particular object is still alive we can provide a preview of the object and you can even log the value to Web Inspector’s console and interact with the object directly. Collected objects are removed from the top level of the Instances view.

The Instances view is where you will spend most of your time, because it gives you quick access to any object no matter how deep or complex the path to the object may be. Its categorization also makes it easy to recognize potential issues. For example, if you notice that there are multiple XMLHttpRequest or Promise instances but you didn’t expect any such objects to exist, you can immediately investigate them. This view is also ideal for sorting by size, allowing you to quickly focus on the largest objects in a snapshot, which saves analysis time in the case of a group of leaked objects where the larger objects are often the root causes of the leaks.

When expanding an instance, you see the other objects it references. Explicit references, such as a property name or array index, will have a name. Implicit or internal references, such as a closure retaining variables defined in an enclosing scope, will not have a name.

Each instance has two sizes. A self size and a retained size. The self size is only the size of the individual instance. This is normally very small, enough to hold the object’s state. It can be larger for strings and certain system objects representing compiled code. The retained size is the size of the object plus the size of all of the nodes it dominates (the objects that this particular object solely keeps alive). An easy way to think about the retained size is if the object were to be deleted right now, the retained size would be the amount of memory that would be reclaimed. The Mozilla Developer Network (MDN) provides an excellent description of dominators in JavaScript.

After creating a few objects like so:

class Person {
constructor(name) {
this.name = name;
}
}

class Group {
constructor(...members) {
this.members = members;
}
}

let shared = new Person("Shared");
let p1 = new Person("Person 1");
let p2 = new Person("Person 2");
let p3 = new Person("Person 3");
p1.parent = p2.parent = p3.parent = shared;

let group = new Group(p1, p2, p3);


We can find the group object instance, expand it and see the objects it immediately dominates (members array), and if we keep expanding see the other objects it dominates (p1, p2, p3, shared) that ultimately contribute to its total retained size.

Perhaps the most powerful aspect of the memory tools is being able to determine the path to a particular object, so you can reason about what keeps it alive. When you hover the instance’s unique identifier, you get a popover showing the shortest path from a root to that instance. If you suspect an object should have gone away, this path will be invaluable for understanding why the object is kept alive.

You can click the unique identifier to log the live value to the console so that you can interact with it directly. Also, for functions, you can click the goto arrow to jump directly to the function declaration.

Detecting JavaScript Leaks

Heap snapshot comparisons are an effective technique for detecting leaks and unintended memory growth. The technique is often referred to as generational analysis. Analyzing an individual heap snapshot for leaks would be time consuming, and on pages with a large number of objects, small leaks would be hard to spot. This is where comparisons shine, letting you focus in on just the objects created between two points in time.

Generational analysis works best when comparing two snapshots before and after an operation that you expect to be memory neutral or have minimal growth. For example, showing and hiding a section of the page, creating and deleting a comment, toggling a preference on and off. You would not expect these actions to cause large memory growth. But if you perform them repeatedly and they do, then comparing a snapshot from before and after the operation would reveal created objects that have not been collected and may be leaks.

Put simply, the steps are:

1. Get your web application into a steady state.
2. Start recording JavaScripts Allocations Timeline.
3. Perform actions that are expected to be memory neutral. Take a snapshot each repetition.
4. Stop recording.

It is best to repeat the action multiple times and end up with multiple snapshots. Often applications populate caches the first time an operation is performed, or just as likely the JavaScript engine itself may create its own internal objects early on. If you perform the action five times and memory only increased the first time, then there likely isn’t a problem, but if you saw a steady increase each time, then you’ve likely uncovered a leak. This style of analysis works great with console.takeHeapSnapshot() because it makes it easy to control the exact before and after points.

To compare two snapshots start at the snapshot list. Click the Compare button, select a baseline snapshot (before) and comparison snapshot (after) and you get the familiar Instances view for the comparison. The comparison shows only the objects created within that time range that are still alive.

Playing with the example above, it is easy to see that multiple XMLHttpRequest objects are kept alive, see that it is a closure keeping them alive, jump to the function capturing them, and address the issue.

Implementation Details

Exposing all of the objects in JavaScriptCore’s heap reveals internal, engine allocated, objects in the heap. These appear as nodes without a preview with names like Structure and FunctionExecutable. We felt it was useful to include these objects to accurately show how they contribute to the retained size of the actual objects exposed to the page. However, keep in mind that their names, and even their existence, is entirely an internal implementation detail that may change. For this reason, the Instances view filters out such objects from the top level categories, allowing you to focus on only the objects you have control over.

In JavaScriptCore, primitive values like numbers and booleans are not allocated as heap objects. Hence, they will not show up in any snapshots. Instead, they are stored as encoded values in JavaScript objects. The string primitive, on the other hand, is allocated as a heap object, and will show up in snapshots. You can always log a live value to the console and see all of its properties and values.

We took great effort in keeping the memory and performance costs of snapshots to a minimum. After all, if you are debugging a memory issue you don’t want the memory tools introducing more memory pressure. However, you should be aware that debugging both memory and performance at the same time won’t be as accurate as measuring either of them individually. Web Inspector has the ability to let you turn on and off individual timelines to get the most accurate recording possible.

Like other potentially expensive console APIs, console.takeHeapSnapshot does nothing unless Web Inspector is open. That said, it is always best practice to avoid including unnecessary debug code in production.

Feedback

You can try out the new Memory Timelines in the latest Safari Technology Preview. Let us know how they work for you. Send feedback on Twitter (@webkit, @JosephPecoraro) or by filing a bug.

May 26, 2016

Manuel Rego: CSS Grid Layout and positioned items

Igalia WebKit

As part of the work done by Igalia in the CSS Grid Layout implementation on Chromium/Blink and Safari/WebKit, we’ve been implementing the support for positioned items. Yeah, absolute positioning inside a grid. 😅

Probably the first idea is that come to your mind is that you don’t want to use positioned grid items, but maybe in some use cases it can be needed. The idea of this post is to explain how they work inside a grid container as they have some particularities.

Actually there’s not such a big difference compared to regular grid items. When the grid container is the containing block of the positioned items (e.g. using position: relative; on the grid container) they’re placed almost the same than regular grid items. But, there’re a few differences:

• Positioned items don't stretch by default.
• They don't use the implicit grid. They don't create implicit tracks.
• They don't occupy cells regarding auto-placement feature.
• autohas a special meaning when referring lines.

Let’s explain with more detail each of these features.

Positioned items shrink to fit

We’re used to regular items that stretch by default to fill their area. However, that’s not the case for positioned items, similar to what a positioned regular block does, they shrink to fit.

This is pretty easy to get, but a simple example will make it crystal clear:

In this example we’ve a simple 2x2 grid. Both the regular item and the positioned one are placed with the same rules taking the whole grid. This defines the area for those items, which takes the 1st & 2nd rows and 1st & 2nd columns.

Positioned items shrink to fit

The regular item stretches by default both horizontally and vertically, so it takes the whole size of the grid area. However, the positioned item shrink to fit and adapts its size to the contents.

For the examples in the next points I’m ignoring this difference, as I want to show the area that each positioned item takes. To get the same result than in the pictures, you’d need to set 100% width and height on the positioned items.

Positioned items and implicit grid

Positioned items don’t participate in the layout of the grid, neither they affect how items are placed.

You can place a regular item outside the explicit grid, and the grid will create the required tracks to accommodate the item. However, in the case of positioned items, you cannot even refer to lines in the implicit grid, they'll be treated as auto. Which means that you cannot place a positioned item in the implicit grid. they cannot create implicit tracks as they don't participate in the layout of the grid.

Let’s use an example to understand this better:

The example defines a 2x2 grid, but the positioned item is using grid-area: 4 / 4; so it tries to goes to the 4th row and 4th column. However the positioned items cannot create those implicit tracks. So it’s positioned like if it has auto, which in this case will take the whole explicit grid. auto has a special meaning in positioned items, it’ll be properly explained later.

Positioned items do not create implicit tracks

Imagine another example where regular items create implicit tracks:

In this case, the regular items will be creating the implicit tracks, making a 4x4 grid in total. Now the positioned item can be placed on the 4th row and 4th column, even if those columns are on the explicit grid.

Positioned items can be placed on the implicit grid

As you can see this part of the post has been modified, thanks to @fantasai for notifying me about the mistake.

Positioned items and placement algorithm

Again the positioned items do not affect the position of other items, as they don’t participate in the placement algorithm.

So, if you’ve a positioned item and you’re using auto-placement for some regular items, it’s expected that the positioned one overlaps the other. The positioned items are completely ignored during auto-placement.

Just showing a simple example to show this behavior:

Here we’ve again a 2x2 grid, with 3 auto-placed regular items, and 1 absolutely positioned item. As you can see the positioned item is placed on the 1st row and 2nd column, but there’s an auto-placed item in that cell too, which is below the positioned one. This shows that the grid container doesn’t care about positioned items and it just ignores them when it has to place regular items.

Positioned items and placement algorithm

If all the children were not positioned, the last one would be placed in the given position (1st row and 2nd column), and the rest of them (auto-placed) will take the other cells, without overlapping.

Positioned items and auto lines

This is probably the biggest difference compared to regular grid items. If you don’t specify a line, it’s considered that you’re using auto, but auto is not resolved as span 1 like in regular items. For positioned items auto is resolved to the padding edge.

The specification introduces the concepts of the lines 0 and -0, despite how weird it can sound, it actually makes sense. The auto lines would be referencing to those 0 and -0 lines, that represent the padding edges of the grid container.

Again let’s use a few examples to explain this:

Here we have a 2x2 grid container, which has some padding. The positioned item will be placed in the 2nd row and 1st column, but its area will take up to the padding edges (as the end line is auto in both axis).

Positioned items and auto lines

We could even place positioned grid items on the padding itself. For example using “grid-column: auto / 1;” the item would be on the left padding.

Positioned items using auto lines to be placed on the left padding

Of course if the grid is wider and we’ve some free space on the content box, the items will take that space too. For example:

Here the grid columns are 500px, but the grid container has 600px width. This means that we’ve 100px of free space in the grid content box. As you can see in the example, that space will be also used when the positioned items extend up to the padding edges.

Positioned items taking free space and right padding

Offsets

Of course you can use offsets to place your positioned items (left, right, top and bottom properties).

These offsets will apply inside the grid area defined for the positioned items, following the rules explained above.

Let’s use another example:

Again a 2x2 grid container with some padding. The positioned item have some offsets which are applied inside its grid area.

Positioned items and offets

Wrap-up

I’m not completely sure about how important is the support of positioned elements for web authors using Grid Layout. You’ll be the ones that have to tell if you really find use cases that need this. I hope this post helps to understand it better and make your minds about real-life scenarios where this might be useful.

The good news is that you can test this already in the most recent versions of some major browsers: Chrome Canary, Safari Technology Preview and Firefox. We hope that the 3 implementations are interoperable, but please let us know if you find any issue.

There’s one last thing missing: alignment support for positioned items. This hasn’t been implemented yet in any of the browsers, but the behavior will be pretty similar to the one you can already use with regular grid items. Hopefully, we’ll have time to add support for this in the coming months.

Igalia and Bloomberg working together to build a better web

Last but least, thanks to Bloomberg for supporting Igalia in the CSS Grid Layout implementation on Blink and WebKit.

May 23, 2016

Igalia Compilers Team: Awaiting the future of JavaScript in V8

Igalia WebKit

On the evening of Monday, May 16th, 2016, we have made history. We’ve landed the initial implementation of “Async Functions” in V8, the JavaScript runtime in use by the Google Chrome and Node.js. We do these things not because they are easy, but because they are hard. Because that goal will serve to organize and measure the best of our energies and skills, because that challenge is one we are willing to accept. It is very exciting to see this, roughly 2 months of implementation, codereview and standards finangling/discussion to land. It is truly an honour.

To introduce you to Async Functions, it’s first necessary to understand two things: the status quo of async programming in JavaScript, as well as Generators (previously implemented by fellow Igalian Andy).

Async programming in JavaScript has historically been implemented by callbacks. window.setTimeout(function toExecuteLaterOnceTimeHasPassed() {}, …) being the common example. Callbacks on their own are not scalable: when numerous nested asynchronous operations are needed, code becomes extremely difficult to read and reason about. Abstraction libraries have been tacked on to improve this, including caolan’s async package, or Promise libraries such as Q. These abstractions simplify control flow management and data flow management, and are a massive improvement over plain Callbacks. But we can do better! For a more detailed look at Promises, have a look at the fantastic MDN article. Some great resources on why and how callbacks can lead to utter non-scalable disaster exist too, check out http://callbackhell.com!

The second concept, Generators, allow a runtime to return from a function at an arbitrary line, and later re-enter that function at the following instruction, in order to continue execution. So right away you can imagine where this is going — we can continue execution of the same function, rather than writing a closure to continue execution in a new function. Async Functions rely on this same mechanism (and in fact, on the underlying Generators implementation), to achieve their goal, immensely simplifying non-trivial coordination of asynchronous operations.

As a simple example, lets compare the following two approaches:
function deployApplication() {
return cleanDirectory(__DEPLOYMENT_DIR__).
then(fetchNpmDependencies).
then(
deps => Promise.all(
deps.map(
dep => moveToDeploymentSite(
dep.files,
${__DEPLOYMENT_DIR__}/deps/${dep.name}
))).
then(() => compileSources(__SRC_DIR__,
__DEPLOYMENT_DIR__)).
then(uploadToServer);
}


The Promise boiler plate makes this preit harder to read and follow than it could be. And what happens if an error occurs? Do we want to add catch handlers to each link in the Promise chain? That will only make it even more difficult to follow, with error handling interleaved in difficult to read ways.

Lets refactor this using async functions:
async function deployApplication() {
await cleanDIrectory(__DEPLOYMENT_DIR__);
let dependencies = await fetchNpmDependencies();

// *see below*
for (let dep of dependencies) {
await moveToDeploymentSite(
dep.files,
${__DEPLOYMENT_DIR__}/deps/${dep.name});
}

await compileSources(__SRC_DIR__,
__DEPLOYMENT_DIR__);
return uploadToServer();
}



You’ll notice that the “moveToDeploymentSite” step is slightly different in the async function version, in that it completes each operation in a serial pipeline, rather than completing each operation in parallel, and continuing once finished. This is an unfortunate limitation of the async function specification, which will hopefully be improved on in the future.

In the meantime, it’s still possible to use the Promise API in async functions, as you can await any Promise, and continue execution after it is resolved. This grants compatibility with  numerous existing Web Platform APIs (such as fetch()), which is ultimately a good thing! Here’s an alternative implementation of this step, which performs the moveToDeploymentSite() bits in parallel, rather than serially:

await Promise.all(dependencies.map(
dep => moveToDeploymentSite(
dep.files,
${__DEPLOYMENT_DIR__}/deps/${dep.name}
)));



Now, it’s clear from the let dependencies = await fetchNpmDependencies(); line that Promises are unwrapped automatically. What happens if the promise is rejected with an error, rather than resolved with a value? With try-catch blocks, we can catch rejected promise errors inside async functions! And if they are not caught, they will automatically return a rejected Promise from the async function.
function throwsError() { throw new Error("oops"); }

async function foo() { throwsError(); }

// will print the Error thrown in throwsError.
foo().catch(console.error)

async function bar() {
try {
var value = await foo();
} catch (error) {
// Rejected Promise is unwrapped automatically, and
// execution continues here, allowing us to recover
// from the error! error is new Error("oops!")
}
}



There are also lots of convenient forms of async function declarations, which hopefully serve lots of interesting use-cases! You can concisely declare methods as asynchronous in Object literals and ES6 classes, by preceding the method name with the async keyword (without a preceding line terminator!)

class C {
async doAsyncOperation() {
// ...
}
};

var obj = {
async getFacebookProfileAsynchronously() {
/* ... */
}
};



These features allow us to write more idiomatic, easier to understand asynchronous control flow in our applications, and future extensions to the ECMAScript specification will enable even more idiomatic forms for writing complex algorithms, in a maintainable and readable fashion. We are very excited about this! There are numerous other resources on the web detailing async functions, their benefits, and perhaps ways they might be improved in the future. Some good ones include [this piece from Google’s Jake Archibald](https://jakearchibald.com/2014/es7-async-functions/), so give that a read for more details. It’s a few years old, but it holds up nicely!

So, now that you’ve seen the overview of the feature, you might be wondering how you can try it out, and when it will be available for use. For the next few weeks, it’s still too experimental even for the “Experimental Javascript” flag.  But if you are adventurous, you can try it already!  Fetch the latest Chrome Canary build, and start Chrome with the command-line-flag –js-flags=”–harmony-async-await”. We can’t make promises about the shipping timeline, but it could ship as early as Chrome 53 or Chrome 54, which will become stable in September or October.

We owe a shout out to Bloomberg, who have provided us with resources to improve the web platform that we love. Hopefully, we are providing their engineers with ways to write more maintainable, more performant, and more beautiful code. We hope to continue this working relationship in the future!

As well, shoutouts are owed to the Chromium team, who have assisted in reviewing the feature, verifying its stability, getting devtools integration working, and ultimately getting the code upstream. Terriffic! In addition, the WebKit team has also been very helpful, and hopefully we will see the feature land in JavaScriptCore in the not too distant future.

April 16, 2016

Frédéric Wang: OpenType MATH in HarfBuzz

Igalia WebKit

TL;DR:

• Work is in progress to add OpenType MATH support in HarfBuzz and will be instrumental for many math rendering engines relying on that library, including browsers.

• For stretchy operators, an efficient way to determine the required number of glyphs and their overlaps has been implemented and is described here.

In the context of Igalia browser team effort to implement MathML support using TeX rules and OpenType features, I have started implementation of OpenType MATH support in HarfBuzz. This table from the OpenType standard is made of three subtables:

• The MathConstants table, which contains layout constants. For example, the thickness of the fraction bar of ab\frac{a}{b}.

• The MathGlyphInfo table, which contains glyph properties. For instance, the italic correction indicating how slanted an integral is e.g. to properly place the subscript in ∫D\displaystyle\displaystyle\int_{D}.

• The MathVariants table, which provides larger size variants for a base glyph or data to build a glyph assembly. For example, either a larger parenthesis or a assembly of U+239B, U+239C, U+239D to write something like:

 (abcdefgh\left(\frac{\frac{\frac{a}{b}}{\frac{c}{d}}}{\frac{\frac{e}{f}}{\frac{g}{h}}}\right.

Code to parse this table was added to Gecko and WebKit two years ago. The existing code to build glyph assembly in these Web engines was adapted to use the MathVariants data instead of only private tables. However, as we will see below the MathVariants data to build glyph assembly is more general, with arbitrary number of glyphs or with additional constraints on glyph overlaps. Also there are various fallback mechanisms for old fonts and other bugs that I think we could get rid of when we move to OpenType MATH fonts only.

In order to add MathML support in Blink, it is very easy to import the OpenType MATH parsing code from WebKit. However, after discussions with some Google developers, it seems that the best option is to directly add support for this table in HarfBuzz. Since this library is used by Gecko, by WebKit (at least the GTK port) and by many other applications such as Servo, XeTeX or LibreOffice it make senses to share the implementation to improve math rendering everywhere.

The idea for HarfBuzz is to add an API to

1. 1.

Expose data from the MathConstants and MathGlyphInfo.

2. 2.

Shape stretchy operators to some target size with the help of the MathVariants.

It is then up to a higher-level math rendering engine (e.g. TeX or MathML rendering engines) to beautifully display mathematical formulas using this API. The design choice for exposing MathConstants and MathGlyphInfo is almost obvious from the reading of the MATH table specification. The choice for the shaping API is a bit more complex and discussions is still in progress. For example because we want to accept stretching after glyph-level mirroring (e.g. to draw RTL clockwise integrals) we should accept any glyph and not just an input Unicode strings as it is the case for other HarfBuzz shaping functions. This shaping also depends on a stretching direction (horizontal/vertical) or on a target size (and Gecko even currently has various ways to approximate that target size). Finally, we should also have a way to expose italic correction for a glyph assembly or to approximate preferred width for Web rendering engines.

As I mentioned at the beginning, the data and algorithm to build glyph assembly is the most complex part of the OpenType MATH and deserves a special interest. The idea is that you have a list of n≥1n\geq 1 glyphs available to build the assembly. For each 0≤i≤n-10\leq i\leq n-1, the glyph gig_{i} has advance aia_{i} in the stretch direction. Each gig_{i} has straight connector part at its start (of length sis_{i}) and at its end (of length eie_{i}) so that we can align the glyphs on the stretch axis and glue them together. Also, some of the glyphs are “extenders” which means that they can be repeated 0, 1 or more times to make the assembly as large as possible. Finally, the end/start connectors of consecutive glyphs must overlap by at least a fixed value omino_{\mathrm{min}} to avoid gaps at some resolutions but of course without exceeding the length of the corresponding connectors. This gives some flexibility to adjust the size of the assembly and get closer to the target size tt.

To ensure that the width/height is distributed equally and the symmetry of the shape is preserved, the MATH table specification suggests the following iterative algorithm to determine the number of extenders and the connector overlaps to reach a minimal target size tt:

1. 1.

Assemble all parts by overlapping connectors by maximum amount, and removing all extenders. This gives the smallest possible result.

2. 2.

Determine how much extra width/height can be distributed into all connections between neighboring parts. If that is enough to achieve the size goal, extend each connection equally by changing overlaps of connectors to finish the job.

3. 3.

If all connections have been extended to minimum overlap and further growth is needed, add one of each extender, and repeat the process from the first step.

We note that at each step, each extender is repeated the same number of times r≥0r\geq 0. So if IExtI_{\mathrm{Ext}} (respectively INonExtI_{\mathrm{NonExt}}) is the set of indices 0≤i≤n-10\leq i\leq n-1 such that gig_{i} is an extender (respectively is not an extender) we have ri=rr_{i}=r (respectively ri=1r_{i}=1). The size we can reach at step rr is at most the one obtained with the minimal connector overlap omino_{\mathrm{min}} that is

 ∑i=0N-1(∑j=1riai-omin)+omin=(∑i∈INonExtai-omin)+(∑i∈IExtr⁢(ai-omin))+omin\sum_{i=0}^{N-1}\left(\sum_{j=1}^{r_{i}}{a_{i}-o_{\mathrm{min}}}\right)+o_{% \mathrm{min}}=\left(\sum_{i\in I_{\mathrm{NonExt}}}{a_{i}-o_{\mathrm{min}}}% \right)+\left(\sum_{i\in I_{\mathrm{Ext}}}r{(a_{i}-o_{\mathrm{min}})}\right)+o% _{\mathrm{min}}

We let NExt=|IExt|N_{\mathrm{Ext}}={|I_{\mathrm{Ext}}|} and NNonExt=|INonExt|N_{\mathrm{NonExt}}={|I_{\mathrm{NonExt}}|} be the number of extenders and non-extenders. We also let SExt=∑i∈IExtaiS_{\mathrm{Ext}}=\sum_{i\in I_{\mathrm{Ext}}}a_{i} and SNonExt=∑i∈INonExtaiS_{\mathrm{NonExt}}=\sum_{i\in I_{\mathrm{NonExt}}}a_{i} be the sum of advances for extenders and non-extenders. If we want the advance of the glyph assembly to reach the minimal size tt then

 SNonExt-omin⁢(NNonExt-1)+r⁢(SExt-omin⁢NExt)≥t{S_{\mathrm{NonExt}}-o_{\mathrm{min}}\left(N_{\mathrm{NonExt}}-1\right)}+{r% \left(S_{\mathrm{Ext}}-o_{\mathrm{min}}N_{\mathrm{Ext}}\right)}\geq t

We can assume 0" class="ltx_Math" display="inline" id="p12.m1">SExt-omin⁢NExt>0S_{\mathrm{Ext}}-o_{\mathrm{min}}N_{\mathrm{Ext}}>0 or otherwise we would have the extreme case where the overlap takes at least the full advance of each extender. Then we obtain

 r≥rmin=max⁡(0,⌈t-SNonExt+omin⁢(NNonExt-1)SExt-omin⁢NExt⌉)r\geq r_{\mathrm{min}}=\max\left(0,\left\lceil\frac{t-{S_{\mathrm{NonExt}}+o_{% \mathrm{min}}\left(N_{\mathrm{NonExt}}-1\right)}}{S_{\mathrm{Ext}}-o_{\mathrm{% min}}N_{\mathrm{Ext}}}\right\rceil\right)

This provides a first simplification of the algorithm sketched in the MATH table specification: Directly start iteration at step rminr_{\mathrm{min}}. Note that at each step we start at possibly different maximum overlaps and decrease all of them by a same value. It is not clear what to do when one of the overlap reaches omino_{\mathrm{min}} while others can still be decreased. However, the sketched algorithm says all the connectors should reach minimum overlap before the next increment of rr, which means the target size will indeed be reached at step rminr_{\mathrm{min}}.

One possible interpretation is to stop overlap decreasing for the adjacent connectors that reached minimum overlap and to continue uniform decreasing for the others until all the connectors reach minimum overlap. In that case we may lose equal distribution or symmetry. In practice, this should probably not matter much. So we propose instead the dual option which should behave more or less the same in most cases: Start with all overlaps set to omino_{\mathrm{min}} and increase them evenly to reach a same value oo. By the same reasoning as above we want the inequality

 SNonExt-o⁢(NNonExt-1)+rmin⁢(SExt-o⁢NExt)≥t{S_{\mathrm{NonExt}}-o\left(N_{\mathrm{NonExt}}-1\right)}+{r_{\mathrm{min}}% \left(S_{\mathrm{Ext}}-oN_{\mathrm{Ext}}\right)}\geq t

which can be rewritten

 SNonExt+rmin⁢SExt-o⁢(NNonExt+rmin⁢NExt-1)≥tS_{\mathrm{NonExt}}+r_{\mathrm{min}}S_{\mathrm{Ext}}-{o\left(N_{\mathrm{NonExt% }}+{r_{\mathrm{min}}N_{\mathrm{Ext}}}-1\right)}\geq t

We note that N=NNonExt+rmin⁢NExtN=N_{\mathrm{NonExt}}+{r_{\mathrm{min}}N_{\mathrm{Ext}}} is just the exact number of glyphs used in the assembly. If there is only a single glyph, then the overlap value is irrelevant so we can assume NNonExt+r⁢NExt-1=N-1≥1N_{\mathrm{NonExt}}+{rN_{\mathrm{Ext}}}-1=N-1\geq 1. This provides the greatest theorical value for the overlap oo:

 omin≤o≤omaxtheorical=SNonExt+rmin⁢SExt-tNNonExt+rmin⁢NExt-1o_{\mathrm{min}}\leq o\leq o_{\mathrm{max}}^{\mathrm{theorical}}=\frac{S_{% \mathrm{NonExt}}+r_{\mathrm{min}}S_{\mathrm{Ext}}-t}{N_{\mathrm{NonExt}}+{r_{% \mathrm{min}}N_{\mathrm{Ext}}}-1}

Of course, we also have to take into account the limit imposed by the start and end connector lengths. So omaxo_{\mathrm{max}} must also be at most min⁡(ei,si+1)\min{(e_{i},s_{i+1})} for 0≤i≤n-20\leq i\leq n-2. But if rmin≥2r_{\mathrm{min}}\geq 2 then extender copies are connected and so omaxo_{\mathrm{max}} must also be at most min⁡(ei,si)\min{(e_{i},s_{i})} for i∈IExti\in I_{\mathrm{Ext}}. To summarize, omaxo_{\mathrm{max}} is the minimum of omaxtheoricalo_{\mathrm{max}}^{\mathrm{theorical}}, of eie_{i} for 0≤i≤n-20\leq i\leq n-2, of sis_{i} 1≤i≤n-11\leq i\leq n-1 and possibly of e0e_{0} (if 0∈IExt0\in I_{\mathrm{Ext}}) and of of sn-1s_{n-1} (if n-1∈IExt{n-1}\in I_{\mathrm{Ext}}).

With the algorithm described above NExtN_{\mathrm{Ext}}, NNonExtN_{\mathrm{NonExt}}, SExtS_{\mathrm{Ext}}, SNonExtS_{\mathrm{NonExt}} and rminr_{\mathrm{min}} and omaxo_{\mathrm{max}} can all be obtained using simple loops on the glyphs gig_{i} and so the complexity is O⁢(n)O(n). In practice nn is small: For existing fonts, assemblies are made of at most three non-extenders and two extenders that is n≤5n\leq 5 (incidentally, Gecko and WebKit do not currently support larger values of nn). This means that all the operations described above can be considered to have constant complexity. This is much better than a naive implementation of the iterative algorithm sketched in the OpenType MATH table specification which seems to require at worst

 ∑r=0rmin-1NNonExt+r⁢NExt=NNonExt⁢rmin+rmin⁢(rmin-1)2⁢NExt=O⁢(n×rmin2)\sum_{r=0}^{r_{\mathrm{min}}-1}{N_{\mathrm{NonExt}}+rN_{\mathrm{Ext}}}=N_{% \mathrm{NonExt}}r_{\mathrm{min}}+\frac{r_{\mathrm{min}}\left(r_{\mathrm{min}}-% 1\right)}{2}N_{\mathrm{Ext}}={O(n\times r_{\mathrm{min}}^{2})}

and at least Ω⁢(rmin)\Omega(r_{\mathrm{min}}).

One of issue is that the number of extender repetitions rminr_{\mathrm{min}} and the number of glyphs in the assembly NN can become arbitrary large since the target size tt can take large values e.g. if one writes \underbrace{\hspace{65535em}} in LaTeX. The improvement proposed here does not solve that issue since setting the coordinates of each glyph in the assembly and painting them require Θ⁢(N)\Theta(N) operations as well as (in the case of HarfBuzz) a glyph buffer of size NN. However, such large stretchy operators do not happen in real-life mathematical formulas. Hence to avoid possible hangs in Web engines a solution is to impose a maximum limit NmaxN_{\mathrm{max}} for the number of glyph in the assembly so that the complexity is limited by the size of the DOM tree. Currently, the proposal for HarfBuzz is Nmax=128N_{\mathrm{max}}=128. This means that if each assembly glyph is 1em large you won’t be able to draw stretchy operators of size more than 128em, which sounds a quite reasonable bound. With the above proposal, rminr_{\mathrm{min}} and so NN can be determined very quickly and the cases N≥NmaxN\geq N_{\mathrm{max}} rejected, so that we avoid losing time with such edge cases…

Finally, because in our proposal we use the same overlap oo everywhere an alternative for HarfBuzz would be to set the output buffer size to nn (i.e. ignore r-1r-1 copies of each extender and only keep the first one). This will leave gaps that the client can fix by repeating extenders as long as oo is also provided. Then HarfBuzz math shaping can be done with a complexity in time and space of just O⁢(n)O(n) and it will be up to the client to optimize or limit the painting of extenders for large values of NN…

April 15, 2016

Frédéric Wang: OpenType MATH in HarfBuzz

Igalia WebKit

TL;DR:

• Work is in progress to add OpenType MATH support in HarfBuzz and will be instrumental for many math rendering engines relying on that library, including browsers.

• For stretchy operators, an efficient way to determine the required number of glyphs and their overlaps has been implemented and is described here.

In the context of Igalia browser team effort to implement MathML support using TeX rules and OpenType features, I have started implementation of OpenType MATH support in HarfBuzz. This table from the OpenType standard is made of three subtables:

• The MathConstants table, which contains layout constants. For example, the thickness of the fraction bar of ab\frac{a}{b}.

• The MathGlyphInfo table, which contains glyph properties. For instance, the italic correction indicating how slanted an integral is e.g. to properly place the subscript in ∫D\displaystyle\displaystyle\int_{D}.

• The MathVariants table, which provides larger size variants for a base glyph or data to build a glyph assembly. For example, either a larger parenthesis or a assembly of U+239B, U+239C, U+239D to write something like:

 (abcdefgh\left(\frac{\frac{\frac{a}{b}}{\frac{c}{d}}}{\frac{\frac{e}{f}}{\frac{g}{h}}}\right.

Code to parse this table was added to Gecko and WebKit two years ago. The existing code to build glyph assembly in these Web engines was adapted to use the MathVariants data instead of only private tables. However, as we will see below the MathVariants data to build glyph assembly is more general, with arbitrary number of glyphs or with additional constraints on glyph overlaps. Also there are various fallback mechanisms for old fonts and other bugs that I think we could get rid of when we move to OpenType MATH fonts only.

In order to add MathML support in Blink, it is very easy to import the OpenType MATH parsing code from WebKit. However, after discussions with some Google developers, it seems that the best option is to directly add support for this table in HarfBuzz. Since this library is used by Gecko, by WebKit (at least the GTK port) and by many other applications such as Servo, XeTeX or LibreOffice it make senses to share the implementation to improve math rendering everywhere.

The idea for HarfBuzz is to add an API to

1. 1.

Expose data from the MathConstants and MathGlyphInfo.

2. 2.

Shape stretchy operators to some target size with the help of the MathVariants.

It is then up to a higher-level math rendering engine (e.g. TeX or MathML rendering engines) to beautifully display mathematical formulas using this API. The design choice for exposing MathConstants and MathGlyphInfo is almost obvious from the reading of the MATH table specification. The choice for the shaping API is a bit more complex and discussions is still in progress. For example because we want to accept stretching after glyph-level mirroring (e.g. to draw RTL clockwise integrals) we should accept any glyph and not just an input Unicode strings as it is the case for other HarfBuzz shaping functions. This shaping also depends on a stretching direction (horizontal/vertical) or on a target size (and Gecko even currently has various ways to approximate that target size). Finally, we should also have a way to expose italic correction for a glyph assembly or to approximate preferred width for Web rendering engines.

As I mentioned at the beginning, the data and algorithm to build glyph assembly is the most complex part of the OpenType MATH and deserves a special interest. The idea is that you have a list of n≥1n\geq 1 glyphs available to build the assembly. For each 0≤i≤n-10\leq i\leq n-1, the glyph gig_{i} has advance aia_{i} in the stretch direction. Each gig_{i} has straight connector part at its start (of length sis_{i}) and at its end (of length eie_{i}) so that we can align the glyphs on the stretch axis and glue them together. Also, some of the glyphs are “extenders” which means that they can be repeated 0, 1 or more times to make the assembly as large as possible. Finally, the end/start connectors of consecutive glyphs must overlap by at least a fixed value omino_{\mathrm{min}} to avoid gaps at some resolutions but of course without exceeding the length of the corresponding connectors. This gives some flexibility to adjust the size of the assembly and get closer to the target size tt.

To ensure that the width/height is distributed equally and the symmetry of the shape is preserved, the MATH table specification suggests the following iterative algorithm to determine the number of extenders and the connector overlaps to reach a minimal target size tt:

1. 1.

Assemble all parts by overlapping connectors by maximum amount, and removing all extenders. This gives the smallest possible result.

2. 2.

Determine how much extra width/height can be distributed into all connections between neighboring parts. If that is enough to achieve the size goal, extend each connection equally by changing overlaps of connectors to finish the job.

3. 3.

If all connections have been extended to minimum overlap and further growth is needed, add one of each extender, and repeat the process from the first step.

We note that at each step, each extender is repeated the same number of times r≥0r\geq 0. So if IExtI_{\mathrm{Ext}} (respectively INonExtI_{\mathrm{NonExt}}) is the set of indices 0≤i≤n-10\leq i\leq n-1 such that gig_{i} is an extender (respectively is not an extender) we have ri=rr_{i}=r (respectively ri=1r_{i}=1). The size we can reach at step rr is at most the one obtained with the minimal connector overlap omino_{\mathrm{min}} that is

 ∑i=0N-1(∑j=1riai-omin)+omin=(∑i∈INonExtai-omin)+(∑i∈IExtr⁢(ai-omin))+omin\sum_{i=0}^{N-1}\left(\sum_{j=1}^{r_{i}}{a_{i}-o_{\mathrm{min}}}\right)+o_{ \mathrm{min}}=\left(\sum_{i\in I_{\mathrm{NonExt}}}{a_{i}-o_{\mathrm{min}}} \right)+\left(\sum_{i\in I_{\mathrm{Ext}}}r{(a_{i}-o_{\mathrm{min}})}\right)+o% _{\mathrm{min}}

We let NExt=|IExt|N_{\mathrm{Ext}}={|I_{\mathrm{Ext}}|} and NNonExt=|INonExt|N_{\mathrm{NonExt}}={|I_{\mathrm{NonExt}}|} be the number of extenders and non-extenders. We also let SExt=∑i∈IExtaiS_{\mathrm{Ext}}=\sum_{i\in I_{\mathrm{Ext}}}a_{i} and SNonExt=∑i∈INonExtaiS_{\mathrm{NonExt}}=\sum_{i\in I_{\mathrm{NonExt}}}a_{i} be the sum of advances for extenders and non-extenders. If we want the advance of the glyph assembly to reach the minimal size tt then

 SNonExt-omin⁢(NNonExt-1)+r⁢(SExt-omin⁢NExt)≥t{S_{\mathrm{NonExt}}-o_{\mathrm{min}}\left(N_{\mathrm{NonExt}}-1\right)}+{r% \left(S_{\mathrm{Ext}}-o_{\mathrm{min}}N_{\mathrm{Ext}}\right)}\geq t

We can assume 0" display="inline">SExt-omin⁢NExt>0S_{\mathrm{Ext}}-o_{\mathrm{min}}N_{\mathrm{Ext}}>0 or otherwise we would have the extreme case where the overlap takes at least the full advance of each extender. Then we obtain

 r≥rmin=max⁡(0,⌈t-SNonExt+omin⁢(NNonExt-1)SExt-omin⁢NExt⌉)r\geq r_{\mathrm{min}}=\max\left(0,\left\lceil\frac{t-{S_{\mathrm{NonExt}}+o_{ \mathrm{min}}\left(N_{\mathrm{NonExt}}-1\right)}}{S_{\mathrm{Ext}}-o_{\mathrm{ min}}N_{\mathrm{Ext}}}\right\rceil\right)

This provides a first simplification of the algorithm sketched in the MATH table specification: Directly start iteration at step rminr_{\mathrm{min}}. Note that at each step we start at possibly different maximum overlaps and decrease all of them by a same value. It is not clear what to do when one of the overlap reaches omino_{\mathrm{min}} while others can still be decreased. However, the sketched algorithm says all the connectors should reach minimum overlap before the next increment of rr, which means the target size will indeed be reached at step rminr_{\mathrm{min}}.

One possible interpretation is to stop overlap decreasing for the adjacent connectors that reached minimum overlap and to continue uniform decreasing for the others until all the connectors reach minimum overlap. In that case we may lose equal distribution or symmetry. In practice, this should probably not matter much. So we propose instead the dual option which should behave more or less the same in most cases: Start with all overlaps set to omino_{\mathrm{min}} and increase them evenly to reach a same value oo. By the same reasoning as above we want the inequality

 SNonExt-o⁢(NNonExt-1)+rmin⁢(SExt-o⁢NExt)≥t{S_{\mathrm{NonExt}}-o\left(N_{\mathrm{NonExt}}-1\right)}+{r_{\mathrm{min}} \left(S_{\mathrm{Ext}}-oN_{\mathrm{Ext}}\right)}\geq t

which can be rewritten

 SNonExt+rmin⁢SExt-o⁢(NNonExt+rmin⁢NExt-1)≥tS_{\mathrm{NonExt}}+r_{\mathrm{min}}S_{\mathrm{Ext}}-{o\left(N_{\mathrm{NonExt% }}+{r_{\mathrm{min}}N_{\mathrm{Ext}}}-1\right)}\geq t

We note that N=NNonExt+rmin⁢NExtN=N_{\mathrm{NonExt}}+{r_{\mathrm{min}}N_{\mathrm{Ext}}} is just the exact number of glyphs used in the assembly. If there is only a single glyph, then the overlap value is irrelevant so we can assume NNonExt+r⁢NExt-1=N-1≥1N_{\mathrm{NonExt}}+{rN_{\mathrm{Ext}}}-1=N-1\geq 1. This provides the greatest theorical value for the overlap oo:

 omin≤o≤omaxtheorical=SNonExt+rmin⁢SExt-tNNonExt+rmin⁢NExt-1o_{\mathrm{min}}\leq o\leq o_{\mathrm{max}}^{\mathrm{theorical}}=\frac{S_{ \mathrm{NonExt}}+r_{\mathrm{min}}S_{\mathrm{Ext}}-t}{N_{\mathrm{NonExt}}+{r_{ \mathrm{min}}N_{\mathrm{Ext}}}-1}

Of course, we also have to take into account the limit imposed by the start and end connector lengths. So omaxo_{\mathrm{max}} must also be at most min⁡(ei,si+1)\min{(e_{i},s_{i+1})} for 0≤i≤n-20\leq i\leq n-2. But if rmin≥2r_{\mathrm{min}}\geq 2 then extender copies are connected and so omaxo_{\mathrm{max}} must also be at most min⁡(ei,si)\min{(e_{i},s_{i})} for i∈IExti\in I_{\mathrm{Ext}}. To summarize, omaxo_{\mathrm{max}} is the minimum of omaxtheoricalo_{\mathrm{max}}^{\mathrm{theorical}}, of eie_{i} for 0≤i≤n-20\leq i\leq n-2, of sis_{i} 1≤i≤n-11\leq i\leq n-1 and possibly of e0e_{0} (if 0∈IExt0\in I_{\mathrm{Ext}}) and of of sn-1s_{n-1} (if n-1∈IExt{n-1}\in I_{\mathrm{Ext}}).

With the algorithm described above NExtN_{\mathrm{Ext}}, NNonExtN_{\mathrm{NonExt}}, SExtS_{\mathrm{Ext}}, SNonExtS_{\mathrm{NonExt}} and rminr_{\mathrm{min}} and omaxo_{\mathrm{max}} can all be obtained using simple loops on the glyphs gig_{i} and so the complexity is O⁢(n)O(n). In practice nn is small: For existing fonts, assemblies are made of at most three non-extenders and two extenders that is n≤5n\leq 5 (incidentally, Gecko and WebKit do not currently support larger values of nn). This means that all the operations described above can be considered to have constant complexity. This is much better than a naive implementation of the iterative algorithm sketched in the OpenType MATH table specification which seems to require at worst

 ∑r=0rmin-1NNonExt+r⁢NExt=NNonExt⁢rmin+rmin⁢(rmin-1)2⁢NExt=O⁢(n×rmin2)\sum_{r=0}^{r_{\mathrm{min}}-1}{N_{\mathrm{NonExt}}+rN_{\mathrm{Ext}}}=N_{ \mathrm{NonExt}}r_{\mathrm{min}}+\frac{r_{\mathrm{min}}\left(r_{\mathrm{min}}-% 1\right)}{2}N_{\mathrm{Ext}}={O(n\times r_{\mathrm{min}}^{2})}

and at least Ω⁢(rmin)\Omega(r_{\mathrm{min}}).

One of issue is that the number of extender repetitions rminr_{\mathrm{min}} and the number of glyphs in the assembly NN can become arbitrary large since the target size tt can take large values e.g. if one writes \underbrace{\hspace{65535em}} in LaTeX. The improvement proposed here does not solve that issue since setting the coordinates of each glyph in the assembly and painting them require Θ⁢(N)\Theta(N) operations as well as (in the case of HarfBuzz) a glyph buffer of size NN. However, such large stretchy operators do not happen in real-life mathematical formulas. Hence to avoid possible hangs in Web engines a solution is to impose a maximum limit NmaxN_{\mathrm{max}} for the number of glyph in the assembly so that the complexity is limited by the size of the DOM tree. Currently, the proposal for HarfBuzz is Nmax=128N_{\mathrm{max}}=128. This means that if each assembly glyph is 1em large you won’t be able to draw stretchy operators of size more than 128em, which sounds a quite reasonable bound. With the above proposal, rminr_{\mathrm{min}} and so NN can be determined very quickly and the cases N≥NmaxN\geq N_{\mathrm{max}} rejected, so that we avoid losing time with such edge cases…

Finally, because in our proposal we use the same overlap oo everywhere an alternative for HarfBuzz would be to set the output buffer size to nn (i.e. ignore r-1r-1 copies of each extender and only keep the first one). This will leave gaps that the client can fix by repeating extenders as long as oo is also provided. Then HarfBuzz math shaping can be done with a complexity in time and space of just O⁢(n)O(n) and it will be up to the client to optimize or limit the painting of extenders for large values of NN…

March 31, 2016

Michael Catanzaro: Positive progress on WebKitGTK+ security updates

Igalia WebKit

I previously reported that, although WebKitGTK+ releases regular upstream security updates, most Linux distributions are not taking the updates. At the time, only Arch Linux and Fedora were reliably releasing our security updates. So I’m quite pleased that openSUSE recently released a WebKitGTK+ security update, and then Mageia did too. Gentoo currently has an update in the works. It remains to be seen if these distros regularly follow up on updates (expect a follow-up post on this in a few months), but, optimistically, you now have several independent distros to choose from to get an updated version WebKitGTK+, plus any distros that regularly receive updates directly from these distros.

Unfortunately, not all is well yet. It’s still not safe to use WebKitGTK+ on the latest releases of Debian or Ubuntu, or on derivatives like Linux Mint, elementary OS, or Raspbian. (Raspbian is notable because it uses an ancient, insecure version of Epiphany as its default web browser, and Raspberry Pis are kind of popular.)

And of course, no distribution has been able to get rid of old, insecure WebKitGTK+ 2.4 compatibility packages, so many applications on distributions that do provide security updates for modern WebKitGTK+ will still be insecure. (Don’t be fooled by the recent WebKitGTK+ 2.4.10 update; it contains only a few security fixes that were easy to backport, and was spurred by the need to add GTK+ 3.20 compatibility. It is still not safe to use.) Nor have distributions managed to remove QtWebKit, which is also old and insecure. You still need to check individual applications to see if they are running safe versions of WebKit.

But at least there are now several distros providing WebKitGTK+ security updates. That’s good.

Special thanks to Apple and to my colleagues at Igalia for their work on the security advisories that motivate these updates.

Michael Catanzaro: Epiphany 3.20

Igalia WebKit

So, what’s new in Epiphany 3.20?

First off: overlay scrollbars. Because web sites have the ability to style their scrollbars (which you’ve probably noticed on Google sites), WebKit embedders cannot use a normal GtkScrolledWindow to display content; instead, WebKit has to paint the scrollbars itself. Hence, when overlay scrollbars appeared in GTK+ 3.16, WebKit applications were left out. Carlos García Campos spent some time to work on this, and the result speaks for itself (if you fullscreen this video to see it properly):

Overlay scrollbars did not actually require any changes in Epiphany itself — all applications using an up-to-date version of WebKit will immediately benefit — but I mention it here as it’s one of the most noticeable changes. Read about other WebKit improvements, like the new Faster Than Light FTL/B3 JavaScript compilation tier, on Carlos’s blog.

Next up, there is a new downloads manager, also by Carlos García Campos. This replaces the old downloads bar that used to appear at the bottom of the screen:

I flipped the switch in Epiphany to enable WebGL:

If you watched that video in fullscreen, you might have noticed that page is marked as insecure, even though it doesn’t use HTTPS. Like most browsers, we used to have several confusing security states. Pages with mixed content received a security warning that all users ignored, but pages with no security at all received no such warning. That’s pretty dumb, which is why Firefox and Chrome have been talking about changing this for a year or so now. I went ahead and implemented it. We now have exactly two security states: secure and insecure. If your page loads any content not over HTTPS, it will be marked as insecure. The vast majority of pages will be displayed as insecure, but it’s no less than such sites deserve. I’m not concerned at all about “warning fatigue,” because users are not generally expected to take any action on seeing these warnings. In the future, we will take this further, and use the insecure indicator for sites that use SHA-1 certificates.

Moving on. By popular request, I exposed the previously-hidden setting to disable session restore in the preferences dialog, as “Remember previous tabs on startup:”

Meanwhile, Carlos worked in both WebKit and Epiphany to greatly improve session restoration. Previously, Epiphany would save the URLs of the pages loaded in each tab, and when started it would load each URL in a new tab, but you wouldn’t have any history for those tabs, for example, and the state of the tab would otherwise be lost. Carlos worked on serializing the WebKit session state and exposing it in the WebKitGTK+ API, allowing us to restore full back/forward history for each tab, plus details like your scroll position on each tab. Thanks to Carlos, we also now make use of this functionality when reopening closed tabs, so your reopened tab will have a full back/forward list of history, and also when opening new tabs, so the new tab will inherit the history of the tab it was opened from (a feature that we had in the past, but lost when we switched to WebKit2).

Interestingly, we found the session restoration was at first too good: it would restore the page really exactly as you last viewed it, without refreshing the content at all. This means that if, for example, you were viewing a page in Bugzilla, then when starting the browser, you would miss any new comments from the last time you loaded the page until you refresh the page manually. This is actually the current behavior in Safari; it’s desirable on iOS to make the browser launch instantly, but questionable for desktop Safari. Carlos decided to always refresh the page content when restoring the session for WebKitGTK+.

Last, and perhaps least, there’s a new empty state displayed for new users, developed by Lorenzo Tilve and polished up by me, so that we don’t greet new users with a completely empty overview (where your most-visited sites are normally displayed):

That, plus a bundle of the usual bugfixes, significant code cleanups, and internal architectual improvements (e.g. I converted the communication between the UI process and the web process extension to use private D-Bus connections instead of the session bus). The best things have not changed: it still starts up about 5-20 times faster than Firefox in my unscientific testing; I expect you’ll find similar results.

Enjoy!

December 15, 2014

Web Engines Hackfest 2014

Gustavo Noronha

For the 6th year in a row, Igalia has organized a hackfest focused on web engines. The 5 years before this one were actually focused on the GTK+ port of WebKit, but the number of web engines that matter to us as Free Software developers and consultancies has grown, and so has the scope of the hackfest.

It was a very productive and exciting event. It has already been covered by Manuel RegoPhilippe Normand, Sebastian Dröge and Andy Wingo! I am sure more blog posts will pop up. We had Martin Robinson telling us about the new Servo engine that Mozilla has been developing as a proof of concept for both Rust as a language for building big, complex products and for doing layout in parallel. Andy gave us a very good summary of where JS engines are in terms of performance and features. We had talks about CSS grid layouts, TyGL – a GL-powered implementation of the 2D painting backend in WebKit, the new Wayland port, announced by Zan Dobersek, and a lot more.

With help from my colleague ChangSeok OH, I presented a description of how a team at Collabora led by Marco Barisione made the combination of WebKitGTK+ and GNOME’s web browser a pretty good experience for the Raspberry Pi. It took a not so small amount of both pragmatic limitations and hacks to get to a multi-tab browser that can play youtube videos and be quite responsive, but we were very happy with how well WebKitGTK+ worked as a base for that.

One of my main goals for the hackfest was to help drive features that were lingering in the bug tracker for WebKitGTK+. I picked up a patch that had gone through a number of iterations and rewrites: the HTML5 notifications support, and with help from Carlos Garcia, managed to finish it and land it at the last day of the hackfest! It provides new signals that can be used to authorize notifications, show and close them.

To make notifications work in the best case scenario, the only thing that the API user needs to do is handle the permission request, since we provide a default implementation for the show and close signals that uses libnotify if it is available when building WebKitGTK+. Originally our intention was to use GNotification for the default implementation of those signals in WebKitGTK+, but it turned out to be a pain to use for our purposes.

GNotification is tied to GApplication. This allows for some interesting features, like notifications being persistent and able to reactivate the application, but those make no sense in our current use case, although that may change once service workers become a thing. It can also be a bit problematic given we are a library and thus have no GApplication of our own. That was easily overcome by using the default GApplication of the process for notifications, though.

The show stopper for us using GNotification was the way GNOME Shell currently deals with notifications sent using this mechanism. It will look for a .desktop file named after the application ID used to initialize the GApplication instance and reject the notification if it cannot find that. Besides making this a pain to test – our test browser would need a .desktop file to be installed, that would not work for our main API user! The application ID used for all Web instances is org.gnome.Epiphany at the moment, and that is not the same as any of the desktop files used either by the main browser or by the web apps created with it.

For the future we will probably move Epiphany towards this new era, and all users of the WebKitGTK+ API as well, but the strictness of GNOME Shell would hurt the usefulness of our default implementation right now, so we decided to stick to libnotify for the time being.

Other than that, I managed to review a bunch of patches during the hackfest, and took part in many interesting discussions regarding the next steps for GNOME Web and the GTK+ and Wayland ports of WebKit, such as the potential introduction of a threaded compositor, which is pretty exciting. We also tried to have Bastien Nocera as a guest participant for one of our sessions, but it turns out that requires more than a notebook on top of a bench hooked up to   a TV to work well. We could think of something next time ;D.

I’d like to thank Igalia for organizing and sponsoring the event, Collabora for sponsoring and sending ChangSeok and myself over to Spain from far away Brazil and South Korea, and Adobe for also sponsoring the event! Hope to see you all next year!

Web Engines Hackfest 2014 sponsors: Adobe, Collabora and Igalia

December 08, 2014

How to build TyGL

University of Szeged

This is a follow-up blog post of our announcement of TyGL - the 2D-accelerated GPU rendering port of WebKit.

We have been received lots of feedback about TyGL and we would like to thank you for all questions, suggestions and comments. As we promised lets get into some technical details.

read more

November 12, 2014

Announcing the TyGL-WebKit port to accelerate 2D web rendering with GPU

University of Szeged

We are proud to announce the TyGL port (link: http://github.com/szeged/TyGL) on the top of EFL-WebKit. TyGL (pronounced as tigel) is part of WebKit and provides 2D-accelerated GPU rendering on embedded systems. The engine is purely GPU based. It has been developed on and tested against ARM-Mali GPU, but it is designed to work on any GPU conforming to OpenGL ES 2.0 or higher.

The GPU involvement on future graphics is inevitable considering the pixel growth rate of displays, but harnessing the GPU power requires a different approach than CPU-based optimizations.

read more

October 22, 2014

Fuzzinator reloaded

University of Szeged

It's been a while since I last (and actually first) posted about Fuzzinator. Now I think that I have enough new experiences worth sharing.

More than a year ago, when I started fuzzing, I was mostly focusing on mutation-based fuzzer technologies since they were easy to build and pretty effective. Having a nice error-prone test suite (e.g. LayoutTests) was the warrant for fresh new bugs. At least for a while.

read more

What is ASM.JS?

Now that mobile computers and cloud services become part of our lives, more and more developers see the potential of the web and online applications. ASM.JS, a strict subset of JavaScript, is a technology that provides a way to achieve near native speed in browsers, without the need of any plugin or extension. It is also possible to cross-compile C/C++ programs to it and running them directly in your browser.

In this post we will compare the JavaScript and ASM.JS performance in different browsers, trying out various kinds of web applications and benchmarks.

read more

August 28, 2014

CSS Shapes now available in Chrome 37 release

Adobe Web Platform

Support for CSS Shapes is now available in the latest Google Chrome 37 release.

What can I do with CSS Shapes?

CSS Shapes lets you think out of the box! It gives you the ability to wrap content outside any shape. Shapes can be defined by geometric shapes, images, and even gradients. Using Shapes as part of your website design takes a visitor’s visual and reading experience to the next level. If you want to start with some tutorials, please go visit Sarah Soueidan’s article about Shapes.

Demo

The following shapes use case is from the Good Looking Shapes Gallery blog post.

Without CSS Shapes
With CSS Shapes

In the first picture, we don’t use CSS Shapes. The text wraps around the rectangular image container, which leads to a lot of empty space between the text and the visible part of the image.

In the second picture, we use CSS Shapes. You can see the wrapping behavior around the image. In this case the white parts of the image are transparent, thus the browser can automatically wrap the content around the visible part, which leads to this nice and clean, visually more appealing wrapping behavior.

How do I get CSS Shapes?

Just update your Chrome browser to the latest version from the Chrome/About Google Chrome menu, or download the latest stable version from https://www.google.com/chrome/browser/.

I’d like to thank the collaboration of WebKit and Blink engineers, and everyone else in the community who has contributed to this feature. The fact that Shapes is shipping in two production browsers — Chrome 37 now and Safari 8 later this year — is the upshot of the open source collaboration between the people who believe in a better, more expressive web. Although Shapes will be available in these browsers, you’ll need another solution for the other browsers. The CSS Shapes Polyfill is one method of achieving consistent behavior across browsers.

Where should I start?

For more info about CSS Shapes, please check out the following links:

Let us know your thoughts or if you have nice demos, here or on Twitter: @AdobeWeb and @ZoltanWebKit.

May 13, 2014

Good-Looking Shapes Gallery

Adobe Web Platform

As a modern consumer of media, you rarely crack open a magazine or a pamphlet or anything that would be characterized as “printed”. Let me suggest that you take a walk on the wild side. The next time you are in a doctor’s office, or a supermarket checkout lane, or a library, thumb though a magazine. Most of the layouts you’ll find inside can also be found on the web, but not all of them. Layouts where content hugs the boundaries of illustrations are common in print and rare on the web. One of the reasons non-rectangular contour-hugging layouts are uncommon on the web is that they are difficult to produce.

They are not difficult to produce anymore.

The CSS Shapes specification is now in the final stages of standardization. This feature enables flowing content around geometric shapes (like circles and polygons), as well as around shapes defined by an image’s alpha channel. Shapes make it easy to produce the kinds of layouts you can find in print today, with all the added flexibility and power that modern online media affords. You can use CSS Shapes right now with the latest builds of WebKit and Blink based browsers, like Safari and Chrome.

Development of CSS Shapes has been underway for about two years, and we’ve been regularly heralding its progress here. Many of those reports have focused on the evolution of the spec and implementations, and they’ve included examples that emphasized basics over beauty. This article is an attempt to tilt the balance back towards good-looking. Listed below are simple shapes demos that we think look pretty good. Everyone on Adobe’s CSS Shapes engineering team contributed at least one.

There’s a live CodePen.io version of each demo in the gallery. Click on the demo screenshot or one of the handy links to take a look. You’ll want to view the demos with a browser that supports Shapes and you’ll need to enable CSS Shapes in that browser. For example you can use a nightly build of the Safari browser or you can enable shapes in Chrome or Chrome Canary like this:

1. Copy and paste chrome://flags/#enable-experimental-web-platform-features into the address bar, then press enter.
2. Click the ‘Enable’ link within that section.
3. Click the ‘Relaunch Now’ button at the bottom of the browser window.

A few of the demos use the new Shapes Polyfill and will work in most browsers.

And now, without further ado, please have a look through our good-looking shapes gallery.

Ozma of Oz

This demo reproduces the layout style that opens many of the chapters of the L. Frank Baum books, including Ozma of Oz.  The first page is often dominated by an illustration on the left or right. The chapter’s text conforms to the illustration, but not too tightly. The books were published over 100 years ago and they still look good print.  With CSS Shapes they can still look good on the web.

Top Cap

The conventional “drop-cap” opens a paragraph by enlarging and highlighting the first letter, word or phrase. The drop-cap’s goal is to draw your attention to where you can start reading. This demo delivers the same effect by crowning the entire opening paragraph with a “top cap” that funnels your attention into the article. In both cases, what’s going on is a segue from a graphic element to the text.

Violator

A violator is small element that “violates” rectangular text layout by encroaching on a corner or a small part of an edge. This layout idiom is common in short-form magazines and product packaging. That “new and improved” banner which blazes through the corner of thousands of consumer products (whether or not they are new or improved) – it’s a violator.

Column Interest

When a print magazine feels the need to incorporate some column layout melodrama, they often reach for this idiom. The shape spans a pair of columns, which creates visual interest in the middle of the page. Without it you’d be faced with a wall of attention sapping text and more than likely turn the page.

Caption

The old-school approach for including a caption with an image is to put the caption text alongside or below the image. Putting a caption on top of an image requires a little more finesse, since you have to ensure that the text doesn’t obscure anything important and that the text is rendered in a way that preserves readability.  The result can be relatively attractive.

This photograph was taken by Zoltan Horvath who has pointed out that I’ve combined a quote about tea with a picture of a ceremonial wine jug.  I apologize for briefly breaching that beverage boundary. It’s just a demo.

Paging

With a layout like this, one could simple let the content wrap and around the shape on the right and then expand into the usual rectangle.  In this demo the content is served up a paragraph at a time, in response to the left and right arrow keys.

Note also: yes in fact the mate gourd is perched on exactly the same windowsill as the previous demo. Zoltan and Pope Francis are among the many fans of yerba mate tea.

Ersatz shape-inside

Originally the CSS Shapes spec included shape-inside as well as shape-outside. Sadly, shape-inside was promoted to “Level 2″ of the spec and isn’t available in the current implementations. Fortunately for shape insiders everywhere, it’s still sometimes possible to mimic shape-inside with an adjacent pair of carefully designed shape-outside floats. This demo is a nice example of that, where the text appears inside a bowl of oatmeal.

Animation

This is an animated demo, so to appreciate it you’ll really need to take a look at the live version. It is an example of using an animated shape to draw the user’s attention to a particular message.  Of course one must use this approach with restraint, since an animated loop on a web page doesn’t just gently tug at the user’s attention. It drags at their attention like a tractor beam.

Performance

Advertisements are intended to grab the user’s attention and a second or two of animation will do that. In this demo a series of transition motions have been strung together into a tiny performance that will temporarily get the reader’s attention. The highlight of the performance is – of course – the text snapping into the robot’s contour for the finale. Try and imagine a soundtrack that punctuates the action with some whirring and clanking noises, it’s even better that way.

April 24, 2014

Adobe Web Platform Goes to the 2014 WebKit Contributors’ Meeting

Adobe Web Platform

Last week, Apple hosted the 2014 WebKit Contributors’ Meeting at their campus in Cupertino. As usual it was an unconference-style event, with session scheduling happening on the morning of the first day. While much of the session content was very specific to WebKit implementation, there were topics covered that are interesting to the wider web community. This post is a roundup of some of these topics from the sessions that Adobe Web Platform Team members attended.

CSS Custom Properties for Cascading Variables

Alan Stearns suggested a session on planning a new implementation of CSS Custom Properties for Cascading Variables. While implementations of this spec have been attempted in WebKit in the past, they never got past the experimental stage. Despite this, there is still much interest in implementing this feature. In addition, the current version of the spec has addressed many of the issues that WebKit contributors had previously expressed. We talked about a possible issue with using variables in custom property values, which Alan is investigating. More detail is available in the notes from the Custom Properties session.

CSS Regions

Andrei Bucur presented the current state of the CSS Regions implementation in WebKit. The presentation was well received and well attended. Notably, this was one of the few sessions with enough interest that it had a time slot all to itself.

While CSS Regions shipped last year in iOS 7 and Safari 6.1 and 7, the implementation in WebKit hasn’t been standing still. Andrei mentioned the following short list of changes in WebKit since the last Safari release:

• correct painting of fragments and overflow
• scrollable regions
• accelerated content inside regions
• position: fixed elements
• the regionoversetchange event
• better selection
• better WebInspector integration
• and more…

Andrei’s slides outlining the state of CSS Regions also contain a roadmap for the feature’s future in WebKit as well as a nice demo of the fix to fragment and overflow handling. If you are following the progress of CSS Regions in WebKit, the slides are definitely worth a look. (As of this writing, the Regions demo in the slides only works in Safari and WebKit Nightly.)

CSS Shapes

Zoltan Horvath, Bear Travis, and I covered the current state of CSS Shapes in WebKit. We are almost done implementing the functionality in Level 1 of the CSS Shapes Specification (which is itself a Candidate Recommendation, the last step before becoming an official W3C standard). The discussion in this session was very positive. We received good feedback on use cases for shape-outside and even talked a bit about the possibilities for when shape-inside is revisited as part of CSS Shapes Level 2. While I don’t have any slides or demos to share at the moment, we will soon be publishing a blog post to bring everyone up to date on the latest in CSS Shapes. So watch this space for more!

Subpixel Layout

This session was mostly about implementation. However, Zalan Bujtas drew an interesting distinction between subpixel layout and subpixel painting. Subpixel layout allows for better space utilization when laying out elements on the page, as boxes can be sized and positioned more precisely using fractional units. Subpixel painting allows for better utilization of high DPI displays by actually drawing elements on the screen using fractional CSS pixels (For example: on a 2x “Retina” display, half of a CSS pixel is one device pixel). Subpixel painting allows for much cleaner lines and smoother animations on high DPI displays when combined with subpixel layout. While subpixel layout is currently implemented in WebKit, subpixel painting is currently a work in progress.

Web Inspector

The Web Inspector is full of shiny new features. The front-end continues to shift to a new design, while the back-end gets cleaned up to remove cruft. The architecture for custom visual property editors is in place and will hopefully enable quick and intuitive editing of gradients, transforms, and animations in the future. Other goodies include new breakpoint actions (like value logging), a redesigned timeline, and IndexedDB debugging support. The Web Inspector still has room for new features, and you can always check out the #webkit-inspector channel on freenode IRC for the latest and greatest.

Web Components

The Web Components set of features continues to gather interest from the browser community. Web Components is made up of four different features: HTML Components, HTML Imports, Shadow DOM, and HTML Templates. The general gist of the talk was that the Web Components concepts are desirable, but there are concerns that the features’ complexity may make implementation difficult. The main concerns seemed to center around performance and encapsulation with Shadow DOM, and will hopefully be addressed with a prototype implementation of the feature (in the works). You can also take a look at the slides from the Web Components session.

CSS Grid Layout

The WebKit implementation of the CSS Grid Layout specification is relatively advanced. After learning in this session that the only way to test out Grid Layout in WebKit was to make a custom build with it enabled, session attendees concluded that it should be turned on by default in the WebKit Nightlies. So in the near future, experimenting with Grid Layout in WebKit should be as easy as installing a nightly build.

More?

As I mentioned earlier, this was just a high-level overview of a few of the topics at this year’s WebKit Contributors’ Meeting. Notes and slides for some of the topics not mentioned here are available on the 2014 WebKit Meeting page in the wiki. The WebKit project is always welcoming new contributors, so if you happen to see a topic on that wiki page that interests you, feel free to get in touch with the community and see how you can get involved.

Acknowledgements

This post would not have been possible without the notes and editing assistance of my colleagues on the Adobe Web Platform Team that attended the meeting along with me: Alan Stearns, Andrei Bucur, Bear Travis, and Zoltan Horvath.

March 18, 2014

QtWebKit is no more, what now?

Gustavo Noronha

Driven by the technical choices of some of our early clients, QtWebKit was one of the first web engines Collabora worked on, building the initial support for NPAPI plugins and more. Since then we had kept in touch with the project from time to time when helping clients with specific issues, hardware or software integration, and particularly GStreamer-related work.

With Google forking Blink off WebKit, a decision had to be made by all vendors of browsers and platform APIs based on WebKit on whether to stay or follow Google instead. After quite a bit of consideration and prototyping, the Qt team decided to take the second option and build the QtWebEngine library to replace QtWebKit.

The main advantage of WebKit over Blink for engine vendors is the ability to implement custom platform support. That meant QtWebKit was able to use Qt graphics and networking APIs and other Qt technologies for all of the platform-integration needs. It also enjoyed the great flexibility of using GStreamer to implement HTML5 media. GStreamer brings hardware-acceleration capabilities, support for several media formats and the ability to expand that support without having to change the engine itself.

People who are using QtWebKit because of its being Gstreamer-powered will probably be better served by switching to one of the remaining GStreamer-based ports, such as WebKitGTK+. Those who don’t care about the underlying technologies but really need or want to use Qt APIs will be better served by porting to the new QtWebEngine.

It’s important to note though that QtWebEngine drops support for Android and iOS as well as several features that allowed tight integration with the Qt platform, such as DOM manipulation through the QWebElement APIs, making QObject instances available to web applications, and the ability to set the QNetworkAccessManager used for downloading resources, which allowed for fine-grained control of the requests and sharing of cookies and cache.

It might also make sense to go Chromium/Blink, either by using the Chrome Content API, or switching to one its siblings (QtWebEngine included) if the goal is to make a browser which needs no integration with existing toolkits or environments. You will be limited to the formats supported by Chrome and the hardware platforms targeted by Google. Blink does not allow multiple implementations of the platform support layer, so you are stuck with what upstream decides to ship, or with a fork to maintain.

It is a good alternative when Android itself is the main target. That is the technology used to build its main browser. The main advantage here is you get to follow Chrome’s fast-paced development and great support for the targeted hardware out of the box. If you need to support custom hardware or to be flexible on the kinds of media you would like to support, then WebKit still makes more sense in the long run, since that support can be maintained upstream.

At Collabora we’ve dealt with several WebKit ports over the years, and still actively maintain the custom WebKit Clutter port out of tree for clients. We have also done quite a bit of work on Chromium-powered projects. Some of the decisions you have to make are not easy and we believe we can help. Not sure what to do next? If you have that on your plate, get in touch!

February 25, 2014

Improving your site’s visual details: CSS3 text-align-last

Adobe Web Platform

In this post, I want to give a status report regarding the text-align-last CSS3 property. If you are interested in taking control of the small visual details of your site with CSS, I encourage you to keep reading.

The problem

First, let’s talk about why we need this property. You’ve probably already seen many text blocks on pages that don’t quite seem visually correct, because the last line isn’t justified with the previous lines. Check out the example paragraph below:

In the first column, the last line isn’t justified. This is the expected behavior, when you apply the ‘text-align: justify’ CSS property on a container. On the other hand, in the second column, the content is entirely justified, including the last line.

The solution

This magic is the ‘text-align-last’ CSS3 property, which is set to justify on the second container. The text-align-last property is part of the CSS Text Module Level 3 specification, which is currently a working draft. The text-align-last property describes how the last line of a block or a line right before a forced line break is aligned when ‘text-align’ is ‘justify’, which means you gain full control over the alignment of the last line of a block. The property allows several more options, which you can read about on WebPlatform.org docs, or the CSS Text Module Level 3 W3C Specification.

A possible use case (Added April – 2014)

After looking at the previous example (which was rather focusing on the functionality of the property), let’s move on to a more realistic use case. The feature is perfect to make our multi-line captions look better. Check out the centered, and the justified image caption examples below.

And now, compare them with a justified, multi-line caption, where the last line has been centered by text-align-last: center.

I think the proper alignment of the last line gives a better overlook to the caption.

Browser Support

I recently added rendering support for the property in WebKit (Safari) based on the latest specification. Dongwoo Joshua Im from Samsung added rendering support in Blink (Chrome). If you like to try it out in WebKit, you’ll need to make a custom developer build and use the CSS3 text support build flag (--css3-text).

The property is already included in Blink’s developer nightlies by default, so after launching your latest Chrome Canary, you only need to enable ‘Enable experimental Web Platform features’ under chrome://flags, and enjoy the full control over your last lines.

Developer note

Please keep in mind that both the W3C specification and the implementations are under experimental status. I’ll keep blogging about the feature and let you know if anything changes, including when the feature ships for production use!

December 11, 2013

WebKitGTK+ hackfest 5.0 (2013)!

Gustavo Noronha

For the fifth year in a row the fearless WebKitGTK+ hackers have gathered in A Coruña to bring GNOME and the web closer. Igalia has organized and hosted it as usual, welcoming a record 30 people to its office. The GNOME foundation has sponsored my trip allowing me to fly the cool 18 seats propeller airplane from Lisbon to A Coruña, which is a nice adventure, and have pulpo a feira for dinner, which I simply love! That in addition to enjoying the company of so many great hackers.

Web with wider tabs and the new prefs dialog

The goals for the hackfest have been ambitious, as usual, but we made good headway on them. Web the browser (AKA Epiphany) has seen a ton of little improvements, with Carlos splitting the shell search provider to a separate binary, which allowed us to remove some hacks from the session management code from the browser. It also makes testing changes to Web more convenient again. Jon McCan has been pounding at Web’s UI making it more sleek, with tabs that expand to make better use of available horizontal space in the tab bar, new dialogs for preferences, cookies and password handling. I have made my tiny contribution by making it not keep tabs that were created just for what turned out to be a download around. For this last day of hackfest I plan to also fix an issue with text encoding detection and help track down a hang that happens upon page load.

Martin Robinson and Dan Winship hack

Martin Robinson and myself have as usual dived into the more disgusting and wide-reaching maintainership tasks that we have lots of trouble pushing forward on our day-to-day lives. Porting our build system to CMake has been one of these long-term goals, not because we love CMake (we don’t) or because we hate autotools (we do), but because it should make people’s lives easier when adding new files to the build, and should also make our build less hacky and quicker – it is sad to see how slow our build can be when compared to something like Chromium, and we think a big part of the problem lies on how complex and dumb autotools and make can be. We have picked up a few of our old branches, brought them up-to-date and landed, which now lets us build the main WebKit2GTK+ library through cmake in trunk. This is an important first step, but there’s plenty to do.

Hackers take advantage of the icecream network for faster builds

Under the hood, Dan Winship has been pushing HTTP2 support for libsoup forward, with a dead-tree version of the spec by his side. He is refactoring libsoup internals to accomodate the new code paths. Still on the HTTP front, I have been updating soup’s MIME type sniffing support to match the newest living specification, which includes specification for several new types and a new security feature introduced by Internet Explorer and later adopted by other browsers. The huge task of preparing the ground for a one process per tab (or other kinds of process separation, this will still be topic for discussion for a while) has been pushed forward by several hackers, with Carlos Garcia and Andy Wingo leading the charge.

Jon and Guillaume battling code

Other than that I have been putting in some more work on improving the integration of the new Web Inspector with WebKitGTK+. Carlos has reviewed the patch to allow attaching the inspector to the right side of the window, but we have decided to split it in two, one providing the functionality and one the API that will allow browsers to customize how that is done. There’s a lot of work to be done here, I plan to land at least this first patch durign the hackfest. I have also fought one more battle in the never-ending User-Agent sniffing war, in which we cannot win, it looks like.

Hackers chillin’ at A Coruña

I am very happy to be here for the fifth year in a row, and I hope we will be meeting here for many more years to come! Thanks a lot to Igalia for sponsoring and hosting the hackfest, and to the GNOME foundation for making it possible for me to attend! See you in 2014!

August 27, 2013

HTML Alchemy – Combining CSS Shapes with CSS Regions

Adobe Web Platform

Note: Support for shape-inside is only available until the following nightly builds: WebKit r166290 (2014-03-26); Chromium 260092 (2014-03-28).

I have been working on rendering for almost a year now. Since I landed the initial implementation of Shapes on Regions in both Blink and WebKit, I’m incredibly excited to talk a little bit about these features and how you can combine them together.

Don’t know what CSS Regions and Shapes are? Start here!

The first ingredient in my HTML alchemy kitchen is CSS Regions. With CSS Regions, you can flow content into multiple styled containers, which gives you enormous creative power to make magazine style layouts. The second ingredient is CSS Shapes, which gives you the ability to wrap content inside or outside any shape. In this post I’ll talk about the “shape-inside” CSS property, which allows us to wrap content inside an arbitrary shape.

Let’s grab a bowl and mix these two features together, CSS Regions and CSS Shapes to produce some really interesting layouts!

In the latest Chrome Canary and Safari WebKit Nightly, after enabling the required experimental features, you can flow content continuously through multiple kinds of shapes. This rocks! You can step out from the rectangular text flow world and break up text into multiple, non-rectangular shapes.

Demo

If you already have the latest Chrome Canary/Safari WebKit Nightly, you can just go ahead and try a simple example on codepen.io. If you are too lazy, or if you want to extend your mouse button life by saving a few button clicks, you can continue reading.

In the picture above we see that the “Lorem ipsum” story flows through 4 different, colorful regions. There is a circle shape on each of the first two fixed size regions. Check out the code below to see how we apply the shape to the region. It’s pretty straightforward, right?
#region1, #region2 {
-webkit-flow-from: flow;
background-color: yellow;
width: 200px;
height: 200px;
-webkit-shape-inside: circle(50%, 50%, 50%);
}
The content flows into the third (percentage sized) region, which represents a heart (drawn by me, all rights reserved). I defined the heart’s coordinates in percentages, so the heart will stretch as you resize the window.
#region3 {
-webkit-flow-from: flow;
width: 50%;
height: 400px;
background-color: #EE99bb;
-webkit-shape-inside: polygon(11.17% 10.25%,2.50% 30.56%,3.92% 55.34%,12.33% 68.87%,26.67% 82.62%,49.33% 101.25%,73.50% 76.82%,85.17% 65.63%,91.63% 55.51%,97.10% 31.32%,85.79% 10.21%,72.47% 5.35%,55.53% 14.12%,48.58% 27.88%,41.79% 13.72%,27.50% 5.57%);
}

The content that doesn’t fit in the first three regions flows into the fourth region. The fourth region (see the retro-blue background color) has its CSS width and height set to auto, so it grows to fit the remaining content.

Real world examples

After trying the demo and checking out the links above, I’m sure you’ll see the opportunities for using shape-inside with regions in your next design. If you have some thoughts on this topic, don’t hesitate to comment. Please keep in mind that these features are under development, and you might run into bugs. If you do, you should report them on WebKit’s Bugzilla for Safari or Chromium’s issue tracker for Chrome. Thanks for reading!

August 06, 2013

WebGL, at last!

Brent Fulgham

It's been a long time since I've written an update -- but my lack of blog posting is not an indication of a lack of progress in WebKit or the WinCairo port. Since I left my former employer (who *still* hasn't gotten around to updating the build machine I set up there), we've:

• Migrated from Visual Studio 2005 to Visual Studio 2010 (and soon, VS2012)
• Enabled New-run-webkit-tests
• Updated the WinCairo Support Libraries to support 64-bit builds
• Integrated a ton of cURL improvements and extensions thanks to the TideSDK guys
• and ...
... thanks to the hard work of Alex Christensen, brought up WebGL on the WinCairo port.  This is a little exciting for me, because it marks the first time (I can recall) where the WinCairo port actually gained a feature that was not already part of the core Apple Windows port.

The changes needed to see these circa-1992 graphics in all their three-dimensional glory are already landed in the WebKit tree.  You just need to:

1. Enable the libEGL, libGLESv2, translator_common, translator_glsl, and translator_hlsl for the WinCairo build (they are currently turned off).
2. Make the following change to WTF/wtf/FeatureDefines.h:

Brent Fulgham@WIN7-VM ~/WebKit/Source/WTF/wtf

2. Directory setup

It is suggested (and actually required by some build scripts) to have a base directory which holds Qt5, Qt Components and WebKit project sources. The suggested base directory can be created by running:

3.2. rsync-scripts

$wget http://trac.webkit.org/attachment/wiki/SettingUpDevelopmentEnvironmentForN9/rsync-scripts.tar.gz?format=raw$ tar xzf rsync-scripts.tar.gz

4. Download required sources

$git clone git://gitorious.org/qtwebkit/testfonts.git 4.2. Qt5, QtComponents and WebKit The script below when successfully run will create ~/swork/qt5, ~/swork/qtcomponents and ~/swork/webkit directories: $ browser-scripts/clone-sources.sh --no-ssh

NOTE: You can also manually download sources, but remember to stick with the directory names described above.

5. Pre-build hacks

5.1. Qt5 translations

Qt5 translations are not being properly handled by cross-platform toolchain. This happens mainly because lrelease application is called to generate Qt message files, but since it is an ARMEL binary your system is probably not capable of running it natively (unless you have a misc_runner kernel module properly set, then you can safely skip this step). In this case, you can use lrelease from your system’s Qt binaries without any worries.

If you have a Scratchbox environment set, it is suggested for you to stop its service first:

$sudo service scratchbox-core stop Now you can manually generate Qt message files by running this: $ cd ~/swork/qt5/qttranslations/translations
$for file in ls *ts; do lrelease$file -qm echo "$file" | sed 's/ts$/qm/'; done

5.2. Disable jsondb-client tool

QtJsonDB module from Qt5 contains a tool called jsondb-client, which depends on libedit (not available on MADDE target). It is safe to disable its compilation for now:

$ln -s ~/swork/qt5/qtbase/mkspecs ~/QtSDK/Madde/sysroots/harmattan_sysroot_10.2011.34-1_slim/home/<USER>/swork/qt5/mkspecs 6. Build sources You can execute the script that will build all sources using cross-compilation setup: $ browser-scripts/build-sources.sh --cross-compile

If everything went well, you now have the most up-to-date binaries for Qt5/WebKit2 development for Nokia N9. Please have a look at WebKit’s wiki for more information about how to update sources after a previous build and information on how to keep files in sync with device. The guide assumes PR1.1 firmware for N9 device, which is already outdated, so I might come up next with updated instructions on how to safely sync files to your PR1.2-enabled device.

That’s all for now, I appreciate your comments and feedback!

March 10, 2012

WebKitGTK+ Debian packaging repository changes

Gustavo Noronha

For a while now the git repository used for packaging WebKitGTK+ has been broken. Broken as in nobody was able to clone it. In addition to that, the packaging workflow had been changing over time, from a track-upstream-git/patches applied one to a import-orig-only/patches-not-applied one.

After spending some more time trying to unbreak the repository for the third time I decided it might be a good time for a clean up. I created a new repository, imported all upstream versions for series 1.2.x (which is in squeeze), 1.6.x (unstable), and 1.7.x (experimental). I also imported packaging-related commis for those versions using git format-patch and black magic.

One of the good things about doing this move, and which should make hacking the WebKitGTK+ debian package more pleasant and accessible can be seen here:

 kov@goiaba ~/s/debian-webkit> du -sh webkit/.git webkit.old/.git 27M webkit/.git 1.6G webkit.old/.git 

If you care about the old repository, it’s on git.debian.org still, named old-webkit.git. Enjoy!

December 07, 2011

WebKitGTK+ hackfest \o/

Gustavo Noronha

It’s been a couple days since I returned from this year’s WebKitGTK+ hackfest in A Coruña, Spain. The weather was very nice, not too cold and not too rainy, we had great food, great drinks and I got to meet new people, and hang out with old friends, which is always great!

Hackfest black board, photo by Mario

I think this was a very productive hackfest, and as usual a very well organized one! Thanks to the GNOME Foundation for the travel sponsorship, to our friends at Igalia for doing an awesome job at making it happen, and to Collabora for sponsoring it and granting me the time to go there! We got a lot done, and although, as usual, our goals list had many items not crossed, we did cross a few very important ones. I took part in discussions about the new WebKit2 APIs, got to know the new design for GNOME’s Web application, which looks great, discussed about Accelerated Compositing along with Joone, Alex, Nayan and Martin Robinson, hacked libsoup a bit to port the multipart/x-mixed-replace patch I wrote to the awesome gio-based infrastructure Dan Winship is building, and some random misc.

The biggest chunk of time, though, ended up being devoted to a very uninteresting (to outsiders, at least), but very important task: making it possible to more easily reproduce our test results. TL;DR? We made our bots’ and development builds use jhbuild to automatically install dependencies; if you’re using tarballs, don’t worry, your usual autogen/configure/make/make install have not been touched. Now to the more verbose version!

The need

Our three build slaves reporting a few failures

For a couple years now we have supported an increasingly complex and very demanding automated testing infrastructure. We have three buildbot slaves, one provided by Collabora (which I maintain), and two provided by Igalia (maintained by their WebKitGTK+ folks). Those bots build as many check ins as possible with 3 different configurations: 32 bits release, 64 bits release, and 64 bits debug.

In addition to those, we have another bot called the EWS, or Early Warning System. There are two of those at this moment: one VM provided by Collabora and my desktop, provided by myself. These bots build every patch uploaded to the bugzilla, and report build failures or passes (you can see the green bubbles). They are very important to our development process because if the patch causes a build failure for our port people can often know that before landing, and try fixes by uploading them to bugzilla instead of doing additional commits. And people are usually very receptive to waiting for EWS output and acting on it, except when they take way too long. You can have an idea of what the life of an EWS bot looks like by looking at the recent status for the WebKitGTK+ bots.

Maintaining all of those bots is at times a rather daunting task. The tests require a very specific set of packages, fonts, themes and icons to always report the same size for objects in a render. Upgrades, for instance, had to be synchronized, and usually involve generating new baselines for a large number of tests. You can see in these instructions, for instance, how strict the environment requirements are – yes, we need specific versions of fonts, because they often cause layouts to change in size! At one point we had tests fail after a compiler upgrade, which made rounding act a bit different!

So stability was a very important aspect of maintaining these bots. All of them have the same version of Debian, and most of the packages are pinned to the same version. On the other hand, and in direct contradition to the stability requirement, we often require bleeding edge versions of some libraries we rely on, such as libsoup. Since we started pushing WebKitGTK+ to be libsoup-only, its own progress has been pretty much driven by WebKitGTK+’s requirements, and Dan Winship has made it possible to make our soup backend much, much simpler and way more featureful. That meant, though, requiring very recent versions of soup.

To top it off, for anyone not running Debian testing and tracking the exact same versions of packages as the bots it was virtually impossible to get the tests to pass, which made it very difficult for even ourselves to make sure all patches were still passing before committing something. Wow, what a mess.

The explosion^Wsolution

So a few weeks back Martin Robinson came up with a proposed solution, which, as he says, is the “nuclear bomb” solution. We would have a jhbuild environment which would build and install all of the dependencies necessary for reproducing the test expectations the bots have. So over the first three days of the hackfest Martin and myself hacked away in building scripts, buildmaster integration, a jhbuild configuration, a jhbuild modules file, setting up tarballs, and wiring it all in a way that makes it convenient for the contributors to get along with. You’ll notice that our buildslaves now have a step just before compiling called “updated gtk dependencies” (gtk is the name we use for our port in the context of WebKit), which runs jhbuild to install any new dependencies or version bumps we added. You can also see that those instructions I mentioned above became a tad simpler.

It took us way more time than we thought for the dust to settle, but it eventually began to. The great thing of doing it during the hackfest was that we could find and fix issues with weird configurations on the spot! Oh, you build with AR_FLAGS=cruT and something doesn’t like it? OK, we fix it so that the jhbuild modules are not affected by that variable. Oh, turns out we missed a dependency, no problem, we add it to the modules file or install them on the bots, and then document the dependency. I set up a very clean chroot which we could use for trying out changes so as to not disrupt the tree too much for the other hackfest participants, and I think overall we did good.

The aftermath

By the time we were done our colleagues who ran other distributions such as Fedora were already being able to get a substantial improvements to the number of tests passing, and so did we! Also, the ability to seamlessly upgrade all the bots with a simple commit made it possible for us to very easily land a change that required a very recent (as in unreleased) version of soup which made our networking backend way simpler. All that red looks great, doesn’t it? And we aren’t done yet, we’ll certainly be making more tweaks to this infrastructure to make it more transparent and more helpful to the users (contributors and other people interested in running the tests).

If you’ve been hit by the instability we caused, sorry about that, poke mrobinson or myself in the #webkitgtk+ IRC channel on FreeNode, and we’ll help you out or fix any issues. If you haven’t, we hope you enjoy all the goodness that a reproducible testing suite has to offer! That’s it for now, folks, I’ll have more to report on follow-up work started at the hackfest soon enough, hopefully =).

November 29, 2011

Accelerated Compositing in webkit-clutter

Gustavo Noronha

For a while now my fellow Collaboran Joone Hur has been working on implementing the Accelerated Compositing infrastructure available in WebKit in webkit-clutter, so that we can use Clutter’s powers for compositing separate layers and perform animations. This work is being done by Collabora and is sponsored by BOSCH, whom I’d like to thank! What does all this mean, you ask? Let me tell me a bit about it.

The way animations usually work in WebKit is by repainting parts of the page every few milliseconds. What that means in technical terms is that an area of the page gets invalidated, and since the whole page is one big image, all of the pieces that are in that part of the page have to be repainted: the background, any divs, images, text that are at that part of the page.

What the accelerated compositing code paths allow is the creation of separate pieces to represent some of the layers, allowing the composition to happen on the GPU, removing the need to perform lots of cairo paint operations per second in many cases. So if we have a semi-transparent video moving around the page, we can have that video be a separate texture that is layered on top of the page, made transparent and animated by the GPU. In webkit-clutter’s case this is done by having separate actors for each of the layers.

I have been looking at this code on and off, and recently joined Joone in the implementation of some of the pieces. The accelerated compositing infrastructure was originally built by Apple and is, for that reason, works in a way that is very similar to Core Animation. The code is still a bit over the place as we work on figuring out how to best translate the concepts into clutter concepts and there are several bugs, but some cool demos are already possible! Bellow you have one of the CSS3 demos that were made by Apple to demo this new functionality running on our MxLauncher test browser.

You can also see that the non-Accelerated version is unable to represent the 3D space correctly. Also, can you guess which of the two MxLauncher instances is spending less CPU? In this second video I show the debug borders being painted around the actors that were created to represent layers.

The code, should you like to peek or test is available in the ac2 branch of our webkit-clutter repository: http://gitorious.org/webkit-clutter/webkit-clutter/commits/ac2

We still have plenty of work to do, so expect to hear more about it. During our annual hackfest in A Coruña we plan to discuss how this work could be integrated also in the WebKitGTK+ port, perhaps by taking advantage of clutter-gtk, which would benefit both ports, by sharing code and maintenance, and providing this great functionality to Epiphany users. Stay tuned!

October 09, 2011

Tests Active

Brent Fulgham

Looking back over this blog, I see that it was around a year ago that I got the initial WinCairo buildbot running. I'm very pleased to announce that I have gotten ahold of a much more powerful machine, and am now able to run a full build and tests in slightly under an hour -- a huge improvement over the old hardware which took over two hours just to build the software!

This is a big step, because we can now track regressions and gauge correctness compared to the other platforms. Up to now, testing has largely consisted of periodic manual runs of the test suite, and a separate set of high-level tests run as part of a larger application. This was not ideal, because it was easy for low-level functions in WebKit that I rarely use to be broken and missed.

All is not perfect, of course. Although over 12,000 tests now run (successfully) with each build, that is effectively two thirds of the full test suite. Most of the tests I have disabled are due to small differences in the output layout. I'm trying to understand why these differences exist, but I suspect many of them simply reflect small differences in Cairo compared to the CoreGraphics rendering layer.

If any of you lurkers are interested in helping out, trying out some of the tests I have disabled and figuring out why they fail would be a huge help!

July 14, 2011

An Unseasonable Snowfall

Brent Fulgham

A year or two ago I ported the Cocoa "CallJS" application to MFC for use with WebKit. The only feedback I ever got on the topic was a complaint that it would not build under the Visual Studio Express software many people used.

After seeing another few requests on the webkit-help mailing list for information on calling JavaScript from C++ (and vice-versa), I decided to dust off the old program and convert it to pure WINAPI calls so that VS Express would work with it.

Since my beloved Layered Window patches finally landed in WebKit, I also incorporated a transparent WebKit view floating over the main application window. Because I suck at art, I stole appropriated the Let It Snow animation example to give the transparent layer something to do.

Want to see what it looks like?

July 10, 2011

Updated WebKit SDK (@r89864)

Brent Fulgham

I have updated the WebKitSDK to correspond to SVN revision r8984.

Major changes in this revision:
* JavaScript engine improvements.
* Rendering improvements.
* New 'Transparent Web View' support.
* General performance and memory use improvements.

This ZIP file also contains updated versions of Zlib, OpenSSL, cURL, and OpenCFLite.

Note that I have stopped statically linking Cairo; I'm starting to integrate some more recent Cairo updates (working towards some new rendering features), and wanted to be able to update it incrementally as changes are made.

This package contains the same Cairo library (in DLL form) as used in previous versions.

As usual, please let me know if you encounter any problems with this build.

[Update] I forgot to include zlib1.dll! Fixed in the revised zip file.

July 05, 2011

WinCairoRequirements Sources Archive

Brent Fulgham

I've posted the 80 MB source archive of the requirements needed to build the WinCairo port of WebKit.

Note that you do NOT need these sources unless you plan on building them yourself or wish to archive the source code for these modules. The binaries are always present in the WinCairoRequirements.zip file, which is downloaded and unzipped to the proper place when you execute the update-webkit --wincairo command.

June 28, 2011

Towards a Simpler WinCairo Build

Brent Fulgham

For the past couple of years, anyone interested in trying to build the WinCairo port of WebKit had to track down a number of support libraries, place them in their development environment's include (and link search) paths, and then cross their fingers and hope everything built.

To make things a little easier, I wrapped up the libraries and headers I use for building and posted them as a zip file on my .Mac account. This made things a little easier, but you still had to figure out where to drop the files and figure out if I had secretly updated my 'requirements.zip' file without telling anyone. Not ideal.

A couple of days ago, while trolling through the open review queue, I ran across a Bug filed by Carl Lobo, which automated the task of downloading the requirements file when running build-webkit --wincairo. This was a huge improvement!

Today, I hijacked Carl's changes and railroaded the patch through the review process (making a few modifications along the way):

• I renamed my requirements file WinCairoRequirements.zip.

• I added a timestamp file, so that build-webkit --wincairo can check to see if the file changed, and download it if necessary.

• I propagated Carl's changes to update-webkit, so that now by adding the --wincairo argument it will update the WinCairoRequirements file.

I'm really excited about this update. If you've been wanting to try out the WinCairo port of WebKit, this would be a great time to try it out. I'd love to hear your experiences!

June 14, 2011

Benchmarking Javascript engines for EFL

Lucas De Marchi

The Enlightenment Foundation Libraries has several bindings for other languages in order to ease the creation of end-user applications, speeding up its development. Among them, there’s a binding for Javascript using the Spidermonkey engine. The questions are: is it fast enough? Does it slowdown your application? Is Spidermonkey the best JS engine to be used?

To answer these questions Gustavo Barbieri created some C, JS and Python benchmarks to compare the performance of EFL using each of these languages. The JS benchmarks were using Spidermonkey as the engine since elixir was already done for EFL. I then created new engines (with only the necessary functions) to also compare to other well-known JS engines: V8 from Google and JSC (or nitro) from WebKit.

Libraries setup

For all benchmarks EFL revision 58186 was used. Following the setup of each engine:

• Spidermonkey: I’ve used version 1.8.1-rc1 with the already available bindings on EFL repository, elixir;
• V8: version ﻿3.2.5.1, using a simple binding I created for EFL. I named this binding ev8;
• JSC: ﻿WebKit’s sources are needed to compile JSC. I’ve used revision 83063. Compiling with CMake, I chose the EFL port and enabled the option SHARED_CORE in order to have a separated library for Javascript;

Benchmarks

Startup time: This benchmark measures the startup time by executing a simple application that imports evas, ecore, ecore-evas and edje, bring in some symbols and then iterates the main loop once before exiting. I measured the startup time for both hot and cold cache cases. In the former the application is executed several times in sequence and the latter includes a call to drop all caches so we have to load the library again from disk

Runtime – Stress: This benchmark executes as many frames per second as possible of a render-intensive operation. The application is not so heavy, but it does some loops, math and interacts with EFL. Usually a common application would do far less operations every frame because many operations are done in EFL itself, in C, such as list scrolling that is done entirely in elm_genlist. This benchmark is made of 4 phases:

• ﻿Phase 0 (P0): Un-scaled blend of the same image 16 times;
• Phase 1 (P1): Same as P0, with additional 50% alpha;
• Phase 2 (P2): Same as P0, with additional red coloring;
• Phase 3 (P3): Same as P0, with additional 50% alpha and red coloring;

The C and Elixir’s versions are available at EFL repository.

Runtime – animation: usually an application doesn’t need “as many FPS as possible”, but instead it would like to limit to a certain amount of frames per second. E.g.: iphone’s browser tries to keep a constant of 60 FPS. This is the value I used on this benchmark. The same application as the previous benchmark is executed, but it tries to keep always the same frame-rate.

Results

The first computer I used to test these benchmarks on was my laptop. It’s a Dell Vostro 1320, Intel Core 2 Duo with 4 GB of RAM and a standard 5400 RPM disk. The results are below.

First thing to notice is there are no results for “Runtime – animation” benchmark. This is because all the engines kept a constant of 60fps and hence there were no interesting results to show. The first benchmark shows that V8’s startup time is the shortest one when considering we have to load the application and libraries from disk. JSC was the slowest and  Spidermonkey was in between.

With hot caches, however, we have another complete different scenario, with JSC being almost as fast as the native C application. Following, V8 with a delay a bit larger and Spidermonkey as the slowest one.

The runtime-stress benchmark shows that all the engines are performing well when there’s some considerable load in the application, i.e. removing P0 from from this scenario. JSC was always at the same speed of native code; Spidermonkey and V8 had an impact only when considering P0 alone.

Next computer to consider in order to execute these benchmarks was  a Pandaboard, so we can see how well the engines are performing in an embedded platform. Pandaboard has an ARM Cortex-A9 processor with 1GB of RAM and the partition containing the benchmarks is in an external flash storage drive. Following the results for each benchmark:

Once again, runtime-animation is not shown since it had the same results for all engines. For the startup tests, now Spidermonkey was much faster than the others, followed by V8 and JSC in both hot and cold caches. In runtime-stress benchmark, all the engines performed well, as in the first computer, but now JSC was the clear winner.

There are several points to be considered when choosing an engine to be use as a binding for a library such as EFL. The raw performance and startup time seems to be very near to the ones achieved with native code. Recently there were some discussions in EFL mailing list regarding which engine to choose, so I think it would be good to share these numbers above. It’s also important to notice that these bindings have a similar approach of elixir, mapping each function call in Javascript to the correspondent native function. I made this to be fair in the comparison among them, but depending on the use-case it’d  be good to have a JS binding similar to what python’s did, embedding the function call in real python objects.

April 29, 2011

Collection of WebKit ports

Holger Freyther

WebKit is a very successfull project. It is that in many ways. The code produced seems to very fast, the code is nice to work on, the people are great, the partys involved collaborate with each other in the interest of the project. The project is also very successfull in the mobile/smartphone space. All the major smartphone platforms but Windows7 are using WebKit. This all looks great, a big success but there is one thing that stands out.

From all the smartphone platforms no one has fully upstreamed their port. There might be many reasons for that and I think the most commonly heard reason is the time needed to get it upstreamed. It is specially difficult in a field that is moving as fast as the mobile industry. And then again there is absolutely no legal obligation to work upstream.

For most of today I collected the ports I am aware of, put them into one git repository, maybe find the point where they were branched, rebase their changes. The goal is to make it more easy to find interesting things and move them back to upstream. One can find the combined git tree with the tags here. I started with WebOS, moved to iOS, then to Bada and stopped at Android as I would have to pick the sourcecode for each android release for each phone from each vendor. I think I will just be happy with the Android git tree for now. At this point I would like to share some of my observations in the order I did the import.

Palm

Palm's release process is manual. In the last two releases they call the file .tgz but forgot to gzip it, in 2.0.0 the tarball name was in camel case. The thing that is very nice about Palm is that they provide their base and their changes (patch) separately. From looking at the 2.1.0 release it looks that for the Desktop version they want to implement Complex Font rendering. Earlier versions (maybe it is still the case) lack the support for animated GIF.

iOS

Apple's release process seems to be very structured. The source can be downloaded here. What I think is to note is that the release tarball contains some implementations of WebCore only as .o file and Apple has stopped releasing the WebKit sourcecode beginning with iOS 4.3.0.

Bada

This port is probably not known by many. The release process seems to be manual as well, the name of directories changed a lot between the releases, they come with a WML Script engine and they do ship something they should not ship.

I really hope that this combined tree is useful for porters that want to see the tricks used in the various ports and don't want to spend the time looking for each port separately.

February 13, 2011

How to make the GNU Smalltalk Interpreter slower

Holger Freyther

This is another post about a modern Linux based performance measurement utility. It is called perf, it is included in the Linux kernel sources and it entered the kernel in v2.6.31-rc1. In many ways it is obsoleting OProfile, in fact for many architectures oprofile is just a wrapper around the perf support in the kernel. perf comes with a few nice application. perf top provides a statistics about which symbols in user and in kernel space are called, perf record to record an application or to start an application to record it and then perf report to browse this report with a very simple CLI utility. There are tools to bundle the record and the application in an archive, a diff utility.

For the last year I was playing a lot with GNU Smalltalk and someone posted the results of a very simplistic VM benchmark ran across many different Smalltalk implementations. In one of the benchmarks GNU Smalltalk is scoring last among the interpreters and I wanted to understand why it is slower. In many cases the JavaScriptCore interpreter is a lot like the GNU Smalltalk one, a simple direct-threaded bytecode interpreter, uses computed goto (even is compiled with -fno-gcse as indicated by the online help, not that it changed something for JSC), heavily inlined many functions.

There are also some differences, the GNU Smalltalk implementation is a lot older and in C. The first notable is that it is a Stack Machine and not register based, there are global pointers for the SP and the IP. Some magic to make sure that in the hot loop the IP/SP is 'local' in a register, depending on the available registers also keep the current argument in one, the interpreter definition is in a special file format but mostly similar to how Interepreter::privateExecute is looking like. The global state mostly comes from the fact that it needs to support switching processes and there might be some event during the run that requires access to the IP to store it to resume the old process. But in general the implementation is already optimized and there is little low hanging fruits and most experiments result in a slow down.

The two important things are again: Having a stable benchmark, having a tool to help to know where to look for things. In my case the important tools are perf stat, perf record, perf report and perf annotate. I have put a copy of the output to the end of this blog post. The stat utility provides one with number of instructions executed, branches, branch misses (e.g. badly predicted), L1/L2 cache hits and cache misses.

The stable benchmark helps me to judge if a change is good, bad or neutral for performance within the margin of error of the test. E.g. if I attempt to reduce the code size the instructions executed should decrease, if I start putting __builtin_expect.. into my code the number of branch misses should go down as well. The other useful utility is to the perf report that allows one to browse the recorded data, this can help to identify the methods one wants to start to optimize, it allows to annotate these functions inside the simple TUI interface, but does not support searching in it.

Because the codebase is already highly optimized any of my attempts should either decrease the code size (and the pressure on the i-cache), the data size (d-cache), remove stores or loads from memory (e.g. reorder instructions), fix branch predictions. The sad truth is that most of my changes were either slow downs or neutral to the performance and it is really important to undo these changes and not have false pride (unless it was also a code cleanup or such).

So after about 14 hours of toying with it the speed ups I have managed to make come from inlining a method to unwind a context (callframe), reordering some compares on the GC path and disabling the __builtin_expect branch hints as they were mostly wrong (something the kernel people found to be true in 2010 as well). I will just try harder, or try to work on the optimizer or attempt something more radical...

This probe will be executed whenever the sqlite3_get_table function of the mentioned library will be called. The $zSql is a variable passed to the sqlite3_get_table function and contains the query to be executed. I am converting the pointer to a local variable and then can print it. Using this simple probe helped me to see which queries were executed by the database library and helped me to do an easy optimisation. In general it could be very useful to build a set of probes (I think one calls set a tapset) that check for API misusage, e.g. calling functions with certain parameters where something else might be better. E.g. in Glib use truncate instead of assigning "" to the GString, or check for calls to QString::fromUtf16 coming from Qt code itself. On second thought this might be better as a GCC plugin, or both. December 17, 2010 In the name of performance Holger Freyther I tend to see people doing weird things and then claim that the change is improving performance. This can be re-ordering instructions to help the compiler, attempting to use multiple cores of your system, writing a memfill in assembly. On the one hand people can be right and the change is making things faster, on the other hand they could use assembly to make things look very complicated, justify their pay, and you might feel awkward to question if it is making any sense. In the last couple of weeks I have stumbled on some of those things. For some reason I found this bug report about GLIBC changing the memcpy routine for SSE and breaking the flash plugin (because it uses memcpy in the wrong way). The breakage is justified that the new memcpy was optimized and is faster. As Linus points out with his benchmark the performance improvement is mostly just wishful thinking. Another case was someone providing MIPS optimized pixman code to speed-up all drawing which turned out to be wishful thinking as well... The conclusion is. If someone claims that things are faster with his patch. Do not simply trust him, make sure he refers to his benchmark, is providing numbers of before and after and maybe even try to run it yourself. If he can not provide this, you should wonder how he measured the speed-up! There should be no place for wishful thinking in benchmarking. This is one of the areas where Apple's WebKit team is constantly impressing me. December 16, 2010 Benchmarking QtWebKit-V8 on Linux University of Szeged For some time it has been possible to build and run QtWebKit on Linux using Google's V8 JavaScript engine instead of the default JavaScriptCore. I thought it would be good to see some numbers comparing the runtime performance of the two engines in the same environment and also measuring the performance of the browser bindings. read more October 23, 2010 Easily embedding WebKit into your EFL application Lucas De Marchi This is the first of a series of posts that I’m planning to do using basic examples in EFL, the Enlightenment Foundation Libraries. You may have heard that EFL is reaching its 1.0 release. Instead of starting from the very beginning with the basic functions of these libraries, I decided to go the opposite way, showing the fun stuff that is possible to do. Since I’m also an WebKit developer, let’s put the best of both softwares together and have a basic window rendering a webpage. Before starting off, just some remarks: 1. I’m using here the basic EFL + WebKit-EFL (sometimes called ewebkit). Developing an EFL application can be much simpler, particularly if you use an additional library with pre-made widgets like Elementary. However, it’s good to know how the underlying stuff works, so I’m providing this example. 2. This could have been the last post in a series when talking about EFL since it uses at least 3 libraries. Don’t be afraid if you don’t understand what a certain function is for or if you can’t get all EFL and WebKit running right now. Use the comment section below and I’ll make my best to help you. Getting EFL and WebKit In order to able to compile the example here, you will need to compile two libraries from source: EFL and WebKit. For both libraries, you can either get the last version from svn or use the last snapshots provided. • EFL: Grab a snapshot from the download page. How to checkout the latest version from svn is detailed here, as well as some instructions on how to compile • WebKit-EFL: A very detailed explanation on how to get WebKit-EFL up and running is available on trac. Recently, though, WebKit-EFL started to be released too. It’s not detailed in the wiki yet, but you can grab a snapshot instead of checking out from svn. hellobrowser! In the spirit of “hello world” examples, our goal here is to make a window showing a webpage rendered by WebKit. For the sake of simplicity, we will use a default start page and put a WebKit-EFL “widget” to cover the entire window. See below a screenshot: The code for this example is available here. Pay attention to a comment in the beginning of this file that explains how to compile it: gcc -o hellobrowser hellobrowser.c \ -DEWK_DATADIR="\"$(pkg-config --variable=datadir ewebkit)\"" \ \$(pkg-config --cflags --libs ecore ecore-evas evas ewebkit)

The things worth noting here are the dependencies and a variable. We directly depend on ecore and evas from EFL and on WebKit. We define a variable, EWK_DATADIR, using pkg-config so our browser can use the default theme for web widgets defined in WebKit. Ecore handles events like mouse and keyboard inputs, timers etc whilst evas is the library responsible for drawing. In a later post I’ll detail them a bit more. For now, you can read more about them on their official site.

The main function is really simple. Let’s divide it by pieces:

 // Init all EFL stuff we use evas_init(); ecore_init(); ecore_evas_init(); ewk_init();

Before you use a library from EFL, remember to initialize it. All of them use their own namespace, so it’s easy to know which library you have to initialize: for example, if you call a function starting by “ecore_”, you know you first have to call “ecore_init()”. The last initialization function is WebKit’s, which uses the “ewk_” namespace.

 window = ecore_evas_new(NULL, 0, 0, 800, 600, NULL); if (!window) { fprintf(stderr, "something went wrong... :(\n"); return 1; }

Ecore-evas then is used to create a new window with size 800×600. The other options are not relevant for an introduction to the libraries and you can find its complete documentation here.

 // Get the canvas off just-created window evas = ecore_evas_get(window);

From the Ecore_Evas object we just created, we grab a pointer to the evas, which is the space in which we can draw, adding Evas_Objects. Basically an Evas_Object is an object that you draw somewhere, i.e. in the evas. We want to add only one object to our window, that is where WebKit you render the webpages. Then, we have to ask WebKit to create this object:

 // Add a View object into this canvas. A View object is where WebKit will // render stuff. browser = ewk_view_single_add(evas);

Below I demonstrate a few Evas’ functions that you use to manipulate any Evas_Object. Here we are manipulating the just create WebKit object, moving to the desired position, resizing to 780x580px and then telling Evas to show this object. Finally, we tell Evas to show the window we created too. This way we have a window with an WebKit object inside with a little border.

 // Make a 10px border, resize and show evas_object_move(browser, 10, 10); evas_object_resize(browser, 780, 580); evas_object_show(browser); ecore_evas_show(window);

We need to setup a bit more things before having a working application. The first one is to give focus to the Evas_Object we are interested on in order to receive keyboard events when opened. Then we connect a function that will be called when the window is closed, so we can properly exit our application.

 // Focus it so it will receive pressed keys evas_object_focus_set(browser, 1);   // Add a callback so clicks on "X" on top of window will call // main_signal_exit() function ecore_event_handler_add(ECORE_EVENT_SIGNAL_EXIT, main_signal_exit, window);

After this, we are ready to show our application, so we start the mainloop. This function will only return when the application is closed:

 ecore_main_loop_begin();

The function called when the application is close, just tell Ecore to exit the mainloop, so the function above returns and the application can shutdown. See its implementation below:

static Eina_Bool main_signal_exit(void *data, int ev_type, void *ev) { ecore_evas_free(data); ecore_main_loop_quit(); return EINA_TRUE; }

Before the application exits, we shutdown all the libraries that were initialized, in the opposite order:

 // Destroy all the stuff we have used ewk_shutdown(); ecore_evas_shutdown(); ecore_shutdown(); evas_shutdown();

This is a basic working browser, with which you can navigate through pages, but you don’t have an entry to set the current URL, nor “go back” and “go forward” buttons etc. All you have to do is start adding more Evas_Objects to your Evas and connect them to the object we just created. For a still basic example, but with more stuff implemented, refer to the EWebLauncher that we ship with the WebKit source code. You can see it in the “WebKitTools/EWebLauncher/” folder or online at webkit’s trac. Eve is another browser with a lot more features that uses Elementary in addition to EFL, WebKit. See a blog post about it with some nice pictures.

Now, let’s do something funny with our browser. With a bit more lines of code you can turn your browser upside down. Not really useful, but it’s funny. All you have to do is to rotate the Evas_Object WebKit is rendering on. This is implemented by the following function:

// Rotate an evas object by 180 degrees static void _rotate_obj(Evas_Object *obj) { Evas_Map *map = evas_map_new(4);   evas_map_util_points_populate_from_object(map, obj); evas_map_util_rotate(map, 180.0, 400, 300); evas_map_alpha_set(map, 0); evas_map_smooth_set(map, 1); evas_object_map_set(obj, map); evas_object_map_enable_set(obj, 1);   evas_map_free(map); }

See this screenshot below and  get the complete source code.

October 02, 2010

Deploying WebKit, common issues

Holger Freyther

From my exposure to people deploying QtWebKit or WebKit/GTK+ there are some things that re-appear and I would like to discuss these here.

• Weird compile error in JavaScript?
• It is failing in JavaScriptCore as it is the first that is built. It is most likely that the person that provided you with the toolchain has placed a config.h into it. There are some resolutions to it. One would be to remove the config.h from the toolchain (many things will break), or use -isystem instead of -I for system includes.
The best way to find out if you suffer from this problem is to use -E instead of -c to only pre-process the code and see where the various includes are coming from. It is a strategy that is known to work very well.

• No pages are loaded.
• Most likely you do not have a DNS Server set, or no networking, or the system your board is connected to is not forwarding the data. Make sure you can ping a website that is supposed to work, e.g. ping www.yahoo.com, the next thing would be to use nc to execute a simple HTTP 1.1 get on the site and see if it is working. In most cases you simply lack networking connectivity.

• HTTPS does not work
• It might be either an issue with Qt or an issue with your system time. SSL Certificates at least have two dates (Expiration and Creation) and if your system time is after the Expiration or before the Creation you will have issues. The easiest thing is to add ntpd to your root filesystem to make sure to have the right time.

The possible issue with Qt is a bit more complex. You can build Qt without OpenSSL support, you can make it link to OpenSSL or you can make it to dlopen OpenSSL at runtime. If SSL does not work it is most likely that you have either build it without SSL support, or with runtime support but have failed to install the OpenSSL library.

Depending on your skills it might be best to go back to ./configure and make Qt link to OpenSSL to avoid the runtime issue. strings is a very good tool to find out if your libQtNetwork.so contains SSL support, together with using objdump -x and search for _NEEDED you will find out which config you have.

• Local pages are not loaded
• This is a pretty common issue for WebKit/GTK+. In WebKit/GTK+ we are using GIO for local files and to determine the filetype it is using the freedesktop.org shared-mime-info. Make sure you have that installed.

• The page only displays blank
• This is another issue that comes back from time to time. It only appears on WebKit/GTK+ with the DirectFB backend but sadly people never report back if and how they have solved it. You could make a difference and contribute back to the WebKit project.

In general most of these issues can be avoided by using a pre-packaged Embedded Linux Distribution like Ångström (or even Debian). The biggest benefit of that approach is that someone else made sure that when you install WebKit, all dependencies will be installed as well and it will just work for your ARM/MIPS/PPC system. It will save you a lot of time.

August 28, 2010

WebKit

Lucas De Marchi

After some time working with the EFL port of WebKit, I’ve been nominated as an official webkit developer. Now I have super powers in the official repository :-), but I swear I intend to use it with caution and responsibility. I’ll not forget Uncle Ben’s advice: ﻿﻿”with great power comes great responsibility”.

I’m preparing a post to talk about WebKit, EFL, eve (a new web browser based on WebKit + EFL) and how to easily embed a browser in your application. Stay tuned.

August 10, 2010

Coscup2010/GNOME.Asia with strong web focus

Holger Freyther

On the following weekend the Coscup 2010/GNOME.Asia is taking place in Taipei. The organizers have decided to have a strong focus on the Web as can be seen in the program.

On saturday there are is a keynote and various talks about HTML5, node.js. The Sunday will see three talks touching WebKit/GTK+. There is one about building a tablet OS with WebKit/GTK+, one by Xan Lopez on how to build hybrid applications (a topic I have devoted moiji-mobile.com to) and a talk by me using gdb to explain how WebKit/GTK+ is working and how the porting layer interacts with the rest of the code.

I hope the audience will enjoy the presentations and I am looking forward to attend the conference, there is also a strong presence of the ex-Openmoko Taiwan Engineering team. See you on Saturday/Sunday and drop me an email if you want to talk about WebKit or GSM...

July 16, 2010

Cross-compiling QtWebKit for Windows on Linux using MinGW

University of Szeged

In this post I'll show you how to configure and compile a MinGW toolchain for cross-compilation on Linux, then how to build Qt using this toolchain and finally compile the Qt port of WebKit from trunk.

read more

September 06, 2008

Skia graphics library in Chrome: First impressions

Alp Toker

With the release of the WebKit-based Chrome browser, Google also introduced a handful of new backends for the browser engine including a new HTTP stack and the Skia graphics library. Google’s Android WebKit code drops have previously featured Skia for rendering, though this is the first time the sources have been made freely available. The code is apparently derived from Google’s 2005 acquisition of North Carolina-based software firm Skia and is now provided under the Open Source Apache License 2.0.

Weighing in at some 80,000 lines of code (to Cairo’s 90,000 as a ballpark reference) and written in C++, some of the differentiating features include:

• Optimised software-based rasteriser (module sgl/)
• Optional GL-based acceleration of certain graphics operations including shader support and textures (module gl/)
• Animation capabilities (module animator/)
• Some built-in SVG support (module (svg/)
• Built-in image decoders: PNG, JPEG, GIF, BMP, WBMP, ICO (modules images/)
• Text capabilities (no built-in support for complex scripts)
• Some awareness of higher-level UI toolkit constructs (platform windows, platform events): Mac, Unix (sic. X11, incomplete), Windows, wxwidgets
• Performace features
• Copy-on-write for images and certain other data types
• Extensive use of the stack, both internally and for API consumers to avoid needless allocations and memory fragmentation
• Thread-safety to enable parallelisation

The library is portable and has (optional) platform-specific backends:

• Fonts: Android / Ascender, FreeType, Windows (GDI)
• Threading: pthread, Windows
• XML: expat, tinyxml
• Android shared memory (ashmem) for inter-process image data references

Skia Hello World

In this simple example we draw a few rectangles to a memory-based image buffer. This also demonstrates how one might integrate with the platform graphics system to get something on screen, though in this case we’re using Cairo to save the resulting image to disk:

#include "SkBitmap.h" #include "SkDevice.h" #include "SkPaint.h" #include "SkRect.h" #include <cairo.h>   int main() { SkBitmap bitmap; bitmap.setConfig(SkBitmap::kARGB_8888_Config, 100, 100); bitmap.allocPixels(); SkDevice device(bitmap); SkCanvas canvas(&device); SkPaint paint; SkRect r;   paint.setARGB(255, 255, 255, 255); r.set(10, 10, 20, 20); canvas.drawRect(r, paint);   paint.setARGB(255, 255, 0, 0); r.offset(5, 5); canvas.drawRect(r, paint);   paint.setARGB(255, 0, 0, 255); r.offset(5, 5); canvas.drawRect(r, paint);   { SkAutoLockPixels image_lock(bitmap); cairo_surface_t* surface = cairo_image_surface_create_for_data( (unsigned char*)bitmap.getPixels(), CAIRO_FORMAT_ARGB32, bitmap.width(), bitmap.height(), bitmap.rowBytes()); cairo_surface_write_to_png(surface, "snapshot.png"); cairo_surface_destroy(surface); }   return 0; }

You can build this example for yourself linking statically to the libskia.a object file generated during the Chrome build process on Linux.

Not just for Google Chrome

The Skia backend in WebKit, the first parts of which are already hitting SVN (r35852, r36074) isn’t limited to use in the Chrome/Windows configuration and some work has already been done to get it up and running on Linux/GTK+ as part of the ongoing porting effort.

The post Skia graphics library in Chrome: First impressions appeared first on Alp Toker.

June 12, 2008

WebKit Meta: A new standard for in-game web content

Alp Toker

Over the last few months, our browser team at Nuanti Ltd. has been developing Meta, a brand new WebKit port suited to embedding in OpenGL and 3D applications. The work is being driven by Linden Lab, who are eagerly investigating WebKit for use in Second Life.

While producing Meta we’ve paid great attention to resolving the technical and practical limitations encountered with other web content engines.

uBrowser running with the WebKit Meta engine

High performance, low resource usage

Meta is built around WebKit, the same engine used in web browsers like Safari and Epiphany, and features some of the fastest content rendering around as well as nippy JavaScript execution with the state of the art SquirrelFish VM. The JavaScript SDK is available independently of the web renderer for sandboxed client-side game scripting and automation.

It’s also highly scalable. Some applications may need only a single browser context but virtual worlds often need to support hundreds of web views or more, each with active content. To optimize for this use case, we’ve cut down resource usage to an absolute minimum and tuned performance across the board.

Stable, easy to use cross-platform SDK

Meta features a single, rock-solid API that works identically on all supported platforms including Windows, OS X and Linux. The SDK is tailored specifically to embedding and allows tight integration (shared main loop or operation in a separate rendering thread, for example) and hooks to permit seamless visual integration and extension. There is no global setup or initialization and the number of views can be adjusted dynamically to meet resource constraints.

Minimal dependencies

Meta doesn’t need to use a conventional UI toolkit and doesn’t need any access to the underlying windowing system or the user’s filesystem to do its job, so we’ve done away with these concepts almost entirely. It adds only a few megabytes to the overall redistributable application’s installed footprint and won’t interfere with any pre-installed web browsers on the user’s machine.

Nuanti will be offering commercial and community support and is anticipating involvement from the gaming industry and homebrew programmers.

In the mid term, we aim to submit components of Meta to the WebKit Open Source project, where our developers are already actively involved in maintaining various subsystems.

Find out more

Today we’re launching meta.nuanti.com and two mailing lists to get developers talking. We’re looking to make this site a focal point for embedders, choc-full of technical details, code samples and other resources.

The post WebKit Meta: A new standard for in-game web content appeared first on Alp Toker.

April 21, 2008

Acid3 final touches

Alp Toker

Recently we’ve been working to finish off and land the last couple of fixes to get a perfect pixel-for-pixel match against the reference Acid3 rendering in WebKit/GTK+. I believe we’re the first project to achieve this on Linux — congratulations to everyone on the team!

Epiphany using WebKit r32284

We also recently announced our plans to align more closely with the GNOME desktop and mobile platform. To this end we’re making a few technology and organisational changes that I hope to discuss in an upcoming post.

The post Acid3 final touches appeared first on Alp Toker.

April 06, 2008

WebKit Summer of Code Projects

Alp Toker

With the revised deadline for Google Summer of Code ’08 student applications looming, we’ve been getting a lot of interest in browser-related student projects. I’ve put together a list of some of my favourite ideas.

If in doubt, now’s the time to submit proposals. Already-listed ideas are the most likely to get mentored but students are free to propose their own ideas as well. Proposals for incremental improvements will tend to be favoured over ideas for completely new applications, but a proof of concept and/or roadmap can help when submitting plans for larger projects.

Update: There’s no need to keep asking about the status of an application on IRC/private mail etc. It’s a busy time for the upstream developers but they’ll get back in touch as soon as possible.

The post WebKit Summer of Code Projects appeared first on Alp Toker.

March 27, 2008

WebKit gets 100% on Acid3

Alp Toker

Today we reached a milestone with WebKit/GTK+ as it became the first browser engine on Linux/X11 to get a full score on Acid3, shortly after the Acid3 pass by WebKit for Safari/Mac.

Epiphany using WebKit r31371

There is actually still a little work to be done before we can claim a flawless Acid3 pass. Two of the most visible remaining issues in the GTK+ port are :visited (causing the “LINKTEST FAILED” notice in the screenshot) and the lack of CSS text shadow support in the Cairo/text backend which is needed to match the reference rendering.

It’s amazing to see how far we’ve come in the last few months, and great to see the WebKit GTK+ team now playing an active role in the direction of WebCore as WebKit continues to build momentum amongst developers.

Update: We now also match the reference rendering.

The post WebKit gets 100% on Acid3 appeared first on Alp Toker.

March 15, 2008

Bossa Conf ’08

Alp Toker

Am here in the LHR lounge. In a couple of hours, we take off for the INdT Bossa Conference, Pernambuco, Brazil via Lisbon. Bumped in to Pippin who will be presenting Clutter. Also looking forward to Lennart‘s PulseAudio talk amongst others.

If you happen to be going, drop by on my WebKit Mobile presentation, 14:00 Room 01 this Monday. We have a small surprise waiting for Maemo developers.

The post Bossa Conf ’08 appeared first on Alp Toker.