Tags

, ,

Let’s review our performance measures for 2013.

Highlights:

– significant regressions in “time to throbber start/stop” and Eideticker startup tests

– most Talos measurements stable, or regressions addressed

– slight, gradual improvements to frame rate and responsiveness

Talos

This section tracks Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android 2.2 opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

tcanvasmark

This test runs the third-party CanvasMark benchmark suite, which measures the browser’s ability to render a variety of canvas animations at a smooth framerate as the scenes grow more complex. Results are a score “based on the length of time the browser was able to maintain the test scene at greater than 30 FPS, multiplied by a weighting for the complexity of each test type”. Higher values are better.

tcanvas

7800 (start of period) – 7700 (end of period).

This test was introduced in September and has been fairly stable ever since. There does however seem to be a slight, gradual regression over this period.

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

tcheck2

4.4 (start of period) – 2.8 (end of period)

This test saw lots of regressions and improvements over 2013, ending on a stable high note.

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

tpan

14000 (start of period) – 110000 (end of period)

The nature of this test measurement makes it one of the most variable Talos tests. We overcame the worst of the Sept/Oct regression, but still ended the year worse off than we started.

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

troboprovider

375 (start of period) – 375 (end of period).

This test has hardly ever reported significant change — is it a useful test?

tsvgx

An svg-only number that measures SVG rendering performance. About half of the tests are animations or iterations of rendering. This ASAP test (tsvgx) iterates in unlimited frame-rate mode thus reflecting the maximum rendering throughput of each test. The reported value is the page load time, or, for animations/iterations – overall duration the sequence/animation took to complete. Lower values are better.

tsvg

1200 (start of period) – 7200 (end of period).

Introduced in September; the regression of Nov 27 is tracked in bug 944429.

tp4m

Generic page load test. Lower values are better.

tp4m

700 (start of period) – 700 (end of period).

This version of tp4m was introduced in September; no significant changes here.

ts_paint

Startup performance test. Lower values are better.

tspaint

4300 (start of period) – 4300 (end of period)

Introduced in September; there are no significant regressions here, but there is a lot of variability, possibly related to the frequent test failures — see bug 781107.

Throbber Start / Throbber Stop

These graphs are taken from http://phonedash.mozilla.org.  Browser startup performance is measured on real phones (a variety of popular devices).

“Time to throbber start” measures the time from process launch to the start of the throbber animation. Smaller values are better.

throbberstart1

There is so much data here, it is hard to see what is happening, but a troubling upward trend over the year is evident.

“Time to throbber stop” measures the time from process launch to the end of the throbber animation. Smaller values are better.

throbberstop1

Again, there is a lot of data here. Here’s another graph that hides the data for all but a few devices:

throbberstop2

Evidently we have lost a lot of ground over the year, with an increase in “time to throbber stop” of nearly 80% for some devices.

Eideticker

These graphs are taken from http://eideticker.mozilla.org. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result.

More info at: https://wiki.mozilla.org/Project_Eideticker

Eideticker confirms that startup time has regressed over the year:

eide1

eide2

Most checkerboarding and frame rate measurements have been steady, or show slight improvement:

eide5

Responsiveness — a new measurement added this year — is similarly steady with slight improvement:

eide3

eide4

awsy

See https://www.areweslimyet.com/mobile/ for content and background information.

awsy1

awsy2

awsy3

There is an upward trend here for many of the measurements, but what I find striking is how often we have managed to keep these numbers stable while adding new features.

Advertisements