More Robocop tests running…and more disabled

Tags

, ,

The Robocop test suite continues to evolve, providing more and more UI-level testing for Firefox for Android. Recently added tests include:

testImportFromAndroid
testOrderedBroadcast
testSharedPreferences
testAddSearchEngine
testInputAwesomeBar

testOrderedBroadcast and testSharedPreferences are examples of hybrid tests that make xpcshell-like assertions in Javascript, in a page loaded by the full browser, inside the Robocop framework. These leverage exciting infrastructure changes landed by :nalexander in bug 870908.

Of course, Robocop has had its share of problems too. We have had to disable some tests because of repeated failures. There is active, on-going work to fix and re-enable some disabled tests, but I fear that we have all but forgotten others. I am concerned about these long-disabled tests:

# [test_bug720538] # see bug 746876
# [testPasswordEncrypt] # see bug 824067
# [testThumbnails] # see bug 813107
# [testPermissions] # see bug 757475
# [testJarReader] # see bug 738890

Do you have insight into any of these neglected tests? Want to see them enabled ASAP? Or is it time to remove them completely? Let me know — :gbrown on #mobile.

Firefox for Android Performance Measures – April Check-up

Tags

, ,

The April 2013 summary of performance measures for Firefox for Android.

Minor regressions in tcheck2 and tp4m. Slight improvement in “time to throbber stop”.

Talos

This section tracks Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

tcheckerboard

Simple measure of “checkerboarding”. Lower values are better.

0.0 (start of month) – 0.0 (end of month).

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

3.5 (start of month) – 4.0 (end of month).

Bug 866113 – Talos Regression tcheck2 26% on Android 2.2, Apr 26

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

12000 (start of month) – 14000 (end of month).

There is noise in this test; it’s not clear if there was an actual regression this month.

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

375 (start of month) – 375 (end of month).

tsvg_nochrome

Page load test for svg. Lower values are better.

4000 (start of month) – 3800 (end of month).

tp4m_nochrome

Generic page load test. Lower values are better.

720 (start of month) – 780 (end of month).

Bug 864637 – 11% Android Tp4 NoChrome regression on 2013-04-18

tp4m_main_rss_nochrome

88000000 (start of month) – 90000000 (end of month).

Gradual regression — no bug.

tp4m_shutdown_nochrome

25000000 (start of month) – 25000000 (end of month).

ts

Startup performance test. Lower values are better.

3800 (start of month) – 3800 (end of month).

ts_shutdown

Shutdown performance test. Lower values are better.

25000000 (start of month) – 25000000 (end of month).

Throbber Start / Throbber Stop

These graphs are taken from http://mrcote.info/phonedash/#/.  Browser startup performance is measured on real phones (a variety of popular devices).

“Time to throbber start” measures the time from process launch to the start of the throbber animation. Smaller values are better.

Image

“Time to throbber stop” measures the time from process launch to the end of the throbber animation. Smaller values are better.

Image

Eideticker

These graphs are taken from http://eideticker.wrla.ch. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result.

More info at: https://wiki.mozilla.org/Project_Eideticker

Image

Image

Image

Image

Image

Image

Image

Image

awsy

See https://www.areweslimyet.com/mobile/ for content and background information.

Image

Image

Image

Firefox for Android Performance Measures – March Check-up

Tags

, ,

The March 2013 summary of performance measures for Firefox for Android.

No Talos regressions this month. RSS measurements are improving.

Talos

This section tracks Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

tcheckerboard

Simple measure of “checkerboarding”. Lower values are better.

0313tcheck

0.23 (start of month) – 0.0 (end of month).

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

0313tcheck2

3.47 (start of month) – 3.47 (end of month).

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

0313tpan

12000 (start of month) – 12000 (end of month).

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

0313provider

375 (start of month) – 375 (end of month).

tsvg_nochrome

Page load test for svg. Lower values are better.

0313tsvg

4020 (start of month) – 4000 (end of month).

tp4m_nochrome

Generic page load test. Lower values are better.

0313tp

720 (start of month) – 720 (end of month).

tp4m_main_rss_nochrome

0313tp-rss

95000000 (start of month) – 88000000 (end of month).

Several improvements

tp4m_shutdown_nochrome

0313tp-shutdown

28000000 (start of month) – 25000000 (end of month).

This “improvement” is bogus — see bug 797339.

ts

Startup performance test. Lower values are better.

0313ts

3400 (start of month) – 3800 (end of month).

There is significant noise in this test; it’s not clear if there was an actual regression this month.

ts_shutdown

Shutdown performance test. Lower values are better.

0313ts-shutdown

28000000 (start of month) – 25000000 (end of month).

This “improvement” is bogus — see bug 797339.

Throbber Start / Throbber Stop

These graphs are taken from http://mrcote.info/phonedash/#/.  Browser startup performance is measured on real phones (a variety of popular devices).

This month’s graphs are incomplete (end early) but reflect the best data available at time of writing.

“Time to throbber start” measures the time from process launch to the start of the throbber animation. Smaller values are better.

0313throbstart

“Time to throbber stop” measures the time from process launch to the end of the throbber animation. Smaller values are better.

0313throbstop

Eideticker

These graphs are taken from http://eideticker.wrla.ch. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result. More info at: https://wiki.mozilla.org/Project_Eideticker

0313eide1

0313eide2

0313eide3

0313eide4

0313eide5

0313eide6

0313eide7

0313eide8

0313eide9

0313eide10

0313eide11

awsy

See https://www.areweslimyet.com/mobile/ for content and background information.

awsy-mobile had a difficult month; we are re-generating the March graph data now. I’m going to skip these graphs this month since they are not quite ready yet.

Screenshots for Robocop

Tags

, ,

Robocop supports UI testing of Firefox for Android using Robotium. With the changes in bug 854549, Robocop now automatically generates a screenshot when a test fails, so you can see what was happening at the moment the test failed.

I am hoping that this will help diagnose intermittent test failures, especially those that seem to only happen during automated testing.

With these changes, robocop takes a screenshot whenever a test assertion fails or a test throws an exception. It does that with Robotium’s Solo.takeScreenshot(), which simply generates a jpg. Unfortunately, takeScreenshot is confused by our content view: Instead of capturing the displayed web page, it shows a big black rectangle. That’s going to be a big disappointment when debugging some failures, and I hope we can find a solution to this limitation, but at least we can see chrome — and a lot of tests fail on the awesome screen or when interacting with menus or other Java UI elements.

The screenshot jpg is written to /mnt/sdcard/Robotium-Screenshots on the device. To allow this to work on tbpl, the robocop framework pulls the screenshot, base64 encodes it, and dumps it to the log. When debugging robocop failures on tbpl, look for “SCREENSHOT” in the log. You should see something like:

SCREENSHOT: data:image/jpg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAEB ...

Copy/paste the base64 text to a file and decode it:

base64 -d myfile > screenshot.jpg

And now you can view your screenshot:

firefox screenshot.jpg

bookmark-fail

This is still a little rough. One of these days I hope we can:

- capture content…maybe use a gecko screenshot?

- display the screenshot directly off of tbpl, similar to the reftest analyzer (see bug 747440)

Firefox for Android Performance Measures – February Check-up

Tags

, , ,

The February 2013 summary of performance measures for Firefox for Android.

Note modest improvements on tcheck2 and ts. A new RSS regression was found with awsy.

Talos

This section tracks Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

I am not including the actual graphs this month — there’s nothing very interesting to show.

tcheckerboard

Simple measure of “checkerboarding”. Lower values are better.

0.00 (start of month) – 0.023 (end of month).

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

5.0 (start of month) – 3.47 (end of month).

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

12000 (start of month) – 12000 (end of month).

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

375 (start of month) – 375 (end of month).

tsvg_nochrome

Page load test for svg. Lower values are better.

4100 (start of month) – 4020 (end of month).

tp4m_nochrome

Generic page load test. Lower values are better.

730 (start of month) – 720 (end of month).

tp4m_main_rss_nochrome

96000000 (start of month) – 95000000 (end of month).

Gradual improvements

tp4m_shutdown_nochrome

28000000 (start of month) – 28000000 (end of month).

ts

Startup performance test. Lower values are better.

4000 (start of month) – 3400 (end of month).

ts_shutdown

Shutdown performance test. Lower values are better.

28000000 (start of month) – 28000000 (end of month).

Throbber Start / Throbber Stop

These graphs are taken from http://mrcote.info/phonedash/#/.  Browser startup performance is measured on real phones (a variety of popular devices).

“Time to throbber start” measures the time from process launch to the start of the throbber animation. Smaller values are better.

This month’s graphs are incomplete (start late) due to maintenance on the phonedash system.

0213-throbber-start

“Time to throbber stop” measures the time from process launch to the end of the throbber animation. Smaller values are better.

0213-throbber-stop

Eideticker

These graphs are taken from http://eideticker.wrla.ch. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result. More info at: https://wiki.mozilla.org/Project_Eideticker

0213eide1

0213eide2

0213eide3

0213eide4

0213eide5

0213eide6

0213eide7

0213eide8

0213eide9

0213eide10

0213eide11

Note: Cold startup for Galaxy Nexus was unavailable — here are the results for LG-P999 instead.

awsy

See https://www.areweslimyet.com/mobile/ for content and background information.

0213awsy1

Significant regression on Feb 25 tracked in bug 846832.

0213awsy2

0213awsy3

Firefox for Android Performance Measures – January Check-up

Tags

, , ,

The January 2013 summary of performance measures for Firefox for Android. This month’s highlights:

- new tool to track memory use: https://www.areweslimyet.com/mobile/

- improvements in eideticker’s “Cold startup to about:home” and tp4m_nochrome

- gradual regression in tp4m_main_rss_nochrome

Talos

This section shows Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

Some of these graphs are difficult to read because the scale of the y-axis is completely inappropriate; we’re *still* working on that (see bug 705293).

tcheckerboard

Simple measure of “checkerboarding”. Lower values are better.

0113tcheck

0.03 (start of month) – 0.00 (end of month).

Improvement -> 0.00 (Jan 30)

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

0113tcheck2

5.0 (start of month) – 5.0 (end of month).

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

0113tpan

12000 (start of month) – 12000 (end of month).

Improvement -> 8000 (Jan 9)

Regression -> 12000 (Jan 19) Bug 833000

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

0113tprov

375 (start of month) – 375 (end of month).

tsvg_nochrome

Page load test for svg. Lower values are better.

0113tsvg

4100 (start of month) – 4100 (end of month).

tp4m_nochrome

Generic page load test. Lower values are better.

0113tp4

880 (start of month) – 730 (end of month).

Gradual improvement + bug 831123

tp4m_main_rss_nochrome

0113tp4rss

88000000 (start of month) – 96000000 (end of month).

Gradual regression + bug 836429

tp4m_shutdown_nochrome

0113tp4shutdown

28000000 (start of month) – 28000000 (end of month).

ts

Startup performance test. Lower values are better.

0113ts

4200 (start of month) – 4000 (end of month).

ts_shutdown

Shutdown performance test. Lower values are better.

0113tsshutdown

28000000 (start of month) – 28000000 (end of month).

Throbber Start / Throbber Stop

These graphs are taken from http://mrcote.info/phonedash/#/.  Browser startup performance is measured on real phones (a variety of popular devices).

“Time to throbber start” measures the time from process launch to the start of the throbber animation. Smaller values are better.

0113-throbberstart

“Time to throbber stop” measures the time from process launch to the end of the throbber animation. Smaller values are better.

0113throbberstop

Regression in time to throbber stop on Jan 3: bug 827361.

Regression in time to throbber stop on Jan 23 (intermittent, Nexus S): bug 836886.

Eideticker

These graphs are taken from http://eideticker.wrla.ch. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result. More info at: https://wiki.mozilla.org/Project_Eideticker

0113eide1 0113eide2 0113eide3 0113eide4 0113eide5 0113eide6 0113eide7 0113eide8 0113eide9 0113eide10 0113eide11

awsy

Beginning this month, we start tracking awsy (are we slim yet?) for mobile. See https://www.areweslimyet.com/mobile/ for content and background information.

0113awsy1 0113awsy2 0113awsy3

Firefox for Android Performance Measures – December Check-up

Tags

, , ,

The December 2012 summary of performance measures for Firefox for Android. This month’s highlights:

- 2 minor regressions in throbber start/stop tests

- regression in eideticker’s “Cold startup to about:home” test

- improvement in Talos tsvg_nochrome

Talos

This section shows Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

Some of these graphs are difficult to read because the scale of the y-axis is completely inappropriate; we’re *still* working on that (see bug 705293).

tcheckerboard

Simple measure of “checkerboarding”. Lower values are better.

tcheck

0.03 (start of month) – 0.03 (end of month).

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

tcheck2

5.0 (start of month) – 5.0 (end of month).

Temporary regression: bug 820603 (Dec 10, backed out Dec 13)

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

tpan

15000 (start of month) – 15000 (end of month).

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

tprovider

375 (start of month) – 375 (end of month).

tsvg_nochrome

Page load test for svg. Lower values are better.

tsvg

7000 (start of month) – 4100 (end of month).

Temporary regression: Bug 820603 (Dec 10, backed out Dec 13)

Improvement: Bug 822755? (Dec 26)

tp4m_nochrome

Generic page load test. Lower values are better.

tp4m

880 (start of month) – 880 (end of month).

tp4m_main_rss_nochrome

tp4m_b

84000000 (start of month) – 88000000 (end of month).

tp4m_shutdown_nochrome

tp4m_c

28000000 (start of month) – 28000000 (end of month).

ts

Startup performance test. Lower values are better.

ts

3600 (start of month) – 4200 (end of month).

ts_shutdown

Shutdown performance test. Lower values are better.

ts_b

28000000 (start of month) – 28000000 (end of month).

Throbber Start / Throbber Stop

These graphs are taken from http://mrcote.info/phonedash/#/.  Browser startup performance is measured on real phones (a variety of popular devices).

“Time to throbber start” measures the time from process launch to the start of the throbber animation. Smaller values are better.

throbber_start

“Time to throbber stop” measures the time from process launch to the end of the throbber animation. Smaller values are better.

throbber_stop

Regression in time to throbber start on Dec 5: bug 819984.

Regression in time to throbber start and time to throbber stop on Dec 22: bug 825612.

Eideticker

These graphs are taken from http://eideticker.wrla.ch. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result. More info at: https://wiki.mozilla.org/Project_Eideticker

eide01

eide02

eide03

eide04

eide05

eide06

eide07

eide08

eide09

eide10

eide11

Regression in second half of December: bug 825612.

Mobile Firefox Performance Measures – November Check-up

Tags

, , ,

The November episode in the popular series…

Talos

This section shows Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

Some of these graphs are difficult to read because the scale of the y-axis is completely inappropriate; we’re *still* working on that (see bug 705293).

tdhtml_nochrome

This test has been retired and will no longer be reported.

tcheckerboard

Simple measure of “checkerboarding”. Lower values are better.

rck

20 (start of month) – 0.03 (end of month).

Improvement: bug 814437 (Nov 29). ** This bug changed the way we measure these results. **

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

rck2

1600 (start of month) – 9.5 (end of month).

Regression: bug 814437 / 783368 (Nov 21).

Improvement: bug 802400 (Nov 26).

Improvement: bug 814272 (Nov 28).

Improvement: bug 814437 (Nov 29). ** This bug changed the way we measure these results. **

tcheck3

This test has been retired and will no longer be reported.

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

rp

642000 (start of month) – 15000 (end of month).

Temporary improvement: bug 809199, 810933 (Nov 8, Nov 12).

Improvement: bug 802400 (Nov 26).

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

rpr

375 (start of month) – 375 (end of month).

tsvg_nochrome

Page load test for svg. Lower values are better.

tsvg

7000 (start of month) – 7000 (end of month).

tp4m_nochrome

Generic page load test. Lower values are better.

tp4m

830 (start of month) – 880 (end of month).

tp4m_main_rss_nochrome

tp4m-rss

84000000 (start of month) – 84000000 (end of month).

tp4m_shutdown_nochrome

tp4m-shutdown

25000000 (start of month) – 28000000 (end of month).

“Regression”: Bug 809190 (Nov 5).

We learned a lot about shutdown tests this month: see bug 797339.

ts

Startup performance test. Lower values are better.

ts

3100 (start of month) – 3100 (end of month).

ts_shutdown

Shutdown performance test. Lower values are better.

ts_shutdown

25000000 (start of month) – 28000000 (end of month).

“Regression”: Bug 809190 (Nov 5).

We learned a lot about shutdown tests this month: see bug 797339.

Throbber Start / Throbber Stop

These graphs are taken from http://mrcote.info/phonedash/#/.  Browser startup performance is measured on real phones (a variety of popular devices).

“Time to throbber start” measures the time from process launch to the start of the throbber animation. Smaller values are better.

throbber_start

“Time to throbber stop” measures the time from process launch to the end of the throbber animation. Smaller values are better.

throbber_stop

Eideticker

These graphs are taken from http://eideticker.wrla.ch. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result. More info at: https://wiki.mozilla.org/Project_Eideticker

eide1

eide2

eide3

eide4

eide5

eide6

eide7

eide8

eide9

eide10

eide11

Running Firefox for Android in the Android-x86 emulator

Tags

, ,

There is on-going work to make Firefox for Android available for x86 hardware — see bug 802527, for instance. Still, I found it is now possible to build, install, and run without much difficulty. I don’t have an x86 Android device, so I installed and tested on the android-x86 emulator.

Here are some notes on my experience.

SDK, NDK, and Emulator setup

I updated my Android SDK and NDK installations to:

  • Android 4.1.2 (API 16) SDK
  • android-ndk-r7b

I understand that r7b is the minimum required NDK version. Using r8b creates a minor complication — see bug 802527.

Using the “android” setup program, I installed an x86 image: “Intel x86 Atom System Image”.

Then I created an AVD to use the x86 image (Tools / Manage AVDs… / New…):

Be sure to:

  • select the x86 image
  • enable GPU emulation
  • give it lots of RAM.

Build for x86:

You just need the right mozconfig! I had trouble with the mozconfigs I found at mobile/android/config/mozconfigs/android-x86, so I made my own:

. “$topsrcdir/mobile/android/config/mozconfigs/common”
ac_add_options –enable-application=mobile/android
ac_add_options –target=i686-linux-android
ac_add_options –with-android-ndk=”/media/extra/android-ndk-r7b”
ac_add_options –with-android-sdk=”/home/mozdev/android-sdk-linux_x86/platforms/android-16″
ac_add_options –with-android-platform=”/media/extra/android-ndk-r7b/platforms/android-14/arch-x86″
ac_add_options –with-android-version=16
ac_add_options –with-system-zlib
# IonMonkey disabled in bug 789373
ac_add_options –disable-ion
ac_add_options –with-branding=mobile/android/branding/nightly
ac_add_options –with-ccache=/usr/bin/ccache
mk_add_options MOZ_OBJDIR=/media/extra/objdir-x86-droid
mk_add_options MOZ_MAKE_FLAGS=”-j9 -s”

Now build and package as normal: make -f client.mk, etc.

Start the emulator with the configured AVD:

mozdev:~$ emulator -avd gyb16x86b

Install Fennec on the emulator:

mozdev:/media/extra/objdir-x86-droid$ adb devices
List of devices attached
03700149427f5657    device
emulator-5554    offline
mozdev:/media/extra/objdir-x86-droid$ adb -e install dist/fennec*.apk
4210 KB/s (20333055 bytes in 4.715s)
pkg: /data/local/tmp/fennec-19.0a1.en-US.android-i686.apk
Success

Launch Firefox for Android on the emulator:

Follow

Get every new post delivered to your Inbox.