OProfile

This page describes how to use oprofile on Sigma 8654 board. Currently, oprofile on mips supports 24K core only.

There is a required patch to the Sigma kernel. Until this is included in Sigma’s kernel, please apply the patch.

This page assumes you know oprofile, and only need MIPS specific details. For general info, manuals, etc., please see the project page here: http://oprofile.sourceforge.net/docs

Getting started

This section of commands is typed at the Android shell prompt, using the serial console, or an adb shell:

Load the kernel module by typing:

    insmod /lib/modules/oprofile.ko

Do a simple setup (profiling by CYCLES only):

    opcontrol --setup --event=CYCLES:40000 --vmlinux=/vmlinux --kernel-range=0x84000000,0x84400000

Start profiling:

    opcontrol --start

Do something that you want to profile ….

Stop profiling:

    opcontrol --stop

Pulling samples to your host workstation

At this point, you have collected sample data and you need to move it from your target board to your host and analyze it there.

I use the python script provided at external/oprofile/opimport_pull. I have copied it to my working directory, and modified the first line from this:

    #!/usr/bin/python2.4 -E

to this:

    #!/usr/bin/python -E

Before you can run this script, you must have set up your paths by doing this at the top of your build tree:

    source build/envsetup.sh
    setpaths

Then you must make sure that you have a live ADB connection, by:

    export ADBHOST=<your_target_board_IP_addr>
    adb kill-server
    adb shell
        (log out of adb shell, once you know it works)

Then you can run the script like this:

    opimport_pull -r my_samples

That tells the script to make a subdirectory called “my_samples” (and first remove it if it already exists), and pull the samples over from your target board.

A little digression

On my system, adb pull (used inside the opimport_pull script) runs very slowly, and it takes 5 minutes to pull over the samples. I wrote a workaround for this, by reading the samples directly off the NFS server that my target board mounts its root filesystem.

See the script nfs_opimport_pull which is attached to this page. You will need to modify line 9 of this script to point to the location of your root filesystem, which must also be mounted on your host workstation. The script copies the files as root using “sudo”, because Android makes its /data directory private, and therefore only readable by root.

Sample Analysis

Finally, if all the above steps are OK, the opimport_pull script will fire off opreport to give you a summary analysis of your trace. The relevant part will look something like this:

CPU: MIPS 24K, speed 0 MHz (estimated)
Counted CYCLES events (Cycles) with a unit mask of 0x00 (No unit mask) count 40000
samples  %        app name                 symbol name
57230    37.4630  libcutils.so             /system/lib/libcutils.so
40837    26.7321  libskia.so               /system/lib/libskia.so
18923    12.3871  libwebcore.so            /system/lib/libwebcore.so
14348     9.3923  vmlinux                  /vmlinux
7862      5.1465  libc.so                  /system/lib/libc.so
7463      4.8853  libdvm.so                /system/lib/libdvm.so
1774      1.1613  oprofiled                /system/xbin/oprofiled
996       0.6520  oprofile                 /oprofile
616       0.4032  libffi.so                /system/lib/libffi.so
586       0.3836  libbinder.so             /system/lib/libbinder.so
533       0.3489  libm.so                  /system/lib/libm.so
424       0.2776  libutils.so              /system/lib/libutils.so
279       0.1826  libandroid_runtime.so    /system/lib/libandroid_runtime.so
245       0.1604  libui.so                 /system/lib/libui.so
115       0.0753  libsurfaceflinger.so     /system/lib/libsurfaceflinger.so
(truncated)

You can then call opreport with the --symbols directive to let it break down the profile by function name. You have to tell it where your symbols are located. I cd into the sample directory, and give it a command like this (note that the -p option seems to want an absolute path, so substitute your own path to top of android tree):

opreport -p /u/android/mips2-eclair/out/target/product/smp86xx/symbols/system/lib --symbols --session-dir=. | head -100

That will produce this type of detail:

samples  %        app name                 symbol name
29443    19.2822  libcutils.so             mips_memset
26507    17.3594  libcutils.so             android_memset16
14348     9.3965  vmlinux                  /vmlinux
8777      5.7481  libskia.so               S32A_D565_Opaque_Dither(unsigned short*, unsigned int const*, int, unsigned int, int, int)
8445      5.5306  libskia.so               SkRGB16_Black_Blitter::blitMask(SkMask const&, SkIRect const&)
2247      1.4716  libwebcore.so            WebCore::WidthIterator::advance(int, WebCore::GlyphBuffer*)
2079      1.3615  libskia.so               SkRGB16_Opaque_Blitter::blitMask(SkMask const&, SkIRect const&)
1774      1.1618  oprofiled                /system/xbin/oprofiled
1440      0.9431  libwebcore.so            WebCore::Font::glyphDataForCharacter(int, bool, bool) const
1165      0.7630  libc.so                  memcmp
1023      0.6700  libc.so                  free
1018      0.6667  libwebcore.so            WebCore::InlineFlowBox::paint(WebCore::RenderObject::PaintInfo&, int, int)
1007      0.6595  libwebcore.so            WebCore::InlineTextBox::paint(WebCore::RenderObject::PaintInfo&, int, int)
996       0.6523  oprofile                 /oprofile
921       0.6032  libskia.so               SkPaint::descriptorProc(SkMatrix const*, void (*)(SkDescriptor const*, void*), void*) const
861       0.5639  libwebcore.so            WebCore::GlyphBuffer::add(unsigned short, WebCore::SimpleFontData const*, float, WebCore::FloatSize const*)
778       0.5095  libdvm.so                dvmMterp_invokeMethod
717       0.4696  libskia.so               SkPicturePlayback::draw(SkCanvas&)

Finally, you can also give it a path to your kernel vmlinux file, and it will get those symbols as well. I again give it a a full path, substitute your own prefix:

opreport -p /u/android/mips2-eclair/out/target/product/smp86xx/symbols/system/lib,/u/android/kernel-sigma --symbols --session-dir=. | head -100
samples  %        app name                 symbol name
29443    19.4091  libcutils.so             mips_memset
26507    17.4736  libcutils.so             android_memset16
8777      5.7859  libskia.so               S32A_D565_Opaque_Dither(unsigned short*, unsigned int const*, int, unsigned int, int, int)
8445      5.5670  libskia.so               SkRGB16_Black_Blitter::blitMask(SkMask const&, SkIRect const&)
2247      1.4812  libwebcore.so            WebCore::WidthIterator::advance(int, WebCore::GlyphBuffer*)
2079      1.3705  libskia.so               SkRGB16_Opaque_Blitter::blitMask(SkMask const&, SkIRect const&)
1799      1.1859  vmlinux                  __do_softirq
1774      1.1694  oprofiled                /system/xbin/oprofiled
1440      0.9493  libwebcore.so            WebCore::Font::glyphDataForCharacter(int, bool, bool) const
1205      0.7943  vmlinux                  ring_buffer_consume
1188      0.7831  vmlinux                  finish_task_switch
1165      0.7680  libc.so                  memcmp
1023      0.6744  libc.so                  free

A few other notes:

Once you have taken a sample and uploaded it, if you want to take another using the same settings, do a --reset first, to clear out the old samples:

    opcontrol --reset
    opcontrol --start
    <do something interesting>
    opcontrol --stop

If you want to change the setup, must must first shutdown the oprofile daemon, and clean up:

    opcontrol --shutdown
    opcontrol --setup --event=CYCLES:50000 --event=ICACHE_MISSES:3000 --vmlinux=/vmlinux --kernel-range=0x84000000,0x84400000
    opcontrol --start
    <do something interesting>
    opcontrol --stop

The 24K has two performance counters, and they differ in capability. Please read the 24K core manual to see what you can profile, and use the “opcontrol –list” command to see the names. You can specify two “–event” specs to opcontrol, as above.

It makes the most sense to specify things that you think can be correlated to each other. The opreport function will print them side-by-side.