User experience is critical for mobile devices to win the market. To tune a device's software stack for great experiences, the engineer needs to understand system state transitions during the interactions. The required skill set is not completely similar to traditional performance tuning, where the hotspots are the major target to attack. Needed toolkits are different as well. In this document, we introduce UXtune, an engineering toolkit for engineers to optimize Android user experience on Intel® devices. It includes the touch inputs generation tool, the system interaction visualization tool, and the metrics output tools. We have found the UXtune toolkit to be quite useful in our daily Android optimization work.
User experience optimization methodology and toolkit
User experience with the device involves a sequence of interactions as shown in Figure 1. The interactions basically have four parts.
- Inputs to the device from the user, sensor, and network, etc. This category evaluates inputs to trigger the device to action accurately as expected. For touch screen inputs, it measures the touch speed, pressure, range, etc.
- Device response to the inputs. This category evaluates how responsive the device is to the inputs.
- System state transition. This category evaluates how smoothly the graphics transition on the screen. It can be a follow up to the device response to some input.
- Continuous control of the device. People operating the device not only provide input, but sometimes control the graphic objects in the screen, such as the control of a game jet-plane, or dragging an application icon. The category evaluates the controllability of the device.
Among them, "Inputs to the device" and "control of the device" are related to the user experience aspect of how a user controls a device. "Device response to the inputs" and "system state transition" is related to the aspect of how the device reacts to the user. Each of these should be good enough for users to be satisfied with the experience. We use following steps for our optimization methodology:
Step 2. Define the software stack scenarios and metrics that transform the user experience issue into a software symptom
Step 3. Develop a software workload to reproduce the issue in a measureable and repeatable way. The workload reports the metric values that reflect the user experience issue.
Step 4. Use the workload and related tools to analyze and optimize the software stack. The workload also verifies the optimization.
Step 5. Get feedback from the users and try more applications with the optimization to confirm the user experience improvement.
Based on this methodology, we have established a systematic working model on Android user experience optimization. In this working model, there are four key components:
- An optimization methodology The overall understanding of user experience and its relation with the software stack
- Android workload suite (AWS) We have developed a workload suite AWS that includes almost all the typical use cases of Android device (except the communications part)
- Android UXtune toolkit We have developed a toolkit that assists the user interaction analysis in the software stack. Different from the traditional performance tuning tools, UXtune correlates the user visible events and the system low-level events.
- Sightings and feedbacks. This is important for the software engineering team, because these inputs can complement our methodology as the subjective side of user experience.
In this document, we focus on bullet 3, UXtune toolkit.
There are close relation and subtle differences between workloads and tools. Workloads are to characterize the representative usage model of the system, while tools analyze system behavior. A tool itself does not represent a use case of the device, but analyzes the use case. At the same time, the common part of multiple workloads can be abstracted into a tool so as to be reused across the workloads. UXtune toolkit provides following tools:
- Generate repeatable events to be used by a workload as input:
- Android input-Gestures: Generate event sequences for touch gestures.
- Android input-Sensors (under development): Generate event sequences for sensor inputs.
- Android system event-log post-processing and visualization:
- Android UXtune: Visualize the system components interactions.
- Measurement tools:
- Android meter-FPS: Measure the FPS value of the system drawing.
- Android app-launch: Measure the launch time of the applications.
- Android touch-pressure: Measure the platform pressure value resolution.
The following sections describe the Input-Gestures, UXtune, and meter-FPS tools in details.
The Input Tool: Android Input-Gestures
Lots of user interactions with Android devices are through touch gestures. To make the process automatic, engineers need a tool to input the gestures for them. More importantly, the engineers expect the inputs to be repeatable; otherwise, the data of every measurement could be different due to manual operations variance. Input-Gestures is a tool that can generate the event sequence for common touch gestures, and can also inject event sequence into the device without attendance. The typical touch gestures supported by input-Gestures are:
- Scroll: up/down/left/right from specified start position to specified end position in specified time
- Fling: up/down/left/right at specified position
- Zoom: in/out at specified position with specified span
- Tap (and double taps): at specified position
- Long press: at specified position for specified duration
- Multi-touch: variant of the common gestures
When the input-Gestures tool injects the event sequence to a device, the raw events are written to the device event driver (i.e., the /dev/event file). Then the raw events are sent to the framework, and are transformed into motion events such as ACTION_DOWN, ACTION_MOVE, ACTION_UP. The motion events are finally dispatched to the application, and the application detects the sequence as touch gestures, such as scroll, fling, etc.
With manual touch, the execution path of the touch events has two more stages than the input-Gestures, as shown in Figure 2. The two stages are the "touch sensor sampling" and the "input driver delivery" stages at the beginning. Total touch processing latency with manual touch has both physical latency and software latency. Since the touch sensor's sampling rate is usually higher than 200Hz and the software latency is usually about 100ms, the physical latency is very small and can be negligible in total touch latency measurement.
One example of why the input-Gestures tool is important for user experience analysis is shown in Figure 3. The chart shows touch event coordinates recorded with a manual scroll down gesture. The X values keep almost constant, and the Y values decrease monotonically. We can see from the chart that there is a short duration (about 50ms) at the beginning of the scroll process where both X and Y values are almost constant. This 50ms should not be counted as part of the scroll gesture, because there is no finger move in this period. This is a disadvantage of manual touch. When the input-Gestures tool generates scroll gesture, it does not generate the leading constant events, hence avoiding the inaccurate response time measurement.
Another example of the tool's importance for user experience analysis is that, even when an engineer can operate the same device quite consistently, it is hard for the engineer to operate consistently different devices that have different screen size. Input-Gestures can transform a gesture in one device to another device according to screen size differences. Figure 4 shows the raw event sequences on two difference devices for same gesture. (3284, 2747) are the (X, Y) values of a touch event on Device 1, which corresponds to (1810, 1515) on Device 2.
Input-Gestures tool has to be adapted to support a new device according to its screen size, resolution, and event format.
The processing Tool: Android UXtune
In traditional performance tuning, the tool usually counts the total number of the events during the workload execution, as engineers care mostly about the hotspots. For user experience tuning, this is not enough, because a user interaction process usually involves several threads, processes, and I/Os, etc. It is a little bit similar to the scalability analysis of enterprise applications in multiple core systems, where the engineer needs to have the information of the threads one-way and mutual synchronization events, including their types, time stamps, and the synchronizing entities. Figure 5 shows the difference in event logging mechanism between the traditional performance tuning and user experience tuning.
Similar to the multi-core scalability analysis, the key in user experience analysis is to understand the state transitions, or the events that trigger the transitions. To assist the engineer's analysis, UXtune tries to map the system events across layers to user-level activities, such as events, gestures and frames; UXtune also tries to correlate the runtime activities between different system entities, such as that a thread triggers a garbage collection, etc. Then UXtune visualizes both the vertical correlation and the horizontal correlation according to the time line. Figure 6 show an example of UXtune usage. It shows all the system entities' states in different levels including CPU, OS, Android, and user level activities.
In the future, the information of other IP blocks such as GPU should be included as well.
Here I provide a case study to demonstrate how UXtune helps Android optimizations. Figure 7 shows the CPU status when running the benchmark CaffeinMark. It only shows a period of full execution, about 20% of the total time. The other 80% of execution does not have those intensive grey spots. The grey spots idle time is about 20% of this showed period. So the performance impact of the CPU idle time is about 20% x 20% = 4%.
In our further analysis with UXtune, we found the idle time happened during garbage collections (GC). Further analysis revealed that, the Android GC design led to the CPU idle time, which happened when GC needed to suspend the application threads for root set enumeration. Figure 8 shows the possible scenarios of the thread interactions between the GC thread (collector) and the application thread (mutator).
In a bad scenario, there is a period when both the collector and the mutator are suspended, hence the idle CPU. The detailed steps are:
- GC thread sets a flag asking app thread(s) to suspend for GC root set enumeration
- GC thread checks if app is suspended. If not yet, GC thread yields to let app run to suspend
- GC thread comes back to check again. If not, GC thread sleeps for 10ms
- App is suspended at some time point (Both GC and app sleep, hence possible CPU idle)
- GC thread wakes up, finishes root enumeration, and lets app resume
UXtune confirmed our observation as shown in Figure 9. The thread "com.android.cm3" is the application thread, and the thread "GC" is the GC thread. We can see that both threads sleep in the beginning of a GC cycle.
To optimize the CPU idle issue, we modified the GC algorithm, so that in the GC bad scenario, the 10ms sleeping was replaced with a CPU-yielding action. In this way, the GC thread just gives up the CPU to other threads instead of going into sleep when the application thread is not suspended. The scenarios with modified GC algorithm are shown in Figure 10, where the CPU is always active throughout the root set enumeration process.
The detailed steps are with the optimized GC algorithm are:
- GC thread notifies the app thread(s) to suspend
- GC thread checks if app is suspended. If not yet, goto Step 3. If yes, goto Step 4
- GC thread yields to let app run to suspend. When GC thread comes back, goto Step 2
- App is suspended at some time point. If no other thread, the GC thread should be scheduled to run
- GC thread finishes root enumeration, then lets app resume. GC thread continues collection concurrently
With the optimization, the result of CaffeinMark did improve 4% in our measurement. This example shows that UXtune is not only useful to user experience tuning, but also to performance tuning. After all, user experience is a superset of performance.
The Output Tool: Android Meter-FPS
Frame related data is most important for user experience measurement, such as the FPS (frame-per-second), maximal frame time, frame time variance, etc. Android Meter-FPS is a tool for this information. It shows data on screen in real time. There are two issues to consider in Meter-FPS design:
First, the device screen might show multiple visible areas at the same time, each of which has individual FPS data. For example, a camera application can have three areas in the screen like the preview window, the control panel and the status bar. Meter-FPS gives the FPS information of all three areas so that the engineer can choose what is really needed.
Second, sometimes the screen can be refreshed by the window manager instead of the application. In this situation, the tool should be able to get the FPS data at the lower level than the application. Meter-FPS gives the information at the surfaceFlinger component that can catch the entire window manager's refreshing.
What Meter-FPS really does is intercept the graphics processing paths in the system to get the logs of the every frame, and then computes the FPS data and outputs. For overlaid video playback, the Android media framework is instrumented to give the frame logs. Figure 11 shows real examples of Android meter-FPS running.
The figure shows three screen shots, and the FPS values are shown in the black box on top of the application. The "Fruit FPS" is the FPS value of the game application, and the "SystemUI FPS" is the FPS value of the status bar. The latter updates rarely hence the FPS value of 1. We can see that, the same SDK application runs on Device A with almost equal FPS as on Device B, while the NDK version of the same application runs much faster.
The engineer can configure Meter-FPS to specify the sampling time period, black box position in the screen, etc.
With Android UXtune, we can identify the root cause of the slow execution of the SDK application on Device A, as shown in Figure 12. The execution time line of the two major threads is marked with leading stars. One is the drawing thread (surfaceFlinger) that sends the rendered data to the frame-buffer. The other is the application thread (fruit) that renders the data. The short grey areas are the periods when the threads are busy working.
From the figure, we can see that both the drawing thread and the application thread have idle periods. The CPU actually falls into sleep when both threads are idle. Deeper analysis reveals that the idle time is caused by the fact that both threads are competing for the memory bus and GPU resource.
Some online public websites have useful information on user interactions and experience.
The author thanks to his colleagues Greg Zhu and Ke Chen for their great supports in developing the methodology for Android user experience optimizations.
About the Author
Xiao-Feng Li is a software architect in the System Optimization Technology Center of the Software and Services Group of Intel Corporation. Xiao-Feng has extensive experience in parallel software design and runtime technologies. Before he joined Intel in year 2001, Xiao-Feng was a manager in Nokia Research Center. Xiao-Feng enjoys ice-skating and Chinese calligraphy in his leisure time.