by Xiao-Feng Li
Traditional performance is inadequate to characterize modern client devices. Performance is more about the steady execution state of the software stack, and usually reported with a final score of the total throughput in the processor or other subsystems. User experience is more about the dynamic state transitions of the system triggered by user inputs. It cares about the user perceivable responsiveness, smoothness, coherence, and accuracy, etc. Traditional performance could measure every link of the chain of the user interaction, while it does not evaluate the full chain of the user interaction as a whole. Thus the traditional performance optimization methodology cannot simply apply to the user experience optimization. In this document, we describe the concepts in user interactions, and introduce the methodology of quantify and optimize user interactions with Android devices.
Client Device User Interactions
In a recent performance measurement with a few market Android devices, we found there was a device X behaving uniformly worse than another device Y with common benchmarks in graphics, media, and browsing. But the user perceivable experience with the device X was better than device Y. The root reason we identified was that traditional benchmarks or benchmarks designed in traditional ways did not really characterize user interactions, but measured the computing capability (e.g., executed instructions) or the throughput (e.g., processed disk reads) of the system and the subsystems.
Take video evaluation as an example. Traditional benchmarks only measure video playback performance with some metrics like FPS (frame-per-second), or frame drop rate. This methodology has at least two problems in evaluating user experience. The first problem is that video playback is only part of the user interactions in playing video. A typical life-cycle of user interaction usually includes at least the following links: "launch player" → "start playing" → "seek progress" → "video playback" → "back to home screen". Only good performance in video playback cannot characterize the real user experience in playing video. User interaction evaluation is a superset of traditional performance evaluation.
The other problem is, using FPS as the key metric to evaluate the smoothness of the user interactions cannot always reflect good user experience. For example, when we flung a picture in the Gallery3D application, the device Y had obvious stuttering during the picture scrolling, but the FPS value of device Y was higher than that of device X. In order to quantify the difference of the two devices, we collected the data of every frame during a picture fling operation in the Gallery3D application on both device X and device Y, as shown in Figure 1and Figure 2 respectively. Every frame data is given in a vertical bar, where the x-axis is the time when the frame is drawn, and the height of the bar is the time it takes the system to draw the frame. From the figures, we can see that device X has obviously lower FPS value than device Y, but with smaller maximal frame time, less frames longer than 30ms, and smaller frame time variance. This means that, to characterize the user experience of the picture fling operation, those metrics like maximal frame time and frame time variance should also be considered.
Figure 1. Frame times of a fling operation in Gallery3D application on device X.
Figure 2. Frame times of a fling operation in Gallery3D application on device Y.
As a comparison, Figure 3 shows the frame data of a fling operation after we optimized the device Y. Apparently all the metrics have been improved and the frame time distribution became much more uniform.
Figure 3. Frame times of a fling operation in Gallery3D application on device Y after optimization.
User experience is more about dynamic state transitions of the system triggered by user inputs. It cares about the user perceivable responsiveness, smoothness, coherence, and accuracy, etc. Traditional performance could measure every link of the chain of the user interaction, while it does not evaluate the full chain of the user interaction as a whole.
Another important note is that user experience is a subjective process; just considering the experience when watching a movie or appreciating music. Current academic research uses various methodologies such as eyeball tracking, heartbeat monitoring, or just polling to understand user experience. For our software engineering purpose, in order to analyze and optimize the user interactions systematically, we categorize the interaction scenarios into four kinds:
- Inputs to the device from the user, sensor, network, etc. This category evaluates if the inputs can trigger the device to action accurately or fuzzily as expected. For touch screen inputs, it measures the touch speed, pressure, range, etc.
- Device response to the inputs. This category evaluates how responsive the device is to the inputs.
- System state transition. This category especially evaluates how smooth graphics transition on the screen. It can be a follow up of the device response to some input.
- Continuous control of the device. People operating the device not only give a single input, but sometimes also control the graphic objects in the screen, such as to control a game jet-plane, or to drag an application icon. The category is to evaluate the controllability of the device.
Among them, "inputs to the device" and "control of the device" are related to the user experience aspect of how a user controls a device. "Device response to the inputs" and "system state transition" is related to the aspect of how the device reacts to the user. We can map a user interaction life-cycle into scenarios that fall into the categories above, then for each scenario, we can identify the key metrics in the software stack to measure and optimize.
Android User Interaction Optimizations
As we have described in last section, there is no silver bullet in measuring the user experience. We setup following criteria in our measurement of the user interactions.
- Perceivable. The metric has to be perceivable by a human being. Otherwise, it is irrelevant to the user experience.
- Measureable. The metric should be measurable by different teams. It should not depend on certain special infrastructure that can only be measured by certain teams.
- Repeatable. The measured result should be repeatable in different measurements. Large deviations in the measurement mean that it is a bad metric.
- Comparable. The measured data should be comparable across different systems. Software engineers can use the metric to compare the different systems.
- Reasonable. The metric should help reason the causality of software stack behavior. In other words, the metric should be mapped to the software behavior, and can be computed based on software stack execution.
- Verifiable. The metric can be used to verify an optimization. The measured result before and after the optimization should reflect the change of the user experience.
- Automatable. For software engineering purpose, we expect the metric can be measured largely unattended. This is especially useful in regression test or pre-commit test. This criterion is not strictly required though, because it is not directly related to user experience analysis and optimization.
Guided by the measurement criteria, we focus on the following complementary aspects of the user experience.
- How a user controls a device. This aspect has mainly two measurement areas.
- Accuracy/fuzziness. It evaluates what accuracy, fuzziness, resolution, and range are supported by the system for inputs from the touch screen, sensors, and other sources. For example, how many pressure levels are supported by the system, how the sampled touch events' coordinates are close to the fingertip move track on the screen, how many fingers can be sampled at the same time, etc.
- Coherence. It evaluates the drag lag distance between the fingertip and the dragged graphic object in the screen. It also evaluates the coherence between the user operations and the sensor-controlled objects, e.g., the angle degree difference between the tilting controlled water flow and the device oblique angle.
- How a device reacts to a user. This aspect also has two measurement areas:
- Responsiveness. It evaluates the time between an input being delivered to the device and device showing visible response. It also includes the time spent to finish an action.
- Smoothness. This area evaluates graphic transition smoothness with the maximal frame time, frame time variance, FPS, and frame drop rate, etc. As we have discussed, FPS alone cannot tell all the user experience regarding to smoothness.
For these four measurement areas, once we identify a concrete metric to use, we need understand how this metric is related to "good" user experience. Since user experience is a subjective topic that highly depends on human being's physiological status and personal taste, there is not always scientific conclusion about what value of a metric constitutes "good" user experience. For those cases, we just adopt the industry experience values. The Table 1 below gives some examples of the industry experience values.
Table 1. The example industry experience values for user experience
Due to human beings' nature, there are two notes for software engineers to pay attention to in the user experience optimizations.
The value of a metric usually has a range for "good" user experience. A "better" value than the range does not necessarily bring "better" user experience. Anything beyond the range limit could be invisible to the user.
The values here are only rough guidelines for common cases with common people. For example, a seasoned game player may not be satisfied with the 120fps animation. On the other hand, a well designed cartoon may bring perfect smoothness with 20fps animation.
Now we can setup our methodology for user experience optimization. It can be summarized into following steps.
Step 2. Define the software stack scenarios and metrics that transform user experience issue into a software symptom
Step 3. Develop a software workload to reproduce the issue in a measureable and repeatable way. The workload reports the metric values that reflect the user experience issue.
Step 4. Use the workload and related tools to analyze and optimize the software stack. The workload also verifies the optimization.
Step 5. Get feedbacks from the users and try more applications with the optimization to confirm the user experience improvement.
Based on this methodology, we have established a systematic working model on Android user experience optimization. In this working model, there are four key components:
- An optimization methodology. This has been described.
- Android workload suite (AWS). We have developed a workload suite AWS that includes almost all the typical use cases of Android device (except the communications part)
- Android UXtune toolkit. We have developed a toolkit that assists user interaction analysis in the software stack. Different from the traditional performance tuning tools, UXtune correlates the user visible events and the system low-level events.
- Sightings and feedbacks. This is important for the software engineering team, because these inputs can complement our methodology as the subjective side of user experience.
Table 2 shows the AWS 2.0 workloads. The use cases were selected based on our extensive survey in the mobile device industry, market applications, and user feedbacks. AWS is still evolving based on user feedbacks and Android platform changes.
Table 2. Android workload suite (AWS) v2.0
The Android UXtune toolkit includes three kinds of tools.
- Generate repeatable inputs to operate the device. Currently Android input-Gestures tool is available for touch gesture input generation, so that engineers do not need to operate the device manually. A tool for sensor input generation is under development.
- Visualize the system interactions between the software components, such as event, frame, thread, etc. Currently Android UXtune 1.0 is available. Integration with PMU (performance monitoring unit) events is under development.
- Extract important user experience metrics. Currently Android meter-FPS, Android app-launch, and Android touch-pressure are available to get the system FPS value, application launch time, and touch pressure resolution respectively.
The details of AWS and UXtune are described in other whitepapers.
A Case Study of Android User Interaction Optimization
We use the "Drag" example to illustrate our Android optimization methodology.
According to our optimization methodology, the first step is to identify the user interaction issue. When a user drags an icon in the screen, there is usually a lag distance between the icon and the finger tip, which is especially obvious when the finger moves fast.
As shown in the Figure 4, at time T0, the finger starts to drag the icon at position P0. When the finger moves to position P1 at time T1, the icon starts to move. At time T2, when the icon moves to position P1, the finger is at position P2. We can see the lag distance between the icon and the fingertip. To summarize:
- T0: the time when the finger starts to drag the icon at position P0;
- P1: the position where the icon starts to move at T1;
- T2: the time when the icon reaches P1;
- P2: the position where the finger touches at time T2.
Figure 4. Drag lag distance.
The next step is to identify the scenario and metrics in the software execution that can expose and characterize the drag lag distance issue. In this example, we can choose the scenario to drag an application icon in the Android home-screen. One metric can be the distance between position P0 and P1, i.e., P1 - P0. An alternative or complementary metric to the distance metric can be the time for the icon to start moving upon the drag operation, i.e., T1 – T0. The time metric could be better for the distance metric depending on the distance representation, because the physical distance (in inch) is different from the pixel distance, thus the same pixel distance may show different visual distance effect. Another additional metric for drag distance can be the distance P2 - P1, which can also be reflected by the time for the icon to move from P0 to P1, i.e., T2 - T1. All the metrics, T1 – T0, P1 – P0, P2 – P1, and T2 – T1 are the smaller the better for user experience.
Then the third step is to construct a workload to reproduce the symptom and report the selected metrics. We construct the workload based on Android home-screen, and map the metrics computation to software stack engineering values. Figure 5 shows the implementation diagram of the workload. As Android programming convention, the majority of the workload code is in the onTouch() and onDraw() callbacks. The callback onTouch() receives a touch event as the parameter and computes the content update that should be shown in the screen. The callback onDraw() does the real drawing with the content updated by the onTouch() callback. So onTouch() has mainly a four-step state machine to deal with the touch event inputs, from the user finger pressing on the screen through the final touch release. Correspondingly, there are four steps in the onDraw() callback to provide the user with the visible state transitions in the screen. We record the frame drawing timestamps and the touch event coordinates in the third step as shown in Figure 5, where the finger keeps moving and the screen keeps showing a moving icon. Then we compute the engineering values of T1, P1, T2, and P2 as below:
- T1 = Time when Frame F1 is drawn by SurfaceFlinger
- P1 = Position value of the touch event at time T1
- T2 = Time of the frame when icon's position is P1
- P2 = Position value of the touch event at time T2
Figure 5. Construct a workload for drag lag distance optimization.
In the workload implementation, the finger move is a continuous operation, thus there is a lag distance for every finger position. We use the maximal lag distance value as the reported metric result.
The fourth step is to analyze and optimize the drag lag distance possibly with tools assistance. Figure 6 below shows clearly the root cause of the drag lag and gives a lag distance computation at one time point.
Figure 6. Analysis of the drag lag distance
The execution time chart in Figure 6 is generated by Android UXtune, the toolkit we developed for UX analysis. The comments and arrows on the chart are added by me to help the analysis. Three threads marked with starts are most relevant. They are the drawing thread (surfaceFlinger), the event thread (inputDisptacher) and the application thread that processes the event. A touch event (here Event M-1) execution path is marked with orange color arrows.
Following the orange arrows, Event M-1 with coordinates (850, 542.8) is dispatched from the event thread to the application thread. It waits in the application's message queue for handling. The application thread takes it out of the queue, and computes the frame content in its surface buffer for this event position. Then the application thread passes the surface buffer to the drawing thread. The drawing thread loads the buffer and composites it with other surfaces of the system, and then posts to the frame buffer of the display.
When Event M-1 has been fully processed with the frame drawn on the display, the event thread already dispatches Event M+1with coordinates (852, 404.6), which is the current finger's position. We can deduce the drag lag distance at this moment as 542.8 – 404.6 = 138 pixels. Note the touch event time-stamp in the event thread is not the exact time of finger touching the screen. There could be up to a few milliseconds difference. But it does not impact the drag lag distance analysis and optimization.
In order to optimize the drag coherence user experience, the most obvious approach is to reduce the time spent in the event processing path, including the time in the application thread and the drawing thread, so that the time difference between the event dispatching and the event final drawing is as small as possible. Since the processing time is mandatory for the event, this approach cannot completely eliminate the drag lag distance.
The other complementary optimization approach is more intelligent. The application tries to predict the next finger position when the frame is visible, then it draws the icon directly at that position, although the finger is not yet at that position when the application starts to draw the icon there. The algorithm idea of the finger position prediction is like below:
The application firstly computes the move speed of the finger (Speedfinger) based on the touch events it receives. Then the application computes how long the finger can move (Distancefinger) during the period of one frame processing (Timeframe), i.e., from the beginning of one frame drawing to the beginning of next frame drawing. The application predicts the finger position (NextPositionfinger) at the time when the next frame is visible. The predicted finger position is its current position (CurrPositionfinger) plus its move distance during one frame drawing (Distancefinger). When the application tries to draw the next frame, it draws the icon at the position (NextPositionicon) of the predicted finger position. The key formulae used in the computation are given below.
Timeframe = 1/FPS
Distancefinger = Speedfinger * Timeframe
NextPositionfinger = Distancefinger + CurrPositionfinger
NextPositionicon = NextPositionfinger
We have successfully developed the prediction algorithm for the Android system in Intel® platforms. Note, some throttling mechanism should be added to avoid the icon surpassing the finger.
Technical Factors That Impact User Interactions
The user experience of an Android device can be impacted by various technical factors. We list the most important ones that we have experienced in our software engineering work. This list only covers the technical design part of the user experience. It does not include the feature part of it, such as the features in hardware and software that can largely impact the user experience.
Some online public websites have useful information on user interactions and experience.
The author thanks to his colleagues Greg Zhu and Ke Chen for their great supports in developing the methodology for Android user experience optimizations.
About the Author
Xiao-Feng Li is a software architect in the System Optimization Technology Center of the Software and Services Group of Intel Corporation. Xiao-Feng has extensive experience in parallel software design and runtime technologies. Before he joined Intel in year 2001, Xiao-Feng was a manager in Nokia Research Center. Xiao-Feng enjoys ice-skating and Chinese calligraphy in his leisure time.