Aggregation into function groups enables you to decide on what level of detail to look at the threads or thread groups' activity. In many cases it might be enough to see that a code spends some percent of its time in MPI without knowing in which particular function. In some cases optimizing the serial parts of the program might seem more rewarding than optimizing the communication structure.
However, if the fraction spent in MPI exceeds the expectation, then it is interesting to know in which particular MPI call the time was spent. Function grouping allows exactly this shift in perspective by ungrouping the function group MPI.
While the argumentation given in Thread Aggregation for having nested thread groups may not be that compelling, the reason for having nested function groups comes quite clear as soon as there occur nested modules, classes and/or name spaces.
Provided that there are adequate function groups, it is also much easier to categorize code by library or by author. In this way, it is possible to concentrate precisely on the code that is considered tunable while code that is controlled by third parties is aggregated into coarse categories.
Selecting a function group generally results in displaying the information for the group's children. That is the reason why single functions cannot be selected for aggregation.
In the case of timelines, certain events may not be visible at all times. It does not necessarily mean that they are not there. It can happen because the finer your grouping is, the less time is spent in each individual function/group. On aggregating over processes in a time interval where some processes are idle, nothing is displayed because of the idle state of the processes. Zooming in helps to see these events better. For a better overview, check one of the corresponding profiles. If the timeline and the profile seem to contradict, then the information from the profile is more precise.