The scheduler works best with tree-structured task graphs, because that is where the strategy of "breadth-first theft and depth-first work" applies very well. Also, tree-structured task graphs allow fast creation of many tasks. For example, if a master task tries to create N children directly, it will take O(N) steps. But with tree structured forking, it takes only O(lg(N)) steps.
Often domains are not obviously tree structured, but you can easily map them to trees. For example, parallel_for (in tbb/parallel_for) works over an iteration space, such as a sequence of integers. Template function parallel_for uses that definition to recursively map the iteration space onto a binary tree.