I am wondering if future versions will incorporate the mesh compression techniques described in yur paper "Memory Efficient Ray Tracing with Hierarchical Mesh Quantization." Also, I have compared Embree to the Intel IPP Ray Tracing KD-Tree, both implemented in my ray tracer. For primary rays (roughly 1-million rays) versus an aircraft model with 100,000 triangles the two approaches gave very similar performance times (roughly 0.1 seconds). Hardly a comprehensive comparison, but interesting.
For more complete information about compiler optimizations, see our Optimization Notice.