After speaking to one of the MKL developers recently I was wondering whether it might be beneficial to add some new functionality to MKL in order to compute triangular-triangular matrix products. As triangular matrices form a subgroup, the result will always remain triangular and can therefore be computed highly efficiently by performing only the minimal number of flops required.
In particular, I work with matrix functions where we often require powers of the Schur factor of a matrix, which is triangular. This would be beneficial for anyone wishing to compute a polynomial or rational function of a matrix, for instance. In particular this would be used extensively to compute the logarithm, powers, and trigonometric functions of a matrix (see http://eprints.ma.man.ac.uk/2431/01/covered/MIMS_ep2016_3.pdf for a list of software that could potentially benefit from such a specialized routine). These algorithms are used in various applications including the solution of PDEs and in network analysis etc.
This new routine could perhaps have a similar calling sequence to <?>GEMMT (https://software.intel.com/en-us/node/590135) since this is targeted at solving a similar problem.
I am sure there are plenty of other applications that I currently don't know about that would highly benefit from this functionality. If you know of any, please leave a comment so that the developers of MKL receive some feedback and can consider implementing this extension.