It is very nice to have this forum. I'm a fresh on the ISA Extension and expect to have your insight:)
My code snippet, which conducts a convolution computing, is attached as a figure. and here is my confusing issue:
Time was consumed hugely when I tried to assign the computed result to image buffer. Computing time of extension sets(line 512~544) only takes about 7~8ms, but the assign work(line 548) takes about 25~26ms.