Hello Westminster,
You take care of almost all of things for filter: image size, rowStrides, image border, components.
About failed at the function
// apply filter
retval = ippiConvFull_32f_C3R(
&scratch.pixels[scratch.StepBytes/sizeof(Ipp32f) * border + border * 3], scratch.StepBytes, dims,//scratch.dims,
krnl.pixels, krnl.StepBytes, krnl.dims,
input32f.pixels, input32f.StepBytes);
I guess, it is because the size of your filter result image ( input32f) is not expected.
let's assume your original image pImageP is 100x20, thus the input32f is 100x20 32f image.
your kernal image is 3x3 32f image
Srcath image is your temp image with border (6), it is 100+2x6, 20+2x6 112x32 32f image.
according to your input, (here enter dim is right than scratch.dims) and the ConvFull Result should be
Mh = Mf + Mg - 1 and Nh = Nf + Ng - 1, you will get a dim (100x20)+ kernal(3x3)-1 = 102x22 image.
But the image input32 is only 100x20. So the memory access will invalid.
If you want to the result is 100x20, then you may consider change the parameter dim, set roiDim = 98x18.
But why you perfer to use ippiConvFul to implement the filter? Besides the function ippiFilterShapen, there are general filter functions such as ippiFilter_32f_C3R()/ippiFilter32f_8u_C3R () and so on. The functions allow you to set any kernal and kernal size, dst size. You may check them in ipp manual ippiman.pdf and see if they can meet your request.
Here is KB talking about ippiFilter for your reference
http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-processing-an-image-from-edge-to-edge/Regards,
Ying