Why is strcat is quicker than ippsConcat_8u?

Why is strcat is quicker than ippsConcat_8u?


I have a little question about IPP speed up.

I thought that the IPP leads to accelerate the calculation. When I compare speed "ippsConcat_8u" with the classic 'strcat' is the speed of 'strcat' the same or even higher.
Maybe I misunderstood principle, or have a bad source code..

There are some comparison speed rates IPP with 'identical' non-IPP function?

Maybe are the strings operation not suitable for speed up demonstration or I do it completely wrong...

Thanks for any advice

(Sorry for my English (Google translate is better than me :-) )

Source code and tech details are included below

I am running on openSUSE 11.2 (x86_64)
I have AMD Athlon 64 X2 Dual Core Processor 5200+ (is this the worst fail?)
I am using ipp/

CpuType = 42
CpuFeatures = 15

There is the source:
#include "ipp.h"
#include "ippcore.h"
#include "ipps.h"
#include "ippch.h"
# include

IppStatus concat_ipp( void ) {
int i;

Ipp8u string[301] = "";
Ipp8u suffix[4] = "100";

for (i=0;i<100;i++)
ippsConcat_8u((Ipp8u*)string, strlen(string), (Ipp8u*)suffix, strlen(suffix),(Ipp8u*)string);


return 0;


IppStatus strcat_normal( void ) {
int i;
IppStatus st;

char string[301] = "";

char suffix[4] = "100";

for (i=0;i<100;i++)

return st;


int main()
double wtime;

wtime = omp_get_wtime ();
wtime = omp_get_wtime () - wtime;
printf("Time strcat: \\t\\t%fs\\n",wtime );

wtime = omp_get_wtime ();
wtime = omp_get_wtime () - wtime;
printf("Time ippsConcat_8u: \\t%fs\\n",wtime );

return 0;


ipp_lib_patch = /opt/intel/ipp/
ipp_static = /opt/intel/ipp/

strings: strings.o
gcc -o strings strings.o -I $(ipp_lib_patch)/include -L -ltbb -L $(ipp_lib_patch)/sharedlib -lippimx -lippsmx -liomp5 -lpthread -lippchmx -lippcoreem64t

strings.o: strings.c
gcc -c strings.c -I $(ipp_lib_patch)/include -I $(ipp_static)

Time strcat: 0.000008s
Time ippsConcat_8u: 0.000032s

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.


from your linker options I see that you link with generic C code IPP implementation (MX libraries). That basically mean that no SIMD instructions used in IPP implementation. I would recommend you to link with IPP dispatcher libraries to allow IPP to select the best code. Although, I'm not sure if processor you are running on do support SSE4 or later intstruction set.

You should also take into account that IPP concatenation functions is implemented as simple call to IPP copy function. This will cause some call overhead which may diminish performance gain if strings are not big enough.

below is pseudocode for ippsConcat fuunction.

const Ipp8u* pSrc1, int len1,
Ipp8u* pSrc2, int len2,
Ipp8u* pDst))
ippsCopy_8u(pSrc1, pDst, len1);
ippsCopy_8u(pSrc2, pDst + len1, len2);
return ippStsNoErr;


Leave a Comment

Please sign in to add a comment. Not a member? Join today