(ippiMalloc_XXX | malloc) & ippiFilterGauss*

(ippiMalloc_XXX | malloc) & ippiFilterGauss*

Hello.

Do I need to allocate memory for arrays only with ippiMalloc (it allocates 32-bytes aligned memory) or not?

Does using other memory allocating functions (malloc, ...) affects to performance?

Code:

IppStatus GaussFilter (Ipp32f* pSrc, const int nWidth, const int nHeight, const Ipp32f fSigma, Ipp32f* pDst)

{

  int nKernelSize = 7;

  IppiSize tWholePic = {nWidth, nHeight};

 

  int nStepBytes = 0;

  pDst = ippiMalloc_32f_C1 (nWidth, nHeight, &nStepBytes);

 

  int nBorderBufferSize = 0;

  Ippi8u* pBorderBuffer = ippiFilterGaussGetBufferSetSize_32f_C1R (tWholePic, nKernelSize, &nBorderBufferSize);

 

  ippiFilterGaussBorder_32f_C1R (pSrc, nWidth * sizeof (Ipp32f)

                                               , pDst, nWidth * sizeof (Ipp32f)  // Have I use this one or nStepBytes receipt from ippiMalloc?

                                               , tWholePic

                                               , nKernelSize, fSigma, ippBorderRepl, 0.

                                               , pBorderBuffer);

}

void tmain(...)

{

  int nWidth = 12, nHeight = 15;

  Ipp32f fSigma = 1.;

  Ipp32f pSrc[nWidth * nHeight] ;

  Ipp32f* pDst = NULL;

 // Initialize pSrc

  GaussFilter (pSrc, nWidth, nHeight, pDst);

// Do something with pDst.

  if (pDst != NULL)

    ippFree (pDst);

}

Regards,

Mark

 

13 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

Hello,

 

Other malloc functions also work. ippsMalloc is actually calling the system malloc function, and make the memory 32 bit/64bit alignment.  From the performance point, it is the better if the input data is address is 32bit or 64 bit alignment( for the machine support AVX instructions).

 

Thanks,
Chao

Hi,

In the source code, where you ask "Have I use...", you need to use nStepBytes, because the step is not always equal to nWidth*sizeof. Otherwise, there is a risk of missing of memory alignment benefits.

Regards,
Sergey

Regards, Sergey

Thanks.

 

regards,

Mark.

Hello, guys.

I have one more question relates to above theme. :) 

If I use memory aligning, how can I define memory with real data and trash memory (allocated to align 32/64 boundary)? Are there some helper functions to define neccessary and trash mem? Are other functions know about the trash memory? It seems that this done by nStepbytes parameter, isn't it?

const int nHeight = 15, nWidth = 8;

IppiSize tsWholePic = {nWidth, nHeight};

 

Ipp32f pSrc[nHeight * nWidth];

ippiSet_32f_C1R (2., pSrc, nWidth * sizeof (Ipp32f), tsWholePic);

 

int nDstStepBytes = 0;

Ipp32f* pDst = ippiMalloc_32f_C1R (nWidth, nHeight, &nDstStepBytes); // In due of mem align pDst has trash memory parts. See pic

ippiCopy_32f_C1R (pSrc, nWidth * sizeof (Ipp32f), pDst, nDstStepBytes, tsWholePic); // Is this right copying? Step bytes are different  // for pSrc and pDst

Regards,

Mark

 

 

Fichiers joints: 

Fichier attachéTaille
Télécharger MemAllocatingSheme._1.jpg9.74 Ko

Mark,

That's correct. Any "step_bytes" parameter in IPP image processing function defines how many bytes to add to the beginning of previous image row to position to the beginning of next image row. So, "nWidth*sizeof(Ipp32f)" and "nDstStepBytes" both are correct as src and dst steps.

Regards,
Sergey

Regards, Sergey

I see. Payment for speed and comfort. :)

Merry Christmass,

Thanks a lot,

Mark

hello

I see. Thanks a lot.

Merry Christmas, guys (yesterday I couldn't add post to the forum, something happened with site.)

Regards,

Mark

Ok, thanks a lot.

regards,

Mark

Hello,

As I understand, in ippiMallocated structures any row in image is aligned to 32/64 border, so bytes are added to the end of the previous row. And the situation seems follow: for static and dynamic (allocated with malloc) I can use nWidth * sizeof (Type_Of_Array). For dynamic arrays, allocated with ippiMalloc I have to use nStepbytes.

schema in ippiMallocated array

xxxx1............a............b............c............xxxxxxxxxx2............a............b............c............ixxxxxxxxxx

1, 2 - address in memory aligned to 32/64

xxxxx - trash memory, added to align

2 - 1 = nStepBytes,

&c - 1 = nWidth * sizeof(Type_Of_Array)

regards,

Mark

Hello,

Thanks a lot.

Regards,

Mark

Mark,

There are aligned mallocs in various OSes (_align_malloc, posix_memalign and others), but they provide only alignment of the very first byte of allocated memory, whereas in image processing the beginnings of each image line should be aligned for better performance.

Regards,
Sergey

Regards, Sergey

I see. I have made a lot of tests and discovered this feature of ippiMalloc. :) 
 

Thank a lot.

best regards,

Mark.
 

Laisser un commentaire

Veuillez ouvrir une session pour ajouter un commentaire. Pas encore membre ? Rejoignez-nous dès aujourd’hui