i try decompress and compress under ipp jpeg file with jpeg library what is part of linux examples on download pages. But its a 20% slower how jpeglib in linux. Where is problem. Is possible accelerate this process with some jpeg coding and decoding parameters? I use 3.4 Ghz nad result is 20 jpeg/s. Jpeg only have 25kb! I need 15x better performance.
When i create example form memory to memory with fast parameters jpeglib is 5x faster. How do it in ipp?
ipp 4, linux, gcc or icc.
jpeg speed
jpeg speed
For more complete information about compiler optimizations, see our Optimization Notice.
Hi,
it looks like you use "generic" or PX IPP libraries. Could you please check if it is so?
Regards,
Vladimir
IPPROOT=/opt/intel_cc_80/bin/ipp40
endif
CC = ./icc
CXX = ./icc
CFLAGS = -Wall -O3 -Ob2 -axN -ip -march=pentium4 -D_REENTRANT -DQT_THREAD_SUPPORT -DNO_DEBUG
CXXFLAGS= -Wall -O3 -Ob2 -axN -ip -march=pentium4 -D_REENTRANT -DQT_THREAD_SUPPORT -DNO_DEBUG
INCPATH = -I$(IPPROOT)/include -I./jpegcodec -I$(QTDIR)/include
LNK = ./icc
LFLAGS =
LIBS = -L$(QTDIR)/lib -lpthread -lm -L$(IPPROOT)/lib -lippjemerged -lippjmerged -lippiemerged -lippimerged -lippsemerged -lippsmerged -lippcore -static
MOC = $(QTDIR)/bin/moc
####### Files
I use standard ipplib from makefile which is part of inteljpeg samples (library) for linux.
ipp 4.0 for linux.
Also what is
TRC1 and TRC0 command in intel encoder sample? Its log? If yes, i can remove it to gain some micros speed.
>> TRC1(" id ",id[i]);
>> TRC1(" dc_selector ",m_comp[id[i]]->m_dc_selector);
>> TRC1(" ac_selector ",m_comp[id[i]]->m_ac_selector);
Ok, thanks. Could you also tell me what info you can see if you issue Help-About IPP command from menu?
Regards,
Vladimir
ippjw7.a
ipp 4.0
looks optimized for pentium 4
also i put jpeg if you want see real test :)
linux time show me 0.050
and when i create simple app which only call djepg and cjpeg command with fast switch it show me little faster results!
And it cover also creating process and load 2 app command. Heh. When i create libjpeg with memory it show result 100+ jpgs/ s.
Yes, IPP uses correct library. I think the key here is that you count process creation time. I think GUI application is loading slower than simple command line application.
Regards,
Vladimir
no, i use comman line my app :) only load , encode/decode and save. Same as libjpeg. Nothing more.
How ipp cound dct in jpeg? with floating? maybe libjepg in fast mode use precreated int array. But info on ipp www inform ipp is 300% faster how libjpeg :)
Here is my code for code/encode its clean and pasted from intel sample for linux:
ITs look like ipp is slower about 1500% over libjpeg :)
#include
#include "ippdefs.h"
#include "ippcore.h"
#include "ipps.h"
#include "ippi.h"
#include "ippj.h"
#include "encoder.h"
#include "decoder.h"
#include
#include
#define BI_RGB 0
#define BI_RLE8 1
#define BI_RLE4 2
#define BI_BITFIELDS 3
Ipp8u* m_imageData = NULL;
int m_width;
int m_height;
int m_nChannels;
int m_lineStep;
JCOLOR m_color;
void RGBA_FPX_to_BGRA(Ipp8u* pSrc,int width,int height)
{
int i;
int j;
int lineStep;
Ipp8u r, g, b, a;
Ipp8u* ptr;
lineStep = width*4;
for(i = 0; i height; i++)
{
ptr = pSrc + i*lineStep;
for(j = 0; j width; j++)
{
r = ptr[0];
g = ptr[1];
b = ptr[2];
a = ptr[3];
ptr[2] = (Ipp8u)((r*a+1) >> 8);
ptr[1] = (Ipp8u)((g*a+1) >> 8);
ptr[0] = (Ipp8u)((b*a+1) >> 8);
ptr += 4;
}
}
return;
} // RGBA_FPX_to_BGRA()
void BGRA_to_RGBA(Ipp8u* pSrc,int width,int height)
{
int i, j, line_width;
Ipp8u r, g, b, a;
Ipp8u* ptr;
line_width = width*4;
for(i = 0; i height; i++)
{
ptr = pSrc + i*line_width;
for(j = 0; j width; j++)
{
b = ptr[0];
g = ptr[1];
r = ptr[2];
a = ptr[3];
ptr[0] = r;
ptr[1] = g;
ptr[2] = b;
ptr += 4;
}
}
return;
} // BGRA_to_RGBA()
char* getSamplingStr(JSS sampling)
{
char* s;
switch(sampling)
{
case JS_444: s = "444"; break;
case JS_422: s = "422"; break;
case JS_411: s = "411"; break;
default: s = "Other"; break;
}
return s;
} // getSamplingStr()
char* getColorStr(JCOLOR color)
{
char* s;
switch(color)
{
case JC_GRAY: s = "Gray"; break;
case JC_RGB: s = "RGB"; break;
case JC_BGR: s = "BGR"; break;
case JC_YCBCR: s = "YCbCr"; break;
case JC_CMYK: s = "CMYK"; break;
case JC_YCCK: s = "YCCK"; break;
default: s = "Unknown"; break;
}
return s;
} // getColorStr()
int DecodeJpeg(int size, unsigned char *data)
{
Ipp8u* buf = (Ipp8u*) data;
JCOLOR jpeg_color;
JSS jpeg_sampling;
int jpeg_nChannels;
int imageSize;
JERRCODE jerr;
IppStatus status;
CJPEGDecoder decoder;
jerr = decoder.SetSource(buf,(int)size);
if(JPEG_OK != jerr)
return 0;
jerr = decoder.ReadHeader(
&m_width,
&m_height,
&jpeg_nChannels,
&jpeg_color,
&jpeg_sampling);
if(JPEG_OK != jerr)
return 0;
switch(jpeg_nChannels)
{
case 1:
m_nChannels = 3;
m_color = JC_RGB;
break;
case 3:
m_nChannels = 3;
m_color = JC_BGR;
break;
case 4:
m_nChannels = 4;
m_color = JC_CMYK;
break;
default:
jpeg_color = JC_UNKNOWN;
m_color = JC_UNKNOWN;
m_nChannels = jpeg_nChannels;
break;
}
if (m_imageData != NULL)
{
ippFree(m_imageData);
m_imageData = NULL;
}
m_lineStep = m_width * m_nChannels;
imageSize = m_lineStep * m_height;
m_imageData = (Ipp8u*)ippMalloc(imageSize);
if(NULL == m_imageData)
return 0;
>jerr = decoder.SetDestination(
m_imageData,
m_width,
m_lineStep,
m_height,
m_nChannels,
m_color);
if(JPEG_OK != jerr)
return 0;
jerr = decoder.ReadData();
if(JPEG_OK != jerr) return 0;
return 1;
}
int EncodeJpeg(int quality, unsigned char *destiny)
{
Ipp8u* buf;
int imageSize = m_width * m_nChannels * m_height;
if(imageSize 4096) imageSize = 4096;
buf = (Ipp8u*)ippMalloc(imageSize);
if (buf == NULL) return 0;
int jpeg_quality = quality;
JSS jpeg_sampling = JS_411;
JCOLOR in_color;
JCOLOR jpeg_color;
JERRCODE jerr;
CJPEGEncoder encoder;
switch(m_nChannels)
{
case 3:
in_color = m_color;
jpeg_color = JC_YCBCR;
break;
case 4:
in_color = JC_CMYK;
jpeg_color = JC_YCCK;
break;
default:
in_color = JC_UNKNOWN;
jpeg_color = JC_UNKNOWN;
break;
}
encoder.SetSource(m_imageData,m_width,m_height,m_nChannels,m_lineStep,in_color);
encoder.SetDestination(jpeg_quality,jpeg_sampling,jpeg_color,0);
if(JC_CMYK == in_color)
BGRA_to_RGBA(m_imageData,m_width,m_height);
jerr = encoder.WriteWholeImage(buf,&imageSize);
if(JPEG_OK != jerr) {
ippFree(buf);
return 0;
}
memcpy(destiny,(const char*)buf,imageSize);
ippFree(buf);
return imageSize;
}
Thanks again, we will check this issue. The numbers on IPP www site is about performance comparison of IPP jpeg codec under windows with IJG v6B. According our tests IPP jpeg codec outperforms IJG on windows.
Regards,
Vladimir
you are linking with static libraries without specifying the architecture, try including ipp_w7.h after ipp.h
otherwise use the dynamic libraries
this made all the difference for me...
- Fabio
I saw from the Website that the JPEG decoding is %200 better performance from the IPP website. We are using JPEG decoder in our application and it takes a 2.0 Ghz processor only for %25 CPU performance JPEG decoding of 25fps on our computer.
When we use IPP assmbled machine codes could we expect the CPU perfroamce for the same enviroanment and same conditions to be under %10 CPU perfroamnce.
Note: We are using C compiled IJG Jpeg libraray at the moment.
Hi,
We have done various tests with JPEG-IPP , IJL, IJL-optimized IPP, and previous IJL 1.51 version. The result is obviuous : JPEG-IPP is the best solution in term of speed and cpu usage. Test was done for :
1024 x 768 x 24bit at 40 fps
640 x 480 x 24 bits at 94 fps
On our web site you can find test applications to comptute what we call jepgBandwidth.
SRSR
Hi,
we have several JPEG implementations by some reasons:
1 - IJL-IPP, to simplify migration to IPP forIJL users. This implementation not always faster than old IJL because of overhead and not optimal implemenation.
2 - IJG-IPP, it is sample which demostrates how you can speed-up well known JPEG implementation, IJG library (www.ijg.org).
3) JPEG-IPP, it is our fastest JPEG codec, because of optimal implementation. In IPP v4.1 it still have some limitation in functionality, but we continue to extend this sample.
So, my recomendation is to try JPEG-IPP codec if you need maximum speed and you agreed with some limitation, hopefullythey arenot so frequently used features. For example, in IPP v4.1, we do support only 444, 422 and 411 sampling, we do not support scan-interleaved JPEG images, we do not support 255 color components, and so on.
Note, all these limitation is only sample code limitation, you not needed in additional IPP functionality for that and you can extend this sample to support missed features if you need them right now.
Regards,
Vladimir
Hi,
could you place here the direct link for the results? I think it should be interesting resource for many people..
Regards,
Vladimir


