How to Use IPP Functions in Linux Kernel Space

How to Use IPP Functions in Linux Kernel Space

Hi,

I try hard to write a Linux driver module with IPP functions. But I failed to use ipp funcs in module. I attached my simple test code and makefile. Is there anybody to help me look into it and tell me how to use ipp functions in kernel-space correctly.

Thanks in advance,

Paul

AllegatoDimensione
Download main.c748 byte
Download makefile.txt653 byte
35 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

Paul,

What happens? Does it Compile? Link? Or crash when ran?

- Chuck

>>...But I failed to use ipp funcs in module...

It should work. Just in case I'll check your test case ( on a Windows platform in a User Mode ring ) and I'll report my results as soon as it is completed.

Hi Chuck,

It can be compiled and linked successfully. But when I insert this module into Linux kernel, ippsSum throw an exception. It seems like ipp funcs can't access directly the memory allocated in kernel-space. But I can't allocate user-space memory with ippMalloc/ippFree, that is because these funcs use C runtime library which is not loaded by kernel.

Paul

Quote:

Chuck De Sylva (Intel) wrote:

Paul,

What happens? Does it Compile? Link? Or crash when ran?

- Chuck

It works and if the source image is initialized with 1 then:
...
[ Output ]
...
Image:
1 1 1 1 1 1 1 1 1 1 1 1
Sum: 12
...
[ Core processing ]
...
status = ippsSum_16s_Sfs( ( Ipp16s * )&iImg[0], ( int )iLen, ( Ipp16s * )&iSum, 0 );
if( status == ippStsNoErr )
printf( "Sum: %d\n", iSum );
else
printf( "Error: %d\n", status );
...

Hi Sergey,

Thanks for your reply. I did the same experiment on Windows/Linux user-space. It works fine too. But when I transform it to a driver program. It can't work. I have not much experience on Linux driver development. I just hope somebody in this forum can tell me how to integrate IPP funcs with my driver code.

Thanks,

Paul

Quote:

Sergey Kostrov wrote:

It works and if the source image is initialized with 1 then:
...
[ Output ]
...
Image:
1 1 1 1 1 1 1 1 1 1 1 1
Sum: 12
...
[ Core processing ]
...
status = ippsSum_16s_Sfs( ( Ipp16s * )&iImg[0], ( int )iLen, ( Ipp16s * )&iSum, 0 );
if( status == ippStsNoErr )
printf( "Sum: %d\n", iSum );
else
printf( "Error: %d\n", status );
...

Hi Paul,

>>...But when I insert this module into Linux kernel, ippsSum throw an exception...

Could you provide more technical details for that exception? Is there some text for it? Also, in your test case I changed:
...
int img[12];
...
to
...
Ipp16s img[12] = { 0 };
...

Hi Paul,

can you implement IPP code in user mode application and call the kernel mode driver is this possible in Linux? in windows user mode app can send the driver IRP's in order to request some work beign done on behalf of user mode code.

I'd like to bring attention of IDZ community members to a possible case of Impersonation of IT experience

>>...can you implement IPP code in user mode application and call the kernel mode driver is this possible in Linux? in windows
>>user mode app can send the driver IRP's in order to request some work beign done on behalf of user mode code...

Deat Iliya Polak,

As far as I know you don't work as a software developer, don't use any Intel software, libraries, SDKs, never posted any real source codes and have a computer with failed HDD. Now, do you really think people don't see who you actually are? You're commenting almost every topic on IDZ and I don't understand why a person without a real software development experience, especially with Intel software and tools, does it?

Hi Sergey,
Do I need to work as a software developer in order to post on IDZ?Am I not allowed to write software as my hobby?Am I not allowed to share my knowledge with others?

>>>I don't understand why a person without a real software development experience>>>

Am I not allowed to learn and accumulate a knowledge.
Please show me my posts which by your judgment are not appriopriate for the standards of this forum.

>>>never posted any real source codes>>>

IIRC a few months ago I sent you special and elementary functions class written in Java.Few times a posted a code snippets written in C and Java.

>>...elementary functions class written in Java. Few times a posted a code snippets written in C and Java...

That is not enough, unfortunately, and you need to use Intel software in order to be a valuable member of IDZ community. When a developer has a real problem he expects a real practical solution. It must be related / relevant to the problem and you need to have experience in that subject / area as well.

Iliya, during last 4 weeks if I slept more than 5 hours a day that was good because too many things on my shoulders. Once again, think about a developer who has a real problem (!) and a pressure form a company's management to solve it in as fastest as possible way. In reality it looks harsh and in software development companies some problems / issues could be unsolved for a long time.

Please respect everybody's time. I've expressed my point of view and this is it.

Hi Sergey,

Here is the system logs when I insert module into Linux kernel. I appreciate your help.

Paul

[  149.074810]  ippSP SSE2 (w7) 7.1.1 (r37466)
[  149.074814] 0 1 2 3 4 5 6 7 8 9 10 11
[  149.074831] BUG: unable to handle kernel NULL pointer dereference at 00000009
[  149.074837] IP: [<f86d8982>] LGLAST2gas_2+0xa/0x1c [ipptest]
[  149.074848] *pde = 00000000
[  149.074851] Oops: 0000 [#1] SMP
[  149.074855] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
[  149.074858] Modules linked in: ipptest(+) binfmt_misc vesafb snd_hda_codec_idt nvidia(P) snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq ppdev snd_timer snd_seq_device dcdbas snd parport_pc psmouse serio_raw i82975x_edac soundcore snd_page_alloc edac_core lp parport usbhid hid ahci libahci tg3
[  149.074892]
[  149.074896] Pid: 1687, comm: insmod Tainted: P            2.6.38-8-generic #42-Ubuntu Dell Inc.                 Precision WorkStation 390    /0DN075
[  149.074903] EIP: 0060:[<f86d8982>] EFLAGS: 00010206 CPU: 0
[  149.074910] EIP is at LGLAST2gas_2+0xa/0x1c [ipptest]
[  149.074913] EAX: 00010000 EBX: 0000000c ECX: 00000003 EDX: 00010000
[  149.074916] ESI: 00000001 EDI: 00010000 EBP: f86db3e8 ESP: ef313ea4
[  149.074919]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[  149.074922] Process insmod (pid: 1687, ti=ef312000 task=ef0dd860 task.ti=ef312000)
[  149.074925] Stack:
[  149.074926]  ef313f04 00000001 00000001 f86d82d0 00000001 0000000b 39343120 3437302e
[  149.074934]  5d343138 01310020 00000000 f86d651a 00000001 0000000b 00000000 00000006
[  149.074942]  00000002 ef313ef0 f86dc040 f86d636b 00000001 0000000b ef313f04 00000000
[  149.074950] Call Trace:
[  149.074958]  [<f86d82d0>] ? w7_ownippsSum_16s32s_Sfs+0x10/0x20 [ipptest]
[  149.074966]  [<f86d651a>] ? w7_ippsSum_16s32s_Sfs+0x2a/0x50 [ipptest]
[  149.074973]  [<f86d636b>] ? w7_ippsSum_16s_Sfs+0x2b/0x80 [ipptest]
[  149.074980]  [<f86d6063>] ? init_module+0x63/0x80 [ipptest]
[  149.074987]  [<c1001255>] ? do_one_initcall+0x35/0x170
[  149.074993]  [<f86d6000>] ? init_module+0x0/0x80 [ipptest]
[  149.074999]  [<c108899b>] ? sys_init_module+0xdb/0x230
[  149.075004]  [<c1125925>] ? sys_close+0x75/0xd0
[  149.075010]  [<c1509bf4>] ? syscall_call+0x7/0xb
[  149.075012] Code: 7c 1f f3 0f 6f 06 f3 0f 6f 4e 10 66 0f f5 c6 83 c6 20 66 0f f5 ce 83 e9 10 66 0f fe e0 66 0f fe e9 83 c1 10 7e 5f 83 e9 08 7c 12 <f3> 0f 6f 06 83 c6 10 66 0f f5 c6 83 e9 08 66 0f fe e0 83 c1 08
[  149.075060] EIP: [<f86d8982>] LGLAST2gas_2+0xa/0x1c [ipptest] SS:ESP 0068:ef313ea4
[  149.075070] CR2: 0000000000000009
[  149.075074] ---[ end trace d3fa9a9bc3cf75ea ]---

Paul,

Do you have 64-bit machine?

I'm not experienced in Linux debugging, but I will try to help you as much as I can.From call stack observation the faulting ip is located 9 bytes after this instruction LGLAST2gas_2+0xa/0x1c  which simply dereferences null pointer.I'm not sure if In Linux kernel architecture code which dereferences null pointer cannot be trapped in exception handler and simply oops_begin() is called to bring the system down.The last transfer call should be made from this routine  w7_ownippsSum_16s32s_Sfs+0x10/0x20 [ipptest].I think that your system is using FPO because ebp is pointing to executable area and esp points to stack space.It would be interesting to dump the esp pointer and try to find the dereferenced pointer.

Paul,

Can you resolve this address esp =  0xef313ea4 ?

>>...Here is the system logs when I insert module into Linux kernel...

Here are a couple of notes:

1.
>>...
>>ipptest-objs := main.o libipps_l.a libippcore_l.a
>>...

You're using static non-threaded libraries libipps_l.a libippcore_l.a. Could you try threaded versions instead? Names for these two libraries are as follows:

libipps_t.a and libippcore_t.a

2. Try to initialize IPP libraries with:

- ippStaticInit() - provides the best available optimization
or
- ippStaticInitCpu() - forces usage of some CPU specific implementation of IPP functions

I see that w7 Waterfall library is used ( SSE2 instruction set ) and make sure that the library is accessible from the kernel layer. That is, check access rights attributes and a path to the Waterfall library.

Update and this is what I've found in older IPP docs:
...
The function ippStaticInit should not be used in the driver implementation.
...

Hi Paul,

This is still a question for me if IPP functions could be used in a driver on a Linux platform. However, I've found an example ( in IPP v3 (!) ) of application IPP functions in a driver for Windows NT and Windows 95 operating systems. So, if it is supported on Windows OSs ( or at least it was supported ) then I expect it has to be supported on Linux OSs as well. It would be strange if IPP is Not supported in driver codes for any OS now.

Please provide updates on your status. Thanks in advance.

Hi Sergey,

I tried to link it with IPP threaded libraries as you said. But it still can't work. It shows a system log as follows.

[  893.449624] ipptest: Unknown symbol _GLOBAL_OFFSET_TABLE_ (err 0)

I guess this is because the IPP threaded libraries used POSIX pthread library internally which can't be used in kernel-space.

Paul

>>...I guess this is because the IPP threaded libraries used POSIX pthread library internally which can't be used in kernel-space...

Any comments from Intel software engineers of IPP team?

>>...It shows a system log as follows.
>>
>>[ 893.449624] ipptest: Unknown symbol _GLOBAL_OFFSET_TABLE_ (err 0)

My quick search shows that it is defined in libbfd.a library. Could you try to find the library on your system and link?

Note: Check for another libraries as well / There are too many Linux systems.

Here is some information about the IPP in kernel mode both Win OS and Linux: Link://software.intel.com/en-us/articles/code-samples-for-intel-integrated-performance-primitives-intel-ipp-library-71

Please check this directory(zip file) advanced-usage/application/ippsdrv

Hi Paul,

Look please into IPP User's Guide for Linux. There's something about PIC and non-PIC libraries. Looks like for driver mode the non-PIC libraries are necessary. They should not contain offset tables.

Regards,
Sergey 

Regards, Sergey

Hi Sergey,

You are right. non-PIC libraries are necessary for driver mode. But I still can't work. Some errors occured like the system log I posted. And I find an IPP official sample for driver mode "ippsdrv". But I failed to build it on my linux box, it seems like that some floating operations can be found in kernel space.

Paul

Quote:

Sergey Khlystov (Intel) wrote:

Hi Paul,

Look please into IPP User's Guide for Linux. There's something about PIC and non-PIC libraries. Looks like for driver mode the non-PIC libraries are necessary. They should not contain offset tables.

Regards,
Sergey 

>>...it seems like that some floating operations can be found in kernel space.

Could you provide more details on these floating operations?

I do not know how it is implemented on Linux,but; in Windows kernel mode driver is subject to many restrictions.For example you must save the FPU context and restore it later.
Regarding Linux here is the Linus's answer when to use floating point operations in linux kernel
Link:http://lkml.indiana.edu/hypermail/linux/kernel/0405.3/1620.html

Here is qouted sentence from one of Linux dev forums regarding floating point code in driver|

"Basically you have to compile hardware floating-point capabilities into your module (with -mhard-float) and use two kernel functions kernel_fpu_begin() and kernel_fpu_end() "

Use  next define before including IPP

#define  IPPAPI( type,name,arg )   extern type name arg __attribute__ ((regparm(0)));

or use -mregparm=3 for GCC cmd line. Linux kernel mode modules use different ABI - so called "fast call" that corresponds to Borland definition (passing parms are in eax, edx, ecx), not to MS one

regards

Igor

Use  next define before including IPP

#define  IPPAPI( type,name,arg )   extern type name arg __attribute__ ((regparm(0)));

or use -mregparm=3 for GCC cmd line. Linux kernel mode modules use different ABI - so called "fast call" that corresponds to Borland definition (passing parms are in eax, edx, ecx), not to MS one

regards

Igor

sorry for "-mregparm=3" - this is not correct - this is default definition for all kenel mode APIs and you should show to compiler that IPP API has different from kernel mode ABI - so only __atribute__(regparm(0)) can help

regards

Igor

Hi Igor,

so are these routines kernel_fpu_begin() and kernel_fpu_end()  not used(called) from within IPP ?

This is what command line help of a GCC compiler displays for mregparm:
...
-mregparm = Number of registers used to pass integer arguments
...

IPP functions are general purpose primitives, they don't have internaly any specific related to user or kernel mode execution. So it is user responsibility to wrapp IPP calls with kernel_fpu_begin()/end().

regards, Igor

Quote:

Igor Astakhov (Intel) wrote:

IPP functions are general purpose primitives, they don't have internaly any specific related to user or kernel mode execution. So it is user responsibility to wrapp IPP calls with kernel_fpu_begin()/end().

regards, Igor

Thanks Igor for explaining this.

Hi Igor,

I used the "__atribute__(regparm(0))" method to fix my problem. Thank you.

And I appreciate all comments from Sergey and iliyapolak.

Paul

Quote:

Igor Astakhov (Intel) wrote:

Use  next define before including IPP

#define  IPPAPI( type,name,arg )   extern type name arg __attribute__ ((regparm(0)));

or use -mregparm=3 for GCC cmd line. Linux kernel mode modules use different ABI - so called "fast call" that corresponds to Borland definition (passing parms are in eax, edx, ecx), not to MS one

regards

Igor

Accedere per lasciare un commento.