Finger IDs are unreliable... (very)

Finger IDs are unreliable... (very)

imagem de SMing

Hi all,

Not sure if this is a known issue to everyone already. Looking at the coordinates shown numerically in realtime for each finger, one would notice that the delivered coordinates (or the computer "perceived" fingers) do not necessarily match the actual fingers in action.

For example: posing the "german three" hand sign with the long-stretching THUMB, INDEX and MIDDLE fingers (and the folded RING and PINKIE fingers) as shown in this picture : http://4.bp.blogspot.com/_nn4b2gkHsEU/S3jmdfEsRVI/AAAAAAAAAfo/kJb0wdyKpts/s400/three1.jpg

The detected finger combinations that one could get from the realtime reading are VERY VERY random, which could be in the following in my test:

1. Index, Ring, Pinkie
2. Thumb, Index, Pinkie
3. Thumb, Ring Pinkie
4. Thumb, Middle, Ring
5. etc....

This problem occurs when posing the simple "PEACE" sign too, the finger pair could be of any combination very often, unfortunately.....
Hope that the team can address this and make the SDK more robust, as otherwise it is very hard to put this into a commercial products, Thanks.

15 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

I also realized that.

One helpful thing is to include the distance between the fingers, if you want the "german three" make sure that the distance between one finger and the other two is greater than between the other (distance thumb - index is far, distance thumb - middle is far, distance index - middle is short).

you can apply this method to some hand signs but not all of them (thumb up, for instance, is one i haven't solved yet)

i hope i could help you

imagem de SMing

Citação:

WolfTW escreveu:

One helpful thing is to include the distance between the fingers, if you want the "german three" make sure that the distance between one finger and the other two is greater than between the other (distance thumb - index is far, distance thumb - middle is far, distance index - middle is short).

you can apply this method to some hand signs but not all of them (thumb up, for instance, is one i haven't solved yet)

i hope i could help you

Hi WolfTW

Many thanks for verifying this and for proposing a solution for this. It's very kind of you  ;-)

The "german three" was actually just an example for demonstrating the detection issue. In practice, I use angles for detecting some (custom) complex gestures and poses without the finger IDs for the moment, which is very similar to the distance approach that you have mentioned. But imagine if one could get reliable finger IDs out of this, the potential and flexibility will be enormous.

On the other hand, since this is no longer beta, one would expect a more reliable data from the SDK to be frank, as otherwise it is very difficult to use this in commercial products. Isn't it?
Hope that the team can speed up a bit in fine tuning the accuracy, coz otherwise it would be a pity as this cool technology might be overshadowed by the one from LeapMotion. Thanks

Angles are probably even better, I haven't even gone that far yet.

I guess the problem lies with the model behind Intel's software. I'm not sure if I understand completely how it works, but to me it seems that they take almost all of the information frame by frame. Have you ever heard of Omek Grasp? It's another depth camera using software which apparently calculates a model of the hand and only takes the bending angles of each finger to recreate the hand position. Should be more reliable than Intel and a better competitor to LeapMotion. Unfortunately they have no official release date yet.

I'm experiencing the same here. Im trying to map the data from the geonodes to a virtual 3D representation of a hand and what I found is that the (finger) detection algorithm performs really poor.
I mean sure, for the detection of the actual hand, its quite good but what's really interesting in near range is the finger position and in its current state, that's quite unreliable. Maybe it's because the depth data is actually quite noisy? Seems like we have to wait for Leap et. al after all.. 

imagem de PONRAM

oh well it seems there may any unwanted algorithms fed inside

imagem de SMing

Citação:

WolfTW escreveu:

... Have you ever heard of Omek Grasp? It's another depth camera using software which apparently calculates a model of the hand and only takes the bending angles of each finger to recreate the hand position. Should be more reliable than Intel and a better competitor to LeapMotion. Unfortunately they have no official release date yet.

Hi WolfTW,

Thanks for the pointer! Wow, yet another interesting technology, indeed! Good to know that  :-)

cheers
SMing 

Oh thanks, WolfTW, that indeed looks very promising!

I'd just hope Intel would implement a better tracking algorithm, similar to this: http://www.openni.org/files/3d-hand-tracking-library/#.UXguIbVJ9yo (and that's using the pretty inaccurate Kinect sensor...)

imagem de Mitch R.

A few clarifications on the above

- Omek Grasp is not a camera - it is middleware that runs on top of a depth camera by analyzing a 3D depth map, same as the PerC SDK.  In other words - should they decide to support it - Omek's middleware could run on top of the PerC camera.

- Leap detects movement and edges and won't help you in recognising what three fingers you suddenly put in front of it because it does not generate a 3D depth map.  Insted it uses very wide angle lenses, low res sensors and a fast frame rate to track movement.  So it is good for tracking, but not for recognition.

- I find the use-case interesting: three random fingers put in front of a camera should be automatically, instantly and flawlessly identified.  Most machine vision systems need some sort of "fiducial" to get its bearing in the world, and lacking one you need more time, more proessing power, or get a higher error rate.  In the case of the PerC the fiducial is the open hand (Big 5) and from there you get high accuracy of identification since all the fingers are identified and stored.  Absent this you are going to get errors unless you take more time or processing, or write custom software that looks specifically for that use case (or similar instance).

imagem de SMing

Hi Mitch,

Thanks for the interesting comments. Guess that you have somewhat misunderstood the points mentioned here... 

I suppose most of us are aware of the fact that it is the algorithm software (or middleware) that does the trick of the actual recognition and coordinates detection, etc, while camera is merely a pure sensor that delivers the raw and depth imagery. Well, even at this stage, it is possible for us developers to inject the image information from the depth cam directly to the Omek layer in our application (or into our own detection algorithm) if we wanted to. But the point is that, we try to encourage Perc Comp SDK to be improved for this part and hope to get a better update soon.

If you understand the mentioned example correctly, it is definitely NOT about a random three-fingers pose suddenly shows up in front of the depth cam; instead, it is a fix hand pose with well-defined fingers in the context as shown in the picture. The problem is that, the finger data queried from the SDK shows some random finger combinations as listed above. Note that this would NOT be noticeable at all if those data are simply rendered (drawn) onto the image since the finger order does not matter here. BUT if we are talking about detecting fine gesture or hand signs, the accuracy of the actual finger identity is crucial here. Not sure if you are familiar with the options delivered in the Perc Comp SDK, actually, the developers have the options to query the availability and the coordinates of a specific fingers explicitly. So, what we are facing right now is that, the realiable coordinates of the finger1 might end up retrieved from the finger2/3/4/5 query.

Since we are talking about gesture and reliable finger tracking here. It doesn't really matter how the algorithms do it behind the scene after all. As in order to deliver a reliable software products (especially the commercial ones) based on such inputs, all the developers need is the reliable input data, that's all  :-)

No offence, Mitch, But you sounded rather defensive here. In any case, I hope you understand the fact that we are not trying to criticize nor defame the technology here. Though we allegedly sounded upset here, our initiative is to report the potential bug and hope that this issue can be addressed pretty soon so that the SDK can be more robust.

Thanks.

SMing is right, I did not mean to offend anyone or attack Intel (be it the software, staff or anything else). I was just sharing information and thoughts.

@ Omek: I worte "It's another depth camera using software [...]". Meant to be read as: "depth-camera-using-software", which you (just my guess) read as "depth-camera using software". I hope you get the difference ;-}. Anyways that seems to have been my fault as I shouldn't have chosen a wording open to interpretation.

@ Leap: I don't have one yet, all my informations come from demo videos on youtube. It doesn't use a depth map but it can still exactly show the position of the fingers or the hand. I don't know if it has the same variables (position of the finger) since it can track basically anything but I hope so.

 @ the problem: what I would like to have is the following: I put up my pinky, middle and index finger and get an information which tells me pinky = true, ring = false, middle = true, index = true, thumb = false (or an array of integers, or 5 single integers, or whatever). I have not found this information yet, as the fingers are scrambled randomly every time you switch the pose. Again, since Grasp calulates the angles I'm sure it can deliver this infromation. And it shouldn't be too hard for Intel either ;-)

imagem de Maneesh k.

Hi All..

During my coding i faced a dificulty that On using the Querynode function ,the primary hand is identified as left hand,even if i show the right hand...Can we solve this problem?

imagem de Maneesh k.

Can any device use as the input of the voice recognition module?

imagem de Maneesh k.

Can any device use as the input of the voice recognition module?

imagem de SMing

From my understanding, the left/right thing does not work correctly for the time being. Guess that it is meant to be the place holder for the features in the future SDKs. 

So, instead of using the precise left or right ID, use the primary/secondary concept for your app. "Usually", the first hand enter in the FOV is the primary hand delivered by the geoNode.

Faça login para deixar um comentário.