Coding and Tango: February 2014

Tuesday, February 25, 2014

So, after much consultation

It is possible that I may not use optical motion capture at all for my project.

OK, I probably will.

I am looking into using Shimmer sensors. They have software which syncs the sensors (accelerometers, gyro, et. al.) to each other AND to cameras. So I could have everyone wear one (ideally across back) & get ID & location THAT way. Then I can still get some rich motion data from the legs using the live video motion capture.

Or else, I just use all Shimmer motion sensors, but I'd actually like to avoid that. I don't want something bulky on their ankles.

I looked into some active and passive marker optical motion capture systems, but ultimately they were too rich for my blood & my department's (I'm in the arts!). They recommended at least 20 cameras to deal with the occlusion, and that's well. I can already tell you that 20 motion capture cameras of that caliber is out of my price range.

Anyways, the folks at Phase Space & Qualisys & APDM were particularly helpful, and I would recommend them highly if you have the funding.

PS. Not that anyone follows this blog, but I did finish my Variations V paper, and I have feedback for it. I am hoping to do the edits and publish it AFTER Buenos Aires. Possibly after my dissertation defense. After that I will probably also dinosaur more (and again -- yes I use dinosaur as a verb now, you would, too). Oh, AND! My! Dinosaur! Got! Accepted! Into! NIME! as! an! installation! in! London! UK! Crazy trip since I'll be in BsAs & Sharif (my partner in this dinosaur business) will be bring the dino, so I have to send it to him (across the country) before I leave.

Tuesday, February 18, 2014

Markers, I will use them.

Feature detection has broken me & I have mostly decided that I will incorporate markers to a limited extent. Well, it hasn't broken me, but motion tracking is only but a part of my dissertation, and I need to do things on top of that. So, I cannot devote all my resources to solving problems that have not been solved in motion capture yet. Here's the current version (still markerless..):

So basically, I need to be able to distinguish between heads, so that it creates a new tracker for a new person. It obviously still loses where its tracking, non-trivially -- although it's better than before. It updates with face detection every 250 frames or so & also when its detecting too many zero pixels (this is after depth segmentation & only looking at the top 25% of blobs for heads -- yes, the kinect data is noisy).

Facial recognition is too CPU-intensive I think. I attempted to try tracking markers using some AR libraries (aruco, ARma) -- just as a prototype -- they were really light-weight & easy to implement -- but they were not meant for applications such as mine (nothing comes cheap in my case). I think I'm also ready to do better depth segmentation... & perhaps there is a way to disgard some of the Kinect noise.

I am worried about varied lighting conditions, etc. I am half-thinking about just turning the Kinect into a cheap IR sensor sans depth -- since the resolution of the depth information is fairly low & noisy for my purposes. Or just buying really high quality & fast webcams.

Another problem to solve: right now, the kinect is sucking up CPU -- like 120%... eek. I've traced the problem to the libfreenect-driver, but replacing the driver with an up-to-date version (the one on homebrew is 2 iterations behind) either crashes or runs once in debug mode, using even more CPU than before...

Wednesday, February 12, 2014

face tracking tango dancers, cont.

Soooo.... I am using the Haar face detection to find the faces when I lose them and update them every 75 frames. I am using Zhang's Real-time Compressive Tracker for the tracking. It does not perform as well on tango dancers. Well, nothing does! I am using pretty crude depth segmentation, skin color tracking (also the crude version) & only looking at the upper fourth of any blob to track something or detect a face. Only looking at the upper fourth of any blob means it will not catch lifts and dips (unless I can separate the couple), but those are really rare in tango social dance.

So, it does NOT have continuity between the faces at the moment. Or at least, every 75 frames it loses continuity. So the next step will be creating the continuity. So, that means when I lose a face, I need to find it again via face detect & check whether THAT image is similar. I think I may dig in the code for the tracker I'm using since it is obviously doing SOMETHING like that. Also, since the tracker updates & learns, that means it should drift less (I'm guessing). That means that eventually I can trust it not to drift less...

I could also make my face detection faster....

Also, To be honest I'm a little horrified at some of my mistakes (dancing) in this video, especially after repeated viewing. I was tired! Nevertheless!

Monday, February 10, 2014

Beatriz Ferreyra is awesome

Beatriz Ferreyra's electroacoustic tango MurmureIn

Sunday, February 9, 2014

adventures in depth segmentation

k-means segmentation gone rogue. I just find this amusing. I think I might try out another way, though. I am awesome at finding the slowest ways of doing things. All CPU + RAM are belong to us.

Wednesday, February 5, 2014

Face Detection using Skin Detection -- so much better!!

Edit: I accidentally deleted this post, but I was able to restore it here.... Yay...

Ok, so I really need a trained haar cascade for like, a 3/4's face, bc that's the one that I'm not detecting. But anyways, it is performing pretty well. The only thing is that I really just need to go to doing my depth segmentation & get the background out of there in a way that doesn't obscure my faces. I found an interesting paper which uses both depth AND color to segment -- & it is straight-forward to implement, so maybe I'll try that. I'm worried about the performance aspect of that, though, since it requires that I translate into IRL coordinates from depth.

Anyhow, here's my skin/face combo results. Narrowing down the search area definitely created a lot more correct positives... AND still some false positives. Note that once the background is gone, that this will improve the results immensely.

The blue is detected skin regions & the pink again, is faces. Also, my algorithm is (in theory) not racist. So, ya, that's important, right?!! Apparently human skin is more or less the same color once you disregard luminance.... actually working on a better skin detection right now from the paper I mentioned earlier... THEN, off to image/depth segmentation.

Tuesday, February 4, 2014

face detection disappointing so far on tango couples (EDITED: not as bad as I thought)

This is just my first, naive try using the Haar cascades in OpenCV. I'm using one frontal training set & one profile.

It could be that if the kinect was closer, then it would get better results... but then of course, there would be more occlusion for the theoretical other dancers. I did enlarge the image in order to do the detection, but this didn't help enough. My problem might also be that the kinect is from an angle above, and so, the faces are not straight-on enough. Wonder if I could solve with rotations? Seems CPU-expensive. Le sighz.

Here are some shots of it half-way working. It is, very disappointingly, very abysmal. I'm wondering if I can tweak it into functionality, though. I did try a quick skin detection thing, and that did really well. I think I could even tinker with it and get rid of the false positives.

The back of the head isn't bad, considering. It would work for my purposes...

Close but no cigar

Just with the background subtraction [below].. I'm going to also use the depth to subtract the background when I get around to it...

Checking out: http://blog.mashape.com/post/53379410412/list-of-50-face-detection-recognition-apis

http://www.semanticvisiontech.com/

http://chenlab.ece.cornell.edu/projects/FaceTracking/

http://pointclouds.org/blog/hrcs/aaldoma/index.php ??

I think I might go into the skin detection more...

EDIT: So it turns out I forgot to draw the profile detections boxes... plus, the other kinect seemed to get a better view. It was mostly my face. I think I could do a mirror and get the other profile... can't do this for all angles though, and it doesn't catch my partner hardly at all. This is still mostly open-embrace. Close embrace is going to be tricky.

EDIT #2:

Ok, so actually the background subtraction was hurting more in this case than helping. I was skeptical, but... I think what I need to do is just put a bounding box around the couple, and detect the face in there -- would be faster. Anyways, ya, so with the first kinect & only frontal face, the background helped A LOT (for the little positives that were created) but in this case, NOT having the background really helped. I mean, looking at the visuals, you can see where it would get confused, but it really was better in the other case. Anyways... I also fooled around with the brightness/contrast. I added the other side profile & I was actually detecting my partner a few times.

smaller face detections!

So, again I think this all has to be supplemented by skin detection. It will definitely fix the relatively few number of false negatives. However, it is still pretty iffy in close embrace...so I think think I need to find another way (in addition? instead?). Damn you close embrace!!! You keep on foiling me!! Also, I can do template matching for a while once I find the face. But I do need to have a robust way of finding face the first time. Again, I think I also could do a bounding box on the couple, and only search that image. Could do that from the depth image maybe? Plus, seeing where the skin color falls?

Brighter lighting would help but this is SUPPOSED to be for a tango milonga. Wonder if there is a way to make Kinect RGB part better in dim light?