Sunday, January 5, 2014

Actual Progress!!! Clustering Self-Similarity Matrices via Normalized Compression Distance (NCD)

See last post for awesome paper that led me to this!


Ok, so what does that MEAN??! you may ask. Well, this is the result of my clustering analysis (via NCD) of all the similarity matrices I have so far, and they roughly correspond with how similar I perceive them to be. I used the NCD command from CompLearn.  I highly recommend it! Simple to use! You have to install graphviz, for the neato command -- which they seem to assume you already have in the CompLearn documentation.

***The closer the ovals are to each other, more similar they are perceived to be.

Why is .DS_Store there? BC I haven't taken the time to figure out how to exclude it. Grrrr!

Abbreviations are:
v5cover_move - Movement data from Cage 100 Festival @ Chicago performance of Variations V
v5cover_music - MFCC (audio) data from Cage 100 Festival @ Chicago performance of Variations V
giselle_move - Movement data from Giselle ballet
giselle_music - MFCC (audio) data from Giselle ballet
v5_original_move - Movement data from Cage/Cunningham's Variations V (1966) Hamburg perf.
v5_original_music - MFCC (audio) data from Cage/Cunningham's Variations V (1966) Hamburg perf.
decibel_v5_move - Movement data from decibel's Variations V
decibel_v5_music - MFCC (audio) data from decibel's Variations V

So, the Cage 100 movement/music data pair (ie, v5cover_), in which the movement corresponds heavily to the music appear super close to each other. The other music/movement are sort of spread out, but stilll kinda close... y'know. AND the furthest away music/movement pair is the Giselle data, the only non-interactive work in the analysis (the only work that isn't a performance of Variations V). So, this is good! If this is the only computational measure of similarity I have for my paper for the class, I think that would be good enough!

So, clearly, this is a good start. I would like to try this with recurrence plots, like the previous paper mentioned. However, apparently they screen people who have access to CRP toolkit, so I'll have to wait for the download link email (or reapply in 4 days?). If, on the off-chance I can't get access, I'll just make sure my similarity maps are up to snuff. These results aren't publication-ready, by any means, but OMG! a start. I definitely would also like more data for comparison.

Maybe I will be able to publish on some of my results and methods. I was feeling very gloomy about that prospect these past couple days until now. 

No comments:

Post a Comment