Coding and Tango: self-similarity matrix

Showing posts with label self-similarity matrix. Show all posts

Monday, January 13, 2014

More results

Amplitudes instead of MFCC's. Interestingly, the giselle music & movement are closer together using this measure... perhaps hinting that timbre wasn't such a structural element in the work, while variation in amplitude (beats?) were. Makes sense. Using self-similarity matrices since the recurrence plots were too much tweaking.

I used gifs as the inputs instead of text files, which seemed to increase the accuracy. Using text files, two matrices of the same data (movement data), which did look extremely similar, were not clustered together. Switching to gifs solved the problem. Will probably redo the MFCC's that way as well. Seems writing the files to matlab data might not be the best thing for this process.

Sunday, January 5, 2014

Actual Progress!!! Clustering Self-Similarity Matrices via Normalized Compression Distance (NCD)

See last post for awesome paper that led me to this!

Ok, so what does that MEAN??! you may ask. Well, this is the result of my clustering analysis (via NCD) of all the similarity matrices I have so far, and they roughly correspond with how similar I perceive them to be. I used the NCD command from CompLearn. I highly recommend it! Simple to use! You have to install graphviz, for the neato command -- which they seem to assume you already have in the CompLearn documentation.

***The closer the ovals are to each other, more similar they are perceived to be.

Why is .DS_Store there? BC I haven't taken the time to figure out how to exclude it. Grrrr!

Abbreviations are:
v5cover_move - Movement data from Cage 100 Festival @ Chicago performance of Variations V
v5cover_music - MFCC (audio) data from Cage 100 Festival @ Chicago performance of Variations V
giselle_move - Movement data from Giselle ballet
giselle_music - MFCC (audio) data from Giselle ballet
v5_original_move - Movement data from Cage/Cunningham's Variations V (1966) Hamburg perf.

v5_original_music - MFCC (audio) data from Cage/Cunningham's Variations V (1966) Hamburg perf.

decibel_v5_move - Movement data from decibel's Variations V

decibel_v5_music - MFCC (audio) data from decibel's Variations V

So, the Cage 100 movement/music data pair (ie, v5cover_), in which the movement corresponds heavily to the music appear super close to each other. The other music/movement are sort of spread out, but stilll kinda close... y'know. AND the furthest away music/movement pair is the Giselle data, the only non-interactive work in the analysis (the only work that isn't a performance of Variations V). So, this is good! If this is the only computational measure of similarity I have for my paper for the class, I think that would be good enough!

So, clearly, this is a good start. I would like to try this with recurrence plots, like the previous paper mentioned. However, apparently they screen people who have access to CRP toolkit, so I'll have to wait for the download link email (or reapply in 4 days?). If, on the off-chance I can't get access, I'll just make sure my similarity maps are up to snuff. These results aren't publication-ready, by any means, but OMG! a start. I definitely would also like more data for comparison.

Maybe I will be able to publish on some of my results and methods. I was feeling very gloomy about that prospect these past couple days until now.

Finally! A paper for self-similarity matrix comparison in the Music Information Retrieval literature -- doing what I want it to do!

Measuring Structural Similarity in Music - Juan P. Bello

I can definitely use this to compare music to movement.... It must be sensitive to small variations since he is using it to compare different performances of the same piece of music. He is using it for tonal music, but I can just use MFCCs (& maybe amplitude or STSMPS?) instead of the chroma-based features and CENs he's using, since (except for the Giselle null comparison example, the music is not necessarily pitch-based that I am analyzing) But anyways, that won't matter. I just need the comparison methods.

He is also using Recurrence Plots instead of self-similarity matrices, but they are similar measures. He then uses the Normalized Comparison Difference (NCD) -- something that is common to use in bioinformatics for genetics comparison -- to come up with a similarity score between the recurrence plots. Then, it looks like he finds threshold values for pairwise similarity and then uses the measure to retrieve performances of the same work of music. The measure is not tolerant of global structure changes (eg, if someone repeats a section and another does not) but that actually doesn't even matter for my application.

Tools that he uses:

Toolbox for Recurrence plots:
http://tocsy.pik-potsdam.de/CRPtoolbox/

Normalized Compression Distance:
http://www.complearn.org/ncd.html

I also had this idea in the shower that perhaps I could define the relationship between the structure of dance and music by their distance matrices. So if I had enough data, I could do ANOTHER NCD on the NCD data and see if the different performances of Variations V music and movement relationship (ie, the distance matrix between their similarity measures (whether RP or SSM) ) cluster together -- as opposed to other interactive and non-interactive dance.

I found this link which provides a lot of good information about music information retrieval -- mostly because I accessed the textbook for the course through my university's library (he had a lot of papers on structural analysis of audio via SSM so I just looked up his name). I thought about using the method described in the textbook for the segmentation of self-similarity matrices to determine what the actual repetitive structure is but I think that Bello's approach is more apropo to my musical analysis problem. I would definitely apply the segmentation / path method as well if I had time... but man, do I need this paper to be over.

Saturday, January 4, 2014

More image similarity analysis... & my last resort method, rules for the eyeball

I found some useful image analysis code in Matlab online from siddhant ahuja. I found it during an handy google search on "texture similarity image analysis" or some such. I will be using some of the transforms to see if I get any good results...?

Similarity Matrix for MFCC's of the audio Variations V from Chicago Cage 100 Festival* :

Similarity Matrix for movement from Variations V from Chicago Cage 100 Festival* (from video analysis, tracking torso and wrists by hand, then using the summed velocity)

Similarity Matrix for movement during the ballet Giselle (a sort of null hypothesis), movement information obtained in the same way as the previous example.*

* These are excerpts, not from the entire piece.

If all else fails, very quickly I will have to go to less computational means of analysis. Create strict rules for what similarity means, and create a ranking. I can then move on to use this as a justification for further work in my conclusion. This is very dissatisfying but this has been an enormous amount of brute force, technical, and programming work for what should be a music history paper for a class, so at some point I have to stop the search for the holy grail self-similarity matrix similarity...

Friday, January 3, 2014

Literature Search & Analysis with Self-Similarity Matrix Tutorial, Part Deux

Lit Search

Useful Matlab tools for audio feature analysis:
http://www.audiolabs-erlangen.de/meinard/data/

I am reading his textbook on MIR, the chapter on musical structure to see if it has any aha's for me. Alas, not so far. Although it has made me think about extracting different features...

I am now extracting volume and MFCC's for my audio clips... which I think make sense. But I might also extract STMSPs, since that might be better than just volume. I'm not sure, due to the nature of Variations V.

Foote's papers are a good place to start for applications of similarity matrices to music & video
http://www.rotorbrain.com/foote/vanity.htm

I thought this one could be helpful in comparing similarity matrices but I'm not sure:
http://pdf.aminer.org/000/009/514/summarizing_video_using_non_negative_similarity_matrix_factorization.pdf

btw -- Non-Negative Matrix Factorization Toolbox for Matlab
https://sites.google.com/site/nmftool/

OK, I have found more helpful papers on the technical side of self-similarity maps, but still no actual examples of any actual musical analysis applications in actual musicology/music theory papers. I have one article on Interlibrary Loan which MAY have something I can use... but if that lead dies, then I'm considering this part of the lit search over, and at least, have the examples that may sorta apply. I have an interesting performance analysis of some of Xenakis' works so that should help guide writing style and presentation, I guess... Le sigh.

I think I'm not done with some of my analysis... it really is an art, sometimes, statistics.

Octave codin' or, OK, so where was I in all this analysis??

Construct a self-similarity matrix in Octave/Matlab


octave:143> beats = wavread('/Users/me/mywave.wav'); % read in the files

octave:143> beats_dist = pdist(beats); % pairwise distance btw all the points

octave:144> beat_self_matrix = squareform(beats_dist); % turns it a square matrix

octave:144> imshow(beat_self_matrix); % show the matrix visually

Those are the basics but then, you actually need to downsample, smooth the data, & actually extract a feature (MFCC, amplitude, STMSP, CENs)

In terms of the movement data, I am dealing with 2d x,y data that I collapse in distances and then velocity measures. Since this data is from 2D video analysis I also smooth the data via an averaging filter and then divide by the standard deviation.

Here's a folder of the octave functions I've written for self-similarity maps. The audio versions may have more general use, but the movement ones do assume that you've collapsed your features into one vector. They are documented somewhat sketchily, so use with care. They require the signal and statistics packages from Octave, and the packages that THOSE require. Well, ya know.