I’ll be giving a talk this Thursday Feb. 19 to Austin Science Club, our mostly-neuroscience salon / journal club.
A recent Kaggle.com competition for seizure prediction provided examples of interictal and preictal (up to one hour before a seizure) ECoG data from five canine and two human subjects. I was working on the contest but was ruled out from competing by a technical rule (you must make some submission one week before the deadline). I was exploring a novel descriptor which looked at relationships between channels.
Many features have been used historically, ranging from energy in various bands, cross correlations, signal decomposition and nonlinear complexity measures. I used normalized compression distance (NCD) to judge similarity between 16 channels at various, and used the resulting measure as a feature to train support vector classifiers for each subject.
I’ll briefly review some prediction features, what the contest winners did, the limits of NCD and possible improvements (including additional features I’ve written code for which have not been commonly applied).
Most of my previous experience in data science was unsupervised learning, so I viewed this as an opportunity to improve my skills in supervised learning and begin to build a data science portfolio and library of EEG analysis tools.
We’ll look at open source tools for research in this area: scipy, signal processing libraries and scikit-learn were used. I’ve also ported code for multi-scale permutation entropy from matlab and will look at this feature set in future work.