Submit on Canvas by Tuesday 10/25/22 at 11:53 PM. Ask questions and discuss on Piazza. Have fun!
The data (in the file ReachData.npz
) is available here and can be loaded
as shown below. Please note that a .mat file of the same data for those who prefer MATLAB has been made available under Files --> Data on
Canvas due to size restrictions on github. Unfortunately, I will not be able to provide any MATLAB coding support. A demo notebook playing
around a little with the data will be shared at this link here (notebook from lecture).
import numpy as np
reachDataFile = np.load("ReachData.npz",allow_pickle=True)
spikeTimes = reachDataFile['spikeTimes']
timeTouchHeld = reachDataFile['timeTouchHeld']
timeGoCue = reachDataFile['timeGoCue']
timeTargetAcquire = reachDataFile['timeTargetAcquire']
cfr = reachDataFile['cfr']
Data description: There are 1127 trials, $N$, and 190 neurons, $D$. Each of the 1127 rows in the data arrays are in order of the trial meaning row \(n\) of each variable corresponds to row $n$ every other variable. Please note that all times are given in ms.
Define two windows of spikes, the "plan window" encompassing spikes which were emitted between target onset and the go cue, and the "movement window" encompassing spikes which were emitted between the go cue and the time movement ends (note that both types of windows are of different lengths for for different trials). Here is example code to find the array of spikes in the plan window:
planSpikes = []
for trialIdx, trialSpikes in enumerate(spikeTimes):
planSpikes.append([np.sum((st > timeTouchHeld[trialIdx]) &
(st < timeGoCue[trialIdx])) for st in trialSpikes])
planSpikes = np.array(planSpikes) # will be 1127 x 190
Randomly choose 50 trials per target to be set as the training data (400 trials total). The remainng trials
will be used as test data. We can describe the number of spikes that a neuron produces in each window (vector
of neural counts as $\textbf{d}_{plan}$ and $\textbf{d}_{move}$, where $\textbf{d}\in\mathbb{R}^{D}$, where $D$ is
the number of neurons) using a wide variety of random process models. Use the training data to estimate the parameters
of a target- direction-dependent vector Poisson process, i.e., $Pr(\textbf{d}_n∣target_k)∼Poisson(\textbf{λ}_k)$
$=$ $\prod^D_d Poisson(\textbf{λ}_{k,d})$.
In other words, calculate a Poisson model for each neuron and class - the rate for a class will be a vector the
size of the number of neurons. For each of the remaining test trials estimate the maximum loglikelihood reach target.
What is the overall decoding accuracy (i.e., what fraction of trials are decoded correctly) using only the plan window data?
Using only the movement window data? Combining both plan and movement data? Why might these be different? Do the accuracies
match your expectations? Why or why not?
In your model in Problem 1, you decoded maximum likelihood targets using all 190 neurons. Now, perform a "neuron dropping analysis". Starting with your full model, randomly eliminate neurons and evaluate decoding accuraccy in the reduced dataset. Eliminate between 20 and 180 neurons (by 20s -- so decode using between 10 and 190 neurons in steps of 20 neurons) -- average each point (number of neurons) by randomly choosing 20 neurons to be dropped 20 times (i.e., drop 20 neurons at each step 20 times to account for variability as some neurons probably contribute more information than others). Plot decoding accuracy as a function of the number of neurons available to the decoder. Describe what you see and hypothesize why this might be the case.