Deep Mouse Trap

Neural Regression, Representational Similarity, Model Zoology & Neural Taskonomy at Scale in Rodent Visual Cortex

Colin Conwell https://colinconwell.github.io/ (Harvard University) , David Mayo (MIT CSAIL / CBMM) , Michael A. Buice (Allen Institute for Brain Science) , Boris Katz (MIT CSAIL / CBMM) , George Alvarez https://scorsese.wjh.harvard.edu/George/ (Harvard University) , Andrei Barbu (MIT CSAIL / CBMM)
December 8th, 2021

This article is an abridged summary of a longer work appearing at NeurIPS 2021, as well as a conceptual introduction to the Deep Mouse Trap Github repo.

Neurons in Vivo & Silico

What goes on in the brain of a mouse? It’s a seemingly simple question that belies a devilishly complicated scientific endeavor to understand how the firing and wiring together of neurons in a nervous system produce intelligent behavior. The mouse is arguably the centerpiece of a modern neuroscientific praxis that has availed itself of everything from genetics to cybernetics, yet in certain respects we know very little about certain key aspects of its neural software. In this project, we’ll be looking at vision, and in particular the ways we’ve increasingly come to model it.

The relative paucity of models we have for characterizing the vision of mice is made all the more conspicuous by the relative excess of models we have for characterizing the vision of another paradigmatic lab animal: monkeys (and in particular the rhesus macaque). Over the last 5 years, our ability to characterize and predict the neural activity of macaque visual cortex has surged in large part thanks to a singular class of model: object-recognizing deep neural networks. So powerful are these models that we can even use them as a sort of neural remote control, synthesizing visual stimuli that drives neural activity beyond the range evoked by handpicked natural images. The success of these models in predicting mouse visual cortex, on the other hand, has been a bit more modest, with some even suggesting that randomly initialized neural networks (neural networks that have never actually learned anything) are as predictive as trained ones – a particularly worrisome suggestion if we’d like to make mechanistic claims about the neural activity we’re predicting as having something to do with visual intelligence.

Here, we re-examine the state of neural network modeling in mouse visual cortex, using a massive optical physiology dataset of almost 6,600 highly reliable visual cortical neurons (courtesy of the Allen Brain Observatory), a large battery of neural networks (both trained and randomly initialized), and multiple methods of comparing the activity of those networks to the brain (including both representational similarity and linear mapping). Our intent is this is not to necessarily to converge on the single best model of mouse brain per se, but to better understand the kinds of pressures that shape the representations inherent to those models with greater or lesser neural predictivity.

The Approach

Neural Data & Models

We first preprocess the neural data such that we have the average responses per neuron to each of the 119 natural scene images that were used by the Brain Observatory as a freeviewing probe. (The 6619 neurons in our final set of neurons are actually the subsampled neurons from a larger set of about 37,398 unique neurons that we’ve filtered for reliability.) Our neural sample includes neurons from 6 different cortical areas that span what has (neuroanatomically) been demarcated as the rodent ventral and dorsal visual pathways.

We then compare these responses systematically to the responses of artificial neurons across the layers of a variety of deep net models, selected deliberately to engender meaningful experimental foils we can use to answer thematic questions about representations in the mouse brain. These models include over 90 distinct architectures (e.g. ConvNets, transformers, MLP-Mixers) all trained on object classification with the ImageNet training set; the randomly-initialized (untrained) versions of these same 90 architectures; the 24 models of the Taskonomy project (which include as a backbone the same architecture of encoder); and 20 models (all ResNet50 architectures) trained on a variety of self-supervised tasks. A list of the models we use is available in the table below.

model_display_name description
AlexNet AlexNet trained on image classification with the ImageNet dataset.
VGG11 VGG11 trained on image classification with the ImageNet dataset.
VGG13 VGG13 trained on image classification with the ImageNet dataset.
VGG16 VGG16 trained on image classification with the ImageNet dataset.
VGG19 VGG19 trained on image classification with the ImageNet dataset.
VGG11-BatchNorm VGG11-BatchNorm trained on image classification with the ImageNet dataset.
VGG13-BatchNorm VGG13-BatchNorm trained on image classification with the ImageNet dataset.
VGG16-BatchNorm VGG16-BatchNorm trained on image classification with the ImageNet dataset.
VGG19-BatchNorm VGG19-BatchNorm trained on image classification with the ImageNet dataset.
ResNet18 ResNet18 trained on image classification with the ImageNet dataset.
ResNet34 ResNet34 trained on image classification with the ImageNet dataset.
ResNet50 ResNet50 trained on image classification with the ImageNet dataset.
ResNet101 ResNet101 trained on image classification with the ImageNet dataset.
ResNet152 ResNet152 trained on image classification with the ImageNet dataset.
SqueezeNet1.0 SqueezeNet1.0 trained on image classification with the ImageNet dataset.
SqueezeNet1.1 SqueezeNet1.1 trained on image classification with the ImageNet dataset.
DenseNet121 DenseNet121 trained on image classification with the ImageNet dataset.
DenseNet161 DenseNet161 trained on image classification with the ImageNet dataset.
DenseNet169 DenseNet169 trained on image classification with the ImageNet dataset.
DenseNet201 DenseNet201 trained on image classification with the ImageNet dataset.
GoogleNet GoogleNet trained on image classification with the ImageNet dataset.
ShuffleNet-V2-x0.5 ShuffleNet-V2-x0.5 trained on image classification with the ImageNet dataset.
ShuffleNet-V2-x1.0 ShuffleNet-V2-x1.0 trained on image classification with the ImageNet dataset.
MobileNet-V2 MobileNet-V2 trained on image classification with the ImageNet dataset.
ResNext50-32x4D ResNext50-32x4D trained on image classification with the ImageNet dataset.
ResNext50-32x8D ResNext50-32x8D trained on image classification with the ImageNet dataset.
Wide-ResNet50 Wide-ResNet50 trained on image classification with the ImageNet dataset.
Wide-ResNet101 Wide-ResNet101 trained on image classification with the ImageNet dataset.
MNASNet0.5 MNASNet0.5 trained on image classification with the ImageNet dataset.
MNASNet1.0 MNASNet1.0 trained on image classification with the ImageNet dataset.
Inception-V3 Inception-V3 trained on image classification with the ImageNet dataset.
Autoencoder Image compression and decompression
Object Classification 1000-way object classification (via knowledge distillation from ImageNet).
Scene Classification Scene Classification (via knowledge distillation from MIT Places).
Curvatures Magnitude of 3D principal curvatures
Denoising Uncorrupted version of corrupted image.
Euclidean Depth Depth estimation
Z-Buffer Depth Depth estimation.
Occlusion Edges Edges which include parts of the scene.
Texture Edges Edges computed from RGB only (texture edges).
Egomotion Odometry (camera poses) given three input images.
Camera Pose (Fixated) Relative camera pose with matching optical centers.
Inpainting Filling in masked center of image.
Jigsaw Putting scrambled image pieces back together.
2D Keypoints Keypoint estimation from RGB-only (texture features).
3D Keypoints 3D Keypoint estimation from underlying scene 3D.
Camera Pose (Nonfixated) Relative camera pose with distinct optical centers.
Surface Normals Pixel-wise surface normals.
Point Matching Classifying if centers of two images match or not.
Reshading Reshading with new lighting placed at camera location.
Room Layout Orientation and aspect ratio of cubic room layout.
Semantic Segmentation Pixel-wise semantic labeling (via knowledge distillation from MS COCO).
Unsupervised 2.5D Segmentation Segmentation (graph cut approximation) on RGB-D-Normals-Curvature image.
Unsupervised 2D Segmentation Segmentation (graph cut approximation) on RGB.
Vanishing Point Three Manhattan-world vanishing points.
Random Weights Taskonomy architecture randomly initialized.
CaIT-S24 CaIT-S24 trained on image classification with the ImageNet dataset.
CoaT-Lite-Mini CoaT-Lite-Mini trained on image classification with the ImageNet dataset.
ConViT-B ConViT-B trained on image classification with the ImageNet dataset.
ConViT-S ConViT-S trained on image classification with the ImageNet dataset.
CSP-DarkNet53 CSP-DarkNet53 trained on image classification with the ImageNet dataset.
CSP-ResNet50 CSP-ResNet50 trained on image classification with the ImageNet dataset.
DLA34 DLA34 trained on image classification with the ImageNet dataset.
DLA169 DLA169 trained on image classification with the ImageNet dataset.
ECA-NFNeT-L0 ECA-NFNeT-L0 trained on image classification with the ImageNet dataset.
ECA-NFNeT-L1 ECA-NFNeT-L1 trained on image classification with the ImageNet dataset.
ECA-Resnet50-D ECA-Resnet50-D trained on image classification with the ImageNet dataset.
ECA-Resnet101-D ECA-Resnet101-D trained on image classification with the ImageNet dataset.
EfficientNet-V2-S EfficientNet-V2-S trained on image classification with the ImageNet dataset.
FBNetC100 FBNetC100 trained on image classification with the ImageNet dataset.
GerNet-L GerNet-L trained on image classification with the ImageNet dataset.
GerNet-S GerNet-S trained on image classification with the ImageNet dataset.
GhostNet100 GhostNet100 trained on image classification with the ImageNet dataset.
HardCoreNAS-A HardCoreNAS-A trained on image classification with the ImageNet dataset.
HardCoreNAS-F HardCoreNAS-F trained on image classification with the ImageNet dataset.
LeViT128 LeViT128 trained on image classification with the ImageNet dataset.
LeViT256 LeViT256 trained on image classification with the ImageNet dataset.
Inception-Resnet-V2 Inception-Resnet-V2 trained on image classification with the ImageNet dataset.
Inception-V3 Inception-V3 trained on image classification with the ImageNet dataset.
Inception-V4 Inception-V4 trained on image classification with the ImageNet dataset.
Inception-V4 Inception-V4 trained on image classification with the ImageNet dataset.
MLP-Mixer-B16 MLP-Mixer-B16 trained on image classification with the ImageNet dataset.
MLP-Mixer-L16 MLP-Mixer-L16 trained on image classification with the ImageNet dataset.
MixNet-L MixNet-L trained on image classification with the ImageNet dataset.
MixNet-S MixNet-S trained on image classification with the ImageNet dataset.
MNASNet100 MNASNet100 trained on image classification with the ImageNet dataset.
MNASNet100 MNASNet100 trained on image classification with the ImageNet dataset.
MobileNet-V3 MobileNet-V3 trained on image classification with the ImageNet dataset.
NASNet-A-Large NASNet-A-Large trained on image classification with the ImageNet dataset.
NF-ResNet50 NF-ResNet50 trained on image classification with the ImageNet dataset.
NF-Net-L0 NF-Net-L0 trained on image classification with the ImageNet dataset.
PNASNet-5-Large PNASNet-5-Large trained on image classification with the ImageNet dataset.
RegNetX-64 RegNetX-64 trained on image classification with the ImageNet dataset.
RegNetY-64 RegNetY-64 trained on image classification with the ImageNet dataset.
RepVGG-B3 RepVGG-B3 trained on image classification with the ImageNet dataset.
RepVGG-B3G4 RepVGG-B3G4 trained on image classification with the ImageNet dataset.
Res2Net50-26W-4S Res2Net50-26W-4S trained on image classification with the ImageNet dataset.
ResNest50D ResNest50D trained on image classification with the ImageNet dataset.
ResNetRS50 ResNetRS50 trained on image classification with the ImageNet dataset.
RexNet100 RexNet100 trained on image classification with the ImageNet dataset.
SemNASNet100 SemNASNet100 trained on image classification with the ImageNet dataset.
SEResNet152D SEResNet152D trained on image classification with the ImageNet dataset.
SEResNext50-32x4D SEResNext50-32x4D trained on image classification with the ImageNet dataset.
SKResNet18 SKResNet18 trained on image classification with the ImageNet dataset.
SKResNext50-32x4D SKResNext50-32x4D trained on image classification with the ImageNet dataset.
SPNasNet100 SPNasNet100 trained on image classification with the ImageNet dataset.
Swin-B-P4-W7-224 Swin-B-P4-W7-224 trained on image classification with the ImageNet dataset.
Swin-L-P4-W7-224 Swin-L-P4-W7-224 trained on image classification with the ImageNet dataset.
Swin-S-P4-W7-224 Swin-S-P4-W7-224 trained on image classification with the ImageNet dataset.
EfficientNet-B1 EfficientNet-B1 trained on image classification with the ImageNet dataset.
EfficientNet-B3 EfficientNet-B3 trained on image classification with the ImageNet dataset.
EfficientNet-B5 EfficientNet-B5 trained on image classification with the ImageNet dataset.
EfficientNet-B7 EfficientNet-B7 trained on image classification with the ImageNet dataset.
Visformer Visformer trained on image classification with the ImageNet dataset.
ViT-L-P16-224 ViT-L-P16-224 trained on image classification with the ImageNet dataset.
ViT-S-P16-224 ViT-S-P16-224 trained on image classification with the ImageNet dataset.
ViT-B-P16-224 ViT-B-P16-224 trained on image classification with the ImageNet dataset.
XCeption XCeption trained on image classification with the ImageNet dataset.
XCeption65 XCeption65 trained on image classification with the ImageNet dataset.
ResNet50-JigSaw-P100 ResNet50-JigSaw-P100 trained via self supervision with the ImageNet dataset.
ResNet50-JigSaw-Goyal19 ResNet50-JigSaw-Goyal19 trained via self supervision with the ImageNet dataset.
ResNet50-RotNet ResNet50-RotNet trained via self supervision with the ImageNet dataset.
ResNet50-ClusterFit-16K-RotNet ResNet50-ClusterFit-16K-RotNet trained via self supervision with the ImageNet dataset.
ResNet50-NPID-4KNegative ResNet50-NPID-4KNegative trained via self supervision with the ImageNet dataset.
ResNet50-PIRL ResNet50-PIRL trained via self supervision with the ImageNet dataset.
ResNet50-SimCLR ResNet50-SimCLR trained via self supervision with the ImageNet dataset.
ResNet50-DeepClusterV2 ResNet50-DeepClusterV2-2x224 trained via self supervision with the ImageNet dataset.
ResNet50-DeepClusterV2 ResNet50-DeepClusterV2-2x224+6x96 trained via self supervision with the ImageNet dataset.
ResNet50-SwAV-BS4096 ResNet50-SwAV-BS4096-2x224 trained via self supervision with the ImageNet dataset.
ResNet50-SwAV-BS4096 ResNet50-SwAV-BS4096-2x224+6x96 trained via self supervision with the ImageNet dataset.
ResNet50-MoCoV2-BS256 ResNet50-MoCoV2-BS256 trained via self supervision with the ImageNet dataset.
ResNet50-BarlowTwins-BS2048 ResNet50-BarlowTwins-BS2048 trained via self supervision with the ImageNet dataset.
Dino-VIT-S16 Dino-VIT-S16 trained via self supervision with the ImageNet dataset.
Dino-VIT-S8 Dino-VIT-S8 trained via self supervision with the ImageNet dataset.
Dino-VIT-B16 Dino-VIT-B16 trained via self supervision with the ImageNet dataset.
Dino-VIT-B8 Dino-VIT-B8 trained via self supervision with the ImageNet dataset.
Dino-XCIT-S12-P16 Dino-XCIT-S12-P16 trained via self supervision with the ImageNet dataset.
Dino-XCIT-S12-P8 Dino-XCIT-S12-P8 trained via self supervision with the ImageNet dataset.
Dino-XCIT-M24-P16 Dino-XCIT-M24-P16 trained via self supervision with the ImageNet dataset.
Dino-XCIT-M24-P8 Dino-XCIT-M24-P8 trained via self supervision with the ImageNet dataset.
Dino-ResNet50 Dino-ResNet50 trained via self supervision with the ImageNet dataset.

Mapping Methods

Equipped with our neural data and models, we then employ two distinct methods for mapping the responses of our biological neurons to the responses of our artificial neurons. The first, often called classic representational similarity analysis, is designed to assess representational structure (sometimes referred to as representational geometry) at the level of neural populations – in our case, the neural populations of 6 different visual cortical areas. The key component of representational similarity the representational (dis)similarity matrix (RDM), a distance matrix computed by taking the pairwise distance (1 - Pearson correlation coefficient in our case) of each stimulus to every other stimulus across all neural responses in the target population. A given model’s neural predictivity score in this classic representational similarity analysis is simply an average second order-distance (in our case, another 1 - Pearson correlation coefficient) between the RDM of its maximally correspondent layer and each of the cortical RDMs in our sample. Note that this kind of classic representational similarity analysis is a nonparametric mapping, and requires no fits or transformations – just an emergent similarity in how stimuli are organized across the responses of the two neural populations (one artificial, one biological) being compared.

Rather than target an entire neural population simultaneously, we can also more closely scrutinize individual neural responses using a method broadly called neural encoding or neural regression. With this method, we take the artificial neural responses of our model as the set of predictors in a regression where we try to predict (always with some form of cross-validation) the responses of a biological neuron to images not included in the regression. What we’re effectively doing in this method is mixing and matching our set of artificial neurons (often with some sort of dimensiionality reduction along the way) to approximate the representational profile of a single biological neuron. The better suited those artificial neurons are to this mixing and matching (which we measure with a correlation between the neural responses predicted by the regression and the actual responses of a target neuron), the higher the score of the model that hosts them. (A schematic of our neural regression method can be seen in the figure below.)

Results

Model Rankings

Combining together our neural data, models and mapping methods, a sort of intuitive first result we obtain is a large set of model rankings, which in the plots below we’ve organized broadly into 1 of 3 categories.

ImageNet Architectures

Taskonomy Models

Self-Supervised Models

These rankings may seem like an end in and of themselves, but scrutinize them without context for more than a few minutes and it will (in all probability) quickly become clear that these rankings are largely insufficient for meaningful insight.

What we need, in their place or at least as a supplement, are targeted contrasts between certain kinds of models that directly arbitrate on questions (theoretical or practical) we have about mouse brain. In the section below, we show a few examples of the kinds of questions a neural modeling survey of this nature allows us to address when we choose models we can meaningfully separate into subgroups.

Contrasts + Questions

Does training matter?

A first question we might ask, and one we’ve deliberately designed our survey of models to assess, is whether the kinds of learning a deep neural network does when trained on a task like object recognition actually matter for the prediction of mouse brain. The answer may seem a trivial and intuitive yes, but some prior work suggests something far less simple. There are, for example, a number of cases in nature of perceptual machineries that (unlike the highly structured, hierarchical primate visual cortex) are characterized by somewhat chaotic, random connectivity – rodent olfactory cortex a prime example among them. That rodent visual cortex may itself be defined by this kind of randomness is thus not unfathomable. Compound this with the finding of a prominent research team a few years ago that a classic convolutional neural network model called VGG16 is just as predictive of mouse brain before training as after training and it’s entirely possible that the first major divergence in the deep neural network modeling of mouse versus monkey brain is the necessity of training – in other words, task optimization.

How then does our modeling arbitrate on the possibility of mouse visual cortex being a randomly initialized neural network?

In brief, it strongly suggests it isn’t. In the plot above, each line represents the difference in score between a model trained on ImageNet and its randomly initialized counterpart. In both our neural encoding and representational similarity methods, pretrained models strongly outperform their untrained pair – in representational similarity, this effect is so strong as to be categorical.

The finding here underscores an important point about deep neural network modeling more generally. While individual models can be informative, it’s often worthwhile to assess many models in the aggregate. While we replicate the field’s previous finding that randomly-initialized VGG16 does just as well if not better than ImageNet-trained VGG16, we also demonstrate this effect is not consistent across different architectures.

All in all, this result suggests some sort of task-optimization is key in predicting mouse brain. To better assess what kind of task-optmization matters, though, we turn to other models in our survey.

Does object recognition matter?

The overarching goal in much of the neural benchmarking literature to date has been the deeper understanding of visual object recognition in the ventral stream of macaque visual cortex. This has naturally entailed a focus on deep convolutional network models of object recognition as the natural (and empirically reified) gold standard of neural and behavioral predictivity.

In a species like the mouse, however, the centrality of object recognition as an ethologically relevant task is immediately suspect. Mice do recognize objects, it seems, and do use their visual systems to predicate and calibrate advanced behaviors like hunting insects… but there are a number of reasons (both first order and observational) to believe object recognition might not be the apex model when it comes to explaining how mouse visual cortex is organized.

So what other options are available? Immediately proximate to object recognition, but mitigating the need for category labels are self-supervised, contrastive learning models. Instead of being taught to discriminate object classes indirectly by dint of invariances, these models are often taught invariances directly – learning for example to treat a single image and various transformation of that image (rotations, dilations, reilluminations, translations) as the same, all without explicit category supervision.

These models, then, allow us to answer the question as to whether the kinds of predictivity we see from trained models in the mouse brain are really about objects per se, or whether it’s more about the fine-grained, separable, invariant representations learnable through both category and self-supervision. The answer it seems, is closer to the latter. Modern contrastive learning algorithms (e.g. BarlowTwins, SimCLR, Swav) often do just as well as category-supervised models when controlling for the effects of architecture. Object recognition, then – as an explicit task – may be sufficient, but is (in the end) unnecessary to capture the responses of mouse visual cortex to naturalistic stimuli. (The comparison of category supervised and self-supervised models is available in the 3rd tab of the ranking plots above, and across the various bar plots of the figure below.)

Less proximate to object recognition, of course, are a whole variety of alternative computer vision tasks that necessitate an arguably kaleidoscopic variety of latent representation. In our survey, this diversity is captured most succinctly (and with the most empirical control) by the taskonomy encoders: a single encoder architecture deployed in 24 different canonical computer vision tasks ranging from autoencoding to edge detection. While trained on a rather limited visual diet (video tours of model houses), these models nonetheless allow us to triangulate the contributions of different kinds of training procedures on neural predictivity.

Clusters of these tasks (organized in terms of how well representations from one task transfer to the others) are depicted in the figure above. (Descriptions of individual tasks, alongside their cluster, may be found in the table below.) The facets in each of these plots are the six cortical areas assayed in our neural data. The individual lines atop each bar are the best performing model from that particular subgroup of models. The ‘intermouse’ predictivity score (calculable only in the neural encoding metric) is a measure of how well we can do when swapping out the artificial neural network models in our pipeline for other mice. (In other words, how well can we do when modeling one mouse if our model was an aggregate of conspecifics?)

The major takeaway from this figure (and from our results more generally) is that object recognition (a member of the ‘Semantic’ task cluster) is by no means the only contender in the game of neural predictivity. In fact, a ‘2D’ task – unsupervised segmentation – is one of the overall best models in our survey, and is perhaps the more intuitive task for a visual system like the mouse’s, configured as it most likely is to navigation and foraging in the dark.

model_display_name task_cluster description
AlexNet Semantic AlexNet trained on image classification with the ImageNet dataset.
VGG11 Semantic VGG11 trained on image classification with the ImageNet dataset.
VGG13 Semantic VGG13 trained on image classification with the ImageNet dataset.
VGG16 Semantic VGG16 trained on image classification with the ImageNet dataset.
VGG19 Semantic VGG19 trained on image classification with the ImageNet dataset.
VGG11-BatchNorm Semantic VGG11-BatchNorm trained on image classification with the ImageNet dataset.
VGG13-BatchNorm Semantic VGG13-BatchNorm trained on image classification with the ImageNet dataset.
VGG16-BatchNorm Semantic VGG16-BatchNorm trained on image classification with the ImageNet dataset.
VGG19-BatchNorm Semantic VGG19-BatchNorm trained on image classification with the ImageNet dataset.
ResNet18 Semantic ResNet18 trained on image classification with the ImageNet dataset.
ResNet34 Semantic ResNet34 trained on image classification with the ImageNet dataset.
ResNet50 Semantic ResNet50 trained on image classification with the ImageNet dataset.
ResNet101 Semantic ResNet101 trained on image classification with the ImageNet dataset.
ResNet152 Semantic ResNet152 trained on image classification with the ImageNet dataset.
SqueezeNet1.0 Semantic SqueezeNet1.0 trained on image classification with the ImageNet dataset.
SqueezeNet1.1 Semantic SqueezeNet1.1 trained on image classification with the ImageNet dataset.
DenseNet121 Semantic DenseNet121 trained on image classification with the ImageNet dataset.
DenseNet161 Semantic DenseNet161 trained on image classification with the ImageNet dataset.
DenseNet169 Semantic DenseNet169 trained on image classification with the ImageNet dataset.
DenseNet201 Semantic DenseNet201 trained on image classification with the ImageNet dataset.
GoogleNet Semantic GoogleNet trained on image classification with the ImageNet dataset.
ShuffleNet-V2-x0.5 Semantic ShuffleNet-V2-x0.5 trained on image classification with the ImageNet dataset.
ShuffleNet-V2-x1.0 Semantic ShuffleNet-V2-x1.0 trained on image classification with the ImageNet dataset.
MobileNet-V2 Semantic MobileNet-V2 trained on image classification with the ImageNet dataset.
ResNext50-32x4D Semantic ResNext50-32x4D trained on image classification with the ImageNet dataset.
ResNext50-32x8D Semantic ResNext50-32x8D trained on image classification with the ImageNet dataset.
Wide-ResNet50 Semantic Wide-ResNet50 trained on image classification with the ImageNet dataset.
Wide-ResNet101 Semantic Wide-ResNet101 trained on image classification with the ImageNet dataset.
MNASNet0.5 Semantic MNASNet0.5 trained on image classification with the ImageNet dataset.
MNASNet1.0 Semantic MNASNet1.0 trained on image classification with the ImageNet dataset.
Inception-V3 Semantic Inception-V3 trained on image classification with the ImageNet dataset.
AlexNet Semantic AlexNet randomly initialized, with no training.
VGG11 Semantic VGG11 randomly initialized, with no training.
VGG13 Semantic VGG13 randomly initialized, with no training.
VGG16 Semantic VGG16 randomly initialized, with no training.
VGG19 Semantic VGG19 randomly initialized, with no training.
VGG11-BatchNorm Semantic VGG11-BatchNorm randomly initialized, with no training.
VGG13-BatchNorm Semantic VGG13-BatchNorm randomly initialized, with no training.
VGG16-BatchNorm Semantic VGG16-BatchNorm randomly initialized, with no training.
VGG19-BatchNorm Semantic VGG19-BatchNorm randomly initialized, with no training.
ResNet18 Semantic ResNet18 randomly initialized, with no training.
ResNet34 Semantic ResNet34 randomly initialized, with no training.
ResNet50 Semantic ResNet50 randomly initialized, with no training.
ResNet101 Semantic ResNet101 randomly initialized, with no training.
ResNet152 Semantic ResNet152 randomly initialized, with no training.
SqueezeNet1.0 Semantic SqueezeNet1.0 randomly initialized, with no training.
SqueezeNet1.1 Semantic SqueezeNet1.1 randomly initialized, with no training.
DenseNet121 Semantic DenseNet121 randomly initialized, with no training.
DenseNet161 Semantic DenseNet161 randomly initialized, with no training.
DenseNet169 Semantic DenseNet169 randomly initialized, with no training.
DenseNet201 Semantic DenseNet201 randomly initialized, with no training.
GoogleNet Semantic GoogleNet randomly initialized, with no training.
ShuffleNet-V2-x0.5 Semantic ShuffleNet-V2-x0.5 randomly initialized, with no training.
ShuffleNet-V2-x1.0 Semantic ShuffleNet-V2-x1.0 randomly initialized, with no training.
MobileNet-V2 Semantic MobileNet-V2 randomly initialized, with no training.
ResNext50-32x4D Semantic ResNext50-32x4D randomly initialized, with no training.
ResNext50-32x8D Semantic ResNext50-32x8D randomly initialized, with no training.
Wide-ResNet50 Semantic Wide-ResNet50 randomly initialized, with no training.
Wide-ResNet101 Semantic Wide-ResNet101 randomly initialized, with no training.
MNASNet0.5 Semantic MNASNet0.5 randomly initialized, with no training.
MNASNet1.0 Semantic MNASNet1.0 randomly initialized, with no training.
Inception-V3 Semantic Inception-V3 randomly initialized, with no training.
CaIT-S24 Semantic CaIT-S24 trained on image classification with the ImageNet dataset.
CoaT-Lite-Mini Semantic CoaT-Lite-Mini trained on image classification with the ImageNet dataset.
ConViT-B Semantic ConViT-B trained on image classification with the ImageNet dataset.
ConViT-S Semantic ConViT-S trained on image classification with the ImageNet dataset.
CSP-DarkNet53 Semantic CSP-DarkNet53 trained on image classification with the ImageNet dataset.
CSP-ResNet50 Semantic CSP-ResNet50 trained on image classification with the ImageNet dataset.
DLA34 Semantic DLA34 trained on image classification with the ImageNet dataset.
DLA169 Semantic DLA169 trained on image classification with the ImageNet dataset.
ECA-NFNeT-L0 Semantic ECA-NFNeT-L0 trained on image classification with the ImageNet dataset.
ECA-NFNeT-L1 Semantic ECA-NFNeT-L1 trained on image classification with the ImageNet dataset.
ECA-Resnet50-D Semantic ECA-Resnet50-D trained on image classification with the ImageNet dataset.
ECA-Resnet101-D Semantic ECA-Resnet101-D trained on image classification with the ImageNet dataset.
EfficientNet-V2-S Semantic EfficientNet-V2-S trained on image classification with the ImageNet dataset.
FBNetC100 Semantic FBNetC100 trained on image classification with the ImageNet dataset.
GerNet-L Semantic GerNet-L trained on image classification with the ImageNet dataset.
GerNet-S Semantic GerNet-S trained on image classification with the ImageNet dataset.
GhostNet100 Semantic GhostNet100 trained on image classification with the ImageNet dataset.
HardCoreNAS-A Semantic HardCoreNAS-A trained on image classification with the ImageNet dataset.
HardCoreNAS-F Semantic HardCoreNAS-F trained on image classification with the ImageNet dataset.
LeViT128 Semantic LeViT128 trained on image classification with the ImageNet dataset.
LeViT256 Semantic LeViT256 trained on image classification with the ImageNet dataset.
Inception-Resnet-V2 Semantic Inception-Resnet-V2 trained on image classification with the ImageNet dataset.
Inception-V3 Semantic Inception-V3 trained on image classification with the ImageNet dataset.
Inception-V4 Semantic Inception-V4 trained on image classification with the ImageNet dataset.
Inception-V4 Semantic Inception-V4 trained on image classification with the ImageNet dataset.
MLP-Mixer-B16 Semantic MLP-Mixer-B16 trained on image classification with the ImageNet dataset.
MLP-Mixer-L16 Semantic MLP-Mixer-L16 trained on image classification with the ImageNet dataset.
MixNet-L Semantic MixNet-L trained on image classification with the ImageNet dataset.
MixNet-S Semantic MixNet-S trained on image classification with the ImageNet dataset.
MNASNet100 Semantic MNASNet100 trained on image classification with the ImageNet dataset.
MNASNet100 Semantic MNASNet100 trained on image classification with the ImageNet dataset.
MobileNet-V3 Semantic MobileNet-V3 trained on image classification with the ImageNet dataset.
NASNet-A-Large Semantic NASNet-A-Large trained on image classification with the ImageNet dataset.
NF-ResNet50 Semantic NF-ResNet50 trained on image classification with the ImageNet dataset.
NF-Net-L0 Semantic NF-Net-L0 trained on image classification with the ImageNet dataset.
PNASNet-5-Large Semantic PNASNet-5-Large trained on image classification with the ImageNet dataset.
RegNetX-64 Semantic RegNetX-64 trained on image classification with the ImageNet dataset.
RegNetY-64 Semantic RegNetY-64 trained on image classification with the ImageNet dataset.
RepVGG-B3 Semantic RepVGG-B3 trained on image classification with the ImageNet dataset.
RepVGG-B3G4 Semantic RepVGG-B3G4 trained on image classification with the ImageNet dataset.
Res2Net50-26W-4S Semantic Res2Net50-26W-4S trained on image classification with the ImageNet dataset.
ResNest50D Semantic ResNest50D trained on image classification with the ImageNet dataset.
ResNetRS50 Semantic ResNetRS50 trained on image classification with the ImageNet dataset.
RexNet100 Semantic RexNet100 trained on image classification with the ImageNet dataset.
SemNASNet100 Semantic SemNASNet100 trained on image classification with the ImageNet dataset.
SEResNet152D Semantic SEResNet152D trained on image classification with the ImageNet dataset.
SEResNext50-32x4D Semantic SEResNext50-32x4D trained on image classification with the ImageNet dataset.
SKResNet18 Semantic SKResNet18 trained on image classification with the ImageNet dataset.
SKResNext50-32x4D Semantic SKResNext50-32x4D trained on image classification with the ImageNet dataset.
SPNasNet100 Semantic SPNasNet100 trained on image classification with the ImageNet dataset.
Swin-B-P4-W7-224 Semantic Swin-B-P4-W7-224 trained on image classification with the ImageNet dataset.
Swin-L-P4-W7-224 Semantic Swin-L-P4-W7-224 trained on image classification with the ImageNet dataset.
Swin-S-P4-W7-224 Semantic Swin-S-P4-W7-224 trained on image classification with the ImageNet dataset.
EfficientNet-B1 Semantic EfficientNet-B1 trained on image classification with the ImageNet dataset.
EfficientNet-B3 Semantic EfficientNet-B3 trained on image classification with the ImageNet dataset.
EfficientNet-B5 Semantic EfficientNet-B5 trained on image classification with the ImageNet dataset.
EfficientNet-B7 Semantic EfficientNet-B7 trained on image classification with the ImageNet dataset.
Visformer Semantic Visformer trained on image classification with the ImageNet dataset.
ViT-L-P16-224 Semantic ViT-L-P16-224 trained on image classification with the ImageNet dataset.
ViT-S-P16-224 Semantic ViT-S-P16-224 trained on image classification with the ImageNet dataset.
ViT-B-P16-224 Semantic ViT-B-P16-224 trained on image classification with the ImageNet dataset.
XCeption Semantic XCeption trained on image classification with the ImageNet dataset.
XCeption65 Semantic XCeption65 trained on image classification with the ImageNet dataset.
CaIT-S24 Semantic CaIT-S24 randomly initialized, with no training.
CoaT-Lite-Mini Semantic CoaT-Lite-Mini randomly initialized, with no training.
ConViT-B Semantic ConViT-B randomly initialized, with no training.
ConViT-S Semantic ConViT-S randomly initialized, with no training.
CSP-DarkNet53 Semantic CSP-DarkNet53 randomly initialized, with no training.
CSP-ResNet50 Semantic CSP-ResNet50 randomly initialized, with no training.
DLA34 Semantic DLA34 randomly initialized, with no training.
DLA169 Semantic DLA169 randomly initialized, with no training.
ECA-NFNeT-L0 Semantic ECA-NFNeT-L0 randomly initialized, with no training.
ECA-NFNeT-L1 Semantic ECA-NFNeT-L1 randomly initialized, with no training.
ECA-Resnet50-D Semantic ECA-Resnet50-D randomly initialized, with no training.
ECA-Resnet101-D Semantic ECA-Resnet101-D randomly initialized, with no training.
EfficientNet-V2-S Semantic EfficientNet-V2-S randomly initialized, with no training.
FBNetC100 Semantic FBNetC100 randomly initialized, with no training.
GerNet-L Semantic GerNet-L randomly initialized, with no training.
GerNet-S Semantic GerNet-S randomly initialized, with no training.
GhostNet100 Semantic GhostNet100 randomly initialized, with no training.
HardCoreNAS-A Semantic HardCoreNAS-A randomly initialized, with no training.
HardCoreNAS-F Semantic HardCoreNAS-F randomly initialized, with no training.
LeViT128 Semantic LeViT128 randomly initialized, with no training.
LeViT256 Semantic LeViT256 randomly initialized, with no training.
Inception-Resnet-V2 Semantic Inception-Resnet-V2 randomly initialized, with no training.
Inception-V3 Semantic Inception-V3 randomly initialized, with no training.
Inception-V4 Semantic Inception-V4 randomly initialized, with no training.
Inception-V4 Semantic Inception-V4 randomly initialized, with no training.
MLP-Mixer-B16 Semantic MLP-Mixer-B16 randomly initialized, with no training.
MLP-Mixer-L16 Semantic MLP-Mixer-L16 randomly initialized, with no training.
MixNet-L Semantic MixNet-L randomly initialized, with no training.
MixNet-S Semantic MixNet-S randomly initialized, with no training.
MNASNet100 Semantic MNASNet100 randomly initialized, with no training.
MNASNet100 Semantic MNASNet100 randomly initialized, with no training.
MobileNet-V3 Semantic MobileNet-V3 randomly initialized, with no training.
NASNet-A-Large Semantic NASNet-A-Large randomly initialized, with no training.
NF-ResNet50 Semantic NF-ResNet50 randomly initialized, with no training.
NF-Net-L0 Semantic NF-Net-L0 randomly initialized, with no training.
PNASNet-5-Large Semantic PNASNet-5-Large randomly initialized, with no training.
RegNetX-64 Semantic RegNetX-64 randomly initialized, with no training.
RegNetY-64 Semantic RegNetY-64 randomly initialized, with no training.
RepVGG-B3 Semantic RepVGG-B3 randomly initialized, with no training.
RepVGG-B3G4 Semantic RepVGG-B3G4 randomly initialized, with no training.
Res2Net50-26W-4S Semantic Res2Net50-26W-4S randomly initialized, with no training.
ResNest50D Semantic ResNest50D randomly initialized, with no training.
ResNetRS50 Semantic ResNetRS50 randomly initialized, with no training.
RexNet100 Semantic RexNet100 randomly initialized, with no training.
SemNASNet100 Semantic SemNASNet100 randomly initialized, with no training.
SEResNet152D Semantic SEResNet152D randomly initialized, with no training.
SEResNext50-32x4D Semantic SEResNext50-32x4D randomly initialized, with no training.
SKResNet18 Semantic SKResNet18 randomly initialized, with no training.
SKResNext50-32x4D Semantic SKResNext50-32x4D randomly initialized, with no training.
SPNasNet100 Semantic SPNasNet100 randomly initialized, with no training.
Swin-B-P4-W7-224 Semantic Swin-B-P4-W7-224 randomly initialized, with no training.
Swin-L-P4-W7-224 Semantic Swin-L-P4-W7-224 randomly initialized, with no training.
Swin-S-P4-W7-224 Semantic Swin-S-P4-W7-224 randomly initialized, with no training.
EfficientNet-B1 Semantic EfficientNet-B1 randomly initialized, with no training.
EfficientNet-B3 Semantic EfficientNet-B3 randomly initialized, with no training.
EfficientNet-B5 Semantic EfficientNet-B5 randomly initialized, with no training.
EfficientNet-B7 Semantic EfficientNet-B7 randomly initialized, with no training.
Visformer Semantic Visformer randomly initialized, with no training.
ViT-L-P16-224 Semantic ViT-L-P16-224 randomly initialized, with no training.
ViT-S-P16-224 Semantic ViT-S-P16-224 randomly initialized, with no training.
ViT-B-P16-224 Semantic ViT-B-P16-224 randomly initialized, with no training.
XCeption Semantic XCeption randomly initialized, with no training.
XCeption65 Semantic XCeption65 randomly initialized, with no training.
ResNet50-JigSaw-P100 SelfSupervised ResNet50-JigSaw-P100 trained via self supervision with the ImageNet dataset.
ResNet50-JigSaw-Goyal19 SelfSupervised ResNet50-JigSaw-Goyal19 trained via self supervision with the ImageNet dataset.
ResNet50-RotNet SelfSupervised ResNet50-RotNet trained via self supervision with the ImageNet dataset.
ResNet50-ClusterFit-16K-RotNet SelfSupervised ResNet50-ClusterFit-16K-RotNet trained via self supervision with the ImageNet dataset.
ResNet50-NPID-4KNegative SelfSupervised ResNet50-NPID-4KNegative trained via self supervision with the ImageNet dataset.
ResNet50-PIRL SelfSupervised ResNet50-PIRL trained via self supervision with the ImageNet dataset.
ResNet50-SimCLR SelfSupervised ResNet50-SimCLR trained via self supervision with the ImageNet dataset.
ResNet50-DeepClusterV2 SelfSupervised ResNet50-DeepClusterV2-2x224 trained via self supervision with the ImageNet dataset.
ResNet50-DeepClusterV2 SelfSupervised ResNet50-DeepClusterV2-2x224+6x96 trained via self supervision with the ImageNet dataset.
ResNet50-SwAV-BS4096 SelfSupervised ResNet50-SwAV-BS4096-2x224 trained via self supervision with the ImageNet dataset.
ResNet50-SwAV-BS4096 SelfSupervised ResNet50-SwAV-BS4096-2x224+6x96 trained via self supervision with the ImageNet dataset.
ResNet50-MoCoV2-BS256 SelfSupervised ResNet50-MoCoV2-BS256 trained via self supervision with the ImageNet dataset.
ResNet50-BarlowTwins-BS2048 SelfSupervised ResNet50-BarlowTwins-BS2048 trained via self supervision with the ImageNet dataset.
Dino-VIT-S16 SelfSupervised Dino-VIT-S16 trained via self supervision with the ImageNet dataset.
Dino-VIT-S8 SelfSupervised Dino-VIT-S8 trained via self supervision with the ImageNet dataset.
Dino-VIT-B16 SelfSupervised Dino-VIT-B16 trained via self supervision with the ImageNet dataset.
Dino-VIT-B8 SelfSupervised Dino-VIT-B8 trained via self supervision with the ImageNet dataset.
Dino-XCIT-S12-P16 SelfSupervised Dino-XCIT-S12-P16 trained via self supervision with the ImageNet dataset.
Dino-XCIT-S12-P8 SelfSupervised Dino-XCIT-S12-P8 trained via self supervision with the ImageNet dataset.
Dino-XCIT-M24-P16 SelfSupervised Dino-XCIT-M24-P16 trained via self supervision with the ImageNet dataset.
Dino-XCIT-M24-P8 SelfSupervised Dino-XCIT-M24-P8 trained via self supervision with the ImageNet dataset.
Dino-ResNet50 SelfSupervised Dino-ResNet50 trained via self supervision with the ImageNet dataset.

Does our modeling tell us anything about perceptual organization?

So far, our questions about mouse brain have been mostly geared towards function – in effect, what is the mouse visual system doing? Another set of questions, perhaps a bit more contentious, are geared towards organization. One particular question is whether (as has been clearly established in primate visual neuroscience) there exists a meaningful information processing hierarchy across the various cortical areas that define mouse visual cortex. Recent work in neuroanatomy (especially in large-scale connectomics) has begun to suggest the answer is yes, but significant debate remains. Does our modeling weigh in on this debate in any way?

As a matter of fact, it just might. A somewhat proximal, but statistically robust method of assessing for the presence of an information-processing hierarchy in a target biological brain is by calculating the mean (or median) depth (in a network) of the model layers that maximally correspond to the target brain area. In deep neural networks, deeper layers (at least in feedforward models) tend almost always to host more complex, sophisticated representations than earlier layers. If a given cortical area is earlier in the information processing hierarchy than another area, the layer of the deep net that corresponds to that earlier area should (on average) be shallower than the layer that corresponds to the later area.

As it turns out, this precise trend occurs in our sample of mouse visual cortex. Beginning from primary visual cortex (VISp), successive layers in the neuroanatomically defined hierarchy are best captured by successively deeper layers of our deep net models. Pairwise statistics in this case between the median depths of maximally correspondent model layers verify this ‘data-driven’; in other words, wherever there exists a significant pairwise difference, that difference favors the hierarchy.

Conclusion

While far from complete, the questions we’ve covered here are at least a minimally representative palette sampler of the kinds of question that can be asked (and ideally answered) by a large-scale, deliberately designed neural benchmarking survey.

Perhaps the most important question we’ve addressed here is the whether or not deep neural networks are useful models of mouse visual cortex. While significant work remains, with a number of further modifications necessary to more fully account for the anatomical and ethological idiosyncrasies of mice, we hope you’ll grant at least preliminarily that the answer is yes.

Our code is freely available for you to test your own models, and maybe even to craft up models as of yet beyond conception. So universal and yet still so misunderstood, the mouse is a marvelous creature for modelers of all persuasion. Happy trapping!

References

Citation

For attribution, please cite this work as

Conwell, et al. (2021, Dec. 8). Deep Mouse Trap. Retrieved from https://colinconwell.github.io/DeepMouseTrap/

BibTeX citation

@misc{conwell2021deep,
  author = {Conwell, Colin and Mayo, David and Buice, Michael A. and Katz, Boris and Alvarez, George and Barbu, Andrei},
  title = {Deep Mouse Trap},
  url = {https://colinconwell.github.io/DeepMouseTrap/},
  year = {2021}
}