Neural Regression, Representational Similarity, Model Zoology & Neural Taskonomy at Scale in Rodent Visual Cortex
This article is an abridged summary of a longer work appearing at NeurIPS 2021, as well as a conceptual introduction to the Deep Mouse Trap Github repo.
What goes on in the brain of a mouse? It’s a seemingly simple question that belies a devilishly complicated scientific endeavor to understand how the firing and wiring together of neurons in a nervous system produce intelligent behavior. The mouse is arguably the centerpiece of a modern neuroscientific praxis that has availed itself of everything from genetics to cybernetics, yet in certain respects we know very little about certain key aspects of its neural software. In this project, we’ll be looking at vision, and in particular the ways we’ve increasingly come to model it.
The relative paucity of models we have for characterizing the vision of mice is made all the more conspicuous by the relative excess of models we have for characterizing the vision of another paradigmatic lab animal: monkeys (and in particular the rhesus macaque). Over the last 5 years, our ability to characterize and predict the neural activity of macaque visual cortex has surged in large part thanks to a singular class of model: object-recognizing deep neural networks. So powerful are these models that we can even use them as a sort of neural remote control, synthesizing visual stimuli that drives neural activity beyond the range evoked by handpicked natural images. The success of these models in predicting mouse visual cortex, on the other hand, has been a bit more modest, with some even suggesting that randomly initialized neural networks (neural networks that have never actually learned anything) are as predictive as trained ones – a particularly worrisome suggestion if we’d like to make mechanistic claims about the neural activity we’re predicting as having something to do with visual intelligence.
Here, we re-examine the state of neural network modeling in mouse visual cortex, using a massive optical physiology dataset of almost 6,600 highly reliable visual cortical neurons (courtesy of the Allen Brain Observatory), a large battery of neural networks (both trained and randomly initialized), and multiple methods of comparing the activity of those networks to the brain (including both representational similarity and linear mapping). Our intent is this is not to necessarily to converge on the single best model of mouse brain per se, but to better understand the kinds of pressures that shape the representations inherent to those models with greater or lesser neural predictivity.
We first preprocess the neural data such that we have the average responses per neuron to each of the 119 natural scene images that were used by the Brain Observatory as a freeviewing probe. (The 6619 neurons in our final set of neurons are actually the subsampled neurons from a larger set of about 37,398 unique neurons that we’ve filtered for reliability.) Our neural sample includes neurons from 6 different cortical areas that span what has (neuroanatomically) been demarcated as the rodent ventral and dorsal visual pathways.
We then compare these responses systematically to the responses of artificial neurons across the layers of a variety of deep net models, selected deliberately to engender meaningful experimental foils we can use to answer thematic questions about representations in the mouse brain. These models include over 90 distinct architectures (e.g. ConvNets, transformers, MLP-Mixers) all trained on object classification with the ImageNet training set; the randomly-initialized (untrained) versions of these same 90 architectures; the 24 models of the Taskonomy project (which include as a backbone the same architecture of encoder); and 20 models (all ResNet50 architectures) trained on a variety of self-supervised tasks. A list of the models we use is available in the table below.
model_display_name | description |
---|---|
AlexNet | AlexNet trained on image classification with the ImageNet dataset. |
VGG11 | VGG11 trained on image classification with the ImageNet dataset. |
VGG13 | VGG13 trained on image classification with the ImageNet dataset. |
VGG16 | VGG16 trained on image classification with the ImageNet dataset. |
VGG19 | VGG19 trained on image classification with the ImageNet dataset. |
VGG11-BatchNorm | VGG11-BatchNorm trained on image classification with the ImageNet dataset. |
VGG13-BatchNorm | VGG13-BatchNorm trained on image classification with the ImageNet dataset. |
VGG16-BatchNorm | VGG16-BatchNorm trained on image classification with the ImageNet dataset. |
VGG19-BatchNorm | VGG19-BatchNorm trained on image classification with the ImageNet dataset. |
ResNet18 | ResNet18 trained on image classification with the ImageNet dataset. |
ResNet34 | ResNet34 trained on image classification with the ImageNet dataset. |
ResNet50 | ResNet50 trained on image classification with the ImageNet dataset. |
ResNet101 | ResNet101 trained on image classification with the ImageNet dataset. |
ResNet152 | ResNet152 trained on image classification with the ImageNet dataset. |
SqueezeNet1.0 | SqueezeNet1.0 trained on image classification with the ImageNet dataset. |
SqueezeNet1.1 | SqueezeNet1.1 trained on image classification with the ImageNet dataset. |
DenseNet121 | DenseNet121 trained on image classification with the ImageNet dataset. |
DenseNet161 | DenseNet161 trained on image classification with the ImageNet dataset. |
DenseNet169 | DenseNet169 trained on image classification with the ImageNet dataset. |
DenseNet201 | DenseNet201 trained on image classification with the ImageNet dataset. |
GoogleNet | GoogleNet trained on image classification with the ImageNet dataset. |
ShuffleNet-V2-x0.5 | ShuffleNet-V2-x0.5 trained on image classification with the ImageNet dataset. |
ShuffleNet-V2-x1.0 | ShuffleNet-V2-x1.0 trained on image classification with the ImageNet dataset. |
MobileNet-V2 | MobileNet-V2 trained on image classification with the ImageNet dataset. |
ResNext50-32x4D | ResNext50-32x4D trained on image classification with the ImageNet dataset. |
ResNext50-32x8D | ResNext50-32x8D trained on image classification with the ImageNet dataset. |
Wide-ResNet50 | Wide-ResNet50 trained on image classification with the ImageNet dataset. |
Wide-ResNet101 | Wide-ResNet101 trained on image classification with the ImageNet dataset. |
MNASNet0.5 | MNASNet0.5 trained on image classification with the ImageNet dataset. |
MNASNet1.0 | MNASNet1.0 trained on image classification with the ImageNet dataset. |
Inception-V3 | Inception-V3 trained on image classification with the ImageNet dataset. |
Autoencoder | Image compression and decompression |
Object Classification | 1000-way object classification (via knowledge distillation from ImageNet). |
Scene Classification | Scene Classification (via knowledge distillation from MIT Places). |
Curvatures | Magnitude of 3D principal curvatures |
Denoising | Uncorrupted version of corrupted image. |
Euclidean Depth | Depth estimation |
Z-Buffer Depth | Depth estimation. |
Occlusion Edges | Edges which include parts of the scene. |
Texture Edges | Edges computed from RGB only (texture edges). |
Egomotion | Odometry (camera poses) given three input images. |
Camera Pose (Fixated) | Relative camera pose with matching optical centers. |
Inpainting | Filling in masked center of image. |
Jigsaw | Putting scrambled image pieces back together. |
2D Keypoints | Keypoint estimation from RGB-only (texture features). |
3D Keypoints | 3D Keypoint estimation from underlying scene 3D. |
Camera Pose (Nonfixated) | Relative camera pose with distinct optical centers. |
Surface Normals | Pixel-wise surface normals. |
Point Matching | Classifying if centers of two images match or not. |
Reshading | Reshading with new lighting placed at camera location. |
Room Layout | Orientation and aspect ratio of cubic room layout. |
Semantic Segmentation | Pixel-wise semantic labeling (via knowledge distillation from MS COCO). |
Unsupervised 2.5D Segmentation | Segmentation (graph cut approximation) on RGB-D-Normals-Curvature image. |
Unsupervised 2D Segmentation | Segmentation (graph cut approximation) on RGB. |
Vanishing Point | Three Manhattan-world vanishing points. |
Random Weights | Taskonomy architecture randomly initialized. |
CaIT-S24 | CaIT-S24 trained on image classification with the ImageNet dataset. |
CoaT-Lite-Mini | CoaT-Lite-Mini trained on image classification with the ImageNet dataset. |
ConViT-B | ConViT-B trained on image classification with the ImageNet dataset. |
ConViT-S | ConViT-S trained on image classification with the ImageNet dataset. |
CSP-DarkNet53 | CSP-DarkNet53 trained on image classification with the ImageNet dataset. |
CSP-ResNet50 | CSP-ResNet50 trained on image classification with the ImageNet dataset. |
DLA34 | DLA34 trained on image classification with the ImageNet dataset. |
DLA169 | DLA169 trained on image classification with the ImageNet dataset. |
ECA-NFNeT-L0 | ECA-NFNeT-L0 trained on image classification with the ImageNet dataset. |
ECA-NFNeT-L1 | ECA-NFNeT-L1 trained on image classification with the ImageNet dataset. |
ECA-Resnet50-D | ECA-Resnet50-D trained on image classification with the ImageNet dataset. |
ECA-Resnet101-D | ECA-Resnet101-D trained on image classification with the ImageNet dataset. |
EfficientNet-V2-S | EfficientNet-V2-S trained on image classification with the ImageNet dataset. |
FBNetC100 | FBNetC100 trained on image classification with the ImageNet dataset. |
GerNet-L | GerNet-L trained on image classification with the ImageNet dataset. |
GerNet-S | GerNet-S trained on image classification with the ImageNet dataset. |
GhostNet100 | GhostNet100 trained on image classification with the ImageNet dataset. |
HardCoreNAS-A | HardCoreNAS-A trained on image classification with the ImageNet dataset. |
HardCoreNAS-F | HardCoreNAS-F trained on image classification with the ImageNet dataset. |
LeViT128 | LeViT128 trained on image classification with the ImageNet dataset. |
LeViT256 | LeViT256 trained on image classification with the ImageNet dataset. |
Inception-Resnet-V2 | Inception-Resnet-V2 trained on image classification with the ImageNet dataset. |
Inception-V3 | Inception-V3 trained on image classification with the ImageNet dataset. |
Inception-V4 | Inception-V4 trained on image classification with the ImageNet dataset. |
Inception-V4 | Inception-V4 trained on image classification with the ImageNet dataset. |
MLP-Mixer-B16 | MLP-Mixer-B16 trained on image classification with the ImageNet dataset. |
MLP-Mixer-L16 | MLP-Mixer-L16 trained on image classification with the ImageNet dataset. |
MixNet-L | MixNet-L trained on image classification with the ImageNet dataset. |
MixNet-S | MixNet-S trained on image classification with the ImageNet dataset. |
MNASNet100 | MNASNet100 trained on image classification with the ImageNet dataset. |
MNASNet100 | MNASNet100 trained on image classification with the ImageNet dataset. |
MobileNet-V3 | MobileNet-V3 trained on image classification with the ImageNet dataset. |
NASNet-A-Large | NASNet-A-Large trained on image classification with the ImageNet dataset. |
NF-ResNet50 | NF-ResNet50 trained on image classification with the ImageNet dataset. |
NF-Net-L0 | NF-Net-L0 trained on image classification with the ImageNet dataset. |
PNASNet-5-Large | PNASNet-5-Large trained on image classification with the ImageNet dataset. |
RegNetX-64 | RegNetX-64 trained on image classification with the ImageNet dataset. |
RegNetY-64 | RegNetY-64 trained on image classification with the ImageNet dataset. |
RepVGG-B3 | RepVGG-B3 trained on image classification with the ImageNet dataset. |
RepVGG-B3G4 | RepVGG-B3G4 trained on image classification with the ImageNet dataset. |
Res2Net50-26W-4S | Res2Net50-26W-4S trained on image classification with the ImageNet dataset. |
ResNest50D | ResNest50D trained on image classification with the ImageNet dataset. |
ResNetRS50 | ResNetRS50 trained on image classification with the ImageNet dataset. |
RexNet100 | RexNet100 trained on image classification with the ImageNet dataset. |
SemNASNet100 | SemNASNet100 trained on image classification with the ImageNet dataset. |
SEResNet152D | SEResNet152D trained on image classification with the ImageNet dataset. |
SEResNext50-32x4D | SEResNext50-32x4D trained on image classification with the ImageNet dataset. |
SKResNet18 | SKResNet18 trained on image classification with the ImageNet dataset. |
SKResNext50-32x4D | SKResNext50-32x4D trained on image classification with the ImageNet dataset. |
SPNasNet100 | SPNasNet100 trained on image classification with the ImageNet dataset. |
Swin-B-P4-W7-224 | Swin-B-P4-W7-224 trained on image classification with the ImageNet dataset. |
Swin-L-P4-W7-224 | Swin-L-P4-W7-224 trained on image classification with the ImageNet dataset. |
Swin-S-P4-W7-224 | Swin-S-P4-W7-224 trained on image classification with the ImageNet dataset. |
EfficientNet-B1 | EfficientNet-B1 trained on image classification with the ImageNet dataset. |
EfficientNet-B3 | EfficientNet-B3 trained on image classification with the ImageNet dataset. |
EfficientNet-B5 | EfficientNet-B5 trained on image classification with the ImageNet dataset. |
EfficientNet-B7 | EfficientNet-B7 trained on image classification with the ImageNet dataset. |
Visformer | Visformer trained on image classification with the ImageNet dataset. |
ViT-L-P16-224 | ViT-L-P16-224 trained on image classification with the ImageNet dataset. |
ViT-S-P16-224 | ViT-S-P16-224 trained on image classification with the ImageNet dataset. |
ViT-B-P16-224 | ViT-B-P16-224 trained on image classification with the ImageNet dataset. |
XCeption | XCeption trained on image classification with the ImageNet dataset. |
XCeption65 | XCeption65 trained on image classification with the ImageNet dataset. |
ResNet50-JigSaw-P100 | ResNet50-JigSaw-P100 trained via self supervision with the ImageNet dataset. |
ResNet50-JigSaw-Goyal19 | ResNet50-JigSaw-Goyal19 trained via self supervision with the ImageNet dataset. |
ResNet50-RotNet | ResNet50-RotNet trained via self supervision with the ImageNet dataset. |
ResNet50-ClusterFit-16K-RotNet | ResNet50-ClusterFit-16K-RotNet trained via self supervision with the ImageNet dataset. |
ResNet50-NPID-4KNegative | ResNet50-NPID-4KNegative trained via self supervision with the ImageNet dataset. |
ResNet50-PIRL | ResNet50-PIRL trained via self supervision with the ImageNet dataset. |
ResNet50-SimCLR | ResNet50-SimCLR trained via self supervision with the ImageNet dataset. |
ResNet50-DeepClusterV2 | ResNet50-DeepClusterV2-2x224 trained via self supervision with the ImageNet dataset. |
ResNet50-DeepClusterV2 | ResNet50-DeepClusterV2-2x224+6x96 trained via self supervision with the ImageNet dataset. |
ResNet50-SwAV-BS4096 | ResNet50-SwAV-BS4096-2x224 trained via self supervision with the ImageNet dataset. |
ResNet50-SwAV-BS4096 | ResNet50-SwAV-BS4096-2x224+6x96 trained via self supervision with the ImageNet dataset. |
ResNet50-MoCoV2-BS256 | ResNet50-MoCoV2-BS256 trained via self supervision with the ImageNet dataset. |
ResNet50-BarlowTwins-BS2048 | ResNet50-BarlowTwins-BS2048 trained via self supervision with the ImageNet dataset. |
Dino-VIT-S16 | Dino-VIT-S16 trained via self supervision with the ImageNet dataset. |
Dino-VIT-S8 | Dino-VIT-S8 trained via self supervision with the ImageNet dataset. |
Dino-VIT-B16 | Dino-VIT-B16 trained via self supervision with the ImageNet dataset. |
Dino-VIT-B8 | Dino-VIT-B8 trained via self supervision with the ImageNet dataset. |
Dino-XCIT-S12-P16 | Dino-XCIT-S12-P16 trained via self supervision with the ImageNet dataset. |
Dino-XCIT-S12-P8 | Dino-XCIT-S12-P8 trained via self supervision with the ImageNet dataset. |
Dino-XCIT-M24-P16 | Dino-XCIT-M24-P16 trained via self supervision with the ImageNet dataset. |
Dino-XCIT-M24-P8 | Dino-XCIT-M24-P8 trained via self supervision with the ImageNet dataset. |
Dino-ResNet50 | Dino-ResNet50 trained via self supervision with the ImageNet dataset. |
Equipped with our neural data and models, we then employ two distinct methods for mapping the responses of our biological neurons to the responses of our artificial neurons. The first, often called classic representational similarity analysis, is designed to assess representational structure (sometimes referred to as representational geometry) at the level of neural populations – in our case, the neural populations of 6 different visual cortical areas. The key component of representational similarity the representational (dis)similarity matrix (RDM), a distance matrix computed by taking the pairwise distance (1 - Pearson correlation coefficient in our case) of each stimulus to every other stimulus across all neural responses in the target population. A given model’s neural predictivity score in this classic representational similarity analysis is simply an average second order-distance (in our case, another 1 - Pearson correlation coefficient) between the RDM of its maximally correspondent layer and each of the cortical RDMs in our sample. Note that this kind of classic representational similarity analysis is a nonparametric mapping, and requires no fits or transformations – just an emergent similarity in how stimuli are organized across the responses of the two neural populations (one artificial, one biological) being compared.
Rather than target an entire neural population simultaneously, we can also more closely scrutinize individual neural responses using a method broadly called neural encoding or neural regression. With this method, we take the artificial neural responses of our model as the set of predictors in a regression where we try to predict (always with some form of cross-validation) the responses of a biological neuron to images not included in the regression. What we’re effectively doing in this method is mixing and matching our set of artificial neurons (often with some sort of dimensiionality reduction along the way) to approximate the representational profile of a single biological neuron. The better suited those artificial neurons are to this mixing and matching (which we measure with a correlation between the neural responses predicted by the regression and the actual responses of a target neuron), the higher the score of the model that hosts them. (A schematic of our neural regression method can be seen in the figure below.)
Combining together our neural data, models and mapping methods, a sort of intuitive first result we obtain is a large set of model rankings, which in the plots below we’ve organized broadly into 1 of 3 categories.
These rankings may seem like an end in and of themselves, but scrutinize them without context for more than a few minutes and it will (in all probability) quickly become clear that these rankings are largely insufficient for meaningful insight.
What we need, in their place or at least as a supplement, are targeted contrasts between certain kinds of models that directly arbitrate on questions (theoretical or practical) we have about mouse brain. In the section below, we show a few examples of the kinds of questions a neural modeling survey of this nature allows us to address when we choose models we can meaningfully separate into subgroups.
Does training matter?
A first question we might ask, and one we’ve deliberately designed our survey of models to assess, is whether the kinds of learning a deep neural network does when trained on a task like object recognition actually matter for the prediction of mouse brain. The answer may seem a trivial and intuitive yes, but some prior work suggests something far less simple. There are, for example, a number of cases in nature of perceptual machineries that (unlike the highly structured, hierarchical primate visual cortex) are characterized by somewhat chaotic, random connectivity – rodent olfactory cortex a prime example among them. That rodent visual cortex may itself be defined by this kind of randomness is thus not unfathomable. Compound this with the finding of a prominent research team a few years ago that a classic convolutional neural network model called VGG16 is just as predictive of mouse brain before training as after training and it’s entirely possible that the first major divergence in the deep neural network modeling of mouse versus monkey brain is the necessity of training – in other words, task optimization.
How then does our modeling arbitrate on the possibility of mouse visual cortex being a randomly initialized neural network?
In brief, it strongly suggests it isn’t. In the plot above, each line represents the difference in score between a model trained on ImageNet and its randomly initialized counterpart. In both our neural encoding and representational similarity methods, pretrained models strongly outperform their untrained pair – in representational similarity, this effect is so strong as to be categorical.
The finding here underscores an important point about deep neural network modeling more generally. While individual models can be informative, it’s often worthwhile to assess many models in the aggregate. While we replicate the field’s previous finding that randomly-initialized VGG16 does just as well if not better than ImageNet-trained VGG16, we also demonstrate this effect is not consistent across different architectures.
All in all, this result suggests some sort of task-optimization is key in predicting mouse brain. To better assess what kind of task-optmization matters, though, we turn to other models in our survey.
Does object recognition matter?
The overarching goal in much of the neural benchmarking literature to date has been the deeper understanding of visual object recognition in the ventral stream of macaque visual cortex. This has naturally entailed a focus on deep convolutional network models of object recognition as the natural (and empirically reified) gold standard of neural and behavioral predictivity.
In a species like the mouse, however, the centrality of object recognition as an ethologically relevant task is immediately suspect. Mice do recognize objects, it seems, and do use their visual systems to predicate and calibrate advanced behaviors like hunting insects… but there are a number of reasons (both first order and observational) to believe object recognition might not be the apex model when it comes to explaining how mouse visual cortex is organized.
So what other options are available? Immediately proximate to object recognition, but mitigating the need for category labels are self-supervised, contrastive learning models. Instead of being taught to discriminate object classes indirectly by dint of invariances, these models are often taught invariances directly – learning for example to treat a single image and various transformation of that image (rotations, dilations, reilluminations, translations) as the same, all without explicit category supervision.
These models, then, allow us to answer the question as to whether the kinds of predictivity we see from trained models in the mouse brain are really about objects per se, or whether it’s more about the fine-grained, separable, invariant representations learnable through both category and self-supervision. The answer it seems, is closer to the latter. Modern contrastive learning algorithms (e.g. BarlowTwins, SimCLR, Swav) often do just as well as category-supervised models when controlling for the effects of architecture. Object recognition, then – as an explicit task – may be sufficient, but is (in the end) unnecessary to capture the responses of mouse visual cortex to naturalistic stimuli. (The comparison of category supervised and self-supervised models is available in the 3rd tab of the ranking plots above, and across the various bar plots of the figure below.)
Less proximate to object recognition, of course, are a whole variety of alternative computer vision tasks that necessitate an arguably kaleidoscopic variety of latent representation. In our survey, this diversity is captured most succinctly (and with the most empirical control) by the taskonomy encoders: a single encoder architecture deployed in 24 different canonical computer vision tasks ranging from autoencoding to edge detection. While trained on a rather limited visual diet (video tours of model houses), these models nonetheless allow us to triangulate the contributions of different kinds of training procedures on neural predictivity.
Clusters of these tasks (organized in terms of how well representations from one task transfer to the others) are depicted in the figure above. (Descriptions of individual tasks, alongside their cluster, may be found in the table below.) The facets in each of these plots are the six cortical areas assayed in our neural data. The individual lines atop each bar are the best performing model from that particular subgroup of models. The ‘intermouse’ predictivity score (calculable only in the neural encoding metric) is a measure of how well we can do when swapping out the artificial neural network models in our pipeline for other mice. (In other words, how well can we do when modeling one mouse if our model was an aggregate of conspecifics?)
The major takeaway from this figure (and from our results more generally) is that object recognition (a member of the ‘Semantic’ task cluster) is by no means the only contender in the game of neural predictivity. In fact, a ‘2D’ task – unsupervised segmentation – is one of the overall best models in our survey, and is perhaps the more intuitive task for a visual system like the mouse’s, configured as it most likely is to navigation and foraging in the dark.
model_display_name | task_cluster | description |
---|---|---|
AlexNet | Semantic | AlexNet trained on image classification with the ImageNet dataset. |
VGG11 | Semantic | VGG11 trained on image classification with the ImageNet dataset. |
VGG13 | Semantic | VGG13 trained on image classification with the ImageNet dataset. |
VGG16 | Semantic | VGG16 trained on image classification with the ImageNet dataset. |
VGG19 | Semantic | VGG19 trained on image classification with the ImageNet dataset. |
VGG11-BatchNorm | Semantic | VGG11-BatchNorm trained on image classification with the ImageNet dataset. |
VGG13-BatchNorm | Semantic | VGG13-BatchNorm trained on image classification with the ImageNet dataset. |
VGG16-BatchNorm | Semantic | VGG16-BatchNorm trained on image classification with the ImageNet dataset. |
VGG19-BatchNorm | Semantic | VGG19-BatchNorm trained on image classification with the ImageNet dataset. |
ResNet18 | Semantic | ResNet18 trained on image classification with the ImageNet dataset. |
ResNet34 | Semantic | ResNet34 trained on image classification with the ImageNet dataset. |
ResNet50 | Semantic | ResNet50 trained on image classification with the ImageNet dataset. |
ResNet101 | Semantic | ResNet101 trained on image classification with the ImageNet dataset. |
ResNet152 | Semantic | ResNet152 trained on image classification with the ImageNet dataset. |
SqueezeNet1.0 | Semantic | SqueezeNet1.0 trained on image classification with the ImageNet dataset. |
SqueezeNet1.1 | Semantic | SqueezeNet1.1 trained on image classification with the ImageNet dataset. |
DenseNet121 | Semantic | DenseNet121 trained on image classification with the ImageNet dataset. |
DenseNet161 | Semantic | DenseNet161 trained on image classification with the ImageNet dataset. |
DenseNet169 | Semantic | DenseNet169 trained on image classification with the ImageNet dataset. |
DenseNet201 | Semantic | DenseNet201 trained on image classification with the ImageNet dataset. |
GoogleNet | Semantic | GoogleNet trained on image classification with the ImageNet dataset. |
ShuffleNet-V2-x0.5 | Semantic | ShuffleNet-V2-x0.5 trained on image classification with the ImageNet dataset. |
ShuffleNet-V2-x1.0 | Semantic | ShuffleNet-V2-x1.0 trained on image classification with the ImageNet dataset. |
MobileNet-V2 | Semantic | MobileNet-V2 trained on image classification with the ImageNet dataset. |
ResNext50-32x4D | Semantic | ResNext50-32x4D trained on image classification with the ImageNet dataset. |
ResNext50-32x8D | Semantic | ResNext50-32x8D trained on image classification with the ImageNet dataset. |
Wide-ResNet50 | Semantic | Wide-ResNet50 trained on image classification with the ImageNet dataset. |
Wide-ResNet101 | Semantic | Wide-ResNet101 trained on image classification with the ImageNet dataset. |
MNASNet0.5 | Semantic | MNASNet0.5 trained on image classification with the ImageNet dataset. |
MNASNet1.0 | Semantic | MNASNet1.0 trained on image classification with the ImageNet dataset. |
Inception-V3 | Semantic | Inception-V3 trained on image classification with the ImageNet dataset. |
AlexNet | Semantic | AlexNet randomly initialized, with no training. |
VGG11 | Semantic | VGG11 randomly initialized, with no training. |
VGG13 | Semantic | VGG13 randomly initialized, with no training. |
VGG16 | Semantic | VGG16 randomly initialized, with no training. |
VGG19 | Semantic | VGG19 randomly initialized, with no training. |
VGG11-BatchNorm | Semantic | VGG11-BatchNorm randomly initialized, with no training. |
VGG13-BatchNorm | Semantic | VGG13-BatchNorm randomly initialized, with no training. |
VGG16-BatchNorm | Semantic | VGG16-BatchNorm randomly initialized, with no training. |
VGG19-BatchNorm | Semantic | VGG19-BatchNorm randomly initialized, with no training. |
ResNet18 | Semantic | ResNet18 randomly initialized, with no training. |
ResNet34 | Semantic | ResNet34 randomly initialized, with no training. |
ResNet50 | Semantic | ResNet50 randomly initialized, with no training. |
ResNet101 | Semantic | ResNet101 randomly initialized, with no training. |
ResNet152 | Semantic | ResNet152 randomly initialized, with no training. |
SqueezeNet1.0 | Semantic | SqueezeNet1.0 randomly initialized, with no training. |
SqueezeNet1.1 | Semantic | SqueezeNet1.1 randomly initialized, with no training. |
DenseNet121 | Semantic | DenseNet121 randomly initialized, with no training. |
DenseNet161 | Semantic | DenseNet161 randomly initialized, with no training. |
DenseNet169 | Semantic | DenseNet169 randomly initialized, with no training. |
DenseNet201 | Semantic | DenseNet201 randomly initialized, with no training. |
GoogleNet | Semantic | GoogleNet randomly initialized, with no training. |
ShuffleNet-V2-x0.5 | Semantic | ShuffleNet-V2-x0.5 randomly initialized, with no training. |
ShuffleNet-V2-x1.0 | Semantic | ShuffleNet-V2-x1.0 randomly initialized, with no training. |
MobileNet-V2 | Semantic | MobileNet-V2 randomly initialized, with no training. |
ResNext50-32x4D | Semantic | ResNext50-32x4D randomly initialized, with no training. |
ResNext50-32x8D | Semantic | ResNext50-32x8D randomly initialized, with no training. |
Wide-ResNet50 | Semantic | Wide-ResNet50 randomly initialized, with no training. |
Wide-ResNet101 | Semantic | Wide-ResNet101 randomly initialized, with no training. |
MNASNet0.5 | Semantic | MNASNet0.5 randomly initialized, with no training. |
MNASNet1.0 | Semantic | MNASNet1.0 randomly initialized, with no training. |
Inception-V3 | Semantic | Inception-V3 randomly initialized, with no training. |
CaIT-S24 | Semantic | CaIT-S24 trained on image classification with the ImageNet dataset. |
CoaT-Lite-Mini | Semantic | CoaT-Lite-Mini trained on image classification with the ImageNet dataset. |
ConViT-B | Semantic | ConViT-B trained on image classification with the ImageNet dataset. |
ConViT-S | Semantic | ConViT-S trained on image classification with the ImageNet dataset. |
CSP-DarkNet53 | Semantic | CSP-DarkNet53 trained on image classification with the ImageNet dataset. |
CSP-ResNet50 | Semantic | CSP-ResNet50 trained on image classification with the ImageNet dataset. |
DLA34 | Semantic | DLA34 trained on image classification with the ImageNet dataset. |
DLA169 | Semantic | DLA169 trained on image classification with the ImageNet dataset. |
ECA-NFNeT-L0 | Semantic | ECA-NFNeT-L0 trained on image classification with the ImageNet dataset. |
ECA-NFNeT-L1 | Semantic | ECA-NFNeT-L1 trained on image classification with the ImageNet dataset. |
ECA-Resnet50-D | Semantic | ECA-Resnet50-D trained on image classification with the ImageNet dataset. |
ECA-Resnet101-D | Semantic | ECA-Resnet101-D trained on image classification with the ImageNet dataset. |
EfficientNet-V2-S | Semantic | EfficientNet-V2-S trained on image classification with the ImageNet dataset. |
FBNetC100 | Semantic | FBNetC100 trained on image classification with the ImageNet dataset. |
GerNet-L | Semantic | GerNet-L trained on image classification with the ImageNet dataset. |
GerNet-S | Semantic | GerNet-S trained on image classification with the ImageNet dataset. |
GhostNet100 | Semantic | GhostNet100 trained on image classification with the ImageNet dataset. |
HardCoreNAS-A | Semantic | HardCoreNAS-A trained on image classification with the ImageNet dataset. |
HardCoreNAS-F | Semantic | HardCoreNAS-F trained on image classification with the ImageNet dataset. |
LeViT128 | Semantic | LeViT128 trained on image classification with the ImageNet dataset. |
LeViT256 | Semantic | LeViT256 trained on image classification with the ImageNet dataset. |
Inception-Resnet-V2 | Semantic | Inception-Resnet-V2 trained on image classification with the ImageNet dataset. |
Inception-V3 | Semantic | Inception-V3 trained on image classification with the ImageNet dataset. |
Inception-V4 | Semantic | Inception-V4 trained on image classification with the ImageNet dataset. |
Inception-V4 | Semantic | Inception-V4 trained on image classification with the ImageNet dataset. |
MLP-Mixer-B16 | Semantic | MLP-Mixer-B16 trained on image classification with the ImageNet dataset. |
MLP-Mixer-L16 | Semantic | MLP-Mixer-L16 trained on image classification with the ImageNet dataset. |
MixNet-L | Semantic | MixNet-L trained on image classification with the ImageNet dataset. |
MixNet-S | Semantic | MixNet-S trained on image classification with the ImageNet dataset. |
MNASNet100 | Semantic | MNASNet100 trained on image classification with the ImageNet dataset. |
MNASNet100 | Semantic | MNASNet100 trained on image classification with the ImageNet dataset. |
MobileNet-V3 | Semantic | MobileNet-V3 trained on image classification with the ImageNet dataset. |
NASNet-A-Large | Semantic | NASNet-A-Large trained on image classification with the ImageNet dataset. |
NF-ResNet50 | Semantic | NF-ResNet50 trained on image classification with the ImageNet dataset. |
NF-Net-L0 | Semantic | NF-Net-L0 trained on image classification with the ImageNet dataset. |
PNASNet-5-Large | Semantic | PNASNet-5-Large trained on image classification with the ImageNet dataset. |
RegNetX-64 | Semantic | RegNetX-64 trained on image classification with the ImageNet dataset. |
RegNetY-64 | Semantic | RegNetY-64 trained on image classification with the ImageNet dataset. |
RepVGG-B3 | Semantic | RepVGG-B3 trained on image classification with the ImageNet dataset. |
RepVGG-B3G4 | Semantic | RepVGG-B3G4 trained on image classification with the ImageNet dataset. |
Res2Net50-26W-4S | Semantic | Res2Net50-26W-4S trained on image classification with the ImageNet dataset. |
ResNest50D | Semantic | ResNest50D trained on image classification with the ImageNet dataset. |
ResNetRS50 | Semantic | ResNetRS50 trained on image classification with the ImageNet dataset. |
RexNet100 | Semantic | RexNet100 trained on image classification with the ImageNet dataset. |
SemNASNet100 | Semantic | SemNASNet100 trained on image classification with the ImageNet dataset. |
SEResNet152D | Semantic | SEResNet152D trained on image classification with the ImageNet dataset. |
SEResNext50-32x4D | Semantic | SEResNext50-32x4D trained on image classification with the ImageNet dataset. |
SKResNet18 | Semantic | SKResNet18 trained on image classification with the ImageNet dataset. |
SKResNext50-32x4D | Semantic | SKResNext50-32x4D trained on image classification with the ImageNet dataset. |
SPNasNet100 | Semantic | SPNasNet100 trained on image classification with the ImageNet dataset. |
Swin-B-P4-W7-224 | Semantic | Swin-B-P4-W7-224 trained on image classification with the ImageNet dataset. |
Swin-L-P4-W7-224 | Semantic | Swin-L-P4-W7-224 trained on image classification with the ImageNet dataset. |
Swin-S-P4-W7-224 | Semantic | Swin-S-P4-W7-224 trained on image classification with the ImageNet dataset. |
EfficientNet-B1 | Semantic | EfficientNet-B1 trained on image classification with the ImageNet dataset. |
EfficientNet-B3 | Semantic | EfficientNet-B3 trained on image classification with the ImageNet dataset. |
EfficientNet-B5 | Semantic | EfficientNet-B5 trained on image classification with the ImageNet dataset. |
EfficientNet-B7 | Semantic | EfficientNet-B7 trained on image classification with the ImageNet dataset. |
Visformer | Semantic | Visformer trained on image classification with the ImageNet dataset. |
ViT-L-P16-224 | Semantic | ViT-L-P16-224 trained on image classification with the ImageNet dataset. |
ViT-S-P16-224 | Semantic | ViT-S-P16-224 trained on image classification with the ImageNet dataset. |
ViT-B-P16-224 | Semantic | ViT-B-P16-224 trained on image classification with the ImageNet dataset. |
XCeption | Semantic | XCeption trained on image classification with the ImageNet dataset. |
XCeption65 | Semantic | XCeption65 trained on image classification with the ImageNet dataset. |
CaIT-S24 | Semantic | CaIT-S24 randomly initialized, with no training. |
CoaT-Lite-Mini | Semantic | CoaT-Lite-Mini randomly initialized, with no training. |
ConViT-B | Semantic | ConViT-B randomly initialized, with no training. |
ConViT-S | Semantic | ConViT-S randomly initialized, with no training. |
CSP-DarkNet53 | Semantic | CSP-DarkNet53 randomly initialized, with no training. |
CSP-ResNet50 | Semantic | CSP-ResNet50 randomly initialized, with no training. |
DLA34 | Semantic | DLA34 randomly initialized, with no training. |
DLA169 | Semantic | DLA169 randomly initialized, with no training. |
ECA-NFNeT-L0 | Semantic | ECA-NFNeT-L0 randomly initialized, with no training. |
ECA-NFNeT-L1 | Semantic | ECA-NFNeT-L1 randomly initialized, with no training. |
ECA-Resnet50-D | Semantic | ECA-Resnet50-D randomly initialized, with no training. |
ECA-Resnet101-D | Semantic | ECA-Resnet101-D randomly initialized, with no training. |
EfficientNet-V2-S | Semantic | EfficientNet-V2-S randomly initialized, with no training. |
FBNetC100 | Semantic | FBNetC100 randomly initialized, with no training. |
GerNet-L | Semantic | GerNet-L randomly initialized, with no training. |
GerNet-S | Semantic | GerNet-S randomly initialized, with no training. |
GhostNet100 | Semantic | GhostNet100 randomly initialized, with no training. |
HardCoreNAS-A | Semantic | HardCoreNAS-A randomly initialized, with no training. |
HardCoreNAS-F | Semantic | HardCoreNAS-F randomly initialized, with no training. |
LeViT128 | Semantic | LeViT128 randomly initialized, with no training. |
LeViT256 | Semantic | LeViT256 randomly initialized, with no training. |
Inception-Resnet-V2 | Semantic | Inception-Resnet-V2 randomly initialized, with no training. |
Inception-V3 | Semantic | Inception-V3 randomly initialized, with no training. |
Inception-V4 | Semantic | Inception-V4 randomly initialized, with no training. |
Inception-V4 | Semantic | Inception-V4 randomly initialized, with no training. |
MLP-Mixer-B16 | Semantic | MLP-Mixer-B16 randomly initialized, with no training. |
MLP-Mixer-L16 | Semantic | MLP-Mixer-L16 randomly initialized, with no training. |
MixNet-L | Semantic | MixNet-L randomly initialized, with no training. |
MixNet-S | Semantic | MixNet-S randomly initialized, with no training. |
MNASNet100 | Semantic | MNASNet100 randomly initialized, with no training. |
MNASNet100 | Semantic | MNASNet100 randomly initialized, with no training. |
MobileNet-V3 | Semantic | MobileNet-V3 randomly initialized, with no training. |
NASNet-A-Large | Semantic | NASNet-A-Large randomly initialized, with no training. |
NF-ResNet50 | Semantic | NF-ResNet50 randomly initialized, with no training. |
NF-Net-L0 | Semantic | NF-Net-L0 randomly initialized, with no training. |
PNASNet-5-Large | Semantic | PNASNet-5-Large randomly initialized, with no training. |
RegNetX-64 | Semantic | RegNetX-64 randomly initialized, with no training. |
RegNetY-64 | Semantic | RegNetY-64 randomly initialized, with no training. |
RepVGG-B3 | Semantic | RepVGG-B3 randomly initialized, with no training. |
RepVGG-B3G4 | Semantic | RepVGG-B3G4 randomly initialized, with no training. |
Res2Net50-26W-4S | Semantic | Res2Net50-26W-4S randomly initialized, with no training. |
ResNest50D | Semantic | ResNest50D randomly initialized, with no training. |
ResNetRS50 | Semantic | ResNetRS50 randomly initialized, with no training. |
RexNet100 | Semantic | RexNet100 randomly initialized, with no training. |
SemNASNet100 | Semantic | SemNASNet100 randomly initialized, with no training. |
SEResNet152D | Semantic | SEResNet152D randomly initialized, with no training. |
SEResNext50-32x4D | Semantic | SEResNext50-32x4D randomly initialized, with no training. |
SKResNet18 | Semantic | SKResNet18 randomly initialized, with no training. |
SKResNext50-32x4D | Semantic | SKResNext50-32x4D randomly initialized, with no training. |
SPNasNet100 | Semantic | SPNasNet100 randomly initialized, with no training. |
Swin-B-P4-W7-224 | Semantic | Swin-B-P4-W7-224 randomly initialized, with no training. |
Swin-L-P4-W7-224 | Semantic | Swin-L-P4-W7-224 randomly initialized, with no training. |
Swin-S-P4-W7-224 | Semantic | Swin-S-P4-W7-224 randomly initialized, with no training. |
EfficientNet-B1 | Semantic | EfficientNet-B1 randomly initialized, with no training. |
EfficientNet-B3 | Semantic | EfficientNet-B3 randomly initialized, with no training. |
EfficientNet-B5 | Semantic | EfficientNet-B5 randomly initialized, with no training. |
EfficientNet-B7 | Semantic | EfficientNet-B7 randomly initialized, with no training. |
Visformer | Semantic | Visformer randomly initialized, with no training. |
ViT-L-P16-224 | Semantic | ViT-L-P16-224 randomly initialized, with no training. |
ViT-S-P16-224 | Semantic | ViT-S-P16-224 randomly initialized, with no training. |
ViT-B-P16-224 | Semantic | ViT-B-P16-224 randomly initialized, with no training. |
XCeption | Semantic | XCeption randomly initialized, with no training. |
XCeption65 | Semantic | XCeption65 randomly initialized, with no training. |
ResNet50-JigSaw-P100 | SelfSupervised | ResNet50-JigSaw-P100 trained via self supervision with the ImageNet dataset. |
ResNet50-JigSaw-Goyal19 | SelfSupervised | ResNet50-JigSaw-Goyal19 trained via self supervision with the ImageNet dataset. |
ResNet50-RotNet | SelfSupervised | ResNet50-RotNet trained via self supervision with the ImageNet dataset. |
ResNet50-ClusterFit-16K-RotNet | SelfSupervised | ResNet50-ClusterFit-16K-RotNet trained via self supervision with the ImageNet dataset. |
ResNet50-NPID-4KNegative | SelfSupervised | ResNet50-NPID-4KNegative trained via self supervision with the ImageNet dataset. |
ResNet50-PIRL | SelfSupervised | ResNet50-PIRL trained via self supervision with the ImageNet dataset. |
ResNet50-SimCLR | SelfSupervised | ResNet50-SimCLR trained via self supervision with the ImageNet dataset. |
ResNet50-DeepClusterV2 | SelfSupervised | ResNet50-DeepClusterV2-2x224 trained via self supervision with the ImageNet dataset. |
ResNet50-DeepClusterV2 | SelfSupervised | ResNet50-DeepClusterV2-2x224+6x96 trained via self supervision with the ImageNet dataset. |
ResNet50-SwAV-BS4096 | SelfSupervised | ResNet50-SwAV-BS4096-2x224 trained via self supervision with the ImageNet dataset. |
ResNet50-SwAV-BS4096 | SelfSupervised | ResNet50-SwAV-BS4096-2x224+6x96 trained via self supervision with the ImageNet dataset. |
ResNet50-MoCoV2-BS256 | SelfSupervised | ResNet50-MoCoV2-BS256 trained via self supervision with the ImageNet dataset. |
ResNet50-BarlowTwins-BS2048 | SelfSupervised | ResNet50-BarlowTwins-BS2048 trained via self supervision with the ImageNet dataset. |
Dino-VIT-S16 | SelfSupervised | Dino-VIT-S16 trained via self supervision with the ImageNet dataset. |
Dino-VIT-S8 | SelfSupervised | Dino-VIT-S8 trained via self supervision with the ImageNet dataset. |
Dino-VIT-B16 | SelfSupervised | Dino-VIT-B16 trained via self supervision with the ImageNet dataset. |
Dino-VIT-B8 | SelfSupervised | Dino-VIT-B8 trained via self supervision with the ImageNet dataset. |
Dino-XCIT-S12-P16 | SelfSupervised | Dino-XCIT-S12-P16 trained via self supervision with the ImageNet dataset. |
Dino-XCIT-S12-P8 | SelfSupervised | Dino-XCIT-S12-P8 trained via self supervision with the ImageNet dataset. |
Dino-XCIT-M24-P16 | SelfSupervised | Dino-XCIT-M24-P16 trained via self supervision with the ImageNet dataset. |
Dino-XCIT-M24-P8 | SelfSupervised | Dino-XCIT-M24-P8 trained via self supervision with the ImageNet dataset. |
Dino-ResNet50 | SelfSupervised | Dino-ResNet50 trained via self supervision with the ImageNet dataset. |
Does our modeling tell us anything about perceptual organization?
So far, our questions about mouse brain have been mostly geared towards function – in effect, what is the mouse visual system doing? Another set of questions, perhaps a bit more contentious, are geared towards organization. One particular question is whether (as has been clearly established in primate visual neuroscience) there exists a meaningful information processing hierarchy across the various cortical areas that define mouse visual cortex. Recent work in neuroanatomy (especially in large-scale connectomics) has begun to suggest the answer is yes, but significant debate remains. Does our modeling weigh in on this debate in any way?
As a matter of fact, it just might. A somewhat proximal, but statistically robust method of assessing for the presence of an information-processing hierarchy in a target biological brain is by calculating the mean (or median) depth (in a network) of the model layers that maximally correspond to the target brain area. In deep neural networks, deeper layers (at least in feedforward models) tend almost always to host more complex, sophisticated representations than earlier layers. If a given cortical area is earlier in the information processing hierarchy than another area, the layer of the deep net that corresponds to that earlier area should (on average) be shallower than the layer that corresponds to the later area.
As it turns out, this precise trend occurs in our sample of mouse visual cortex. Beginning from primary visual cortex (VISp), successive layers in the neuroanatomically defined hierarchy are best captured by successively deeper layers of our deep net models. Pairwise statistics in this case between the median depths of maximally correspondent model layers verify this ‘data-driven’; in other words, wherever there exists a significant pairwise difference, that difference favors the hierarchy.
While far from complete, the questions we’ve covered here are at least a minimally representative palette sampler of the kinds of question that can be asked (and ideally answered) by a large-scale, deliberately designed neural benchmarking survey.
Perhaps the most important question we’ve addressed here is the whether or not deep neural networks are useful models of mouse visual cortex. While significant work remains, with a number of further modifications necessary to more fully account for the anatomical and ethological idiosyncrasies of mice, we hope you’ll grant at least preliminarily that the answer is yes.
Our code is freely available for you to test your own models, and maybe even to craft up models as of yet beyond conception. So universal and yet still so misunderstood, the mouse is a marvelous creature for modelers of all persuasion. Happy trapping!
For attribution, please cite this work as
Conwell, et al. (2021, Dec. 8). Deep Mouse Trap. Retrieved from https://colinconwell.github.io/DeepMouseTrap/
BibTeX citation
@misc{conwell2021deep, author = {Conwell, Colin and Mayo, David and Buice, Michael A. and Katz, Boris and Alvarez, George and Barbu, Andrei}, title = {Deep Mouse Trap}, url = {https://colinconwell.github.io/DeepMouseTrap/}, year = {2021} }