Predicting brain activity using Transformers
Update: 2023-08-05
Description
Link to bioRxiv paper:
http://biorxiv.org/cgi/content/short/2023.08.02.551743v1?rss=1
Authors: Adeli, H., Minni, S., Kriegeskorte, N.
Abstract:
The Algonauts challenge (Gifford et al. [2023]) called on the community to provide novel solutions for predicting brain activity of humans viewing natural scenes. This report provides an overview and technical details of our submitted solution. We use a general transformer encoder-decoder model to map images to fMRI responses. The encoder model is a vision transformer trained using self-supervised methods (DINOv2). The decoder uses queries corresponding to different brain regions of interests (ROI) in different hemispheres to gather relevant information from the encoder output for predicting neural activity in each ROI. The output tokens from the decoder are then linearly mapped to the fMRI activity. The predictive success (challenge score: 63.5229, rank 2) suggests that features from self-supervised transformers may deserve consideration as models of human visual brain representations and shows the effectiveness of transformer mechanisms (self and cross-attention) to learn the mapping from features to brain responses.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC
http://biorxiv.org/cgi/content/short/2023.08.02.551743v1?rss=1
Authors: Adeli, H., Minni, S., Kriegeskorte, N.
Abstract:
The Algonauts challenge (Gifford et al. [2023]) called on the community to provide novel solutions for predicting brain activity of humans viewing natural scenes. This report provides an overview and technical details of our submitted solution. We use a general transformer encoder-decoder model to map images to fMRI responses. The encoder model is a vision transformer trained using self-supervised methods (DINOv2). The decoder uses queries corresponding to different brain regions of interests (ROI) in different hemispheres to gather relevant information from the encoder output for predicting neural activity in each ROI. The output tokens from the decoder are then linearly mapped to the fMRI activity. The predictive success (challenge score: 63.5229, rank 2) suggests that features from self-supervised transformers may deserve consideration as models of human visual brain representations and shows the effectiveness of transformer mechanisms (self and cross-attention) to learn the mapping from features to brain responses.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC
Comments
In Channel