Visual Intelligence Online Seminar: Layer-wise Analysis of Transformer Models in Vision and Audio Processing
Visual Intelligence Online Seminar: Layer-wise Analysis of Transformer Models in Vision and Audio Processing
In this talk, Teresa Dorszewski will present a layer-wise analysis of Vision Transformer models (ViTs) and speech representation models, providing a detailed understanding of state-of-the-art transformer architectures.