spring 2025
FYS-8604 NORA research school on multi-modal learning - 5 ECTS

Type of course

The course can be taken as a singular course.

Registration is open for UiT students and members of NORA Research School.

It will be conducted as a concentrated course in the style of summer school, 10th -14th of June 2025 in Høgskolen i Østfold, Fredrikstad.


Admission requirements

PhD students or holders of a Norwegian master´s degree of five years (300 ECTS) or 3 (180 ECTS) + 2 years (120 ECTS) or equivalent may be admitted.PhD students must upload a document from their university stating that there are registered PhD students. This group of applicants does not have to prove English proficiency and are exempt from semester fee. Holders of a Master´s degree must upload a Master´s Diploma with Diploma Supplement /English PhD students at UiT The Arctic University of Norway register for the course through StudentWeb. External applicants apply for admission through SøknadsWeb. All external applicants have to attach a confirmation of their status as a PhD student from their home institution. Students who hold a Master of Science degree, but are not yet enrolled as a PhD-student have to attach a copy of their master’s degree diploma. These students are also required to pay the semester fee.

Recommended prerequisites: Programming skills in python and hands on knowledge of python programming for deep learning.

Application code: 9317, application deadline: May 1st

The course is limited to 40 places. Qualified applicants are ranked on the basis of a lottery if there are more applicants than available places.


Course content

This course will provide both a fundamental understanding of the techniques and methods used in multi-modal learning and the most recent innovations within the field. The course will cover multi-modal learning in the context of important data domains such as image, text, and time series data, and give the students a deeper understanding of the theory that underpins current multi-modal methods. The course will consists of 5 days of teaching with both lectures and practical components.

A take-home exam will be given that will be used for evaluation.

Core concepts

  • History of multi-modal learning
  • Motivation and fundamental concepts
  • Encoders, data fusion, and loss functions

Discriminative multi-modal learning

  • Classification-based methods
  • Clustering-based methods

Self-supervised multi-modal learning

  • Contrastive methods
  • Non-contrastive methods
  • Masking-based methods
  • Data augmentation

Generative multi-modal learning theory

  • Auto-regressive methods
  • Predictive methods

Advanced multi-modal learning theory

  • On the design of multi-modal loss functions
  • Understanding data fusion multi-modal learning
  • Nosiy and missing data in multi-modal learning

Relevance of course in program of study: Fusing information from multiple modalities is a fundamental challenge in machine learning. However, recent works have achieved impressive performance on a wide range of different tasks involving modalities from variety of different source such as images, time series, and text. Recent multi-modal approaches heavily rely on deep neural networks to process various different data types and fuse them together to a common representation that contains the complementary information found in the different sources. Performing this fusion in a reliable and precise manner is key to achieve good performance. Furthermore, the added complexity of having multiple neural networks for different modalities exacerbates already existing neural network-related challenges such as a lack of explainability and a limited capability for uncertainty modeling.


Objectives of the course

Knowledge - The student is able to

  • Describe advanced multi-modal techniques
  • Describe the role of domain-knowledge in multi-modal learning
  • Describe the development of multi-modal learning
  • Discuss recent development in the field
  • Discuss advanced multi-modal learning in an applied setting.

Skills - The student is able to

  • Explain the fundamental ideas behind multi-modal learning
  • Apply the learned material to new applications or problem settings
  • Use multi-modal learning for research and industrial settings using software libraries such as e.g. Pytorch or TensorFlow
  • Make appropriate method and architecture choices for a given application or problem setting

General competence - The student is able to

  • Give an interpretation of recent developments and provide an intuition of the open questions in the field of multi-modal learning
  • Show an understanding of why multi-modal learning has shown great improvements over the last couple of years

Language of instruction and examination

The language of instruction is English, and all the syllabus material is in English.The final report needs to be submitted in English.

Teaching methods

Lectures: 20 hours

Self-study sessions: 40 hours

Project work: spread over 8 weeks - net time 50 hours

Hands-on sessions: 10 hours

Net effort: ( 120 hours)


Final exam

Emnet legges ned og siste mulighet til å avlegge eksamen etter dette semesteret, er vår 2026

Her finner du mer informasjon om eksamen i nedlagte emner

Schedule

Examination

Examination: Duration: Grade scale:
Off campus exam 8 Weeks Passed / Not Passed
UiT Exams homepage

Re-sit examination

There will not be given a re-sit exam for this course.
  • About the course
  • Campus: Fredrikstad | Annet |
  • ECTS: 5
  • Course code: FYS-8604
  • Tidligere år og semester for dette emnet