spring 2024
FYS-8602 Winter School NLDL 2024 - 5 ECTS

Type of course

The course can be taken as a singular course. Registration is open for UiT students, members of the NORA Research School, and NLDL participants. It will be conducted as a concentrated course in the style of a winter school 8 to 12 Jan 2024.

Admission requirements

PhD students or holders of a Norwegian master´s degree of five years (300 ECTS) or 3 (180 ECTS) + 2 years (120 ECTS) or equivalent may be admitted. PhD students do not have to prove English proficiency and are exempt from semester fee. Holders of a Master´s degree must upload a Master´s Diploma with Diploma Supplement / English. PhD students at UiT The Arctic University of Norway can register for the course through StudentWeb. External applicants apply for admission through SøknadsWeb. All external applicants must attach a confirmation of their status as a PhD student from their home institution. Please note that students who hold a Master of Science degree but are not yet enrolled as a PhD-student must attach a copy of their master's degree diploma. These students are also required to pay the semester fee.

Recommended prerequisites: Programming skills in python and hands on knowledge of python programming for deep learning. Knowledge of machine learning at Master’s level from study programs in computer science, physics and technology, mathematics and statistics, or equivalent.

Application code: 9315, application deadline: 1 November.

The course is limited to 40 places. Qualified applicants are ranked on the basis of a lottery if there are more applicants than available places.


Course content

This course will provide a study of several emerging topics of high relevance within advanced deep learning, from a basic understanding of the techniques to the latest state-of-the-art developments in the field. Synthetic data generation for addressing common problems of data scarcity, privacy-preserving data sharing, and bias through case studies, reliability of AI, and generative models will be treated in depth in the form of tutorials, as will high-performance computing. Additional directions within deep learning, complementing those already mentioned, will be covered at the introductory level via a series of keynote talks. In addition, the participants will be exposed to the latest advances and applications in deep learning by the oral presentations and poster presentations in the main conference program.

The course will thus consist of 5 full days of the NLDL conference including: tutorials, keynote sessions, oral presentations and poster presentations, as well as practical components.

Synthetic data generation will focus on use of synthcity, an open-source Python library that implements state-of-the-art synthetic data generators for addressing data scarcity, privacy, and bias. The topics that will be treated include i) use of privacy preserving synthetic data to substitute sensitive data for training AI systems; ii) use of synthetic data to de-bias training data and promote fairness for training AI analytics, and, iii) synthetic data to augment the available training data in domains where data is scarce, difficult to acquire or costly by leveraging data from multiple other sources (i.e. multi-source learning).

Reliability in AI will focus on expressivity, generalization and explainability of deep learning models with emphasis on graph convolutional neural networks, spectral graph convolution, rate distortion theory, and use of applied harmonic analysis for better explainability approaches and correctness of deep learning concerning computability on digital machines.

Generative models will focus on neural networks of the encoder-decoder form where the encoder maps inputs to a stochastic latent space from which the decoder maps back to the input space. The specific model in study will be the variational autoencoder, also investigated in a multi-modal learning setting.

Introduction to High Performance Computing (HPC) will focus on use of HPC for training deep learning models for very large scientific problems. Use of generative AI models such as variational autoencoder for practical applications.

Core concepts

Synthetic data generation

  • Use of privacy preserving synthetic data.
  • Use of synthetic data to de-bias training data.
  • Promote fairness for training AI analytics.
  • Data augmentation for scarce data domains.

Reliability in AI

  • Generalization and Explainability in deep learning models.
  • Graph Convolutional Neural Networks.
  • Using applied harmonic analysis for explainability.
  • Reliability from a digital hardware perspective.

Generative models

  • Core concepts pertaining to Variational autoencoder (VAE) models.
  • Using Evidence lower bound (ELBO) methods for VAE.
  • Extending VAEs to multimodality domains.

High Performance Computing for deep learning

  • Using clusters and handling large datasets.
  • Training deep learning models using HPC for very large problems.

In addition, keynote talks will address fundamental topics such as:

  • Trustworthiness of AI models.
  • Using AI for prediction of patient health conditions based on radiology and pathology imaging and electronic health records.
  • Applications of deep learning to vision and multi-modal data without using human supervision.

Relevance of course in program of study: Deep learning has revolutionized technology and science by enabling information to be extracted with much better precision and at much larger scales compared to only a few years ago from data sources such as images, text, speech, biological material, chemical components, and sensory measurements in general. Most study programs offer courses on the fundamental theory and applications of neural networks and machine learning systems in general, which provide the backbone of deep learning. This course treats topics within deep learning that are not covered in standard courses. Use of AI for synthetic data generation that allows sharing, augmenting and de-biasing data for building performant and socially responsible AI algorithms. Furthermore, reliability of AI is of fundamental importance and will be treated from a mathematical perspective to produce explainable deep learning models. Generative models have recently been developed within deep learning and have received much public attention and there is a need to expose the next generation researchers and practitioners in depth to these emerging topics. At the same time, the course will give an introduction to additional developments and specialized sub fields and applications such as trustworthiness of AI, Using AI for predicting patient health conditions and the use of unsupervised deep learning for vision and multi-modal data, thus naturally relevant to the wider study program.


Recommended prerequisites

FYS-3012 Pattern recognition, FYS-3033 Deep learning

Objectives of the course

Knowledge - The student is able to

  • Describe synthetic data generation techniques and the use of privacy preserving synthetic data.
  • Describe the process of de-biasing training data and synthetic data generation by multi-source learning.
  • describe generalization and explainability.
  • Discuss recent developments in the field of fair and reliable AI.
  • Discuss advanced generative models such as in an applied setting and its advantages and limitations.
  • Implement advanced deep learning models.

Skills - The student is able to

  • Explain the fundamental ideas behind fair synthetic data generation techniques.
  • Explain generalization, fairness in synthetic data, and explainability.
  • Apply the learned material to a new application or problem setting.
  • Use fair synthetic data generation and explainable AI for research and industrial settings.
  • Explain VAEs and the significance of deriving evidence lower bound (ELBO).
  • Make appropriate method and architecture choices for a given application or problem setting.

General competence - The student is able to

  • Give an interpretation of recent developments and provide an intuition of the open questions in the field of fair synthetic data generation and reliability in AI.
  • Show an understanding of using fair synthetic data generation techniques, using reliability in AI, and why these methods are important for different applications.
  • Show a deeper understanding of VAEs and its extension towards multimodality domains.
  • Understand the role of HPC and the practical implementation of advanced deep learning models by HPC.

Language of instruction and examination

The language of instruction is English, and all the syllabus material is in English. The final report needs to be submitted in English.

Teaching methods

Lectures: 40 hours (full 5 days of the winter school)

Self-study sessions: 40 hours

Mandatory assignment: 5 hours (poster presentation)

Project work: spread over 8 weeks - net time 50 hours

Net effort (~135 hours)


Final exam

Emnet legges ned og siste mulighet til å avlegge eksamen etter dette semesteret, er høst 2024

Her finner du mer informasjon om eksamen i nedlagte emner

Schedule

Examination

Examination: Date: Duration: Grade scale:
Off campus exam 15.01.2024 09:00 (Hand out)
11.03.2024 14:00 (Hand in)
8 Weeks Passed / Not Passed

Coursework requirements:

To take an examination, the student must have passed the following coursework requirements:

Poster Approved – not approved
UiT Exams homepage

More info about the coursework requirements

Submission of a poster and a poster presentation at the winter school.

Re-sit examination

There will not be given a re-sit exam for this course.
  • About the course
  • Campus: Tromsø |
  • ECTS: 5
  • Course code: FYS-8602
  • Tidligere år og semester for dette emnet