Identifying predictive signalling networks

The field of bioinformatics has been defined as conceptualizing biology in terms of macromolecules and then applying computer-based techniques to understand and organize the information associated with the macromolecules. It combines aspects of data analysis, statistics, mathematics, and software engineering to collect, integrate, analyse, visualize and interpret biological data. The field intersects signal processing on the instrument side and mathematical modelling of complex systems on the interpretation side. Within this broad scope, this project will focus on developing a network-based modelling framework for transcriptomics data in complex disease that can integrate biological background information the regulatory networks of gene expression, and networks describing protein-protein and protein-drug target interactions. The model framework will generate specific, testable hypotheses from these data sources by providing a computational representation of increasingly accepted theories of epigenetics and transcription factor regulation, signal transduction networks and drugs acting as inhibitors of signal transduction molecules.

Network representations are ubiquitous in bioinformatics. A network model consists of nodes and edges, typically with nodes representing genes or proteins and edges representing the relationship between two nodes. This relationship can represent a physical interaction between two proteins, as in protein-protein interaction networks, a regulatory relationship between a transcription factor and a target gene in gene regulatory networks or statistical relationships, such as an observed correlation between the gene expression levels of two genes. Central to the utility of network models is that they can integrate many diverse types of data in the same model system, as networks models can include diverse types of nodes and edges. An example is the structured knowledge representations of the gene ontology (GO), where different biological cellular compartments, molecular functions and biological processes are ordered from more general to more specific. However, network models used purely as visualisation tools can generate confusing graphical visualizations, resulting in the infamous “furball” plots familiar to readers of bioinformatics papers. The aim of this project is to use network models in conjunction with a class of mathematical analysis tools known as network propagation to generate specific falsifiable quantitative predictions.

The model framework will be applied and tested in the context of inflammatory bowel disease (IBD). IBD encompasses ulcerative colitis (UC) and Crohn’s disease (CD), an autoimmune disease of the gastrointestinal tract. The incidence of the disease is increasing and there are an estimated 2.5-3.5 million individuals in Europe with an IBD diagnosis, incurring an annual treatment cost of 4.6-5.6 billion Euros. The aetiology and pathogenesis of IBD are not completely understood, but the current theories propose a combination of genetic predisposition and environmental factors that cause an aberrant activation of the immune system.

A large number of transcriptomics datasets have been published within the IBD field using both microarrays and RNA-seq. This project will use network component analysis (NCA) to identify the key transcription factors underlying the observed patterns of gene expression variability between individual patients in these data. NCA represents the transcriptomic data as a linear combination of the activities of a small number of transcription factors and estimates a quantitative measure of the activity of these regulators. Secondly, network propagation models in combination with protein-protein interaction networks will be combined with the gene expression data to suggest protein subnetworks that contribute to the activity of these regulators. This will provide insight into the underlying mechanisms of individual differences in gene expression between IBD patients that may be relevant for further development of precision medicine in IBD.

Members:

Ruth H Paulssen (Principal investigator) (Project manager)

Christopher Graham Fenton

Endre Anderssen

Financial/grant information:

Helse-Nord