News

New method for inference of gene regulatory networks

  • 13 juillet 2020
  • Thème
    Sciences de la vie & médecine

The team of Prof. Jorge Gonçalves at the Luxembourg Centre for Systems Biomedicine of the University of Luxembourg and their collaborators from Finland have developed a new method to infer gene regulatory networks in an easier and more accurate way than possible so far. This work has recently been published in the scientific journal Nature Communications.

The complexity of biological systems is encoded in gene regulatory networks. They are defined as a set of genes, or parts of genes, that interact with each other to control a specific cell function. Gene regulatory networks are important in development, differentiation and responding to environmental cues. “Unraveling this intricate web is a fundamental step in understanding the mechanisms of life and eventually developing efficient therapies to treat and cure diseases,” explains Prof. Jorge Gonçalves, Head of the Systems Control group at the Luxembourg Centre for Systems Biomedicine and senior author of the study. 

With many different genes involved in such a network this is not a trivial task. “The major obstacle in inferring gene regulatory networks is the lack of data,” explains Dr Atte Aalto, first author of the study. “While time series data are nowadays widely available, they are typically noisy, with low sampling frequency and overall small number of samples.” 

New method overcomes data scarcity  

In their paper, the team describes their newly developed method called BINGO (Bayesian Inference of Networks using Gaussian prOcess dynamical models) to specifically deal with these issues.  The novelty of BINGO lies on a nonparametric approach featuring statistical sampling of continuous gene expression profiles (gene expression is the process by which the instructions in our DNA are converted into a functional product, such as a protein). Capturing the complex dynamics of gene expression requires a rich enough model class. This often prevents simulating trajectories to be compared with data. A common approach is to estimate derivatives from the time series data, and solve the resulting regression problem. This strategy, however, suffers badly from the low sampling frequency and noise in the data. Trajectory sampling is a new idea that allows the use of a nonparametric approach in a truly continuous time framework.

“It was important for us to overcome the low sampling frequency – the inherent problem of most gene expression experiments,” details Atte Alto. “BINGO is hence based on modelling gene expression with a nonlinear stochastic differential equation where the dynamics function (or drift function), is modelled as a Gaussian process. This defines gene expression as a stochastic process. Only then we could use Markov chain Monte Carlo (MCMC) techniques which are the key to overcome low sampling frequency by sampling the trajectory computationally also between measurement times.”

BINGO outperforms previously existing methods

Not only does BINGO overcome the problem of data scarcity, it also clearly and consistently outperforms state-of-the-art methods when benchmarked with both real and simulated time-series data covering many different gene regulatory networks. “BINGO’s superior performance and ease of use even by non-specialists make gene regulatory network inference available to any researcher, helping to decipher the complex mechanisms of life,” concludes Prof. Gonçalves. BINGO is available for download under https://github.com/AtteAalto/BINGO.

Reference: Aalto, A., Viitasaari, L., Ilmonen, P. et al. Gene regulatory network inference from sparsely sampled noisy data. Nat Commun 11, 3493 (2020). https://doi.org/10.1038/s41467-020-17217-1

The project has been funded by the Fonds National de la Recherche, ERASysApp and the University of Luxembourg Internal Research Project program.

Freepik