You are here

Analytical Genomics


The group works on various problems connected with the functioning and evolution of biological systems. We use mathematical tools, coming from statistics and combinatorics, algorithmic tools and molecular physics tools to study basic principles of cellular functioning starting from genomic data. We run several projects in parallel, all aiming at understanding the basic principles of evolution and co-evolution of molecular structures in the cell. They are intimately linked to each other.

Four projects concern protein evolution and the development of bioinformatics and molecular modeling tools for the detection of :

1. distantly related proteins. We develop novel computational approaches to annotation based on sequence and protein family learning.
2. networks of co-evolved and/or dynamically correlated residues. On the one hand a fine combinatorial analysis of phylogenetic trees leads to reconstruct networks of co-evolved residues from sequence analysis. On the other hand a thorough characterization of inter-residue dynamical correlations enables to detect communication pathways across protein structures. Combining these approaches we aim at predicting interaction sites, mechanical and allosteric properties, folding pathways in proteins.
3. functional sites on protein complexes and detection of potential protein partners. We combine evolutionary information (how evolution modified proteins to enhance their function) and molecular modeling (computational determination of the relative position of two interacting protein partners) to identify potential interactions.
4. alternative functional conformations of proteins. We develop and apply methods to describe the complex transitions of biomolecules, with the aim of predicting alternative conformations that play a functional role and that are suitable for drug targeting.

Four projects concern sequence evolution:

5. in microbial organisms: essential genes, synthetic biology and genome evolution. We extract information concerning environmental organization and essential metabolic networks from codon bias analysis. We aim at a metabolic network reconstruction of metagenomics sampling, genome synthesis, and modeling evolvability of gene expression under changes of environmental conditions.
6. in eukaryotic organisms: ab initio detection of miRNAs. We work on a novel ab initio approach to discover miRNA organized in clusters. The functional organisation of miRNA clusters is studied.
7. reconstruction of ancestral genomes and of chromosomal rearrangements dynamics. We develop general methods for reconstructing ancestral genomes and the history of the rearrangements. We focus on the phylogenetic tree of Lachancea genus.
8. statistical methods for transcriptome analysis by deep sequencing. New sequencing technologies enable profiling the transcriptome of given cell types with unprecedented precision. We develop methods for detection and estimation of transcript expression levels.

Applications are multiple and play a role in directed mutagenesis, synthetic biology, metagenomic data organisation, gene annotation

Selected Publications
Mathelier A, Carbone A.
Large scale chromosomal mapping of human microRNA structural clusters.
Nucleic Acids Res. 41(8), pp.4392-408 (2013).
Mirauta B, Nicolas P, Richard H
Pardiff: Inference of Differential Expression at Base-Pair Level from RNA-Seq Experiments.
in IEEE International Conference on Image Analysis and Processing (ICIAP) 2013 Workshops, LNCS 8158. Springer. pp. 418–427 (2013)
Lopes A, Sacquin-Mora S, Dimitrova V, Laine E, Ponty Y, Carbone A.
Protein-protein interactions in a crowded environment: an analysis via cross-docking simulations and evolutionary information.
PLoS Comput Biol. 9(12), pp.e1003369 (2013).
Laine E, Auclair C, Tchertanov L.
Allosteric communication across the native and mutated KIT receptor tyrosine kinase.
PLoS Comput Biol. 8(8), pp.e1002661 (2012).
Laine E, Goncalves C, Karst JC, Lesnard A, Rault S, Tang W-J, Malliavin TE, Ladant D, Blondel A.
Use of allostery to identify inhibitors of calmodulin-induced activation of Bacillus anthracis edema factor.
Proc Natl Acad Sci U S A. 107(25), pp.11277-82 (2010).
Mathelier A, Carbone A.
Chromosomal periodicity and positional networks of genes in Escherichia coli.
Mol Syst Biol. 6, pp.366 (2010).
Mathelier A, Carbone A.
MIReNA: finding microRNAs with high accuracy and no learning at genome scale and from deep sequencing data.
Bioinformatics. 26(18), pp.2226-34 (2010).
Richard H, Schulz MH, Sultan M, Nürnberger A, Schrinner S, Balzereit D, Dagand E, Rasche A, Lehrach H, Vingron M, Haas SA, Yaspo M-L.
Prediction of alternative isoforms from exon expression levels in RNA-Seq experiments.
Nucleic Acids Res. 38(10), pp.e112 (2010).
Engelen S, Trojan LA, Sacquin-Mora S, Lavery R, Carbone A.
Joint evolutionary trees: a large-scale method to predict protein interfaces based on sequence sampling.
PLoS Comput Biol. 5(1), pp.e1000267 (2009).
Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, Seifert M, Borodina T, Soldatov A, Parkhomchuk D, Schmidt D, O'Keeffe S, Haas S, Vingron M, Lehrach H, Yaspo M-L.
A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome.
Science. 321(5891), pp.956-60 (2008).
Jobs & Internships

Open Positions