BIOINFORMATICS

The Bioinformatics Group at INGM focuses on the analysis, development, optimization, and application of tools for processing biological and genomic data generated by the Institute’s researchers and their collaborators. The facility works closely with research groups, providing support for both standard and customized bioinformatic analyses, contributing to both basic and translational research. The group ensures access to up-to-date analytical methodologies, fostering the adoption of innovative approaches.
The team operates in an open space shared with bioinformaticians, students, and PhD candidates from various laboratories, promoting skill exchange and interdisciplinary collaboration.
The group’s expertise spans biology, systems biology, computer science, and biostatistics. This multidisciplinary background allows for an integrated and contextualized interpretation of biological data, with continuous attention to their biomedical and clinical implications.
Support from the Information Systems staff ensures constant access to high-performance computing (HPC) infrastructure and to the Institute’s network connectivity—both essential for the group’s activities.

Activities, Technologies and Methodologies

NGS Data Analysis (Next Generation Sequencing): includes analysis of RNA-seq, ChIP-seq, ATAC-seq, RADICL-seq, SAMMY-seq and C-technologies (4C-seq, HiC). The entire analytical workflow is covered, from raw data processing to biological interpretation.
Single-cell Data Analysis: Use of 10X Genomics and Smart-seq-based technologies in a variety of applications. Integration of multi-omic approaches including VDJ-seq, feature barcoding, ATAC-seq, transposable element analysis, and LIBRA-seq. Transcriptome reconstruction and data integration using the Seurat and Scanpy ecosystems.
Long-read Sequencing Data Analysis (Oxford Nanopore Technologies): Bulk and single-cell transcriptomics, DNA methylation. Full data management, from basecalling to genome alignment and downstream analyses. Development of custom pipelines for bulk and single-cell applications, including gene/transcript quantification, transcriptome reconstruction, and modification calling.
Multiplexed Tissue Imaging Data Analysis (MACSima Technology): Accurate quantification of fluorescence intensity (MFI) and automated cell population labeling. Efficient segmentation of histological regions and identification of complex tissue structures and contexts. Cellular neighborhood analysis and topological assessment to enable detailed spatial characterization of the microenvironment.
Functional Analysis for Biological Contextualization: Includes enrichment tests for Gene Ontology (GO), Gene Set Enrichment Analysis (GSEA), co-expression analysis (WGCNA), and Gene Regulatory Network (GRN) inference. Integration with interaction or prediction databases (e.g., GRID, MINT, miRBase, TargetScan) for in-depth functional analysis.
Collaboration and Custom Solutions: Active collaboration with research groups to develop tailored solutions, adapting existing pipelines or creating new analytical workflows to meet specific project needs.
Training and Education: Internal training for new thesis students, fellows, and interns involved in bioinformatics analyses

Team

Unit Coordinator

Prof. Beatrice Bodega, PhD

Staff

Nome / Name	Ruolo / Role	Email
Valeria Ranzani, PhD	Researcher	ranzani@ingm.org
Eugenia Galeota, PhD	Researcher	galeota@ingm.org
Andrea Gobbini	Researcher	gobbini@ingm.org
Alberto Carignano	Researcher	carignano@ingm.org

Affiliated members

Nome / Name	Ruolo / Role	Aff.ted Lab	email
Riccardo Nodari	Researcher	De Francesco	nodari@ingm.org
Francesco Panariello	Post Doc	Bodega	panariello@ingm.org
Mattia Battistella	Post Doc	Cattaneo	battistella@ingm.org
Simone Maestri	Post Doc	Cattaneo	maestri@ingm.org
Ivan Ferrari	Post Doc	Biffo	ferrari@ingm.org
Michele Panepuccia	PhD student	Bodega	panepuccia@ingm.org
Alen Stambolliu	PhDstudent	Bodega	stambolliu@ingm.org
Camilla Righetti	PhD student	Geginat	righetti@ingm.org
Marinicla Pascale	Predoctoral Fellow	Lanzuolo	pascale@ingm.org
Amanda Bianchi	Predoctoral Fellow	Bodega	abianchi@ingm.org
Marco Cominelli	Predoctoral Fellow	Bodega	cominelli@ingm.org
Francesca Conti	Predoctoral Fellow	Bodega	conti@ingm.org
Elisa Arsuffi	Predoctoral Fellow	Biffo	arsuffi@ingm.org
Christian dall'Ava	Predoctoral Fellow	Cattaneo	dallava@ingm.org
Francesco Giuseppe Tagliabue	Master Student	Cattaneo	tagliabue@ingm.org
Andrine Risoy	Master Student	Cattaneo	risoy@ingm.org
Davide Sandrelli	Master Student	Bodega	sandrelli@ingm.org
John Villis	Master Student	Bodega	villis@ingm.org
Behrad Kashefi	Master Student	Biffo	kashefi@ingm.org

In-house developed tools e software

CIA (Cluster Independent Annotation): A computational tool for analyzing single-cell RNA-seq (scRNA-seq) data, enabling accurate identification of cell populations based on predefined transcriptional signatures. CIA offers an intuitive, fast, and reproducible approach for single-cell functional annotation. PyPI | GitHub
IRescue: A tool for quantifying the expression of transposable element (TE) subfamilies in single-cell RNA sequencing (scRNA-seq) data. It performs UMI deduplication with sequencing error correction (for 10X or UMI-based libraries) or read quantification (for UMI-less libraries, such as SMART-seq), followed by probabilistic assignment of multi-mapping reads using an Expectation-Maximization (EM) procedure. GitHub
TEcount: A package for quantifying transposable elements (TEs) in bulk RNA-seq experiments at the subfamily, family, and class levels. GitHub
OligoMiner: A web app for designing oligo probes against any genomic target, suitable for hybridization-based applications (e.g., FISH). Website
CombiROC: A web application for generating and interactively analyzing ROC curves from multi-marker panels. Website
combiroc (R package): An R package representing a new implementation of the CombiROC method. It includes functions for automatic selection and optimization of gene signatures, including in single-cell RNA-seq contexts. (GitHub) (CRAN)
myVCF: A desktop application for managing mutation data from high-throughput sequencing projects producing VCF files. It allows users with no bioinformatics background to explore, query, visualize, and export mutational data in a simple and intuitive way. Documentation
miRiadne (application no longer available): A web tool for re-annotating microRNA lists or datasets. Outdated annotations (due to older miRBase versions or profiling platforms) can be updated. The tool has been discontinued and the project is no longer maintained. For any requests, please contact the original authors of the publication.

INGM bioinformaticians rely on a high-performance computing (HPC) cluster with over 400 CPUs (AMD Opteron), 1.5 TB of RAM, and approximately 100 TB of storage. The infrastructure is maintained by the Information Systems staff in collaboration with the bioinformatics team. It features ultra-fast connectivity and is equipped with security and backup systems to ensure data protection.

Publications

IRescue: uncertainty-aware quantification of transposable elements expression at single cell level
Polimeni B, Marasca F, Ranzani V, Bodega B
Nucleic Acids Research, (2024)
[preprint] CIA: Unveiling Cellular Identities with Cluster-Independent Annotation in Single-Cell RNA Sequencing Data for Comprehensive Cell Type Characterization and Exploration
Ferrari, Battistella, Vincenti, Gobbini, Marini, Notarbartolo, Costanza, Biffo, Grifantini, Abrigani, Galeota
bioRxiv 2023.11.30.569382 (2023)
[preprint] Combinatorial selection of biomarkers to optimize gene signatures in diagnostics and single cell applications
Ferrari I, Mazzara S, Abrignani S, Grifantini R, Bombaci M, Rossi R.L.
bioRxiv 2022.01.17.476603 (2022)
Novel interferon-sensitive genes unveiled by correlation-driven gene selection and systems biology
Cheroni C, Manganaro L, Donnici L, Bevilacqua V, Bonnal RJP, Rossi RL, De Francesco R.
Scientific Reports 11, 18043 (2021)
OligoMinerApp: a web-server application for the design of genome-scale oligonucleotide in situ hybridization probes through the flexible OligoMiner environment
Passaro M, Martinovic M, Bevilacqua V, Hershberg EA, Rossetti G, Beliveau BJ, Bonnal RJP, Pagani M.
Nucleic Acids Research, Volume 48, Issue W1, 02 July 2020, Pages W332–W339
Computation and Selection of Optimal Biomarker Combinations by Integrative ROC Analysis Using CombiROC
Bombaci M, Rossi RL.
In: Brun V., Couté Y. (eds) Proteomics for Biomarker Discovery. Methods in Molecular Biology, vol 1959. Humana Press, New York, NY. (2019)
Big Data: Challenge and Opportunity for Translational and Industrial Research
Rossi RL, Grifantini RM.
Front. Digit. Humanit. 5:13 (2018)
myVCF: a desktop application for high-throughput mutations data management
Pietrelli A, Valenti L.
Bioinformatics btx475 (2017)
CombiROC: an interactive web tool for selecting accurate marker combinations of omics data
Mazzara S, Rossi RL, Grifantini R, Donizetti S, Abrignani S, Bombaci M.
Sci Rep (2017) 7:45477
Normalization of circulating microRNA expression data obtained by quantitative real-time RT-PCR
Marabita F, de Candia P, Torri A, Tegnér J, Abrignani S, Rossi RL.
Brief Bioinform (2016) 17:204-12
miRiadne: a web tool for consistent integration of miRNA nomenclature
Bonnal RJ., Rossi RL., Carpi D., Ranzani V., Abrignani S., Pagani M.
Nucleic Acids Res (2015) 43:W487-92

BIOINFORMATICS

Activities, Technologies and Methodologies

Team

Unit Coordinator

Staff

Affiliated members

In-house developed tools e software

Publications

INGM RESEARCH IS SUPPORTED BY

NEWS

CONTACTS

AMMINISTRAZIONE TRASPARENTE

BANDI DI GARA