BIOINFORMATICS

The Bioinformatics group is a team of professional scientists focusing on study, application, development, and optimization of tools for the analysis of genomic and biological data generated by INGM scientists and collaborators. The facility works closely with the Institute’s researchers and gives access to biological data analyses and elaborations to all research groups. It supports basic and translational research with both standard and customized analyses; it provides and facilitates access to up-to-date as well as novel analytical methods.

Our team works in the Institute’s bioinformatics open space in strict contact with other bioinformatics researchers and graduate students. Our team members are either dedicated to single projects or work in collaboration on multiple ones, according to Institute’s needs, workloads and requirements of principal investigators. Our background is wide and covers biology, systems biology, computer science, biostatistics; our multidisciplinar nature allows us to have a fresh and systemic view of data and their biomedical and clinical context.
The IT personnel grants us the access to INGM’s in-house state-of-the-art high performance computing infrastructure and connectivity.

Activity

  • Support on experimental design for data-intensive projects, data cleaning and data exploration.
  • Medium to high throughput gene expression profiling: from RTqPCR arrays to microarrays.
  • Next generation sequencing analyses: RNA sequencing, whole exome sequencing, custom panels, ChIP sequencing.
  • Analysis of non-coding RNAs data, cellular and circulating microRNAs.
  • Multivariate analyses for transcriptomics, genomics and proteomics; features selection, biomarker prioritization, descriptive and inferential biostatistics.
  • Functional analyses for biological contextualization, gene ontology, methods for pathway analyses.
  • Advanced functional analyses based on pathways impact, network metrics.
  • Design and development of software applications for computational biology and genomics.
  • Training for students and interns as developing bioinformaticians
  • Internal training in general data literacy and scripting principles aimed at all INGM researchers

Team

Unit Coordinator

Prof. Beatrice Bodega, PhD

Staff

Nome / NameRuolo / RoleEmail
Valeria Ranzani, PhDResearch Scientistranzani@ingm.org
Eugenia Galeota, PhDResearch Scientistgaleota@ingm.org

Affiliated members

Nome / NameRuolo / RoleAff.ted Labemail
Andrea GobbiniResearcherGrifantinigobbini@ingm.org
Ivan FerrariResearcherBiffoferrari@ingm.org
Riccardo NodariResearcherDe Francesconodari@ingm.org
Benedetto PolimeniPhD studentBodegapolimeni@ingm.org
Lorenzo SalviatiPhD studentBodegasalviati@ingm.org
Emanuele Di Patrizio SoldateschiPhD studentLanzuolosoldateschi@ingm.org
Mattia BattistellaPhD studentCattaneobattistella@ingm.org
Francesca VincentiResearch fellowAbrignanivincenti@ingm.org
Gialuca DamaggioResearch fellowCattaneogianluca.damaggio@unimi.it
Michele PanepucciaResearch fellowBodegapanepuccia@ingm.org
Isidora BijelovićMaster studentBodegabijelovic@ingm.org
Alen StambolliuMaster studentBodegastambolliu@ingm.org
Carola MiuccioII level Master internManganaromiuccio@ingm.org

Equipment

INGM bioinformaticians rely on a in-house high performance computing (HPC) cluster with more than 300 CPUs, 1.5 TB RAM and about 100 TB of disk storage. The infrastructure was deployed and is being maintained by th Information Technology personnel in collaboration with the bioinformatics group. The whole infrastructure is wired with high speed connectivity and protected by secure and backup systems.

Computational activities are performed on the HPC and managed by the Torque/PBS queue system on a series of virtual machines. Both the HCP and VMs run Ubuntu Linux operating system. Minor computational tasks can be also performed locally on personal workstations (Xeon PCs, Windows OS) and/or laptops (Win/iOS), which are also used as HPC clients.

Applications / Software

  • combiroc (R-package) (GitHub) (CRAN)
    The combiroc R package is our most recent implementation of the CombiROC method, introducing additional functions for the automatic selection and optimization of gene signatures, also in the context of single cell RNA sequencing experiments.
  • CombiROC (http://www.combiroc.eu)
    CombiROC is a web application for guided and interactive generation of multimarker panels.
  • myVCF (http://myvcf.readthedocs.io/en/latest/)
    myVCF is a application for high-throughput mutations data management managing multiple sequencing projects created from VCF files; it allows end-users without strong programming and bioinformatics skills to explore, query, visualize and export mutations data in a simple and straightforward way.
  • miRiadne
    miRiadne is a tool for re-annotating miRNA namelists or datasets. Obsolete annotations (either due to older miRBase versions or out-dated profiling platforms) can be converted into newer ones enforcing mature sequence correspondence. This project is not further mantained and the application is not available anymore: for any enquire please contact the paper’s main author (see below).

Publications

INGM RESEARCH IS SUPPORTED BY