Bioinformatics
To read more details about the Bioinformatics theme within 19th IHIWS, click on each of the following subthemes. You will get to learn about the project leader, project description, milestones data required and more.
Project Leaders:
- Steven Mack
- Martin Maiers
- Kazutoyo Osoegawa
Detailed Project Description: Development of methods, standards, software tools, and online services to foster the standardized analysis, collection, exchange and storage of highly polymorphic immune-related genetic data for the purposes of basic and clinical research, the advancement of medical therapies, and understanding the genomics of the vertebrate immune system. A specific focus of this project involves development of pangenomic graphs for the HLA, KIR and LILR genes.
A key goal of this project is the development of am IHIWS database (dbIHW) structure and system fostering community data-access along with database continuity across subsequent IHIWS iterations.
Milestones in Years:
2023: Community outreach, enrolment and new project formulation, continuation of existing projects, dbIHIW design.
2024: Construction of datasets, satellite DaSH meetings, publication of papers and standards, continued enrolment, continued dbIHIW design and implementation.
2025: Identification of sub-projects, satellite DaSH meetings, publication of papers and standards, continued enrolment, dbIHW data-migration and population
2026: Presentation and dbIHIW launch
Data Required (number, type of data, inclusion/exclusion criteria): TBD, but will include existing data from prior DaSH efforts along with data from prior IHIW efforts, going back to at least the 13th IHIW (e.g., dbMHC), and including data from the 17th and 18th IHIW efforts.
Samples required (if applicable, number, type of samples, inclusion/exclusion criteria): No biological samples will be *required* as part of this project, but data describing and generated from current and prior IHIW efforts will be requested.
Reagents/Additional Assays Required: NONE
Data Infrastructure Required: We look forward to discussing this with the 19th IHIW organizers and data team. Certainly AWS/Cloud resources will be used for development.
Project Name: SNP-HLA Reference Consortium (SHLARC)
Project Leaders: Nicolas Vince, Pierre-Antoine Gourraud
Detailed Project Description:
Over the past 15 years, genome-wide association studies (GWAS) have identified more than 10,000 associations. Particularly, the HLA genomic region stands out as the most highly associated locus in GWAS, predominantly in immune-related diseases. SNPs are the hallmark of GWAS, however, the information on this type of genetic marker is very limited, especially in the HLA region where linkage disequilibrium (LD; defined as the non-random association of allele frequencies) is strong and extends over several megabases. To advance our understanding of functional mechanisms and potentially identify therapeutic targets, we must move beyond these simple associations, especially when dealing with HLA alleles. HLA typing techniques are expensive, require specialized laboratory infrastructure, and are in constant evolution.
However, recent developments in statistical inference enable us to impute HLA alleles from genotyped GWAS SNPs. Successful implementation of this technique relies on the availability of adequate reference panels for imputation. The objective of this project is to create diverse reference panels that enhance HLA imputation accuracy from GWAS datasets. To achieve this goal, we still need to:
1- Collect additional HLA and SNP data from numerous sources.
2- Improve our understanding of how diverse haplotypes and populations influence HLA imputation accuracy.
3- Maintain a digital platform (SHLARC, the SNP-HLA reference consortium: https://hla.univ-nantes.fr) accessible to scientists for their own data imputation needs.
Practically, we have successfully gathered more than 10,000 samples from several sources including public data (the 1000 Genomes project), semi-public data (via access to dbGAP and EGA data repositories), and direct collaborations. These later datasets come from diverse ancestry backgrounds such as Brazil (European + African + Native American), Benin (African), and various European (Western Europe, USA, Finland). We are still open to expanding the diversity of our data sources.
We have also developed an online platform to perform Hla imputation using the datasets mentioned above, which is freely accessible: (https://hla.univ-nantes.fr).
Milestones in years:
- 2023: Launch of the SHLARC website.
- 2024: Joint SHLARC/SIP workshop in Nantes, France (around September).
- 2026: Final Report on database diversity, HLA imputation performance, and applicability for research projects.
Data Required (number, type of data, inclusion/exclusion criteria):
Several types of data are suitable but all need to contain at least second-field molecular HLA typing for all HLA genes and SNP genotypes.
SNP genotypes: all types of GWAS chip data, sequencing data covering 500 kb around HLA genes, whole-genome sequencing WGS data.
Minimal HLA typing resolution: second-field. HLA can also be called from WGS data.
Data Infrastructure Required:
Data infrastructure will be hosted in the Nantes Université data center. Additionally, we will make use of our local high throughput calculation center (Glicid, Nantes Université) to build reference panels with the help of high-performance GPUs (NVIDIA A100).
Project Name: Clinical Histocompatibility Laboratory Informatics
Project Leaders: Loren Gragert and Nicholas Brown
Detailed Project Description:
The project will build key infrastructure to standardize collection and reporting of clinical histocompatibility data and improve analysis tools and resources to aid in virtual crossmatch assessments.
We will query the databases underlying the histocompatibility laboratory information systems (LIS) to extract detailed information on molecular HLA typing (high resolution NGS typing and intermediate resolution deceased donor typing), solid phase antibody screens, flow crossmatch, and post-transplant donor-specific antibody (DSA) assessments.
This more detailed information usually does not leave the HLA laboratory in an electronic form and is not being adequately captured in organ allocation systems and outcomes registries. To make it easier to transfer HLA lab information into electronic medical records (EMRs), biobanks for clinical research, and transplant registries, we are developing data standards for reporting results from HLA antibody screens and histocompatibility assessments. The project will continue development of first XML-based format for HLA antibody data, HLA antibody markup language (HAML), initially created by Eric Spierings, Gottfried Fischer, and Loren Gragert.
For organ allocation systems, we plan to build informatics tools that analyze and integrate histocompatibility data between donors and recipients to aid in virtual crossmatch assessments. We also plan to build tools that would help perform data cleaning/curation for large-scale reanalysis of historical histocompatibility data for research.
In addition to organ allocation systems, with multiple mismatched unrelated donors (MMUD) becoming more common in hematopoietic stem cell transplantation (HSCT), registries such as National Marrow Donor Program (NMDP) and World Marrow Donor Association (WMDA) are recognizing increased needs to have their donor selection systems capture HLA antibody screen data and utilize it to automatically screen off incompatible donors from the search.
Our team plans to make all our informatics tools available to the transplant and immunogenetics community to benefit other research consortia, including Clinical Trials in Organ Transplantation (CTOT).
Milestones in Years:
- 2023: Scripts developed to extract detailed information from histocompatibility lab information systems from leading vendors on antibody screens, intermediate resolution molecular typing, and crossmatch results.
- 2024: Publication of HLA antibody markup language (HAML) XML standard
- 2025: Publish standards for transmitting histocompatibility data in electronic medical records (HL7 FHIR Orders and Observations implementation guide for communicating HLA antibody data and histocompatibility assessments)
- 2026: Test advanced virtual crossmatch tools on histocompatibility for simulated donor and recipient pairings.
Data Required (number, type of data, inclusion/exclusion criteria):
- Historical data on HLA typing (molecular and antigen level), antibody assays, and crossmatch results
- Scripts will be provided for extracting and de-identifying detailed data from laboratory information systems of leading vendors
Samples Required (if applicable, number, type of samples, inclusion/exclusion criteria):
- Physical samples are not required for this project. The project will involve secondary data analysis.
Reagents/additional assays required:
- Participants will not be required to run additional assays or utilize reagents.
Data Infrastructure Required:
- Participants will need access to their clinical histocompatibility information systems.
- The project will host a web server for providing web-based analysis tools for aiding in virtual cross-matching and user-authenticated access to de-identified datasets.