Advanced Bioinformatics for Clinical Genomics Training Course

Biotechnology and Pharmaceutical Development

Advanced Bioinformatics for Clinical Genomics Training Course is designed to bridge the critical knowledge gap between raw Next-Generation Sequencing (NGS) data and actionable clinical insights

Advanced Bioinformatics for Clinical Genomics Training Course

Course Overview

Advanced Bioinformatics for Clinical Genomics Training Course

Introduction

Advanced Bioinformatics for Clinical Genomics Training Course is designed to bridge the critical knowledge gap between raw Next-Generation Sequencing (NGS) data and actionable clinical insights. The curriculum delves deep into the computational and statistical methodologies required for the secure and reproducible analysis of high throughput genomic data, with a specific focus on applications in Precision Medicine and diagnostic healthcare. Participants will master complex bioinformatics workflows, from FastQ quality control and read alignment to somatic and germline variant calling. Key trending areas, including single cell genomics, liquid biopsy analysis, and the application of Machine Learning (ML) for variant classification and drug discovery, are central themes.

The course emphasizes hands-on, project-based learning in a cloud computing environment, utilizing the Linux command line and contemporary scripting languages like Python and R. By focusing on real-world clinical case studies such as cancer research (oncology), rare disease diagnosis, and pharmacogenomics graduates will be prepared to design, deploy, and validate robust clinical grade analysis pipelines. This mastery positions them as high value Clinical Bioinformaticians capable of contributing to translational research and enhancing patient-centered services within hospitals, research institutions, and the biotechnology sector.

Course Duration

10 days

Course Objectives

  1. Implement Next-Generation Sequencing (NGS) quality control and pre-processing using industry-standard tools
  2. Execute robust read-mapping and genome alignment protocols for WGS, WES, and RNA-Seq data to generate BAM/SAM files.
  3. Master advanced techniques for germline and somatic variant calling across diverse clinical data sets.
  4. Apply variant annotation tools for clinical significance and pathogenicity assessment.
  5. Perform comprehensive RNA-Seq analysis, including differential gene expression (DGE) and isoform detection.
  6. Analyze single cell RNA Seq data, including clustering, dimensionality reduction, and cell-type identification.
  7. Apply Machine Learning (ML) algorithms for variant prioritization, biomarker discovery, and patient stratification.
  8. Develop scalable bioinformatics workflows using cloud-native tools and workflow managers
  9. Enforce FAIR data principles and reproducible research practices using containerization
  10. Analyze genomic data for drug response prediction and personalized therapeutic recommendations.
  11. Develop pipelines for tumor-normal pair analysis, identifying driver mutations and tumor heterogeneity.
  12. Understand information governance, ethical, and regulatory challenges in handling Electronic Health Records (EHR) and patient genomic data.
  13. Create high-impact genomic visualizations and interactive reports in R and Python for clinical presentation.

Target Audience

  1. Clinical Scientists and Pathologists.
  2. Bioinformatics Analysts.
  3. Genomic Medicine Trainees and Medical Fellows.
  4. Computational Biologists.
  5. R&D Scientists in Biotech and Pharmaceuticals
  6. Data Scientists.
  7. Laboratory Directors.
  8. Software Engineers.

Course Modules

Module 1: Command Line & High-Performance Computing (HPC) for Genomics

  • Mastering the Linux command line for file manipulation and process management.
  • Introduction to HPC environments and cluster job submission
  • Essential Python and Bash scripting for task automation.
  • Case Study: Setting up an entire bioinformatics analysis environment on an AWS EC2 instance.
  • Understanding data storage (BAM/VCF) and transfer protocols

Module 2: Core Principles of Next-Generation Sequencing (NGS) Data

  • Review of major NGS platforms and data outputs.
  • FastQ format deep dive.
  • Executing FastQC for initial data quality assessment and visualization.
  • Case Study: Troubleshooting a low quality FastQ dataset by identifying adapter contamination and overrepresented sequences.
  • Implementing read trimming and filtering to optimize input data.

Module 3: Advanced DNA Alignment and Reference Genomes

  • Algorithms for short read mapping and best-practice alignment parameters.
  • Understanding and working with the BAM/SAM format, flags, and headers.
  • Post-alignment processing.
  • Case Study: Aligning whole-exome sequencing data from a trio study against the GRCh38 human reference.
  • Quality control of alignments using metrics and coverage analysis 

Module 4: Germline Variant Calling and Annotation

  • Principles of germline variant calling using industry-standard pipelines
  • In-depth analysis of the Variant Call Format (VCF) structure and fields
  • Filtering variants based on quality scores, depth, and population frequencies
  • Case Study: Identifying the causal mutation in a simulated Mendelian rare disease pedigree using a VCF file.
  • Introduction to variant effect prediction tools

Module 5: Somatic Variant Analysis in Oncology (Tumor-Normal)

  • Specific challenges and strategies for detecting somatic mutations in cancer
  • Advanced callers for tumor normal pairs
  • Identifying and filtering sequencing artifacts and benign polymorphisms.
  • Case Study: Analyzing a matched tumor-normal pair to find a clinically relevant driver mutation in a lung cancer patient.
  • Interpreting allele frequencies and tumor purity estimates.

Module 6: Copy Number Variation (CNV) and Structural Variant (SV) Detection

  • Computational methods for detecting CNVs from NGS depth-of-coverage data
  • Analyzing large-scale Structural Variants (SVs)
  • Visualization of CNVs/SVs using tools like IGV or Circos.
  • Case Study: Using CNV analysis to diagnose a known microdeletion syndrome that standard SNP calling missed.
  • Assessing the clinical impact of CNVs on gene dosage.

Module 7: Introduction to RNA Sequencing (RNA-Seq) Workflows

  • Alignment strategies for RNA-Seq data
  • Quantifying gene and transcript abundance
  • Normalization methods for count data
  • Case Study: Identifying differentially expressed genes (DEGs) between two cancer subtypes using DESeq2 in R.
  • Quality control for RNA-Seq.

Module 8: Advanced Transcriptomics: Differential Expression & Fusion Genes

  • Statistical testing for Differential Gene Expression (DGE) and multiple testing correction
  • Pathway and Gene Ontology (GO) enrichment analysis to interpret gene lists.
  • Detecting gene fusion events relevant to oncology
  • Case Study: Functional interpretation of a DGE list to hypothesize a drug's mechanism of action.
  • Introduction to alternative splicing and differential exon usage analysis.

Module 9: Single-Cell and Spatial Genomics Bioinformatics

  • Specialized pre-processing for single-cell RNA Seq data
  • Dimensionality reduction and cell type clustering
  • Analyzing cell-cell communication and trajectory inference.
  • Case Study: Identifying novel cell populations in a scRNA-Seq dataset from a tumor biopsy.
  • Fundamentals of Spatial Transcriptomics data analysis and visualization.

Module 10: Machine Learning (ML) for Genomic Interpretation

  • Fundamentals of supervised vs. unsupervised Machine Learning in genomics.
  • Feature engineering from VCF and gene expression data for predictive models.
  • Applying Random Forests/SVMs for variant pathogenicity prediction
  • Case Study: Building an ML model to predict patient response to a specific chemotherapy based on somatic mutations.
  • Evaluating model performance.

Module 11: Clinical Variant Interpretation and Reporting

  • Applying ACMG/AMP guidelines for variant classification
  • Curating and searching public clinical databases
  • Structured generation of Clinical Genomics Reports for diagnostic use.
  • Case Study: Producing a final clinical report for a patient with an ambiguous VUS, arguing for its reclassification based on in-silico evidence.
  • Understanding ethical, legal, and social implications (ELSI) of genomic data.

Module 12: Pharmacogenomics (PGx) and Drug Response

  • Genomic markers that influence drug metabolism and efficacy
  • Bioinformatics tools for PGx analysis and reporting
  • Analyzing GWAS data for drug toxicity and adverse event prediction.
  • Case Study: Analyzing a patient's PGx panel to recommend the correct starting dose for a common antidepressant.
  • Integration of PGx data into Electronic Health Records (EHR).

Module 13: Reproducible Workflows and Cloud Deployment

  • Implementing FAIR data principles.
  • Using containerization for dependency management and portability.
  • Developing and deploying robust pipelines with workflow languages
  • Case Study: Converting a complex NGS pipeline into a Nextflow workflow and deploying it on a cloud platform
  • Best practices for version control (Git) and collaborative development.

Module 14: Population and Comparative Genomics

  • Fundamentals of Population Genetics and concepts like linkage disequilibrium (LD).
  • Performing Genome-Wide Association Studies and interpreting Manhattan Plots.
  • Tools for comparative genomics and evolutionary analysis.
  • Case Study: Replicating a published GWAS finding using a publicly available dataset and validating the associated gene.
  • Understanding the challenge of population-specific variants and reference bias.

Module 15: Capstone Project and Pipeline Validation

  • Designing an end-to-end clinical diagnostics pipeline.
  • Rigorous pipeline validation and testing using synthetic and real-world controls.
  • Presentation of final project results and methodology to a peer-review panel.
  • Case Study: Validating a full pipeline against an external gold-standard dataset from a proficiency testing provider.
  • Strategies for continuous integration/continuous deployment (CI/CD) in a clinical lab setting.

Training Methodology

  • Active Learning.
  • Hands-on Labs.
  • Project-Based.
  • Flipped Classroom.
  • Pair Programming/Group Work.
  • Guest Speakers.

Register as a group from 3 participants for a Discount

Send us an email: info@datastatresearch.org or call +254724527104 

 

Certification

Upon successful completion of this training, participants will be issued with a globally- recognized certificate.

Tailor-Made Course

 We also offer tailor-made courses based on your needs.

Key Notes

a. The participant must be conversant with English.

b. Upon completion of training the participant will be issued with an Authorized Training Certificate

c. Course duration is flexible and the contents can be modified to fit any number of days.

d. The course fee includes facilitation training materials, 2 coffee breaks, buffet lunch and A Certificate upon successful completion of Training.

e. One-year post-training support Consultation and Coaching provided after the course.

f. Payment should be done at least a week before commence of the training, to DATASTAT CONSULTANCY LTD account, as indicated in the invoice so as to enable us prepare better for you.

Course Information

Duration: 10 days

Related Courses

HomeCategoriesSkillsLocations