Epidemiology Analytics with Python and Power BI Training Course

About the Course
Course Outline
More Details

This intensive five-day training course is a specialized program focusing on mastering the integration of Epidemiology principles with modern data science tools, specifically Python for analysis and Power BI for visualization and reporting. Participants will gain the critical skills needed to process, analyze, model, and visualize public health data, allowing them to rapidly assess disease burdens, identify risk factors, monitor outbreaks, and communicate findings effectively to stakeholders. The course balances rigorous epidemiological methodology with practical, hands-on application using coding and business intelligence platforms.

The curriculum is structured across 18 progressive modules, spanning the full analytic workflow in public health. Key topics include calculating core epidemiological measures (incidence, prevalence, risk ratios), performing advanced data manipulation using the Python Pandas library, building complex data models and DAX measures in Power BI, implementing regression and survival analysis, and creating fully interactive, data-driven disease surveillance dashboards. Every module features a mandatory practical session using real-world public health data, guaranteeing hands-on experience and immediate reinforcement of analytical and visualization concepts.

Who should attend the training

· Public Health Professionals

· Epidemiologists and Biostatisticians

· Healthcare Data Analysts

· Biomedical Researchers

· Data Scientists interested in health applications

Objectives of the training

· Personal benefits

o Master data cleaning and analysis of complex health data using Python's Pandas

o Confidently calculate and interpret key epidemiological measures (OR, RR, etc.)

o Design and build professional, interactive dashboards in Power BI

o Apply inferential statistical models like logistic regression and survival analysis

o Integrate coding, statistics, and visualization into a seamless analytical pipeline

· Organizational benefits

o Enable the organization to perform rapid, in-house outbreak analysis and reporting

o Improve the quality and clarity of public health surveillance data visualization

o Enhance predictive modeling capabilities for resource allocation and intervention planning

o Standardize data reporting processes using powerful BI tools

o Foster a stronger data-driven culture within public health and healthcare departments

Course Duration: 5 days

Training fee: USD 3000

Training methodology

· Instructor-led coding workshops using Python (Jupyter Notebooks)

· Hands-on lab sessions for building Power BI dashboards from scratch

· Case study analysis of real or simulated disease outbreak datasets

· Collaborative data interpretation and presentation exercises

Trainer Experience

Our trainers are experienced public health informaticians and epidemiologists with doctoral-level expertise and practical experience in government health agencies and global health organizations. They specialize in leveraging open-source tools (Python) and commercial BI platforms (Power BI) to translate complex health data into actionable insights for policymakers and clinicians.

Quality Statement

We are committed to delivering a high-quality, specialized training program that meets the highest standards of epidemiological rigor and modern data analytics proficiency. Our curriculum emphasizes practical skill mastery in Python and Power BI, ensuring participants can immediately apply advanced analytical techniques to real-world public health challenges.

Tailor-made courses

This course can be customized to focus on specific disease areas (e.g., non-communicable diseases, infectious diseases), specialized datasets (e.g., genomics data), or alternative software stacks (e.g., R and Tableau). We offer flexible delivery options, including on-site, virtual, and blended learning solutions to meet your organizational needs.

Module 1: Foundations of Epidemiology and Public Health Data

· Definition, scope, and types of epidemiology (descriptive, analytic, experimental)

· Data sources in public health: Surveillance systems, surveys, and electronic health records (EHR)

· Understanding the epidemiological triad (Host, Agent, Environment)

· Types of study designs: Cohort, Case-Control, Cross-Sectional

· Introduction to common public health data standards and formats

· Practical session: Reviewing and classifying various real-world public health datasets by source and study design

Module 2: Python Environment Setup and Data Handling with Pandas

· Setting up the analytical environment (Anaconda, Jupyter Notebooks)

· Introduction to fundamental Python syntax and data structures

· Loading, viewing, and basic inspection of data using the Pandas library

· Essential data indexing, slicing, and filtering operations in Pandas

· Handling missing values and basic data imputation techniques

· Practical session: Loading a large CSV file of health records, inspecting the first 10 rows, and summarizing key variables

Module 3: Core Epidemiological Measures and Descriptive Analytics

· Calculating and interpreting incidence (cumulative and density)

· Calculating and interpreting prevalence (point and period)

· Standardization of rates (direct and indirect methods)

· Measures of mortality (Crude death rate, cause-specific rate)

· Calculating person-time and adjusting for follow-up loss

· Practical session: Calculating and comparing the crude and age-standardized incidence rates for a chronic disease using Python

Module 4: Inferential Epidemiology: Risk, Association, and Hypothesis Testing

· Measuring association: Odds Ratio (OR) and Relative Risk (RR)

· Calculating Attributable Risk and Population Attributable Risk (PAR)

· Introduction to Chi-Squared ($\chi^2$) tests for categorical data

· Calculating and interpreting confidence intervals for OR and RR

· Formulating and testing epidemiological hypotheses

· Practical session: Analyzing a simulated 2x2 contingency table for a case-control study and calculating the OR, its $95\%$ CI, and the p-value using Python

Module 5: Data Cleaning and Validation for Epidemiological Datasets

· Identifying and correcting data entry errors and inconsistencies

· Techniques for merging and joining multiple epidemiological datasets

· Reshaping data (wide-to-long and long-to-wide) for analysis

· Handling date and time variables for time-based analysis

· Data validation strategies and quality checks (range, uniqueness)

· Practical session: Cleaning a simulated outbreak dataset with missing values, inconsistent formats, and merging it with a demographic dataset in Python

Module 6: Data Visualization for Epidemiology using Matplotlib and Seaborn

· Principles of effective visualization in public health reporting

· Creating Epidemic Curves (histograms over time)

· Generating Choropleth Maps for visualizing geographical spread

· Creating and customizing box plots and violin plots for comparing distributions

· Customizing plot aesthetics for professional reporting

· Practical session: Generating an epidemic curve for a simulated outbreak and annotating key events using Matplotlib

Module 7: Introduction to Time-Series Analysis and Epidemic Curves

· Decomposing time-series data: Trend, Seasonality, and Residual

· Utilizing moving averages for smoothing and trend identification

· Introduction to simple time-series forecasting models (ARIMA overview)

· Identifying clusters and anomalies in temporal data

· Calculating and visualizing the basic reproductive number ($R_0$) overview

· Practical session: Applying a simple moving average to a weekly influenza case count dataset in Python to identify the underlying trend

Module 8: Introduction to Power BI: Connecting and Transforming Health Data

· Understanding the Power BI interface (Desktop, Service, Report)

· Connecting to various public health data sources (CSV, Excel, Databases)

· Using the Power Query Editor for Extract, Transform, Load (ETL) operations

· Basic data transformations (unpivoting, grouping, calculated columns)

· Data loading best practices and troubleshooting common connection issues

· Practical session: Connecting Power BI to a raw health data source and performing three essential cleaning and transformation steps in Power Query

Module 9: Data Modeling and DAX for Public Health Metrics in Power BI

· Principles of dimensional modeling (Star and Snowflake schemas)

· Creating relationships between fact and dimension tables

· Introduction to Data Analysis Expressions (DAX) language

· Writing fundamental DAX measures (SUM, COUNTROWS, CALCULATE)

· Creating Time Intelligence functions in DAX for year-over-year comparison

· Practical session: Building a star schema data model and writing three core DAX measures to calculate Incidence Rate and Case Fatality Rate

Module 10: Building Interactive Epidemiological Dashboards

· Principles of dashboard design for public health audiences

· Selecting appropriate visuals for different epidemiological metrics

· Implementing slicers, filters, and cross-filtering for interactivity

· Designing a clear, scannable layout for key performance indicators (KPIs)

· Utilizing drill-through and tooltips for deeper data exploration

· Practical session: Building the first page of a disease surveillance dashboard in Power BI, including at least five interactive visuals

Module 11: Regression Modeling in Epidemiology (Linear and Logistic)

· Review of when to use Linear vs. Logistic Regression

· Interpreting coefficients in a logistic regression model (Log-odds, OR)

· Assessing model fit and performance metrics (R-squared, AIC, AUC)

· Feature selection and model building techniques in Python (statsmodels or scikit-learn)

· Documenting and presenting regression results to non-statisticians

· Practical session: Building a logistic regression model in Python to predict disease status based on risk factors and interpreting the resulting Odds Ratios

Module 12: Survival Analysis Fundamentals and Kaplan-Meier Curves

· Introduction to time-to-event data and censoring

· Calculating and plotting Kaplan-Meier Survival Curves

· Interpreting the Hazard Function and median survival time

· Introduction to the Cox Proportional Hazards Model overview

· Utilizing Python libraries (lifelines) for survival analysis

· Practical session: Generating and interpreting a Kaplan-Meier curve for a clinical trial dataset in Python, comparing survival between two treatment groups

Module 13: Spatial Epidemiology and Data Mapping with Python

· Concepts of spatial clustering and autocorrelation (Moran's I overview)

· Utilizing GeoPandas for reading and manipulating spatial data (shapefiles)

· Joining epidemiological data to geographical boundaries

· Creating visually effective density and Choropleth Maps in Python

· Limitations and ethical considerations in spatial data visualization

· Practical session: Loading a geographic boundary file and plotting disease incidence by administrative unit using GeoPandas/Folium

Module 14: Bias, Confounding, and Effect Modification (Statistical Adjustment)

· Understanding key epidemiological biases (Selection, Information)

· Identifying and controlling for Confounding variables

· Techniques for adjustment (stratification and multivariable regression)

· Detecting and interpreting Effect Modification (interaction)

· Introduction to Propensity Score Matching overview

· Practical session: Using stratification in Python to check for confounding between a risk factor and an outcome

Module 15: Advanced Data Storytelling and Power BI Custom Visuals

· Structuring a narrative around public health data insights

· Best practices for designing reports for executive and policy audiences

· Utilizing Power BI Custom Visuals for unique data representations

· Creating tooltips and report pages that function as mini-dashboards

· Ensuring accessibility and usability in Power BI reports

· Practical session: Refining a Power BI dashboard by applying design principles and integrating a custom visual to enhance the data story

Module 16: Public Health Surveillance Systems and Data Pipelines

· Overview of national and international disease surveillance systems

Requirements:

· Participants should be reasonably proficient in English.

· Applicants must live up to Armstrong Global Institute admission criteria.

Terms and Conditions

1. Discounts: Organizations sponsoring Four Participants will have the 5th attend Free

2. What is catered for by the Course Fees: Fees cater for all requirements for the training – Learning materials, Lunches, Teas, Snacks and Certification. All participants will additionally cater for their travel and accommodation expenses, visa application, insurance, and other personal expenses.

3. Certificate Awarded: Participants are awarded Certificates of Participation at the end of the training.

4. The program content shown here is for guidance purposes only. Our continuous course improvement process may lead to changes in topics and course structure.

5. Approval of Course: Our Programs are NITA Approved. Participating organizations can therefore claim reimbursement on fees paid in accordance with NITA Rules.

Booking for Training

Simply send an email to the Training Officer on training@armstrongglobalinstitute.com and we will send you a registration form. We advise you to book early to avoid missing a seat to this training.

Or call us on +254720272325 / +254725012095 / +254724452588

Payment Options

We provide 3 payment options, choose one for your convenience, and kindly make payments at least 5 days before the Training start date to reserve your seat:

1. Groups of 5 People and Above – Cheque Payments to: Armstrong Global Training & Development Center Limited should be paid in advance, 5 days to the training.

2. Invoice: We can send a bill directly to you or your company.

3. Deposit directly into Bank Account (Account details provided upon request)

Cancellation Policy

1. Payment for all courses includes a registration fee, which is non-refundable, and equals 15% of the total sum of the course fee.

2. Participants may cancel attendance 14 days or more prior to the training commencement date.

3. No refunds will be made 14 days or less before the training commencement date. However, participants who are unable to attend may opt to attend a similar training course at a later date or send a substitute participant provided the participation criteria have been met.

Tailor Made Courses

This training course can also be customized for your institution upon request for a minimum of 5 participants. You can have it conducted at our Training Centre or at a convenient location. For further inquiries, please contact us on Tel: +254720272325 / +254725012095 / +254724452588 or Email training@armstrongglobalinstitute.com

Accommodation and Airport Transfer

Accommodation and Airport Transfer is arranged upon request and at extra cost. For reservations contact the Training Officer on Email: training@armstrongglobalinstitute.com or on Tel: +254720272325 / +254725012095 / +254724452588

Instructor-led Training Schedule

Course Dates	Venue	Fees
Apr 06 - Apr 10 2026	Zoom	$1,300
Jul 06 - Jul 10 2026	Nairobi	$1,500
Mar 02 - Mar 06 2026	Mombasa	$1,500
May 04 - May 08 2026	Nakuru	$1,500
Apr 13 - Apr 17 2026	Kisumu	$1,500
Jul 06 - Jul 10 2026	Kigali	$2,500
Mar 02 - Mar 06 2026	Kampala	$2,500
Jun 01 - Jun 05 2026	Arusha	$2,500
Mar 09 - Mar 13 2026	Johannesburg	$4,500
Jul 06 - Jul 10 2026	Cape Town	$4,500
May 11 - May 15 2026	Pretoria	$4,500
May 04 - May 08 2026	Accra	$4,500
Jun 08 - Jun 12 2026	Addis Ababa	$4,500
Jul 06 - Jul 10 2026	Dubai	$5,000
Jun 15 - Jun 19 2026	Riyadh	$5,000
Jun 01 - Jun 05 2026	Doha	$5,000
Apr 20 - Apr 24 2026	London	$6,500
Apr 13 - Apr 17 2026	Paris	$6,500
Apr 13 - Apr 17 2026	Geneva	$6,500
Apr 27 - May 01 2026	Berlin	$6,500
May 18 - May 22 2026	Zurich	$6,500
Jul 13 - Jul 17 2026	New York	$6,950
Jul 20 - Jul 24 2026	Los Angeles	$6,950
May 04 - May 08 2026	Washington DC	$6,950
May 11 - May 15 2026	Toronto	$7,000
May 18 - May 22 2026	Vancouver	$7,000

Epidemiology Analytics with Python and Power BI Training Course

Module 1: Foundations of Epidemiology and Public Health Data

Module 2: Python Environment Setup and Data Handling with Pandas

Module 3: Core Epidemiological Measures and Descriptive Analytics

Module 4: Inferential Epidemiology: Risk, Association, and Hypothesis Testing

Module 5: Data Cleaning and Validation for Epidemiological Datasets

Module 6: Data Visualization for Epidemiology using Matplotlib and Seaborn

Module 7: Introduction to Time-Series Analysis and Epidemic Curves

Module 8: Introduction to Power BI: Connecting and Transforming Health Data

Module 9: Data Modeling and DAX for Public Health Metrics in Power BI

Module 10: Building Interactive Epidemiological Dashboards

Module 11: Regression Modeling in Epidemiology (Linear and Logistic)

Module 12: Survival Analysis Fundamentals and Kaplan-Meier Curves

Module 13: Spatial Epidemiology and Data Mapping with Python

Module 14: Bias, Confounding, and Effect Modification (Statistical Adjustment)

Module 15: Advanced Data Storytelling and Power BI Custom Visuals

Module 16: Public Health Surveillance Systems and Data Pipelines

Instructor-led Training Schedule

Quick Links

Quick Links

Contact Us

Address

Phone Number

Email Address

Epidemiology Analytics with Python and Power BI Training Course

Module 1: Foundations of Epidemiology and Public Health Data

Module 2: Python Environment Setup and Data Handling with Pandas

Module 3: Core Epidemiological Measures and Descriptive Analytics

Module 4: Inferential Epidemiology: Risk, Association, and Hypothesis Testing

Module 5: Data Cleaning and Validation for Epidemiological Datasets

Module 6: Data Visualization for Epidemiology using Matplotlib and Seaborn

Module 7: Introduction to Time-Series Analysis and Epidemic Curves

Module 8: Introduction to Power BI: Connecting and Transforming Health Data

Module 9: Data Modeling and DAX for Public Health Metrics in Power BI

Module 10: Building Interactive Epidemiological Dashboards

Module 11: Regression Modeling in Epidemiology (Linear and Logistic)

Module 12: Survival Analysis Fundamentals and Kaplan-Meier Curves

Module 13: Spatial Epidemiology and Data Mapping with Python

Module 14: Bias, Confounding, and Effect Modification (Statistical Adjustment)

Module 15: Advanced Data Storytelling and Power BI Custom Visuals

Module 16: Public Health Surveillance Systems and Data Pipelines

Instructor-led Training Schedule

Subscribe To Our Newsletter

Quick Links

Quick Links

Contact Us

Address

Phone Number

Email Address