2 Course Syllabus

COMPUTATIONAL APPROACHES TO POPULATION GENETICS
Instructor: Andrew Bortvin
Times: Tuesday, Thursday 3:00PM - 4:30PM
Location: UG Teaching Lab (UTL) 189
Office Hours: Wednesday 9:00-10:00 AM and 5:00-6:00 PM, or by appointment; Levi 251
Course website: https://andrew-bortvin.github.io/popGenModeling/

2.0.1 Course Description

The size, composition, and genetics of populations fluctuate over time. These fluctuations are the product of dynamics between individuals, the interactions between populations, and the context of a population within a broader ecological landscape. The quantitative tools developed to study population genetics allow biologists to discover the simple fundamental principles that govern these complex systems. This course will introduce the basic theory of population genetics while teaching students the fundamental skills of programming in the R programming language, which will allow them to directly implement and visualize theoretical concepts. Students will model and simulate theoretical populations and analyze population-scale genomic data. This course will examine evolution on a variety of scales, ranging from the competition between cells within a single organism, to population dynamics in conservation biology that span decades, to the evolution of contemporary human populations over hundreds of thousands of years.

2.0.2 Learning Goals

By the end of this course, students will be able to:

• outline, using biological theory and terms, how populations grow and interact with each other
• describe the external factors that can cause changes in genetic diversity and use this information to predict how specific demographic scenarios would impact a population
• manipulate, analyze, and visualize data using the R programming language
• describe and interpret common formats used to store genomic data, and implement standard analytic protocols used to analyze these data types

2.0.3 Grading

30% Participation
40% Weekly Assignments
30% Final Project

Weekly Assignments

Class sessions will consist of a lecture in which theory is introduced, instructor-led live-coding sessions that implement the models studied in class, and time for students to work independently on assignments that further develop the models designed in class. Most weeks, there will be a take-home assignment that extends concepts studied in class. Students will have a week to complete problem sets. After submission, students will receive feedback from the instructor on each assignment and will have until the end of the semester to submit any revisions necessary.

Each assignment will consist of a set of required exercises that can be completed by students of any coding background. These will be followed by optional, more computationally-focused exercises, which will allow students to examine more intricate evolutionary scenarios and implement more complicated computational models.

Work will be graded on reasonable completion–that is to say, code that demonstrates an understanding of an algorithm and its general implementation will receive full marks, regardless of whether output is exactly correct. Students will also be assessed on the clarity and interpretability of the data visualizations that their code outputs and the accuracy of their responses to short questions prompting biological interpretation of their results.

Google and AI

Googling is always an acceptable way to find answers or help, and I encourage you to utilize it extensively. If you adopt a solution following a Google search, make sure you understand what you incorporate, rather than just copy/paste without comprehension of the logic or code. Google is also a good way to learn more about any error messages you encounter in your code.

You may be familiar with ChatGPT and other large language models. After trying each problem/assignment/task on your own, if you’re still running into issues, feel free to use ChatGPT as you would any other online resource (Google, stack overflow, etc.). Learning how to succinctly describe exactly what you want to accomplish is a skillset in itself, so this can be good practice. If you find code that seems to work (e.g., from Google) but you’re not sure how exactly it works, you can also type it into ChatGPT and ask it to explain what’s happening. As always, please do not submit any code if you are not familiar entirely with how it works; flag it and ask an instructor for assistance. Be aware that ChatGPT might confidently offer an answer that is not correct; so always check the output on your own.

2.0.4 Schedule

First Week - Introduction to Population Genetics Modeling

Date Topic Assignments
August 27 Welcome; Course Overview
August 29 Introduction to R Programming - Working with Data, Plotting DUE: Create a Posit account

Unit One: Population Biology
How do population sizes change? Models for one and two populations. Cooperation, Competition, and predation.

Week Two: One Population Models

Date Topic Assignments
September 3 The Exponential and Logistic Growth Models
September 5 Density-Dependent Growth Problem Set 1 Assigned

Week Three: Multiple Populations

September 10 Lotka-Volterra dynamics 1: Competition and Cooperation
September 12 Lotka-Volterra dynamics 2: Predation and Parasitism Problem Set 1 Due; Problem Set 2 Assigned

Week Four: Advanced Topics in Population Biology

September 17 Spatial Models
September 19 Social Evolution and Game Theory Problem Set 2 Due; Problem Set 3 Assigned

Unit Two: Population Genetics
How do we measure the genetic relationships between individuals? Between Populations? Between Species?
What determines the fate of a genetic element in a population? How does the size and demographic history of a population impact its genetic composition?
Biological Simulation

Week Five: The Wright Fisher Model

September 24 The Wright-Fisher Model: Evolutionary Neutrality
September 26 The Wright-Fisher Mode 2: Types of Selection, Selective Sweeps Problem Set 3 Due; Problem Set 4 Assigned

Week Six: Multiple Loci - Measures of Genetic Variation

October 1 Nucleotide Diversity, the Site Frequency Spectrum
October 3 F statistics Problem Set 4 Due

Week Seven: Biological Simulation and Population Size Changes

October 8 The SLiM Programming Language and slimr Problem Set 5 Assigned
October 10: Population Bottlenecks, Population Expansion, and Genetic Diversity

Week Eight: Biological Simulation and Population Size Changes Continued

October 15 Population Size Changes and the Site Frequency Spectrum Problem Set 5 Due
October 17 Fall Break

Week Nine: Multiple Populations and Genetic Relatedness

October 22 Simulation with Multiple Populations - Migration Problem Set 6 Assigned
October 24 Simulation with Multiple Populations - Admixture, Local Adaptation

Unit Three: Analyzing Genetic Data
How are population-scale genetic variants represented?
How do we quantify relatedness between populations?
Tests for selection, association testing, fine mapping. Phylogeny

Week Ten: Association Testing

October 29 The Variant Call Format and population-scale data Problem Set 6 Due
October 31 GWAS, linkage disequilbrium, Fine Mapping Problem Set 7 Assigned

Week Eleven: Population Structure and Phylogeny

November 5 Population structure: PCA, STRUCTURE, and clustering methods
November 7 The Coalescent - Inferring Timing of Selection Problem Set 7 Due

Week Twelve: Constructing and Interpreting Phylogenies

November 12 Working with Phylogenetic Trees - the ape and phytools packages Problem Set 8 Assigned
November 14 Tree Comparison Methods, Advanced trees DUE: Finalized proposals for independent/small Group projects

Week Thirteen: Independent/Small Group Projects

November 19 Work on Independent/Small Group Projects Problem Set 8 Due
November 21 Work on Independent/Small Group Projects

Week Fourteen:

December 3 Evolutionary Methods in Other Fields (Linguistics, Economics, etc.)
Current Directions in Population Genetics
December 5 Project Presentations and Discussion Independent/Small Group Projects Due

December 19: All revisions for weekly assignments due