
y research has been a series of projects that fall into two broad categories: studying dynamical systems that arise in complex biological problems, and using data analysis to address systemic bias in society. These varied projects have given me valuable experience in a wide array of analytical and numerical methods, and my exposure to these topics has given me the ability to consider problems from different perspectives.
My thesis focused primarily on the analytical and numerical treatments of nonlinear integro-differential equations, with applications in population dynamics of single species and food chain systems. Since graduating, I have used what I learned from these endeavors to take on projects that seek to address societal ills.
Quantifying Gerrymandering
I have spent the past two years working with the Quantifying Gerrymandering group at Duke, led by Jonathan Mattingly. The long term goal of this group is to develop `the ensemble method for outlier analysis,' which is used to generate a representative sample of non-partisan maps from a distribution on redistricting. These samples are then used as a comparison against potentially partisan proposals.
The ensemble method works by first reading precinct level data of the state or county in question, and creating a planar graph where the vertices correspond to individual precincts and the edges indicate physical adjacencies. It then creates a random initial districting plan, and generates new plans by randomly sampling from possible alterations to this initial plan, ensuring that all new proposals comply to specified demographic and geographic criteria.
My work has been the implementation of a merge-split procedure for generating new district proposals. In this method, we impose a random spanning tree on each individual district, and then select two random adjacent districts to be merged. A single merged district is created with its own random spanning tree. We then scan the edges of this merged district's tree seeking `valid cuts' where the removal of the edge would result in two split districts that each comply to the given demographic and geographic criteria. We randomly select one of these `valid cuts,' and this creates the two new districts. The benefit of this method is that it is reversible. The probability of our sampler accepting the new proposal depends on the total number of possible spanning trees and the number of alternative cuts we could have selected to create the same proposal, both of which can be reasonably computed. We submitted a paper `Metropolized Forest Recombination for Monte Carlo Sampling of Graph Partitions' to the SIAM Journal on Applied Mathematics, and are working on revisions now.
This past year, we have worked to extend this merge-split method into a multi-scale framework, where we perform the merge-split procedure at each level (county, precinct, census block), but only descend levels when more resolution is required for a valid split. This framework has given some promising results so far. Our paper on this work Multi-Scale Merge-Split Markov Chain Monte Carlo for Redistricting,’ has recently been accepted for publication with the SIAM Journal on Multiscale Modeling and Simulation. We are currently working to extend this method and apply it to the newly released census data in North Carolina, Georgia, and Pennsylvania in preparation for the upcoming redistricting cycle.
Des Moines Traffic Citations (continuing)
I have been in contact with several members of the ACLU in my home state of Iowa, who have collected data on traffic citations, arrests, and dismissed cases in the capital city, my hometown, Des Moines. Their preliminary analysis, obtained through the volunteer work of a retired statistician, indicates the presence of police bias. My roll will be to perform my own, separate, analysis of the data to verify his results. Ultimately, we would like to connect all of the collected data sets to determine rates at which traffic citations lead to arrests or dismissals for minority populations. The end goal of this project will be to publish a paper of our results. This is a critical step in the litigation process, as it adds credibility to the expert testimony, and this process will hopefully force the city to enact reasonable restrictions on bias in traffic stops.
Travelling Waves in a Fisher Equation with Time-Dependent Growth (continuing):
I recently have been working with a fellow Northwestern graduate who approached me last spring about a cancer model he was working on in his position at the Moffitt Cancer Center. A simplified version of his problem, reached by assuming that the complex nutrient dynamics resolve themselves on a very fast time scale, closely resembles a Fisher-KPP equation with a time varying growth rate. We have found several analytic results regarding the spreading speed of the cancer, and are working to numerically validate these results using a semi-implicit solver I wrote. In the future, we hope to consider the full cancer model, and compare our analytical predictions to experimental results.
Understanding Voting Patterns and Interactions with Gerrymandering (2021):
This past summer, I worked with a team of undergraduates from the Data+ program at Duke in an exploratory analysis of elections in North Carolina. Our goal was to examine a set of statewide elections from the past 20 years to look for potential spatial patterns, and see how these patterns interacted with the geographic complexities of districting plans.
The first task of the summer was to try clustering the elections. The students first used PCA to get a measure of the distance between elections. They also developed several of their own distance metrics based on the well known ‘uniform swing’ hypothesis. For each set of pairwise distances obtained, the students applied an agglomerative clustering algorithm. The resulting clusters were quite stable.
Once the students had identified the clusters of interest, they started exploring various generative methods for creating new synthetic elections. The goal was to examine an ensemble of potential districting maps for a set of synthetic elections. My students presented a poster of our results at the end of the summer.
American Predatory Lending
In the summer of 2020, I mentored a group of students from the Data+ program at Duke. This project was a continuation of a much larger American Predatory Lending group at Duke which is attempting to compile a history of the 2007/2008 financial crisis. In past years, this group has used a variety of proprietary data sets to help describe the mortgage market of the 2000s decade at a national level. This summer, my team's task was to focus this analysis on North Carolina.
We first had to find ways to use the data present in the Home Mortgage Disclosure Act (HMDA) data set to help identify predatory practices because we did not have access to the proprietary data used previously. Usually predatory practices can be identified by looking at delinquent loans or repetitive refinancing loans for the same property. But the HMDA data set was lacking in many ways, and did not allow for either of those more typical approaches. Instead, we used indicators like the rate spread (difference between its APR and a survey-based estimate of current APRs on comparable prime mortgage loans) and denial rates to help identify racial disparity. Our analysis showed that black applicants were denied loans at almost twice the rate as white applicants across the state, and received significantly higher rate spreads when they were able to obtain loans, even when accounting for applicant income or the value to income ratio for the loan. My students presented a poster of our results at the end of the summer.
Women in Professional Hierarchies
After graduation, I worked on a collaboration attempting to model how the distribution of women at each level of a professional hierarchy changes over time, which led to a paper `Mathematical Model of Gender Bias and Homophily in Professional Hierarchies.’ In our model, we examined the role that cultural and psychological factors, such as promotion bias and homophily (self-seeking), play in the ascension of women through these hierarchies. Incorporating these factors into a model that accounts for the proportion of women at each level in a hierarchy resulted in a system of nonlinear differential equations. This system exhibits a rich range of dynamics, and, importantly, indicates that gender parity is not inevitable. We collected data on the gender fractionation over time of a number of different professional hierarchies, mainly examining the progression of women through different fields in academia. We used this data to verify our model, and quantify the degree of homophily and bias in each hierarchy. For this project, I helped in the derivation of the original model, and contributed many of the numerical results.
Nonlocal Population Dynamics (2014-2018)
My graduate work focused on migratory traveling wave behaviors in models of population dynamics governed by nonlocal species interactions. Nonlocality here refers to interactions between individuals that occur over a distance. To model this, we considered the weighted spatial average of the population, which is represented by a convolution of the population against a kernel function. The addition of these convolutions leads to nonlinear integro-partial differential equations that can be analytically intractable. My research focused on the formation of patterned states in these models and how they interact with the traveling wave, species migration problem.
My first project led to the paper `Traveling Waves in a Nonlocal, Piecewise Linear Reaction-Diffusion Population Model.' The model considered small populations that are governed by local natural growth and decay. When the population increases beyond a threshold, growth was instead controlled by nonlocal competition. This piecewise linear assumption, along with a specific choice of kernel function, allowed us to reduce the integro-partial differential equation to a system of algebraic equations, which enabled an analytic characterization of traveling waves in the presence of nonlocality.
My second project involved a three-species food chain system in the context of biological control, and led to the paper `Biological Control with Nonlocal Interactions.' The ecological system in question consisted of a crop, a pest infesting the crop, and an artificially introduced superpredator designed to devour the pest. The goal of the model was to examine the possibility of biological control, where the pest was fully eliminated. In my research, I extended this model to a system of integro-partial differential equations to allow for species mobility and superpredator nonlocality. I found that resurgences of the pest species can occur in certain parameter regimes, such as when the pest is highly mobile relative to the superpredator. I was able to identify parameter regimes where robust control is attainable.
Future Interest In Population Dynamics
While my research interests in recent years have branched out into a wide variety of different projects, my interest in populations dynamics abides.
One promising avenue of research is related to comparing the nonlocal population systems with real world data. The inclusion of nonlocality in models of population dynamics is often justified with arguments that a scarcity of resources will force competition between individuals over larger distances, as they all search for the shared, scarce resource. This means that the species these models are typically applied to tend to be few in numbers and live in remote environments, making them difficult to study. Nonlocal behaviors, however, can also arise when the species involved are highly mobile so that two individuals can interact over large distances. This means that some migratory bird species might be good candidates for comparison, and it will be interesting to examine the available data on various migrations.
Another area of interest is the study of these migratory problems in higher dimensions. A problem of interest would be when a large, planar wave hits some obstacle (such as a migratory herd maneuvering around a large lake or a city). Boundary interactions are known to give rise to patterned states in these ecological models, so the interactions with the obstacle could provide some interesting dynamics. In numerical simulations, it would be necessary to ensure that the artificially imposed boundary around the simulated domain does not cause spurious patterns to form. One approach might be to introduce artificial damping in an extended layer around the obstacle, with the artificial domain boundaries determined by the unobstructed planar wave solution.
A third area of interest would be to examine a nonlocal model where the extent of the nonlocality itself varied in space, i.e., the kernel function would be dependent on the spatial variable(s). Since nonlocality is often used as a means of representing a scarcity of a shared resource, this type of model could represent a situation where the relative abundance/scarcity of a resource was not uniformly distributed in the environment.
My thesis focused primarily on the analytical and numerical treatments of nonlinear integro-differential equations, with applications in population dynamics of single species and food chain systems. Since graduating, I have used what I learned from these endeavors to take on projects that seek to address societal ills.
Quantifying Gerrymandering
I have spent the past two years working with the Quantifying Gerrymandering group at Duke, led by Jonathan Mattingly. The long term goal of this group is to develop `the ensemble method for outlier analysis,' which is used to generate a representative sample of non-partisan maps from a distribution on redistricting. These samples are then used as a comparison against potentially partisan proposals.
The ensemble method works by first reading precinct level data of the state or county in question, and creating a planar graph where the vertices correspond to individual precincts and the edges indicate physical adjacencies. It then creates a random initial districting plan, and generates new plans by randomly sampling from possible alterations to this initial plan, ensuring that all new proposals comply to specified demographic and geographic criteria.
My work has been the implementation of a merge-split procedure for generating new district proposals. In this method, we impose a random spanning tree on each individual district, and then select two random adjacent districts to be merged. A single merged district is created with its own random spanning tree. We then scan the edges of this merged district's tree seeking `valid cuts' where the removal of the edge would result in two split districts that each comply to the given demographic and geographic criteria. We randomly select one of these `valid cuts,' and this creates the two new districts. The benefit of this method is that it is reversible. The probability of our sampler accepting the new proposal depends on the total number of possible spanning trees and the number of alternative cuts we could have selected to create the same proposal, both of which can be reasonably computed. We submitted a paper `Metropolized Forest Recombination for Monte Carlo Sampling of Graph Partitions' to the SIAM Journal on Applied Mathematics, and are working on revisions now.
This past year, we have worked to extend this merge-split method into a multi-scale framework, where we perform the merge-split procedure at each level (county, precinct, census block), but only descend levels when more resolution is required for a valid split. This framework has given some promising results so far. Our paper on this work Multi-Scale Merge-Split Markov Chain Monte Carlo for Redistricting,’ has recently been accepted for publication with the SIAM Journal on Multiscale Modeling and Simulation. We are currently working to extend this method and apply it to the newly released census data in North Carolina, Georgia, and Pennsylvania in preparation for the upcoming redistricting cycle.
Des Moines Traffic Citations (continuing)
I have been in contact with several members of the ACLU in my home state of Iowa, who have collected data on traffic citations, arrests, and dismissed cases in the capital city, my hometown, Des Moines. Their preliminary analysis, obtained through the volunteer work of a retired statistician, indicates the presence of police bias. My roll will be to perform my own, separate, analysis of the data to verify his results. Ultimately, we would like to connect all of the collected data sets to determine rates at which traffic citations lead to arrests or dismissals for minority populations. The end goal of this project will be to publish a paper of our results. This is a critical step in the litigation process, as it adds credibility to the expert testimony, and this process will hopefully force the city to enact reasonable restrictions on bias in traffic stops.
Travelling Waves in a Fisher Equation with Time-Dependent Growth (continuing):
I recently have been working with a fellow Northwestern graduate who approached me last spring about a cancer model he was working on in his position at the Moffitt Cancer Center. A simplified version of his problem, reached by assuming that the complex nutrient dynamics resolve themselves on a very fast time scale, closely resembles a Fisher-KPP equation with a time varying growth rate. We have found several analytic results regarding the spreading speed of the cancer, and are working to numerically validate these results using a semi-implicit solver I wrote. In the future, we hope to consider the full cancer model, and compare our analytical predictions to experimental results.
Understanding Voting Patterns and Interactions with Gerrymandering (2021):
This past summer, I worked with a team of undergraduates from the Data+ program at Duke in an exploratory analysis of elections in North Carolina. Our goal was to examine a set of statewide elections from the past 20 years to look for potential spatial patterns, and see how these patterns interacted with the geographic complexities of districting plans.
The first task of the summer was to try clustering the elections. The students first used PCA to get a measure of the distance between elections. They also developed several of their own distance metrics based on the well known ‘uniform swing’ hypothesis. For each set of pairwise distances obtained, the students applied an agglomerative clustering algorithm. The resulting clusters were quite stable.
Once the students had identified the clusters of interest, they started exploring various generative methods for creating new synthetic elections. The goal was to examine an ensemble of potential districting maps for a set of synthetic elections. My students presented a poster of our results at the end of the summer.
American Predatory Lending
In the summer of 2020, I mentored a group of students from the Data+ program at Duke. This project was a continuation of a much larger American Predatory Lending group at Duke which is attempting to compile a history of the 2007/2008 financial crisis. In past years, this group has used a variety of proprietary data sets to help describe the mortgage market of the 2000s decade at a national level. This summer, my team's task was to focus this analysis on North Carolina.
We first had to find ways to use the data present in the Home Mortgage Disclosure Act (HMDA) data set to help identify predatory practices because we did not have access to the proprietary data used previously. Usually predatory practices can be identified by looking at delinquent loans or repetitive refinancing loans for the same property. But the HMDA data set was lacking in many ways, and did not allow for either of those more typical approaches. Instead, we used indicators like the rate spread (difference between its APR and a survey-based estimate of current APRs on comparable prime mortgage loans) and denial rates to help identify racial disparity. Our analysis showed that black applicants were denied loans at almost twice the rate as white applicants across the state, and received significantly higher rate spreads when they were able to obtain loans, even when accounting for applicant income or the value to income ratio for the loan. My students presented a poster of our results at the end of the summer.
Women in Professional Hierarchies
After graduation, I worked on a collaboration attempting to model how the distribution of women at each level of a professional hierarchy changes over time, which led to a paper `Mathematical Model of Gender Bias and Homophily in Professional Hierarchies.’ In our model, we examined the role that cultural and psychological factors, such as promotion bias and homophily (self-seeking), play in the ascension of women through these hierarchies. Incorporating these factors into a model that accounts for the proportion of women at each level in a hierarchy resulted in a system of nonlinear differential equations. This system exhibits a rich range of dynamics, and, importantly, indicates that gender parity is not inevitable. We collected data on the gender fractionation over time of a number of different professional hierarchies, mainly examining the progression of women through different fields in academia. We used this data to verify our model, and quantify the degree of homophily and bias in each hierarchy. For this project, I helped in the derivation of the original model, and contributed many of the numerical results.
Nonlocal Population Dynamics (2014-2018)
My graduate work focused on migratory traveling wave behaviors in models of population dynamics governed by nonlocal species interactions. Nonlocality here refers to interactions between individuals that occur over a distance. To model this, we considered the weighted spatial average of the population, which is represented by a convolution of the population against a kernel function. The addition of these convolutions leads to nonlinear integro-partial differential equations that can be analytically intractable. My research focused on the formation of patterned states in these models and how they interact with the traveling wave, species migration problem.
My first project led to the paper `Traveling Waves in a Nonlocal, Piecewise Linear Reaction-Diffusion Population Model.' The model considered small populations that are governed by local natural growth and decay. When the population increases beyond a threshold, growth was instead controlled by nonlocal competition. This piecewise linear assumption, along with a specific choice of kernel function, allowed us to reduce the integro-partial differential equation to a system of algebraic equations, which enabled an analytic characterization of traveling waves in the presence of nonlocality.
My second project involved a three-species food chain system in the context of biological control, and led to the paper `Biological Control with Nonlocal Interactions.' The ecological system in question consisted of a crop, a pest infesting the crop, and an artificially introduced superpredator designed to devour the pest. The goal of the model was to examine the possibility of biological control, where the pest was fully eliminated. In my research, I extended this model to a system of integro-partial differential equations to allow for species mobility and superpredator nonlocality. I found that resurgences of the pest species can occur in certain parameter regimes, such as when the pest is highly mobile relative to the superpredator. I was able to identify parameter regimes where robust control is attainable.
Future Interest In Population Dynamics
While my research interests in recent years have branched out into a wide variety of different projects, my interest in populations dynamics abides.
One promising avenue of research is related to comparing the nonlocal population systems with real world data. The inclusion of nonlocality in models of population dynamics is often justified with arguments that a scarcity of resources will force competition between individuals over larger distances, as they all search for the shared, scarce resource. This means that the species these models are typically applied to tend to be few in numbers and live in remote environments, making them difficult to study. Nonlocal behaviors, however, can also arise when the species involved are highly mobile so that two individuals can interact over large distances. This means that some migratory bird species might be good candidates for comparison, and it will be interesting to examine the available data on various migrations.
Another area of interest is the study of these migratory problems in higher dimensions. A problem of interest would be when a large, planar wave hits some obstacle (such as a migratory herd maneuvering around a large lake or a city). Boundary interactions are known to give rise to patterned states in these ecological models, so the interactions with the obstacle could provide some interesting dynamics. In numerical simulations, it would be necessary to ensure that the artificially imposed boundary around the simulated domain does not cause spurious patterns to form. One approach might be to introduce artificial damping in an extended layer around the obstacle, with the artificial domain boundaries determined by the unobstructed planar wave solution.
A third area of interest would be to examine a nonlocal model where the extent of the nonlocality itself varied in space, i.e., the kernel function would be dependent on the spatial variable(s). Since nonlocality is often used as a means of representing a scarcity of a shared resource, this type of model could represent a situation where the relative abundance/scarcity of a resource was not uniformly distributed in the environment.