SIAM Conference on Applications of Dynamical Systems
Minisymposium on Data, Inference, and Dynamics in Complex Social Systems
Denver, CO || May 13th, 2025
What mechanisms drive ongoing lack of gender representation in academic mathematics?
What can we expect to happen in our profession if these mechanisms continue to operate as is?
What could the effects of interventions be on long-term gender representation?
![]() |
Heather Brooks |
![]() |
Harlin Lee |
![]() |
Mason Porter |
![]() |
Juan G. Restrepo |
![]() |
Anna Haensch |
![]() |
Phil Chodrow |
This talk represents work-in-progress using data about our mathematics community. All results are preliminary!
This talk focuses on the production of PhD graduates and therefore almost exclusively considers doctoral universities.
Gender is not binary, but unfortunately our data (and therefore our story and our model) are.
Quantitative work complements, but never replaces: voices of marginalized scholars, qualitative research and critical theory, activism, and implementation of initiatives and policies.
Mentorship Female advisors are more effective in attracting or retaining female graduate students. |
Belonging Greater representation in the grad student community attracts women to programs and subfields. |
Attrition Addressing disparities in career attrition for female faculty would help to close the gender gap. |
Leadership A small number of influential women can dramatically change the culture of a department or research community. |
Ben Brill, UCLA ’21
Total of 116,306 advisor-student pairs in the US since 1950, representing 21,781 distinct advisors. We observe or estimate math subfields for 94% of these pairs (predictions based on thesis titles). We estimate gender for 95% of PhD students and 97% of advisors.
Misgendering
Incorrect MSCs inferred
Nonrandom missing data
Various shenanigans
We could treat this as a branching process and read off parameters, but this would assume that we are already at stationarity – our predictions would just be averages over the recent past.
Advisor production
Model the number of graduate students produced by a given advisor.
Technique: maximum likelihood estimation in a bespoke stochastic model.
Student gender
Model the gender of students produced by a given advisor.
Technique: logistic regression.
Assumptions:
We model an observed sequence of students \(\color{#086788}{\mathbf{S}} = (\color{#086788}{S}_1, \color{#086788}{S}_2, \ldots, \color{#086788}{S}_T)\) produced by an advisor as a function of an unobserved advisor career \(\color{#07A0C3}{C}\) specified by the startup period length and retirement year.
\[ \begin{aligned} p(\color{#086788}{\mathbf{S}};\color{#F25C54}{\boldsymbol{\theta}}) &= \sum_{\color{#07A0C3}{C}\in\mathcal{C}} p(\color{#086788}{\mathbf{S}}|\color{#07A0C3}{C};\color{#F25C54}{\boldsymbol{\theta}})p(\color{#07A0C3}{C};\color{#F25C54}{\boldsymbol{\theta}}) \end{aligned} \]
The vector \(\color{#F25C54}{\boldsymbol{\theta}}\) contains the parameters to be estimated.
We do this using a hybrid expectation-maximization algorithm: some parameters can be estimated efficiently via EM, while others must be estimated by hill-climbing.
We hypothesize that greater student production per year reflects unequal access to research resources; cf. Zhang et al. (2022)
Estimate the odds that the next student produced by an advsior is female based on subfield, advisor gender, and representation of women in advisor group and subfield.
\[ \begin{aligned} \log (\text{odds F}) = & \beta_0 + \beta_a \times (\text{advisor gender}) + \beta_f \times (\text{subfield}) \\ & \beta_g \times (\text{proportion F advisees in group}) + \\ & \gamma_{p} \times (\text{proportion F in subfield}) \end{aligned} \]
We tried a lot of other models with other features (e.g. decade, nonlinear transformations, etc) but this one was best in cross-validation.
If \(p^*\) is the stationary proportion of women in the subfield, then \(p^*\) approximately satisfies the equation \[ \begin{aligned} p^* = \color{#ffaf03}{w_f}\color{#ffaf03}{\sigma_f}(p^*) + \color{#5b427c}{w_m} \color{#5b427c}{\sigma_m}(p^*) \end{aligned} \]
Mean-field assumption: advisor groups represent the subfield as a whole.
Two strategies: compute the numerical stationary proportion of female advisors or simulate the model forwards.
We can do both of these either with or without parameter uncertainty.
Two candidate interventions:
Improve retention and resourcing of female advisors
We can model this by setting the career and student production parameters of women equal to men in the advisor production model.
Train men as equally-appealing mentors for female PhD students:
We can model this by setting the propensity of men to produce female PhD advisees equal to women in the advisee gender model.
Limitations of scenario modeling: we are not explicitly modeling the supply of female students entering graduate school (“the pipeline”).
Mentorship Female advisors are more effective in attracting or retaining female graduate students. |
Belonging Greater representation in the grad student community attracts women to programs and subfields. |
Attrition Addressing disparities in career attrition for female faculty would help to close the gender gap. |
Leadership A small number of influential women can dramatically change the culture of a department or research community. |
Yes! Female advisors are substantially more likely than male advisors to produce female PhD graduates. |
Yes! Subfields/advisor groups with greater representation of women tend to attract more women. |
Yes, but… Resourcing and retaining female faculty is an inclusive, equitable goal to pursue but may not lead to large-scale change. |
We’re exploring… We’re developing models and data analysis to try to detect these effects in our data set. |
![]() |
Heather Brooks |
![]() |
Harlin Lee |
![]() |
Mason Porter |
![]() |
Juan G. Restrepo |
![]() |
Anna Haensch |
![]() |
Ben Brill |
![]() |
National Science Foundation |
![]() |
ICERM @Brown |
Preprint coming soon 😬😬😬