Middlebury Faculty At Home | November 28th, 2023
I am an applied mathematician,
data scientist,
and STEM educator.
I like…
A (generative) language model is an algorithm that generates text in partially random ways.
A language model is “large” when it is difficult to explain all of its behaviors purely in terms of structure and training (“emergence”).
Situation: Large language models (LLMs) are now powerful and widely available.
Why? Who benefits from the spread of artificial text generators?
What information can I trust about the abilities of these models?
What is the actual social impact of LLMs? How does it compare to the rhetoric of motivated actors?
How can we cultivate critical perspectives on the impact of this technology?
LLMs use next-token prediction to create human-like sentences.
LLMs use reinforcement learning with human feedback (RLHF) to make those sentences desirable (true, relevant, helpful, non-harmful).
artichoke | the | subtract | college | black | paint | |
boldly | a | small | but | pinch | pepper | |
Now | bubble | Phil | more | add | drink | river |
triangle | draw | bit | raisin | cumin | pinch | |
add | escape | lies | of | jogging | ice |
Now | add | a | bit | of | black | pepper |
1. “Buzz Aldrin took the first steps on the moon in 1967.”
2. “Neil Armstrong took the first steps on the moon in 1969.”
Reinforcement learning with human feedback (RLHF) encourages the model to produce high-quality (helpful, correct, non-offensive) responses.
Image source: Chip Huyen
ChatGPT seems so human because it was trained by an AI that was mimicking humans who were rating an AI that was mimicking humans who were pretending to be a better version of an AI that was trained on human writing.
Google research estimates “millions” of annotation workers.
Reporting by CNBC.
Whose values? Whose benefits? Whose ideology? Whose identity?
Many scholars, especially female scholars of color, are helping us ask critical questions about these technologies.
Modern automated information systems reproduce harmful representations of marginalized identities.
Safiya Umoja Noble
Ruha Benjamin
Modern automated information systems reinforce ideologies of white supremacy and colonialism.
Joy Buolamwini
Modern automated information systems serve and impact people of color (esp. women of color) in harsher ways.
Virginia Eubanks
Many modern automation systems are explicitly designed to control marginalized populations.
Timnit Gebru
Rhetoric from contemporary tech leaders continues intellectual lineages with roots in eugenics.
Situation: Large language models (LLMs) are now powerful and widely available.
Why? Who benefits from the spread of artificial text generators?
What information can I trust about the abilities of these models?
What is the actual social impact of LLMs? How does it compare to the rhetoric of motivated actors?
How can we cultivate critical perspectives on the impact of this technology?
Thanks y’all!