A new paradigm of distributed AI training by Google DeepMind. Virtual talk
Plus: Can AI solve Sophie's Choice? A video lecture on AI alignment
Hello, fellow human! Sophia here, sharing the virtual talks and video lectures I’ve prepared for you.
Table of Contents:
Virtual talk: A new way to solve distributed training problems by Google DeepMind.
Video lecture: Can AI solve Sophie's Choice? AI alignment with pluralistic human values.
We are planning to organize our talks in an offline format and we need your input to make these events as useful as possible for you. Please take a minute to fill out this survey (just 4 questions) – we would greatly appreciate it.
On June 13th, we are hosting a virtual talk with Arthur Douillard, Senior Research Scientist at Google DeepMind, who will discuss Distributed Paths Composition (DiPaCo). It’s a prototype of a scalable, modular ML paradigm (an architecture and training algorithm).
The high-level idea is to distribute computation by path—a 'path' refers to a sequence of modules that define an input-output function. Paths are small relative to the entire model and require only a handful of tightly connected devices to train or evaluate.
During both training and deployment, a query is routed to a replica of a path rather than a replica of the whole model. In other words, the DiPaCo architecture is sparsely activated.
Register for the talk to learn the technical details of DiPaCo.
Can AI solve Sophie's Choice?
Our recent guest, Taylor Sorensen from the University of Washington, shared his team's recent research work on AI alignment with pluralistic human values. Together we discussed the challenges of aligning AI with diverse human values and ethical situations like the trolley problem or Sophie’s Choice.
First, he presented a ValuePrism dataset with over 200,000 values corresponding to over 30,000 human-written situations (for example, lying to a friend to protect their feelings or going 50 mph over the speed limit to get my wife to a hospital).
Those situations were used to prompt GPT-4 to generate values, rights, and duties. Then humans evaluated the model's generations to ensure they are of high quality and establish missing values.
Then, the research group conducted a study to establish whose values are represented.
Over 600 people with diverse backgrounds (age, socio-economic groups, sexual preference, etc.) participated in the study to answer two questions:
Do you agree with the suggested values, rights, and duties?
Is your perspective missing?
The participants were from the US. I assume the Western culture bias was present in their answers.
Probably, that's the reason why most of them agreed on the provided values, and the only significant cases of disagreement were related to political questions.
By using this work, the researchers created the Kaleido System to generate a batch of pluralistic values, rights, and duties.
This system is basically a roadmap to AI alignment with pluralistic human values – a path to enable customizable AI systems for users from different cultures and backgrounds or in general having AI that is capable of representing human diversity.
Obviously, it’s impossible to solve the alignment problem in the sense of providing one clear answer. Rather, it’s an opportunity to work together on what behavior we would expect from AI systems in different situations, especially ethically difficult ones.
If we take the famous trolley problem—where one must choose between the life of one person and the lives of five people—it’s a hard philosophical question and humans don’t have one solid answer to that.
Then what should we expect from AI in this case?
Well, there could be several approaches on how to align AI to the so-called trolley problem situations:
Overton Pluralism: Different schools of thought provide different answers to the ethical situation. Broad spectrum of reasonable responses.
Steerable Pluralism: The model can steer to a certain value and can be customizable.
Distributional Pluralism: Population of people we want to fit. An AI system should randomly sample an answer that a human might give, proportional to the population that would give that particular answer.
Basically, we can’t expect AI to give “the right answer” to Sophie’s Choice-like situations or the trolley problem.
It’s more a question of aligning humans on what should be represented and then training AI systems accordingly.