March 21, 2024
The cycle between man and machine becomes particularly interesting when the machines think. From the beginning there has been a close connection between the development of artificial intelligence, psychology, and theories of learning. The connection between man and machine was built in from the beginning: one was made in the image of the other.
The connections between computer and human learning are more than superficial analogies. Demis Hassabis, one of the world's leading AI researchers, received his doctorate in cognitive neuroscience and has written influential articles on how neuroscience, psychology and AI interact and inspire each other (Hassabis et al., 2017).
It is also possible that we can learn something about children's learning from studying how AI algorithms learn. Hassabis believes that the similarities between AI and children's learning are profound, and that through these interactions we are beginning to discern basic principles for learning.
AI pioneers and cognition
The term "artificial intelligence" was likely introduced at a meeting in 1956 with some of the people who would become key figures in the field for the next twenty years: Marvin Minsky, Herbert Simon, and Allen Newell. They had a great interest in psychology and learning: one of the articles that resulted from the meeting is entitled "Elements of a theory of human problem solving", which illustrates how they started from human thinking as a model for how machines should solve problems.
Psychology at this time was dominated by behaviorism, which only cared about stimuli and response, but was comfortable regarding everything that happened in between as a black box. The new vision was to open the black box, understand the cognitive processes and construct thinking machines based on the same principles – an electronic brain.
The computers were primitive, as big as closets, and programmed from large control tables with switches and punch cards. The programs also worked differently than most of today's AI algorithms, which are trained to recognize patterns and images. The early programs were constructed by manipulating symbols according to the rules of logic, the way a human was thought to solve problems.
With these primitive prerequisites, they still managed to program machines that proved mathematical theorems by manipulating symbols. A 1970 article summarizing the progress of the last fifteen years concluded with a paragraph on education: "The theory of problem solving described here gives us a new basis for approaching the psychology of education and the learning process" (Simon and Newell, 1971).
One of the researchers who would take the theories further towards education was John R. Andersson who joined the group of AI pioneers and was particularly interested in mathematics learning. He developed a theory for the mental processes that occur in mathematical problem solving: ACT (adaptive control of thought). The research, which largely took place at Carnegie Mellon University, led, among other things, to the development of the first and most successful systems for "intelligent tutor systems", or "digital mentors". These had pre-programmed rules for how mathematical thinking takes place with the intention of being able to give help and tips, for example when children were learning algebra.
Papert, Piaget and constructivism
The history of AI and education also has a parallel track, which began when a young South African mathematician, Seymore Papert, traveled to Geneva to visit Jean Piaget. Jean Piaget was one of the most famous developmental psychologists of the 20th century. He lived most of his life in Geneva, where he developed philosophical theories about how children develop in stages, including through interaction with the outside world. Papert's collaboration with Piaget would lead to Papert's special version of Piaget's theories: that children's development could take place through interaction with computers. The seed for digital education was sown.
In 1963, Papert was recruited to the United States by Marvin Minsky. Together, the two led the Artificial Intelligence Laboratory and wrote a book titled Perceptrons, which turned out to be an important piece of the puzzle in developing artificial neural networks.
Papert's work with children's learning began to take shape when he developed the Logo programming language, and he was particularly fond of a variant called Turtle Logo. In a console window, the children wrote commands to a turtle, which was sometimes on the screen, but was also made as a small robot turtle that ran around on the floor. By programming commands like "FORWARD" and "RIGHT" you moved, or rotated, the turtle. With the "PENDOWN" command, the turtle's movement was recorded with a line on the screen.
The idea was that through programming, children would learn math, the same way a child living in France learns French, without taking any lessons or being aware that they are learning. Not as a "language bath", but as a "math bath". Mathland was Papert's expression of the fictional country where the children would learn to speak the language of mathematics with the help of programming.
The program Logo was used in collaboration with the company Lego and resulted in the product Mindstorms, where robots were built from Lego pieces and whose movements were controlled by computer programs that the children could write themselves. Papert also helped develop the Scratch programming language, which is widely used to teach children the first steps in programming. He also co-founded the MIT Media Lab, the maker movement, and the One-Laptop-Per-Child project that aimed to get a laptop into the hands of every child on the planet (spoiler alert: this didn't happen). Papert's life thus connects three areas that came to be central in the future digital education: mathematics, computers, and children's learning.
But Papert was a visionary, not a scientist. His ideas were usually not tested in controlled trials, and when individual ideas have been tested in retrospect, they have not been shown to work very well for teaching children mathematics. For example, the idea of Mathland, that is, that children should teach themselves, can be traced back to Jean Piaget, and even to Jean-Jacques Rousseau, who suggested that children should teach themselves through exploration. It later developed into the constructivist theory of how children should learn for themselves through exploration, not through instruction.
It may sound good, but has repeatedly proven to be a largely useless methodology for learning. A summary of some of that research can be found in the article "Urban legends of education", where the authors debunk three popular myths (Kirschner and van Merrienboer, 2013). The first myth is that children who grew up with computers know how to acquire knowledge themselves and that they cannot learn in traditional ways, such as through teacher-led instruction. The reason why it doesn't work is, according to the authors, that prior knowledge is required to be able to search for, analyze, and compile new knowledge. The knowledge thus comes before the search. Another myth is that children are self-taught and should be given maximum control over what they learn and how they learn. The reason why this does not work is that children have difficulty assessing their level of knowledge and the knowledge goals:and that what they like or think is good is not always so.
The researcher Stanislas Dehaene, in the book How we learn – The new science of education and the brain (Dehaene, 2020), comments on the idea of learning solely through independent exploration:
"The theory is appealing ... Unfortunately, repeated studies, conducted over several decades, show that the educational value is next to zero - and this has been replicated so many times that one researcher titled his review article "Should There be a three-strikes rule against pure discovery learning?” (Dehaene, 2020)
When Uruguay gave a computer to every child, as part of Papret’s One-Laptop-Per-Child project, neither the children's knowledge nor learning improved. Children obviously do not learn automatically just by experimenting with a computer, or by searching for information on the Internet themselves. The Rousseau-Piaget-inspired idea that children should learn without formal instruction, just by interacting with the environment, does not work.
Curiosity and false rewards
There are other cases where AI development has been inspired by how the human brain and psychology work, including how children learn. In 2015, researchers from the company Deepmind published an article that described how they succeeded in constructing an AI that, through machine learning, learned to play a number of computer games by itself (Mnih et al., 2015). Out of around 50 different Atari games, it outperformed a human on more than half. In some games it was more than ten times better than a human. But in the list of games one was found at the bottom – a game where the AI algorithm scored zero: Montezuma's Revenge.
The reason seemed to be that there were a lot of steps required at the beginning of the game before you got any reward points at all: you had to enter a room, swing on a rope, climb a ladder and so on. Only then did the player get a point and could understand that they were on the right track. It was thus difficult for machines to know that they were on the right path at the beginning of learning, which meant that they never progressed.
The solution, it turned out, would be to incorporate something that children have a lot of, but machines don't: curiosity. A child is often fascinated by something simply because it is new. Babies as young as a few weeks old look more at new pictures than at pictures they have seen before. They have a drive to explore, just for the sake of discovery. A similar principle has now been introduced in machine learning, where the algorithms are only rewarded when they see something new. A funny episode is when an AI with such a curiosity-rule encountered a TV screen and got completely stuck, because a TV constantly shows new images. (Christian, 2020) Algorithms with curiosity principles also avoid dying when playing computer games, because they then have to start from scratch, which is incredibly dull. As the physicist Richard Feynman is reported to have said on his deathbed: "I would hate to die twice. It is so boring".
Just as curiosity is critical to human learning, so is attention This has also been tried to be imitated in machine learning and led to great improvements. For example, when an algorithm is analyzing images, it is much more efficient if it places more emphasis on certain parts of the image instead of trying to use the information in every pixel of the image. It thus selects certain information to be given priority, just as humans do when we learn.
Another interesting similarity between human and machine learning is how both can easily go in the wrong direction. In his book The Alignment Problem, AI researcher Brian Christian describes how a friend's daughter, who was 3 years old at the time, accidentally knocked over a jar of crayons on the floor (Christian, 2020). Her father asked her to pick up the crayons and put them back in the jar. When she did this, he thought she was so good at cleaning up that he said she could take a cookie as a reward. The girl took a cookie and then immediately went back to the kitchen table and knocked the jar of crayons down on the floor again so she could have another cookie when she picked them up.
Actually, it was the father who used a faulty reward system. What he really wanted was to teach his child to always keep the kitchen floor clean. But he rewarded another specific behavior: picking up crayons. His daughter interpreted the rule literally, disregarding the unspoken and underlying goal of keeping the kitchen clean. Since Brian Christian's friend is also an AI researcher, he immediately saw the similarity between his daughter's behavior and how it often goes wrong when teaching artificial intelligence a specific behavior or to cope with a specific task. The core of the problem is the gap between what you really want to achieve and what is rewarded.
In studies of how animals learn, learning is often honed gradually. These are principles that go back to Skinner's research on conditioning. When Skinner would teach a pigeon to press a special button, the pigeon first gets the reward if it is only in the vicinity of the button. Then it gets a reward if it pecks anywhere near the button. As a final step, it only gets a reward if it pecks the button exactly. Similar strategies are used when computers have to learn. An example that Brian Christian has described in his book was when teaching an AI to cycle from one point to another. (Christian, 2020) The AI was rewarded every time it drove in the right direction. But after letting the AI train for a few hours, you saw that it got stuck in a loop where it rapidly cycled around and around in a circle, because it scored points every time it was pointed in the right direction. It was stuck in a reward loop that didn't match the goal the researchers really had.
Another, theoretical, example of how rewards can go wrong is how you could get a super-intelligent AI to solve humanity's most complex problems. What if we could get a future AI that can not only suggest, but also perform the experiments? Let's say we were to give this instruction: "Find a cure for cancer." What could go wrong? Well, one way to solve that problem for an AI is to give half the world's population cancer and then systematically test 1 million different drugs on those 4 billion people who now have cancer. Again, that would be an example of the machine taking the task literally, failing to see the unspoken, underlying goal of ensuring that as many people as possible live healthy, long and happy lives.
When you open your eyes to the gap between rewards and underlying goals, you can find examples of wrong rewards everywhere in society. Take research for example. Defining the goal of research is not easy, but a possible, simplified formulation could be: "Seek the truth, make use". But what researchers are rewarded for is instead publishing many articles. The result can be many articles that neither seek the truth nor do any good, often cannot be replicated and, in the worst case, contain deliberate errors.
How we define goals and reward children's behavior is highly relevant to understanding both child rearing, children's learning and schooling. Children are rewarded for small changes in behavior such as attending classes, doing homework, performing well on tests, and getting good grades. But the underlying goal we really have is for them to acquire knowledge, partly for the knowledge's sake, partly because it is useful in the future. Cheating on a test is an example of when the immediate rewards work against the underlying, unspoken goal we have of school and learning.
In his book Mindstorms: Children, computers and powerful ideas (Papert, 1984), Papert writes that computers will revolutionize learning and that learning will be as different from previous teaching as the car was from the horse and cart. Papert was way ahead of his time, was unscientific, and wrong in many respects. But through his visions he was influential, and in one respect he was perhaps right: digitalization has already had a huge influence on virtually every aspect of society and research, and it is likely to revolutionize learning as well. Perhaps we are right now on the threshold of that revolution.
(This text is a slightly shortened excerpt from my book “The Future of Digital Learning”, in Swedish 2023 as “Framtidens digitala lärande” )
References
Christian B (2020) The Alignment problem - machine learning and human values: Norton and Company.
Dehaene S (2020) How we learn. The new science of education and the brain. London: Allen Lane.
Hassabis D, Kumaran D, Summerfield C, Botvinick M (2017) Neuroscience-Inspired Artificial Intelligence. Neuron 95:245-258.
Kirschner PA, van Merrienboer JJG (2013) Do learners really know best? Urban legends in education. Educ Psychol 48:169-183.
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H,
Kumaran D, Wierstra D , Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518:529-533.
Papert S (1984) Mindstorms: Children, computers, and powerful ideas: Basic books, Inc.
Simon HA, Newell A (1971) Human problem solving: The state of the theory in 1970. Am Psychol 26:145.
Professor of Cognitive Neuroscience