Learning and (machine) “learning”

In “Toward a Critical Technical Practice: Lessons Learned in Trying to Reform AI”, Philip E. Agre points out that, in the field of artificial intelligence research, “each [AI implementation] technique is both a method for designing artifacts and a thematics for narrating its operation.” Thus, many of the terms used to discuss the implementation of AI systems take on a kind of dual meaning.

Consider, for instance, the term “planning”. On the one hand, “planning” refers to a particular family of techniques employed in the design and implementation of AI systems – things like classical planning, reactive planning, and so on. On the other hand, the term also implicitly frames these techniques as equivalent to a particular cognitive faculty found in human beings: in this case, the ability to make and execute plans for the future.

Dual-meaning terms like this, grounded in our understanding of human cognition, serve an important purpose: they help us to understand the role that a given AI implementation technique might play in a larger design. But, as Agre goes on to suggest, they can also lead to confusion. In areas where the human faculty being modeled does not map perfectly to the attempted formalization of this faculty, discrepancies between the formal model and the faculty itself are often covered up by their conflation under a single common term.

Many such implicit conflations of human faculty and implementation technique are present in the language we use when talking about AI, but at the moment, I’m especially concerned with the way we apply the term “learning”. In modern AI discourse, “learning” generally refers to a process by which a system that ingests more and more training data can gradually become better and better at making a certain kind of decision based on that data. To “learn”, in the “machine learning” sense, is to ingest more existing knowledge, seemingly to the near-total exclusion of anything else.

Education researcher Anna Sfard refers to this perspective on learning as the “acquisition metaphor”. Under this perspective, the learner’s mind is viewed as a vessel into which the fluid of knowledge is poured. Sfard draws a contrast between this and another metaphor for learning: the “participation metaphor”, in which…

[…] learning a subject is now conceived of as a process of becoming a member of a particular community. This entails, above all, the ability to communicate in the language of the community and act according to its particular norms. […] From a lone entrepreneur, the learner turns into an integral part of a team.

Anna Sfard, “On Two Metaphors for Learning and the Dangers of Choosing Just One”

More than merely pointing out that these two contrasting metaphors compete to shape our understanding of learning, Sfard also argues that elements of both metaphors are needed to understand learning fully. The acquisition metaphor, arguably the dominant perspective “since the dawn of human civilization” (and certainly throughout most of the history of Western education research), more or less entirely ignores the role of groups and communities in learning. The participation metaphor, on the other hand, situates knowledge entirely within delimited social contexts, leaving little or no room for the notion of knowledge transfer from one context to another.

Thus, lest one particular metaphor for learning become too dominant (and thereby begin to obscure key parts of the learning process), Sfard stresses the importance of having multiple metaphors on hand:

As long as a metaphor enjoys full hegemony, its normative implications are usually taken for granted; introduction of a new metaphor is often enough to bring the issue of norms to the fore and turn it into an object of explicit reflection.

Anna Sfard, “On Two Metaphors for Learning and the Dangers of Choosing Just One”

And beyond just Sfard’s acquisition and participation metaphors for learning, there’s a good case to be made for at least one more: the construction or “knowledge creation” metaphor suggested by Sami Paavola and Kai Hakkarainen, in which the focus of the learning process is neither on acquistion of knowledge nor on participation in communities but rather on the learner’s role as a creator and developer of “new material and conceptual artifacts”. All three of these metaphors account for weaknesses in the others and contribute to a balanced understanding of the learning process.

What does it mean for AI research, then, that the sense of “learning” we use when we discuss it is so tightly and exclusively bound up with the acquisition metaphor? For one thing, I think it means we might be missing out on a wide variety of potentially important new research directions. Here’s a (hugely inexhaustive) list of activities that definitely play a role in the learning process in humans, but that I rarely if ever see discussed under the “machine learning” label:¹

Using both internal (“does this feel right?”) and external (“what do my peers/teachers/critics/audience think? what happens in the environment?”) feedback to evaluate your performance
Deciding how to weigh potentially conflicting or mutually contradictory feedback from a variety of different sources
Forming and testing hypotheses: “what will happen if I do X?”
Curiosity, intellectual need: proactively seeking out new information to clarify an existing understanding
Experimenting in a playful or trial-and-error fashion with alternative ways of approaching the subject, some of which you don’t actually expect to work
Analogizing, forming connections between the thing you’re attempting to learn and other things you already understand
Decomposing an activity you’re attempting to learn into distinct subactivities, and practicing these subactivities in isolation

One thing I notice immediately about a lot of these activities is that they rely on the learner’s existence as either a social or embodied being. Most systems that get discussed under the label of “machine learning” today make no attempt to situate themselves in the physical world (or even a simulation thereof), nor are they enabled or encouraged to form social relationships with other agents – whether humans or bots.

My own interest in AI is focused primarily on its application to creative or expressive purposes, and here especially I suspect the dominance of the acquisition metaphor in machine learning discourse leaves us shortchanged. Learning in creative fields often depends heavily on both enculturation into communities of creative practice (better supported by the participation metaphor) and on the practice of creating new artifacts (better supported by the knowledge creation metaphor). To set about developing a “learning” creative AI system with only the acquisition metaphor in hand seems likely, in my view, to produce a highly static system, extraordinarily limited in its ability to produce anything that might legitimately surprise its creator.²

There’s also the issue of creative inspiration to consider. A number of existing learning-based content recommendation systems have managed to put together a pretty good profile of what I like, but they all suffer from a certain common problem: as good as they are at surfacing more of the stuff I’m already known to enjoy, they have no ability whatsoever to suggest things outside my known interests. This creates a self-reinforcing feedback loop: I click the things that are recommended to me, so I get more recommendations for that kind of thing. Gradually, over time, I become trapped in the local maximum of who the system thinks I am.

This is bad for creative work, and particularly for the surfacing of creative inspiration. When I’m doing creative work, what I often want is to break out of the rut I’m currently stuck in – to be presented not with stuff that I or people like me have already expressed an interest in, but with possible alternatives to the familiar. The acquisition metaphor doesn’t really afford this way of thinking; it has no room for the idea that I might want to temporarily escape my current cultural context (participation metaphor) or produce new knowledge by juxtaposing apparently disconnected things (creation metaphor).³

When I look more closely at my expanded list of activities that “learning” might entail, I realize that I have, in fact, encountered AI systems that engage in many of these activities in some form or another. But relatively few of these systems would seem to fit under the label of “machine learning” today, which has become remarkably narrow in its use.

Consider one of my favorite creative AI projects: Techne, an “artbot commune” that plays host to a variety of different art-creating bots. These bots not only create art but also learn from one another, sharing techniques and critiquing one another’s creations according to their own internal tastes. A bot may even update its own taste based on the feedback it receives from other bots. Altogether, this project amounts to a computational model of a community of creative practice – perhaps the only such model I have ever encountered.

Yet despite the fact that bots in Techne clearly learn from one another, this project is unlikely to be considered an example of a “machine learning” system. The aspects of learning that it models, and the implementation techniques that it uses, are a poor fit for the dominant acquisition metaphor used to understand learning in machine learning communities. Learning in Techne would likely be better described by a combination of the participation and creation metaphors, which are generally viewed today as outside the purview of machine learning.

This is a problem for AI research because the narrow use of “machine learning” as a term prevents us from making the connection between projects like Techne and the very concept of learning. When someone sets about asking the question of how to make a machine that learns, they are steered automatically toward a certain family of techniques – techniques that don’t necessarily represent the best way to approach the problem of learning across the board. As a result, potentially promising research directions are obscured from view.

Thus the importance of what Agre refers to as a “critical technical practice”. By interrogating the imperfect, “leaky” metaphors we use to frame our understanding of technology, we can uncover perspectives and problems that the currently dominant metaphors exclude. Then, as Sfard suggests, we can bring in new metaphors to encapsulate the alternative understandings that we create, ensuring that no one metaphor can enjoy “full hegemony” and opening up whole new ways of looking at the subjects of our research.

My own research in the development of AI for creative purposes has already benefitted significantly from one such perspective shift, away from the dominant metaphor of using software to “solve a problem” and toward the alternative metaphor of using software to “explore a space”. As I continue in this direction, I’m looking forward to seeing what other perspective shifts will become necessary, and how exactly they’ll change my understanding of the issues involved.

Footnotes

[^1] Of course, this is also partly due to my own lack of familiarity with some branches of machine learning research! Matthew Guzdial suggests on Twitter that machine learning researchers have given at least some thought to all of the specific elements of learning that I propose here.

[^2] Not to say, of course, that the goal is always to develop a system that can surprise its creator! Sometimes the last thing you want as a creator is to be caught off guard by your own tools. But if we intend to cast an AI system in the role of a creative collaborator, it’s important to note that creative collaboration is often most enjoyable and generative when your collaborators make suggestions or introduce elements that you would not have considered alone.

[^3] It might be tempting to dismiss this argument as unimportant. After all, there’s nothing that outright prevents us from rephrasing these concerns in terms of the acquisition metaphor – it just takes a bit of extra work. But this is rather like saying that, since C and Haskell and PowerPoint are all Turing complete programming languages, it doesn’t really matter which one you choose to program in. The reason we have multiple different languages is because each one makes certain ideas easier (or, as the case may be, harder) to express. It’s not about what each model makes theoretically possible, but about what it makes easy.

Likewise, even though it’s totally possible to rephrase concerns raised by one metaphor for learning in terms of another metaphor, it rarely results in the most natural expression of those concerns. And, if you’re thinking in terms of one metaphor only, the concerns initially raised by other metaphors might well never even occur to you – even though you totally could express these concerns in terms of your chosen metaphor once you became aware of them in the first place.

Affording Play

Learning and (machine) “learning”

Footnotes

About

Elsewhere