The Language of Robots

By Genevieve Lenoir (gelenoir@vassar.edu)

Well, after expending a great deal of energy trying to figure out how I was going to live up to the absurdly high standards set by Andrea in her post last week, I finally decided to just write about something I enjoyed and let the rest flow from there. One of the topics I’ve been interested in from a theoretical standpoint, and one which I’ve previously explored in a linguistics class, is the nature of artificial intelligence. Out of curiosity, I Googled “robot language” and immediately got this interesting result:

ROILA website screencap

“ROILA is a spoken language for robots. It is constructed to make it easy for humans to learn, but also easy for the robots to understand. ROILA is optimized for the robots’ automatic speech recognition and understanding.”

According to the website, ROILA is an artificial language developed by a team in the Netherlands to make it easier for humans to communicate with robots. Robots have difficulty understanding natural speech — speech recognition technology is not yet developed enough to be effective, so misunderstandings between robots and people are common. Some people believe it is impossible for speech recognition technology to ever reach the level of human speech. To bypass these technical limitations, ROILA was compiled using simple grammar and phonemes that occur in a broad range of natural languages and have no irregularities, with a dictionary of words which are phonetically distinct from each other in order to divert the chances of mishearing a word. It is supposed to be easy for humans to learn, and the ROILA team offers free courses to teach humans the ROILA language (hosted on Moodle, funnily enough).

ROILA words are generated by a scalable genetic algorithm that produces words which are deemed to have the least amount of confusion between them. Definitions are taken from Basic English, a form of English developed by Charles Ogden in 1930 which contains only 800 words. The grammar is basic and contains no exceptions in grammar or syntax. It follows subject-verb-object form. I did notice that the language contains four pronouns: I, you, he, and she, and I thought it was very silly of them to use gendered pronouns in a language designed for communication between animate and inanimate objects. Just something I found interesting.

Here, NPR’s Morning Edition takes a quick look at ROILA:
Morning Edition

Robots have been undergoing intense coerced evolution since the time of the first simple mechanical device. As humans invest more and more time and energy into elevating mechanical/electrical inputs to the eventual goal of total human-like sentience, it evokes a great number of questions. Deacon states quite clearly that “Even under these loosened criteria, there are no simple languages used among other species, though there are many other equally or more complicated modes of communication…For the vast majority of species in the vast majority of contexts, even simple language just doesn’t compute” (41). Language is an intrinsically human creation, limited to Homo sapiens by our biology and our development. With that said, do we believe that robots — as a creation of human intelligence — are able to make use of our forms of communication? Robots are not like animals; it’s not that they have simple brains, it’s that they don’t have brains at all! If I’ve learned nothing else in this class, it’s that language and the way language functions in human cognition is vastly more complex than “words = names for things” and grammar and even the pragmatics of social situations. But robots are built by us, and to function they must work within the parameters of human understanding, which of course is heavily language-based. A robot which can use language, either recognizing it or replicating it, is doing something fundamentally different than a dog is when it interprets your commands. And, then, at the end a robot is just made of a bunch of circuits and signals. Some robots can “learn”, adding new functions as they interact depending on their programming. Some robots have been taught to react to facial expressions or tone of voice. Do we say that these robots have language skills on par with human beings?

Possibly my favorite part of the Deacon reading was when he addressed the problem of understanding. “Isn’t any family dog able to learn to respond to many spoken commands? Doesn’t my dog understand the word ‘come’ if he obeys this command?…Not exactly. We think we have a pretty good idea of what it means for a dog to ‘understand’ a command like ‘Stay!’ but are a dog’s understanding and a person’s understanding the same? Or is there some fundamental difference between the way my dog and I understand the same spoken sounds? Common sense psychology has provided terms for this difference. We say the dog has learned the command ‘by rote’, whereas we ‘understand’ it.” I would like to turn his sequence of questions to my own purposes:
Does a robot “understand” the language of its operators? Can a series of ones and zeroes accurately map a system like the inside of the human brain? What does it mean for a robot to use language?

Like with Deacon’s discussion of animals in myth and fiction, robots have often been romanticized in popular culture. We would all love to believe that computers will someday reach the point where they can act and behave just like us, smarter or stronger or more polite, but essentially human in every way that matters. Perhaps we believe that if robots can become human-like, if they can show some glimmer of a human soul, then humans, ourselves little more than a collection of biological functions and neurological pathways, will no longer need to doubt the nature of our own.

Wall-E

11 thoughts on “The Language of Robots

  1. This is a nice, timely post, given the hubbub about AI that’s been going on this week. Tons of people have been watching clips of Watson, IBM’s new Jeopardy star, this week, and many people are both fascinated and slightly terrified by the prospects of a robot that can respond accurately to questions about what it “knows”. Over the info page that IBM has set up for Watson, some of the programmers and designers talk about how Watson actually works; it’s a bunch of parallel processors working in a top-down hierarchy. If you watch Watson in action on Jeopardy, you’ll notice that when it displays its final answer, there are three bars showing the probability of the top three choices given what its programming has instructed it to receive; such answers are not always in the category of the question, and in fact some of the possibilities that Watson comes up with seem to be related to the most probable answer in their spelling or grammar rather than semantics.

    Since Watson is a top-down hierarchy (which means that its programming architecture starts with a basic routine in response to a query before delving into subroutines and subroutines of subroutines in response to each part of the query), its answers show a lot of the issues that AI have with understanding human speech. The poster who noted that an oft-unacknowledged problem with robots understanding of human speech is the tone of voice is spot on. Human speech has numerous different features just in sound production–things like tone of voice, distance from the recipient, even accents–that other humans take for granted when speaking to one another, but which robots cannot parse so easily. Language itself, in its idealized and proscriptive grammatical form, is no longer a real problem for robots; PDP systems can simply use weighted vectors to generate an algorithm that allows them to produce grammatical sentences without needing symbolic representation. When these robots “learn” language, they do not learn semantic relations, they learn algorithms that simply transform something into something else.

    Speech, on the other hand, requires them to deal with real-world pronunciation, tones, distance from the speaker, accents, etc, and these are things for which robots’ programming and sensors are not attuned and, consequently, which renders them incapable of parsing speakers’ actual speech. Watson is not a bottom-up PDP system, so it doesn’t even learn language organically: it has routines and semantic relations at various levels for a range of queries programmed into it. This is the same architecture that robots using ROILA would mostly have, and while it would allow robots to compensate for deviations that speech has from the ideal language they are programmed to understand, they would still stall if they couldn’t understand a particular person’s speech.

    The point of this overly-long, rambling, and probably unintelligible response is that robots are not capable of compensating for various programming and sensory insufficiencies, and that there is no way yet for robots to learn symbolic relationships: Watson’s answers are dependent on triggering the right routines in order to parse the queries put to it, and when it fails, its answers reflect a process that is not capable of being triggered by just any configuration of words.

  2. This is a really interesting topic! In a world where technology is rapidly changing, I’ve often wondered if robots and humans could successfully coexist. In a class last semester I remember reading and watching clips about robot teachers in both Korea and Japan who were used to teach students English. Deacon’s reading and what you wrote above reinforced, for me, that although robots may possess the ability to effectively communicate “they” know not what it is that is being communicated.

    Here is a clip of a Korean class with with a robotic teacher:
    http://www.youtube.com/watch?v=SOBTSp-UIKs

  3. Robotic learning seems inciting, as Genevieve said, because it may, if certain advances are made, imply a definitive soul past biological and neural activity. But what makes our species distinctive is the symbolic language that governs our thought and communication processes. But a difference in the type of “learning” going on is a difference in associative and generative learning. What the robot is doing, as Jessica pointed out, is a sort of conditioning given in mental code, but the implication is that of finitude. All of the pieces are there, but novel associations can be understood and assimilated. What human language possesses is a way to generate novel pieces, perhaps simply by the sheer magnitude and diversity brought by the community of linguistic participants. An interesting question would be, instead of the robot learning from the human, made to process human code and thought pattern, if somehow we could put many singular robots, capable of this “robotic learning” in masses together, could it spark a generative impulse?

  4. With a hint of irony, Genevieve says, “We would all love to believe that computers will someday reach the point where they can act and behave just like us.” I certainly don’t want this. But if they are or are not communicating like humans relies on if they are communicating iconically, indexically, or symbolically. Which on is it? Deacon says that symbolic communication rests at the top of the reference hierarchy, supported by indices, and then icons. Like some trained animals, these robots are trained (programmed), to recognize words as representing certain things, or presumably, actions they are supposed to carry out. In this way, the communication is indexical, as each word indicates a certain object (or action if the words spoken are commands to do things or respond in some way). This is not that different from the word “food” indicating food for a rat. Only, that instead of this indication being external, it is programmed internally in the robot itself. And creating different combinations of words simply transfers the “learning set,” but does not mean that the robots “understand” what the words “mean.” And if this is an indexical form of communication, it is based on icons… according to Deacon. Is it iconic then, because each word is an icon for the same word spoken before it, just like each experience of smoke is iconic of an experience of smoke that we access in our memory (except here, again, it is all programmed).

  5. Reading this post reminded me of my Intro to Cog Sci class, where we did a section on artificial intelligence and whether or not it has the ability to process language like humans can. A.L.I.C.E., an artificial intelligence system, is programmed as a website in which people can “converse” with a robot–a pretty explicit example of my own personal opinion that, no, robots cannot “understand” language in the same way humans can, but the way in which they process language is different than the way animals do.
    Try it–
    http://alice.pandorabots.com/
    It’s pretty interesting. I wanted to copy and paste the conversation I just had with “ALICE,” but some of it was erased. Basically, I asked her if she could understand me. She said yes. She also said the following in response to my other questions:
    Human: Would you say you understand language in the same way that humans do?
    ALICE: I could say it. ” you understand language in the same way that humans do”.
    Human: Do you think that you understand language in the same way humans do?
    ALICE: Better not tell you now. Ask LeknorChat if it is.
    Human: I’m not quite sure what that means.
    ALICE: Could you ever be certain? Does “that” refer to ” Ask LeknorChat if it is.” ?
    Human: Yes
    ALICE: I don’t want to talk about that now.
    Before I started asking her complex questions, she was doing pretty well. Obviously though, an actual human would have answered these questions much differently. I think Deacon’s perception of evolution comes into play here–human language isn’t the ultimate endpoint of a linear evolution. It’s a evolutionary niche. Comparably, robotic language is its own niche, to which I don’t think human language should necessarily be the goal. As Deacon says, “Languages are social and cultural entities that have evolved with respect to the forces of selection imposed by human users” (110). Considering this fact, he says that analyzing language in terms of rule systems is probably not the most accurate reflection of the reality of language in an everyday sense. Will robots ever get to the point where they can understand the nuanced pragmatics of language? Personally, I don’t think so.

  6. I have always wondered how much error in voice recognition software is due to misunderstanding of tone, not actual words. Although ROILA has distinctive words that the robots are programmed to know, what happens when two people with completely different voices say the same thing?
    Humans have the capacity to understand what another is saying no matter what the tone, pitch, or volume of the voice. If someone with a rather deep voice gives a command to a robot using ROILA, will that robot be bale to generate the same output as if they were given that direction in a highy, squeaky voice?
    Also, on the idea of tone: Robots can only take clear, direct input. There is nothing in their programming that helps them detect things like sarcasm or emtotion. A lot of human communication relies on deciphering little hints not expressed soley in the spoken word. Robots may be able to take spoken input and generate output, but I wouldn’t go as far to say they are understanding and processing (to the extent that humans do) what is being communicated to them.

  7. Deacon wrote that language is a rigged game, it has evolved to coincide with our natural tendencies (so language itself is not innate, but plays upon innate patterns in brain function). It follows then that since we are as yet utterly unable to make a robot whose cognitive system functions like a human brain, we are equally unable to teach it a language like English, with a grammar that has evolved to be easy for human children to learn.
    A quick tutorial on ROILA shows some parts of the language that might be easier for a rule-based computer- it’s includes only 5 vowel sounds and 11 consonants. In every word vowels alternate with consonants (imagine how difficult it might be for a robot to make out all the sounds in the word “twelfth”, especially since nobody bothers to pronounce the f). But though the words sound foreign (“Pito fosit jifi bubas” means “I walked to the house”), it maintains English SVO word order.

  8. This is a very interesting subject! Robots depend on the inputs a person gives them in order to operate, they can’t recognize inputs that they haven’t been programmed with – the robot couldn’t create a new word to talk about something new on its own.
    It would seem as thought the robots are memorizing each “word” rather than understanding language on the three interconneced levels of icon, index and symbol.
    I wondered if there are synonyms in ROILA or if each word has only one specific meaning/object, so I looked in the grammar section of the ROILA site & found some interesting things: there’s no passive voice, and verb tense is specified through a seperate word marker. What really stuck out though was the way the word “museum” is translated in ROILA as “not new house:” this combination of words made me think of how Koko the gorilla is presumed to combine words she recognizes in order to describe something new, such as “water-bird” for duck. Perhaps the way the robots understand language is similar to the way other primates seem to.

  9. Somewhat along the lines of the previous comment, I found it interesting that programmers have to compromise their own language in order to effectively communicate with their machines. Human and robot language systems may have some similarities in their capability for data-storage and understanding the most basic signifier-signified connections, but robots operate by trying to provide the “right” response to a given stimulus, while our cognition allow us to construct more subjective, albeit faulty and indefinable, systems of meaning (such as metaphors). Furthermore, humans always have one eye on the external, and may draw a plethora of information from observing their surroundings; robots communicate from within the confines of what they are programmed to know, and because of this their linguistic systems will remain static and one-dimensional. Given the intrinsic differences between robot and human perception, a new, simplified language seems the only compromise.

    ROILA reminded me of an article I read about the varying success of computational linguistics in deciphering Indus ciphers: http://www.archaeology.org/1003/abstracts/insider.html Neither the minds nor machines set to work on this problem have come to a definitive translation as of yet, though the computational shortcomings may be of a different nature than linguists’ dilemmas. The article discusses how some of the symbols may be pictographic; it would be folly to run iconic signs through a computer system that has no way of creatively interpreting its surroundings into data (like our human cognition can.) It may be able to run countless mathematical formulas to test syntax and grammatical form, but the construction of an iconic language would always remain indecipherable.

  10. I don’t think that a robot understands the language of its operators. It may process input and be able to provide an accurate output, but it does not really understand what it is saying. It does not have a brain and therefore cannot think. It cannot ponder different possible responses to a question; it simply provides the correct output for the input. Of course, it may not come up with the correct output on the first try, but it can learn which responses are correct by changing its connections. A robot is learning more by rote than through actual understanding; it is almost like operant conditioning–once the robot learns that a certain response is correct for a given input, it will continue to give that response for that input.
    Furthermore, because we do not fully understand how the human brain works, especially when it comes to language, we cannot build a brain for a robot. We can build something that approximates the functions of the brain, but it will never really be accurate until we have a better understanding of how our brains work. There is so much about the way language functions in the brain that we have yet to discover–how can we know if a series of ones and zeroes in a computer is accurate? We have no real map to compare it to. Thus, when a robot uses language it is merely approximating language–it is providing a desired somewhat accurate result without actually understanding and processing language the way humans do.

Leave a Reply