If I say to you the word “university,” what do you think I’ll say next? You might not know exactly, but deep in your brain, unbeknownst to you, you’ve already just run some pretty sophisticated guessing software, and you’ve figured out that the next word out of my mouth is probably not going to be “rutabaga,” or “glasses,” or even “university.” I’m much more likely, you’ve deduced, to say “students,” or “officials,” or even “of” (as in “University of Washington”).
We may not be aware of it, but we do this all day, every day.
And that’s something that computers are only just starting to get, as researchers develop ways for voice-recognition software — from transcription services to Rosetta Stone to the voice command on your smartphone — to start thinking like humans. That is, to consider the odds of what you’ll say next. Programs that can adapt to the nuances of their particular user’s voice are getting more accurate, but machines are still daunted by continuous speech and florid vocabularies.
Gonzaga professor Paul De Palma, along with colleagues at the University of New Mexico and the Spokane software company Next IT, thinks we can do better. De Palma, a computational linguist, uses computer algorithms to solve problems from voice recognition to bridge construction to graph theory.
“An algorithm is just a very precise step-by-step solution to a problem,” he says. “[Say] I want to make chocolate chip cookies. You go to Rosauers and you buy the little Tollhouse bag and you look at the back and you find the recipe: That’s an algorithm.”
But in De Palma’s case, of course, it’s much more sophisticated than that. His algorithms are complicated computer systems built to figure out complex problems.
Like the voice-recognition software. He’s using algorithms here not only to get voice-recognition software to understand what you’re saying but to get it to more effectively guess what you’ll say next, increasing its accuracy (as other researchers have done before him).
“Our particular contribution to all of this is that instead of trying to match up sequences of words [with speech], we break those down further and match up sequences of syllables,” he says. “The reason for this is clear once you think about it. There are just fewer syllables in English than there are words, fewer places to go wrong.”
Using syllables rather than words makes the software 15 percent more accurate, he says, but even this approach produces mixed results.
“Unfortunately, with our system, what you’re left with is a sequence of more or less correct syllables rather than more or less correct words,” he admits. “So there has to be some way to get those back into usable form.”
Their latest effort digs even deeper into human language processing.
“We got this insight from what we think is the way that human beings actually think — that you take this acoustic signal that comes in and you map it to some sense of concepts that are floating around in your head,” he says, moving to a practical example: those automated travel reservation systems used by airlines. “If I said to you, ‘I wanna fly to Spokane,’ ‘I wanna book a ticket to Spokane,’ ‘I wanna go to Spokane,’ ‘I wanna return to Spokane.’ All of these things mean, in the context of a travel reservation system, ‘Go Spokane.’”
If his team can get the software to understand that all of these separate syllables and words are related to a unifying concept, he believes, then the software will have a smaller pool of syllables and words to consider (only those related to “Travel to Spokane,” in this example), increasing the probability that it will guess the right ones.
“The idea is we’ll have this set of concepts that we have constructed, themselves probabilistically, from collections of spoken words, and we will probabilistically map the syllables back into these concepts, and then we’ll send these concepts off to an existing language-understanding system,” he says.
This isn’t even De Palma’s most interesting work. In a whole series of other projects, with Gonzaga profs Sara Ganzerli and Shannon Overbay, he uses “evolutionary” algorithms to solve problems. In an attempt to build the strongest bridge at the lowest cost, for example, his algorithms test 64 different models of 64-bar bridge trusses, kill off the weakest examples, and mate the strongest ones, creating a “survival of the fittest” effect that creates more and more efficient bridge designs.
But De Palma’s excitement shows when he talks about the linguistic research. Swiveling in his chair, he motions toward the huge, gleaming iMac on his desk and the small keyboard sitting prostrate at its pedestal.
“What we have here is this fabulous 21st-century electronic device, but it’s chained to a 19th-century input system,” he muses, excitedly. “Wouldn’t it be nice if we could just speak to our computers and they would do what we ask them to do?
“This could be really great if this works. This could be just great.”