It makes sense to communicate with computers
The art of communication becomes a science when dealing with computers. Laying the foundations for future research in human-computer interactions, PF-STAR’s speech and gesture databases, and virtual agents open up new approaches to machine-based communications.
Completed in September 2004, the IST project PF-STAR aimed to lay the foundations for future research efforts in Multilingual and Multisensorial Communication, or MMC for short. Over the project’s two-year term, researchers worked to develop a range of advanced technological baselines, comparative speech and non-verbal communication evaluations, as well as an assessment of the prospects in some key areas of technology.
Machines that can communicate like human beings?
Project coordinator Fabio Pianesi of the Istituto Trentino di Cultura in Italy explains MMC as follows, “It’s the kind of technology that you need if you want to communicate with the same facility to both the PC and other human beings. The PC needs to be capable of interpreting and reproducing your gestures and facial expressions, as well as the emotion expressed in your speech, in the same way as humans do.”
Interpreting such subtle visual and aural cues, as well as the meaning of the spoken word, is a highly complex business. Facial expression, gesture, and even variations in pitch and tone of the voice all play their part in the way human beings interact. We use and respond to such subtle elements of human communication in our day-to-day lives almost without being aware of it, since our training in such communication develops from birth.
The challenge for the researchers is how to get a machine to interpret and reproduce such communication subtleties. Linguists have for many years reckoned the task to be near impossible given the number of channels and the complexity of signals involved. However PF-STAR’s work has provided a promising foundation on which future research can develop.
Virtual agents for intelligent interaction
The project partners in PF-STAR have built on several years of research within a variety of national and international projects, most notably NESPOLE!, C-STAR, Verbmobil and SmartKom. In PF-STAR, work focused on three key technological areas: speech-to-speech translation, the detection and expression of emotional states in both verbal and non-verbal channels, and core speech technologies for children. The partners also worked in five languages: English, German, Italian, Spanish and Swedish.
Two project partners, the Royal Institute of Technology (KTH) in Stockholm and the Istituto Trentino di Cultura, hired professional actors at the start of the project to study how speech tone and facial expressions changed while expressing emotions. This data was then fed into the project databases, which led to the development of a series of on-screen facial images, or ‘talking heads’, that offered a machine-based visual alternative to the human face.
These on-screen talking heads, which could be either 2D or 3D facial images, are designed to act as ‘virtual agents’ that can interact intelligently with human beings, other agents or, depending on their level of autonomy, the environment around them. Such virtual agents are believed to have a huge potential for future man/machine communication, in applications from teaching through helpdesks to entertainment.
The project has also allowed for variations in facial expression resulting from cultural differences, says Pianesi. “We should not forget that the expression of emotion is culturally dependent. We had to adapt the expressions on the talking heads to the language concerned, to see how our hypotheses work in the different countries.”
Speech technologies for children were a key area of research for the participants. Error rates for machine-based translation of children’s speech are believed to be some 100 per cent greater than for adults. To help improve such recognition rates, the partners used on-screen virtual agents based on children’s faces rather than on those of adults.
Strong foundation for future research
PF-STAR has laid strong foundations for further research into MMC, says Pianesi. “Two years ago there were no real databases available covering children’s speech, for example. Now we have such speech databases, as well as visual and gesture databases, that we are making available to partners and others.”
The project has also produced several new approaches to machine-based communication. The virtual agents for example are capable of reproducing the emotions expressed, either verbally or as facial expressions, along with the semantics of the message. They can be set to use either both channels (i.e. verbal and non-verbal), or only one.
And the results are more than just data, stresses Pianesi. Since August 2004 the project has made available the databases, the platform and the software for constructing virtual agents, as well as the code to enable further development to be carried out.
Development continues
While PF-STAR is now complete, the project partners are maintaining their development work in the basic technology of machine-based translation. As well as further improving the virtual agents, they are continuing to distribute the technology to client organisations to gain vital feedback on its use. Some of the partners have also commenced within the Sixth Framework Programme (FP6) a project called TC-STAR, a six-year project focused on exploring and evaluating new approaches to machine-based translation, and for creating the infrastructure needed for accelerating the rate of progress in the field.
The area of children’s speech remains of particular interest, says Pianesi. “How can we develop interfaces for instruction, for entertainment and so on, that are suitable for children? How can we produce suitable outputs for children?” Certain partners have come together within another FP6 project, CHIL, to further research children’s communication in schools.
Media Contact
More Information:
http://istresults.cordis.lu/All latest news from the category: Information Technology
Here you can find a summary of innovations in the fields of information and data processing and up-to-date developments on IT equipment and hardware.
This area covers topics such as IT services, IT architectures, IT management and telecommunications.
Newest articles
New model of neuronal circuit provides insight on eye movement
Working with week-old zebrafish larva, researchers at Weill Cornell Medicine and colleagues decoded how the connections formed by a network of neurons in the brainstem guide the fishes’ gaze. The…
Innovative protocol maps NMDA receptors in Alzheimer’s-Affected brains
Researchers from the Institute for Neurosciences (IN), a joint center of the Miguel Hernández University of Elche (UMH) and the Spanish National Research Council (CSIC), who are also part of…
New insights into sleep
…uncover key mechanisms related to cognitive function. Discovery suggests broad implications for giving brain a boost. While it’s well known that sleep enhances cognitive performance, the underlying neural mechanisms, particularly…