Multimodal interaction: Humanizing the human-computer interface

Kouichi Katsurada is an associate professor at Toyohashi Tech’s Graduate School of Engineering with a mission to ‘humanize’ the computer interface. Katsurada’s research centers on the expansion of human-computer communication by means of a web-based multimodal interactive (MMI) approach employing speech, gesture and facial expressions, as well as the traditional keyboard and mouse.

“Although many MMI systems have been tried, few are widely used,” says Katsurada. “Some reasons for this lack of use are their complexity of installation and compilation, and their general inaccessibility for ordinary computer users. To resolve these issues we have designed a web browser-based MMI system that only uses open source software and de facto standards.”

This openness has the advantage that it can be executed on any web browser, handle JavaScript, Java applets and Flash, and can be used not only on a PC but also on mobile devices like smart phones and tablet computers.

The user can interact with the system by speaking directly with an anthropomorphic agent that employs speech recognition, speech synthesis and facial image synthesis.

For example, a user can recite a telephone number, which is recorded by the computer and the data sent via the browser to a session manager on the server housing the MMI system. The data is processed by the speech recognition software and sent to a scenario interpreter, which uses XISL (extensible Interaction Scenario Language) to manage the human-computer dialogue.

“XISL is a multimodal interaction description language based on the XML markup language,” says Katsurada. “Its advantage over other MMI description languages is that it has sufficient modal extensibility to deal with various modes of communication without having to change its specifications. Another advantage is that it inherits features from VoiceXML, as well as SMIL used for authoring interactive audio-video presentations.”

On the downside, XISL requires authors to use a large number of parameters for describing individual input and output tags, making it a cumbersome language to use. “In order to solve this problem, we will provide a GUI-prototyping tool that will make it easier to write XISL documents,” says Katsurada.

“Currently, we can use some voice commands and the keyboard with the system, and in the future we will add both touch and gestures for devices equipped with touch displays and cameras,” says Katsurada. “In other words, it is our aim is to make interaction with the computer as natural as possible.”

Comments (0) Cancel reply

Communications Media

Driving Noise Impacts Music Enjoyment in Cars

More than just loud bass! Loud or unpleasant driving noises can impair the enjoyment of music in the car. Some sound systems therefore dynamically adjust the volume and bass. However, individual sound preferences are not taken into account. A study from Fraunhofer IDMT in Oldenburg has now investigated the influence of background noise on the personal sound experience while driving – and shows how an adjustment of individual sound preferences once could improve the sound in the vehicle (and beyond)….

24.07.2024

Communications Media

Portable Antenna Enhances Communication After Disasters

Researchers from Stanford and the American University of Beirut have developed a lightweight, portable antenna that can communicate with satellites and devices on the ground, making it easier to coordinate rescue and relief efforts in disaster-prone areas. When an earthquake, flood, or other disaster strikes a region, existing communication infrastructure such as cell phone and radio towers are often damaged or destroyed. Restoring emergency communications as quickly as possible is vital for coordinating rescue and relief efforts. Researchers at Stanford…

More than 1 year ago

Communications Media

Long-Distance, Low-Power Underwater Communication System

The system could be used for battery-free underwater communication across kilometer-scale distances, to aid monitoring of climate and coastal change. MIT researchers have demonstrated the first system for ultra-low-power underwater networking and communication, which can transmit signals across kilometer-scale distances. This technique, which the researchers began developing several years ago, uses about one-millionth the power that existing underwater communication methods use. By expanding their battery-free system’s communication range, the researchers have made the technology more feasible for applications such as…

More than 1 year ago

Communications Media

New Maritime WiFi Record: SeaFi Tech Transmits Further

European team use proprietary ‘SeaFi’ technology to send furthest wireless broadband transmission from ship-to-shore ever achieved without satellite. A team of scientists working off the west coast of Ireland have set a new world record for the furthest broadband transmission from a ship at sea back to land without satellite or cellular connection. The new record is 36.83km (19.9 nautical miles) set on Saturday, 26 May 2023, off the coast of the Aran Islands. The team based at Aran Island Research Station…

More than 1 year ago

Multimodal interaction: Humanizing the human-computer interface

Comments (0) Cancel reply

Most Read Articles

Related Posts

Driving Noise Impacts Music Enjoyment in Cars

Portable Antenna Enhances Communication After Disasters

Long-Distance, Low-Power Underwater Communication System

New Maritime WiFi Record: SeaFi Tech Transmits Further

Do You Like Our New Design?