| Disaggregate
Consulting |
![]() |
|||
|
|
|
|
|
|
To verify someone's identity, experts recommend a three-fold test: "something you know, something you have, and something you are." Most computer applications, particularly those that work over the Internet, can only use "something you know" a password.
In the November, 2002 issue of Dr. Dobb's Journal (Nov., 2002 requires registration), I show how to use VoiceXML to provide authentication through "biometrics" the measurement of "something you are."
The demonstration combines VoiceXML with standard web-based services. In conjuction with a telephone call to your cell phone "something you have" and a request for a password "something you know " voice biometrics can complete the security triad.
Download the source code (report broken links) which shows how to use VoiceXML for voice biometrics. This package uses publicly-accessible VoiceXML servers, publicly-accessible telephony servers to control the telephone network from the Internet, and publicly-accessible voice biometrics you do not need any special hardware or software. At present, the package does not include documentation; see the Dr. Dobb's article.
Voice2IM is a package that ties VoiceXML with Instant Messaging (IM) to produce a multimodal user interface, a user interface that lets the user choose different modes. The input modes are speech and text, and the output modes are voice and text.
Example: In this demonstration package, the user is a business traveler who calls a call center to change travel plans; the user also has a wireless device connected to the Internet (e.g., a Palm/cellphone combination). The automated call center sends voice and text to the user, and the user can either speak or write his choice an integrated, multimodal experience.
Multimodal interfaces are particularly useful for call centers. Agents (whether human or automated) are spared the task of reading long lists of information to callers, the user experience improves as the user receives complicated information in written form, transactions are more accurate, and the overall cost of the transcation drops.
This package uses publicly-accessible VoiceXML servers, publicly-accessible telephony servers to control the telephone network from the Internet, and publicly-accessible Instant Messaging servers you do not need your own servers (i.e., you do not need specialized hardware or software).
Dr. Dobb's Journal published my article about this technology in the January, 2004 edition. This unpublished sidebar explains a bit more about multimodal systems and provides some brief hints about successful speech user interfaces.
Download Voice2IM (report broken links).
For the pedantic:
This is a demonstration of of the multimodal user interface, built around the concept of an automated travel agency. The demonstration is built on the Voice2IM package that described earlier on this page. The demonstration shows how to combine different modes of input and output speech recognition, text-to-speech, and text.
You can try this demonstration but first, a brief reminder. The service is hosted on remote servers, and every once in a while remote servers can be offline, or even reconfigured without warning. If the demo fails please contact me so I can repair the damage!
Because of some connection problems with the jabber server, I now test the connection every 10 minutes. In the latest test, the jabber server is
.
To proceed with the demo, please click here.
|
|
|
|
|
|
| Site and contents © 2001, 2002 Moshe Yudkowsky | ||||
Last updated 2002-06-25