A dictation application requires certain hardware and software on the user's computer. Not all computers have the memory, speed, microphone, or speakers required to support speech, so it is a good idea to design the application so that speech is optional.
These hardware and software requirements should be considered when designing a speech application:
Even the most sophisticated speech recognition engine has limitations that affect what it can recognize and how accurate the recognition will be. The following list illustrates many of the limitations found today. The limitations do pose some problems, but they do not prevent the design and development of savvy applications that use dictation.
Microphones and sound cards
The microphone is the largest problem that speech recognition encounters. Microphones inherently have the following problems:
Most applications can do little about the microphone. One way that vendors can deal with this is to test and verify the user's microphone setup as part of the installation of any speech component software. Software to test a user's microphone can be delivered along with other components to ensure that the user can periodically test and adjust the microphone and configuration.
Most users of dictation will wear close-talk microphones for maximum accuracy. Close-talk mikes have the best characteristics for speech recognition; they alleviate a number of the problems encountered in Command and Control recognition caused by weaknesses in the capabilities of user microphones in speech recognition and dictation applications.
Speech Recognizers make mistakes
Speech recognizers make mistakes, and will always make mistakes. The only thing that is changing is that every two years recognizers make half as many mistakes as they did before. But, no matter how great a recognizer is it will always make mistakes.
To make matters worse, dictation engines make misrecognitions that are correctly spelled and often grammatically correct, but mean nothing. Unfortunately, the misrecognitions sometimes mean something completely different than the user intended. These sorts of errors serve to illustrate some of the complexity of speech communication, particularly in that people are not accustomed to attributing strange wording to speech errors.
To minimize some of the misrecognitions, an application can:
Is it a Command?
When speech recognition is listening for dictation, user's will often want to interject commands such as "cross-out" to delete the previous word or "capitalize-that". Applications should make sure that:
Finite Number of Words
Speech recognizers listen for 20,000 to 100,000 words. Because of this, one out of every fifty words a user speaks isn't recognized because it isn't in the 20,000 -- 100,000 words supported by the engine.
Applications can reduce the error rate of an engine if the application tells the engine about what words the engine should expect.
Other Problems
Some other problems crop up:
Here are some design considerations for applications using command and control speech recognition.
Design Speech Recognition in From the Start
Don't make the mistake of implementing speech recognition in your application as an afterthought. It's a poor design if the application is designed for a mouse and keyboard. Applications designed for just the keyboard and mouse get little benefit from speech recognition. The speech interface is at a point similar to where the mouse interface was when applications were designed for keyboard input only-not until applications were deliberately designed for mousing did the mouse prove generally effective for user input.
Do Not Replace the Keyboard and Mouse
Most dictation systems provide discrete dictation, allowing users to speak up to 50 words per minute. While this is faster than hunt-and-peck typists, touch typists can type at least 70 words per minute. Discrete dictation will not be used by touch typists. Continuous dictation allows up to 120 words per minute.
Communicate Speech Awareness
Since most applications today do not include speech recognition, users will find speech recognition a new technology. They probably won't assume that your application has it, and won't know how to use it.
When you design a speech recognition application, it is important to communicate to the user that your application is speech-aware and to provide him or her with the commands it understands. It is also important to provide command sets that are consistent and complete.
Manage User Expectations
Users will often have the expectation that speech-enabled applications will provide a level of comprehension and interaction comparable to the futuristic speech-enabled computers of Star Trek and 2001: A Space Odyssey. Some users will expect the computer to correctly transcribe every word that they speak, understand it, and then act upon it in an intelligent manner.
You should convey as clearly as possible exactly what an application can and cannot do and emphasize that the user should speak clearly, using words the application understands.
Where the Engine Comes From
If an application implements speech recognition, it can work on an end user's PC only if the system has a speech recognition engine installed on it. The application has two choices:
2nd Speech Center is Award-Winning Text-To-Speech Player to converts any text into spoken words or even MP3/WAV audio files.
Copyright © 2000-2021 Zero2000 Software All Rights Reserved. | Terms & Conditions - Privacy Policy - Contact Us - Sitemap