How to Configure Microsoft Speech Engine for Speech-To-Text


How to use the Speech-to-Text feature in Camtasia.


Camtasia has a feature called Speech-To-Text which utilizes Microsoft Speech Engine to convert the audio in the presentation into captions. Follow the process below to configure the feature.


Microsoft Speech Engine is already installed in Windows 7, 8, and 10. There is no need to install the engine again. After installing Camtasia, the speech recognition features will be ready to use. This can be found within Captions by selecting the gear icon. Speech-to-Text will only be available if there is audio on the timeline.

Available Languages

  • US English
  • UK English
  • German
  • French
  • Spanish
  • Japanese
  • Traditional Chinese
  • Simplified Chinese

Complete Voice Training in Camtasia

Before using the Speech-to-Text feature, the following training must be completed in order for the speech recognition to be successful.

  • Train your computer to understand your voice
  • Set up your microphone
  • Add words to the speech recognition dictionary

Complete all the steps that is necessary. Once the training is complete, you do not need to train again. You may export then import the profile to reuse the training info on different logins or computers. Users can have more than one profiles for each login.

Tips to Improve the Accuracy of the Speech Engine

  • In the Settings > Time & Language > Speech, you may also find these methods important.
    • Accuracy is improved by training and audio quality. Best accuracy requires 4-5 hours of training. The more you train your computer, the better result you could get.
    • There are no acoustic models and audio quality settings for speech engine, however, on XP machine, you may set the recognition quality vs. recognition speech.
    • Use a decent quality microphone and configure the microphone properly.
    • Choose a speech recognizer that best matches your accent (e.g. US vs. UK for English).
  • Use the best speech recognizer you could get. For example, on XP, you may install Speech Recognizer 6.1 instead of default public domain version Speech Recognizer 5.1.
  • Custom words can be added to a user’s dictionary by telling the system the text word and speaking the word (e.g. you can explicitly tell the system to recognize how you speak the word “Camtasia”).
  • Use the proper training profile to do the speech recognition.
  • Record or dictate your voice in a quiet environment and use your normal speed to speak.

You may also install MS language packs to obtain the speech engines in other languages.

More more information on the Microsoft Speech API, see this article.