How to use Speech-To-Text in Camtasia

Updated 9 months ago

Problem

How to use the Speech-to-Text feature in Camtasia.

Solution

Camtasia 2024 and later utilize our new captioning functionality powered by OpenAI's Whisper technology (TechSmith AI policy). To use this new method, go to Captions > Speech-to-Text and Camtasia will automatically transcribe your content.

Microsoft Speech Engine method (discontinued)

Legacy versions of Camtasia have a feature called Speech-To-Text which utilizes Microsoft Speech Engine to convert the audio in the presentation into captions. Follow the process below to configure the feature.

Installation

Microsoft Speech Engine is already installed in Windows 7, 8, 10, and 11. There is no need to install the engine again. After installing Camtasia, the speech recognition features will be ready to use. This can be found within Captions by selecting the gear icon. Speech-to-Text will only be available if there is audio on the timeline.

Available Languages

US English
UK English
German
French
Spanish
Japanese
Traditional Chinese
Simplified Chinese

Complete Voice Training in Camtasia

Before using the Speech-to-Text feature, the following training must be completed in order for the speech recognition to be successful.

Train your computer to understand your voice
Set up your microphone
Add words to the speech recognition dictionary

Complete all the steps that is necessary. Once the training is complete, you do not need to train again. You may export then import the profile to reuse the training info on different logins or computers. Users can have more than one profiles for each login.

Tips to Improve the Accuracy of the Speech Engine

In the Settings > Time & Language > Speech, you may also find these methods important.
- Accuracy is improved by training and audio quality. Best accuracy requires 4-5 hours of training. The more you train your computer, the better result you could get.
- There are no acoustic models and audio quality settings for speech engine, however, on XP machine, you may set the recognition quality vs. recognition speech.
- Use a decent quality microphone and configure the microphone properly.
- Choose a speech recognizer that best matches your accent (e.g. US vs. UK for English).
Use the best speech recognizer you could get. For example, on XP, you may install Speech Recognizer 6.1 instead of default public domain version Speech Recognizer 5.1.
Custom words can be added to a user’s dictionary by telling the system the text word and speaking the word (e.g. you can explicitly tell the system to recognize how you speak the word “Camtasia”).
Use the proper training profile to do the speech recognition.
Record or dictate your voice in a quiet environment and use your normal speed to speak.

You may also install MS language packs to obtain the speech engines in other languages.

More more information on the Microsoft Speech API, see this article.