Posted on Dec 1, 2020
If you go to Google Translate, you can not only translate text from one language to another, but you can also listen to the translation. For example, this English to Chinese translation allows you to listen to the pronunciation of the Chinese text.
Google’s Cloud Text-to-Speech API allows you to programmatically generate mp3s of any text. Below are steps to do it on Windows using PHP.
1. Follow These Instructions
2. Set Environment Variable
When you follow the steps above, you will download a JSON file containing your credentials. You need to set an environment variable by opening a command prompt and entering
You can then verify it is set by typing “set”.
That environment variable is temporary and will persist for the duration of the terminal session. To set the environment variable permanently, follow these steps.
3. Install PHP Composer
Composer will need a php.ini file. If one doesn’t exist, it will create one.
4. Update php.ini
To ensure your SSL certificates are up-to-date, download the latest cacert.pem from https://curl.haxx.se/ca/cacert.pem. Then, edit php.ini as follows:
5. Create PHP Script
Copy and paste the example code from the instructions in step 1. This is a PHP script so wrap the code in <?php … ?>. Save it as test-text-to-speech.php somewhere.
6. Run PHP Script
At the command prompt, verify the Google environment variable is set and then run the PHP script. If PHP is in your path, you can run, for example,
This will output an audio file (output.mp3) in the same folder. By default, the text is “Hello, world!” and the language code is en-US. You can change the text to Chinese, for example: 这是一个测试 and change the language code accordingly to cmn-CN. Then, you’ll get the same speech as what you hear in Google Translate.
There are two types of voice synthesis models: standard (parametric) and Wavenet. Wavenet is more expensive but sounds more natural.
Google Text-to-Speech pricing is based on the number of characters processed.
For additional customization of text to speech, use Speech Synthesis Markup Language (SSML).