public class TextToSpeech
extends com.ibm.cloud.sdk.core.service.BaseService
For speech synthesis, the service supports a synchronous HTTP Representational State Transfer (REST) interface. It also supports a WebSocket interface that provides both plain text and SSML input, including the SSML <mark> element and word timings. SSML is an XML-based markup language that provides text annotation for speech-synthesis applications.
The service also offers a customization interface. You can use the interface to define sounds-like or phonetic translations for words. A sounds-like translation consists of one or more words that, when combined, sound like the word. A phonetic translation is based on the SSML phoneme format for representing a word. You can specify a phonetic translation in standard International Phonetic Alphabet (IPA) representation or in the proprietary IBM Symbolic Phonetic Representation (SPR). The Arabic, Chinese, Dutch, and Korean languages support only IPA.
Constructor and Description |
---|
TextToSpeech()
Constructs a new `TextToSpeech` client using the DEFAULT_SERVICE_NAME.
|
TextToSpeech(com.ibm.cloud.sdk.core.security.Authenticator authenticator)
Constructs a new `TextToSpeech` client with the DEFAULT_SERVICE_NAME and the specified
Authenticator.
|
TextToSpeech(java.lang.String serviceName)
Constructs a new `TextToSpeech` client with the specified serviceName.
|
TextToSpeech(java.lang.String serviceName,
com.ibm.cloud.sdk.core.security.Authenticator authenticator)
Constructs a new `TextToSpeech` client with the specified Authenticator and serviceName.
|
Modifier and Type | Method and Description |
---|---|
com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> |
addWord(AddWordOptions addWordOptions)
Add a custom word.
|
com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> |
addWords(AddWordsOptions addWordsOptions)
Add custom words.
|
com.ibm.cloud.sdk.core.http.ServiceCall<VoiceModel> |
createVoiceModel(CreateVoiceModelOptions createVoiceModelOptions)
Create a custom model.
|
com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> |
deleteUserData(DeleteUserDataOptions deleteUserDataOptions)
Delete labeled data.
|
com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> |
deleteVoiceModel(DeleteVoiceModelOptions deleteVoiceModelOptions)
Delete a custom model.
|
com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> |
deleteWord(DeleteWordOptions deleteWordOptions)
Delete a custom word.
|
com.ibm.cloud.sdk.core.http.ServiceCall<Pronunciation> |
getPronunciation(GetPronunciationOptions getPronunciationOptions)
Get pronunciation.
|
com.ibm.cloud.sdk.core.http.ServiceCall<Voice> |
getVoice(GetVoiceOptions getVoiceOptions)
Get a voice.
|
com.ibm.cloud.sdk.core.http.ServiceCall<VoiceModel> |
getVoiceModel(GetVoiceModelOptions getVoiceModelOptions)
Get a custom model.
|
com.ibm.cloud.sdk.core.http.ServiceCall<Translation> |
getWord(GetWordOptions getWordOptions)
Get a custom word.
|
com.ibm.cloud.sdk.core.http.ServiceCall<VoiceModels> |
listVoiceModels()
List custom models.
|
com.ibm.cloud.sdk.core.http.ServiceCall<VoiceModels> |
listVoiceModels(ListVoiceModelsOptions listVoiceModelsOptions)
List custom models.
|
com.ibm.cloud.sdk.core.http.ServiceCall<Voices> |
listVoices()
List voices.
|
com.ibm.cloud.sdk.core.http.ServiceCall<Voices> |
listVoices(ListVoicesOptions listVoicesOptions)
List voices.
|
com.ibm.cloud.sdk.core.http.ServiceCall<Words> |
listWords(ListWordsOptions listWordsOptions)
List custom words.
|
com.ibm.cloud.sdk.core.http.ServiceCall<java.io.InputStream> |
synthesize(SynthesizeOptions synthesizeOptions)
Synthesize audio.
|
okhttp3.WebSocket |
synthesizeUsingWebSocket(SynthesizeOptions synthesizeOptions,
SynthesizeCallback callback)
Synthesize audio.
|
com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> |
updateVoiceModel(UpdateVoiceModelOptions updateVoiceModelOptions)
Update a custom model.
|
configureClient, configureHttpClient, configureService, createServiceCall, getAuthenticator, getClient, getEndPoint, getName, getServiceUrl, isJsonMimeType, isJsonPatchMimeType, processServiceCall, setAuthentication, setClient, setDefaultHeaders, setDefaultHeaders, setEndPoint, setServiceUrl, toString
public TextToSpeech()
public TextToSpeech(com.ibm.cloud.sdk.core.security.Authenticator authenticator)
authenticator
- the Authenticator instance to be configured for this servicepublic TextToSpeech(java.lang.String serviceName)
serviceName
- The name of the service to configure.public TextToSpeech(java.lang.String serviceName, com.ibm.cloud.sdk.core.security.Authenticator authenticator)
serviceName
- The name of the service to configure.authenticator
- the Authenticator instance to be configured for this servicepublic com.ibm.cloud.sdk.core.http.ServiceCall<Voices> listVoices(ListVoicesOptions listVoicesOptions)
Lists all voices available for use with the service. The information includes the name, language, gender, and other details about the voice. To see information about a specific voice, use the **Get a voice** method.
**See also:** [Listing all available voices](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices#listVoices).
listVoicesOptions
- the ListVoicesOptions
containing the options for the callServiceCall
with a response type of Voices
public com.ibm.cloud.sdk.core.http.ServiceCall<Voices> listVoices()
Lists all voices available for use with the service. The information includes the name, language, gender, and other details about the voice. To see information about a specific voice, use the **Get a voice** method.
**See also:** [Listing all available voices](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices#listVoices).
ServiceCall
with a response type of Voices
public com.ibm.cloud.sdk.core.http.ServiceCall<Voice> getVoice(GetVoiceOptions getVoiceOptions)
Gets information about the specified voice. The information includes the name, language, gender, and other details about the voice. Specify a customization ID to obtain information for a custom voice model that is defined for the language of the specified voice. To list information about all available voices, use the **List voices** method.
**See also:** [Listing a specific voice](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices#listVoice).
getVoiceOptions
- the GetVoiceOptions
containing the options for the callServiceCall
with a response type of Voice
public com.ibm.cloud.sdk.core.http.ServiceCall<java.io.InputStream> synthesize(SynthesizeOptions synthesizeOptions)
Synthesizes text to audio that is spoken in the specified voice. The service bases its understanding of the language for the input text on the specified voice. Use a voice that matches the language of the input text.
The method accepts a maximum of 5 KB of input text in the body of the request, and 8 KB for the URL and headers. The 5 KB limit includes any SSML tags that you specify. The service returns the synthesized audio stream as an array of bytes.
**See also:** [The HTTP interface](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-usingHTTP#usingHTTP).
### Audio formats (accept types)
The service can return audio in the following formats (MIME types). * Where indicated, you can optionally specify the sampling rate (`rate`) of the audio. You must specify a sampling rate for the `audio/l16` and `audio/mulaw` formats. A specified sampling rate must lie in the range of 8 kHz to 192 kHz. Some formats restrict the sampling rate to certain values, as noted. * For the `audio/l16` format, you can optionally specify the endianness (`endianness`) of the audio: `endianness=big-endian` or `endianness=little-endian`.
Use the `Accept` header or the `accept` parameter to specify the requested format of the response audio. If you omit an audio format altogether, the service returns the audio in Ogg format with the Opus codec (`audio/ogg;codecs=opus`). The service always returns single-channel audio. * `audio/basic` - The service returns audio with a sampling rate of 8000 Hz. * `audio/flac` - You can optionally specify the `rate` of the audio. The default sampling rate is 22,050 Hz. * `audio/l16` - You must specify the `rate` of the audio. You can optionally specify the `endianness` of the audio. The default endianness is `little-endian`. * `audio/mp3` - You can optionally specify the `rate` of the audio. The default sampling rate is 22,050 Hz. * `audio/mpeg` - You can optionally specify the `rate` of the audio. The default sampling rate is 22,050 Hz. * `audio/mulaw` - You must specify the `rate` of the audio. * `audio/ogg` - The service returns the audio in the `vorbis` codec. You can optionally specify the `rate` of the audio. The default sampling rate is 22,050 Hz. * `audio/ogg;codecs=opus` - You can optionally specify the `rate` of the audio. Only the following values are valid sampling rates: `48000`, `24000`, `16000`, `12000`, or `8000`. If you specify a value other than one of these, the service returns an error. The default sampling rate is 48,000 Hz. * `audio/ogg;codecs=vorbis` - You can optionally specify the `rate` of the audio. The default sampling rate is 22,050 Hz. * `audio/wav` - You can optionally specify the `rate` of the audio. The default sampling rate is 22,050 Hz. * `audio/webm` - The service returns the audio in the `opus` codec. The service returns audio with a sampling rate of 48,000 Hz. * `audio/webm;codecs=opus` - The service returns audio with a sampling rate of 48,000 Hz. * `audio/webm;codecs=vorbis` - You can optionally specify the `rate` of the audio. The default sampling rate is 22,050 Hz.
For more information about specifying an audio format, including additional details about some of the formats, see [Audio formats](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-audioFormats#audioFormats).
### Warning messages
If a request includes invalid query parameters, the service returns a `Warnings` response header that provides messages about the invalid parameters. The warning includes a descriptive message and a list of invalid argument strings. For example, a message such as `"Unknown arguments:"` or `"Unknown url query arguments:"` followed by a list of the form `"{invalid_arg_1}, {invalid_arg_2}."` The request succeeds despite the warnings.
synthesizeOptions
- the SynthesizeOptions
containing the options for the callServiceCall
with a response type of InputStream
public okhttp3.WebSocket synthesizeUsingWebSocket(SynthesizeOptions synthesizeOptions, SynthesizeCallback callback)
Synthesizes text to audio that is spoken in the specified voice. The service bases its understanding of the language for the input text on the specified voice. Use a voice that matches the language of the input text.
The method accepts a maximum of 5 KB of input text in the body of the request, and 8 KB for the URL and headers. The 5 KB limit includes any SSML tags that you specify. The service returns the synthesized audio stream as an array of bytes.
### Audio formats (accept types)
For more information about specifying an audio format, including additional details about some of the formats, see [Audio formats](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-audioFormats#audioFormats).
synthesizeOptions
- the SynthesizeOptions
containing the options for the callcallback
- the SynthesizeCallback
callbackWebSocket
instancepublic com.ibm.cloud.sdk.core.http.ServiceCall<Pronunciation> getPronunciation(GetPronunciationOptions getPronunciationOptions)
Gets the phonetic pronunciation for the specified word. You can request the pronunciation for a specific format. You can also request the pronunciation for a specific voice to see the default translation for the language of that voice or for a specific custom voice model to see the translation for that voice model.
**Note:** This method is currently a beta release.
**See also:** [Querying a word from a language](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsQueryLanguage).
getPronunciationOptions
- the GetPronunciationOptions
containing the options for
the callServiceCall
with a response type of Pronunciation
public com.ibm.cloud.sdk.core.http.ServiceCall<VoiceModel> createVoiceModel(CreateVoiceModelOptions createVoiceModelOptions)
Creates a new empty custom voice model. You must specify a name for the new custom model. You can optionally specify the language and a description for the new model. The model is owned by the instance of the service whose credentials are used to create it.
**Note:** This method is currently a beta release.
**See also:** [Creating a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsCreate).
createVoiceModelOptions
- the CreateVoiceModelOptions
containing the options for
the callServiceCall
with a response type of VoiceModel
public com.ibm.cloud.sdk.core.http.ServiceCall<VoiceModels> listVoiceModels(ListVoiceModelsOptions listVoiceModelsOptions)
Lists metadata such as the name and description for all custom voice models that are owned by an instance of the service. Specify a language to list the voice models for that language only. To see the words in addition to the metadata for a specific voice model, use the **List a custom model** method. You must use credentials for the instance of the service that owns a model to list information about it.
**Note:** This method is currently a beta release.
**See also:** [Querying all custom models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQueryAll).
listVoiceModelsOptions
- the ListVoiceModelsOptions
containing the options for the
callServiceCall
with a response type of VoiceModels
public com.ibm.cloud.sdk.core.http.ServiceCall<VoiceModels> listVoiceModels()
Lists metadata such as the name and description for all custom voice models that are owned by an instance of the service. Specify a language to list the voice models for that language only. To see the words in addition to the metadata for a specific voice model, use the **List a custom model** method. You must use credentials for the instance of the service that owns a model to list information about it.
**Note:** This method is currently a beta release.
**See also:** [Querying all custom models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQueryAll).
ServiceCall
with a response type of VoiceModels
public com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> updateVoiceModel(UpdateVoiceModelOptions updateVoiceModelOptions)
Updates information for the specified custom voice model. You can update metadata such as the name and description of the voice model. You can also update the words in the model and their translations. Adding a new translation for a word that already exists in a custom model overwrites the word's existing translation. A custom model can contain no more than 20,000 entries. You must use credentials for the instance of the service that owns a model to update it.
You can define sounds-like or phonetic translations for words. A sounds-like translation
consists of one or more words that, when combined, sound like the word. Phonetic translations
are based on the SSML phoneme format for representing a word. You can specify them in standard
International Phonetic Alphabet (IPA) representation
<phoneme alphabet="ipa" ph="təmˈɑto"></phoneme>
or in the
proprietary IBM Symbolic Phonetic Representation (SPR)
<phoneme alphabet="ibm" ph="1gAstroEntxrYFXs"></phoneme>
**Note:** This
method is currently a beta release.
**See also:** * [Updating a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsUpdate) * [Adding words to a Japanese custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd) * [Understanding customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
updateVoiceModelOptions
- the UpdateVoiceModelOptions
containing the options for
the callServiceCall
with a response type of Voidpublic com.ibm.cloud.sdk.core.http.ServiceCall<VoiceModel> getVoiceModel(GetVoiceModelOptions getVoiceModelOptions)
Gets all information about a specified custom voice model. In addition to metadata such as the name and description of the voice model, the output includes the words and their translations as defined in the model. To see just the metadata for a voice model, use the **List custom models** method.
**Note:** This method is currently a beta release.
**See also:** [Querying a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQuery).
getVoiceModelOptions
- the GetVoiceModelOptions
containing the options for the
callServiceCall
with a response type of VoiceModel
public com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> deleteVoiceModel(DeleteVoiceModelOptions deleteVoiceModelOptions)
Deletes the specified custom voice model. You must use credentials for the instance of the service that owns a model to delete it.
**Note:** This method is currently a beta release.
**See also:** [Deleting a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsDelete).
deleteVoiceModelOptions
- the DeleteVoiceModelOptions
containing the options for
the callServiceCall
with a response type of Voidpublic com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> addWords(AddWordsOptions addWordsOptions)
Adds one or more words and their translations to the specified custom voice model. Adding a new translation for a word that already exists in a custom model overwrites the word's existing translation. A custom model can contain no more than 20,000 entries. You must use credentials for the instance of the service that owns a model to add words to it.
You can define sounds-like or phonetic translations for words. A sounds-like translation
consists of one or more words that, when combined, sound like the word. Phonetic translations
are based on the SSML phoneme format for representing a word. You can specify them in standard
International Phonetic Alphabet (IPA) representation
<phoneme alphabet="ipa" ph="təmˈɑto"></phoneme>
or in the
proprietary IBM Symbolic Phonetic Representation (SPR)
<phoneme alphabet="ibm" ph="1gAstroEntxrYFXs"></phoneme>
**Note:** This
method is currently a beta release.
**See also:** * [Adding multiple words to a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsAdd) * [Adding words to a Japanese custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd) * [Understanding customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
addWordsOptions
- the AddWordsOptions
containing the options for the callServiceCall
with a response type of Voidpublic com.ibm.cloud.sdk.core.http.ServiceCall<Words> listWords(ListWordsOptions listWordsOptions)
Lists all of the words and their translations for the specified custom voice model. The output shows the translations as they are defined in the model. You must use credentials for the instance of the service that owns a model to list its words.
**Note:** This method is currently a beta release.
**See also:** [Querying all words from a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsQueryModel).
listWordsOptions
- the ListWordsOptions
containing the options for the callServiceCall
with a response type of Words
public com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> addWord(AddWordOptions addWordOptions)
Adds a single word and its translation to the specified custom voice model. Adding a new translation for a word that already exists in a custom model overwrites the word's existing translation. A custom model can contain no more than 20,000 entries. You must use credentials for the instance of the service that owns a model to add a word to it.
You can define sounds-like or phonetic translations for words. A sounds-like translation
consists of one or more words that, when combined, sound like the word. Phonetic translations
are based on the SSML phoneme format for representing a word. You can specify them in standard
International Phonetic Alphabet (IPA) representation
<phoneme alphabet="ipa" ph="təmˈɑto"></phoneme>
or in the
proprietary IBM Symbolic Phonetic Representation (SPR)
<phoneme alphabet="ibm" ph="1gAstroEntxrYFXs"></phoneme>
**Note:** This
method is currently a beta release.
**See also:** * [Adding a single word to a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordAdd) * [Adding words to a Japanese custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd) * [Understanding customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
addWordOptions
- the AddWordOptions
containing the options for the callServiceCall
with a response type of Voidpublic com.ibm.cloud.sdk.core.http.ServiceCall<Translation> getWord(GetWordOptions getWordOptions)
Gets the translation for a single word from the specified custom model. The output shows the translation as it is defined in the model. You must use credentials for the instance of the service that owns a model to list its words.
**Note:** This method is currently a beta release.
**See also:** [Querying a single word from a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordQueryModel).
getWordOptions
- the GetWordOptions
containing the options for the callServiceCall
with a response type of Translation
public com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> deleteWord(DeleteWordOptions deleteWordOptions)
Deletes a single word from the specified custom voice model. You must use credentials for the instance of the service that owns a model to delete its words.
**Note:** This method is currently a beta release.
**See also:** [Deleting a word from a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordDelete).
deleteWordOptions
- the DeleteWordOptions
containing the options for the callServiceCall
with a response type of Voidpublic com.ibm.cloud.sdk.core.http.ServiceCall<java.lang.Void> deleteUserData(DeleteUserDataOptions deleteUserDataOptions)
Deletes all data that is associated with a specified customer ID. The method deletes all data for the customer ID, regardless of the method by which the information was added. The method has no effect if no data is associated with the customer ID. You must issue the request with credentials for the same instance of the service that was used to associate the customer ID with the data.
You associate a customer ID with data by passing the `X-Watson-Metadata` header with a request that passes the data.
**See also:** [Information security](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-information-security#information-security).
deleteUserDataOptions
- the DeleteUserDataOptions
containing the options for the
callServiceCall
with a response type of Void