watson_developer_cloud.text_to_speech_v1 module¶

The IBM Watson Text to Speech service provides an API that uses IBM’s speech-synthesis capabilities to synthesize text into natural-sounding speech in a variety of languages, dialects, and voices. The service supports at least one male or female voice, sometimes both, for each language. The audio is streamed back to the client with minimal delay.

For more information about the service and its various interfaces, see [About Text to Speech](https://console.bluemix.net/docs/services/text-to-speech/index.html).

class TextToSpeechV1(url='https://stream.watsonplatform.net/text-to-speech/api', username=None, password=None)[source]¶

Bases: watson_developer_cloud.watson_service.WatsonService

The Text to Speech V1 service.

default_url = 'https://stream.watsonplatform.net/text-to-speech/api'¶

get_voice(voice, customization_id=None, **kwargs)[source]¶

Retrieves a specific voice available for speech synthesis.

Parameters:	voice (str) – The voice for which information is to be returned. customization_id (str) – The GUID of a custom voice model for which information is to be returned. You must make the request with service credentials created for the instance of the service that owns the custom model. Omit the parameter to see information about the specified voice with no customization. headers (dict) – A dict containing the request headers
Returns:	A dict containing the Voice response.
Return type:	dict

list_voices(**kwargs)[source]¶

Retrieves a list of all voices available for use with the service. The information includes the name, language, gender, and other details about the voice.

Parameters:	headers (dict) – A dict containing the request headers
Returns:	A dict containing the Voices response.
Return type:	dict

voices(**kwargs)¶

synthesize(text, accept=None, voice=None, customization_id=None, **kwargs)[source]¶

Streaming speech synthesis of the text in the body parameter. Synthesizes text to spoken audio, returning the synthesized audio stream as an array of bytes.

Parameters:	text (str) – The text to synthesize. accept (str) – The type of the response: audio/basic, audio/flac, audio/l16;rate=nnnn, audio/ogg, audio/ogg;codecs=opus, audio/ogg;codecs=vorbis, audio/mp3, audio/mpeg, audio/mulaw;rate=nnnn, audio/wav, audio/webm, audio/webm;codecs=opus, or audio/webm;codecs=vorbis. voice (str) – The voice to use for synthesis. customization_id (str) – The GUID of a custom voice model to use for the synthesis. If a custom voice model is specified, it is guaranteed to work only if it matches the language of the indicated voice. You must make the request with service credentials created for the instance of the service that owns the custom model. Omit the parameter to use the specified voice with no customization. headers (dict) – A dict containing the request headers
Returns:	A Response <Response> object representing the response.
Return type:	requests.models.Response

get_pronunciation(text, voice=None, format=None, customization_id=None, **kwargs)[source]¶

Gets the pronunciation for a word.

Returns the phonetic pronunciation for the word specified by the text parameter. You can request the pronunciation for a specific format. You can also request the pronunciation for a specific voice to see the default translation for the language of that voice or for a specific custom voice model to see the translation for that voice model. Note: This method is currently a beta release.

Parameters:	text (str) – The word for which the pronunciation is requested. voice (str) – A voice that specifies the language in which the pronunciation is to be returned. All voices for the same language (for example, en-US) return the same translation. format (str) – The phoneme format in which to return the pronunciation. Omit the parameter to obtain the pronunciation in the default format. customization_id (str) – The GUID of a custom voice model for which the pronunciation is to be returned. The language of a specified custom model must match the language of the specified voice. If the word is not defined in the specified custom model, the service returns the default translation for the custom model’s language. You must make the request with service credentials created for the instance of the service that owns the custom model. Omit the parameter to see the translation for the specified voice with no customization. headers (dict) – A dict containing the request headers
Returns:	A dict containing the Pronunciation response.
Return type:	dict

pronunciation(**kwargs)¶

create_voice_model(name, language=None, description=None, **kwargs)[source]¶

Creates a new empty custom voice model. The model is owned by the instance of the service whose credentials are used to create it. Note: This method is currently a beta release.

Parameters:	name (str) – The name of the new custom voice model. language (str) – The language of the new custom voice model. Omit the parameter to use the the default language, en-US. description (str) – A description of the new custom voice model. Specifying a description is recommended. headers (dict) – A dict containing the request headers
Returns:	A dict containing the VoiceModel response.
Return type:	dict

create_customization(**kwargs)¶

delete_voice_model(customization_id, **kwargs)[source]¶

Deletes the custom voice model with the specified customization_id. You must use credentials for the instance of the service that owns a model to delete it. Note: This method is currently a beta release.

Parameters:	customization_id (str) – The GUID of the custom voice model. headers (dict) – A dict containing the request headers
Return type:	None

delete_customization(**kwargs)¶

get_voice_model(customization_id, **kwargs)[source]¶

Queries the contents of a custom voice model.

Lists all information about the custom voice model with the specified customization_id. In addition to metadata such as the name and description of the voice model, the output includes the words in the model and their translations as defined in the model. To see just the metadata for a voice model, use the GET /v1/customizations method. You must use credentials for the instance of the service that owns a model to list information about it. Note: This method is currently a beta release.

Parameters:	customization_id (str) – The GUID of the custom voice model. headers (dict) – A dict containing the request headers
Returns:	A dict containing the VoiceModel response.
Return type:	dict

get_customization(**kwargs)¶

list_voice_models(language=None, **kwargs)[source]¶

Lists metadata such as the name and description for the custom voice models that you own. Use the language query parameter to list the voice models that you own for the specified language only. Omit the parameter to see all voice models that you own for all languages. To see the words in addition to the metadata for a specific voice model, use the GET /v1/customizations/{customization_id} method. You must use credentials for the instance of the service that owns a model to list information about it. Note: This method is currently a beta release.

Parameters:	language (str) – The language for which custom voice models that are owned by the requesting service credentials are to be returned. Omit the parameter to see all custom voice models that are owned by the requester. headers (dict) – A dict containing the request headers
Returns:	A dict containing the VoiceModels response.
Return type:	dict

customizations(**kwargs)¶

update_voice_model(customization_id, name=None, description=None, words=None, **kwargs)[source]¶

Updates information and words for a custom voice model.

Updates information for the custom voice model with the specified customization_id. You can update the metadata such as the name and description of the voice model. You can also update the words in the model and their translations. Adding a new translation for a word that already exists in a custom model overwrites the word’s existing translation. A custom model can contain no more than 20,000 entries. You must use credentials for the instance of the service that owns a model to update it. Note: This method is currently a beta release.

Parameters:

customization_id (str) – The GUID of the custom voice model.
name (str) – A new name for the custom voice model.
description (str) – A new description for the custom voice model.
words (list[Word]) – An array of Word objects that provides the words and their translations that are to be added or updated for the custom voice model. Pass an empty array to make no additions or updates.
headers (dict) – A dict containing the request headers

Return type:

None

update_customization(**kwargs)¶

add_word(customization_id, word, translation, part_of_speech=None, **kwargs)[source]¶

Adds a word to a custom voice model.

Adds a single word and its translation to the custom voice model with the specified customization_id. Adding a new translation for a word that already exists in a custom model overwrites the word’s existing translation. A custom model can contain no more than 20,000 entries. You must use credentials for the instance of the service that owns a model to add a word to it. Note: This method is currently a beta release.

Parameters:

customization_id (str) – The GUID of the custom voice model.
word (str) – The word that is to be added or updated for the custom voice model.
translation (str) – The phonetic or sounds-like translation for the word. A phonetic translation is based on the SSML format for representing the phonetic string of a word either as an IPA translation or as an IBM SPR translation. A sounds-like is one or more words that, when combined, sound like the word.
part_of_speech (str) – Japanese only. The part of speech for the word. The service uses the value to produce the correct intonation for the word. You can create only a single entry, with or without a single part of speech, for any word; you cannot create multiple entries with different parts of speech for the same word. For more information, see [Working with Japanese entries](https://console.bluemix.net/docs/services/text-to-speech/custom-rules.html#jaNotes).
headers (dict) – A dict containing the request headers

Return type:

None

set_customization_word(**kwargs)¶

add_words(customization_id, words, **kwargs)[source]¶

Adds one or more words to a custom voice model.

Adds one or more words and their translations to the custom voice model with the specified customization_id. Adding a new translation for a word that already exists in a custom model overwrites the word’s existing translation. A custom model can contain no more than 20,000 entries. You must use credentials for the instance of the service that owns a model to add words to it. Note: This method is currently a beta release.

Parameters:

customization_id (str) – The GUID of the custom voice model.
words (list[Word]) – When adding words to a custom voice model, an array of Word objects that provides one or more words that are to be added or updated for the custom voice model and the translation for each specified word. When listing words from a custom voice model, an array of Word objects that lists the words and their translations from the custom voice model. The words are listed in alphabetical order, with uppercase letters listed before lowercase letters. The array is empty if the custom model contains no words.
headers (dict) – A dict containing the request headers

Return type:

None

add_customization_words(**kwargs)¶

delete_word(customization_id, word, **kwargs)[source]¶

Deletes a word from a custom voice model.

Deletes a single word from the custom voice model with the specified customization_id. You must use credentials for the instance of the service that owns a model to delete it. Note: This method is currently a beta release.

Parameters:	customization_id (str) – The GUID of the custom voice model. word (str) – The word that is to be deleted from the custom voice model. headers (dict) – A dict containing the request headers
Return type:	None

delete_customization_word(**kwargs)¶

get_word(customization_id, word, **kwargs)[source]¶

Queries details about a word in a custom voice model.

Returns the translation for a single word from the custom model with the specified customization_id. The output shows the translation as it is defined in the model. You must use credentials for the instance of the service that owns a model to query information about its words. Note: This method is currently a beta release.

Parameters:	customization_id (str) – The GUID of the custom voice model. word (str) – The word that is to be queried from the custom voice model. headers (dict) – A dict containing the request headers
Returns:	A dict containing the Translation response.
Return type:	dict

get_customization_word(**kwargs)¶

list_words(customization_id, **kwargs)[source]¶

Queries details about the words in a custom voice model.

Lists all of the words and their translations for the custom voice model with the specified customization_id. The output shows the translations as they are defined in the model. You must use credentials for the instance of the service that owns a model to query information about its words. Note: This method is currently a beta release.

Parameters:	customization_id (str) – The GUID of the custom voice model. headers (dict) – A dict containing the request headers
Returns:	A dict containing the Words response.
Return type:	dict

get_customization_words(**kwargs)¶

class Pronunciation(pronunciation)[source]¶

Bases: object

Pronunciation.

Attr str pronunciation:
	The pronunciation of the requested text in the specified voice and format.

class SupportedFeatures(custom_pronunciation, voice_transformation)[source]¶

Bases: object

SupportedFeatures.

Attr bool custom_pronunciation:
	If true, the voice can be customized; if false, the voice cannot be customized. (Same as customizable.).
Attr bool voice_transformation:
	If true, the voice can be transformed by using the SSML <voice-transformation> element; if false, the voice cannot be transformed.

class Translation(translation, part_of_speech=None)[source]¶

Bases: object

Translation.

Attr str translation:
	The phonetic or sounds-like translation for the word. A phonetic translation is based on the SSML format for representing the phonetic string of a word either as an IPA translation or as an IBM SPR translation. A sounds-like is one or more words that, when combined, sound like the word.
Attr str part_of_speech:
	(optional) Japanese only. The part of speech for the word. The service uses the value to produce the correct intonation for the word. You can create only a single entry, with or without a single part of speech, for any word; you cannot create multiple entries with different parts of speech for the same word. For more information, see [Working with Japanese entries](https://console.bluemix.net/docs/services/text-to-speech/custom-rules.html#jaNotes).

class Voice(url, gender, name, language, description, customizable, supported_features, customization=None)[source]¶

Bases: object

Voice.

Attr str gender:
Attr str url:	The URI of the voice.
	The gender of the voice: male or female.
Attr str name:	The name of the voice. Use this as the voice identifier in all requests.
Attr str language:
	The language and region of the voice (for example, en-US).
Attr str description:
	A textual description of the voice.
Attr bool customizable:
	If true, the voice can be customized; if false, the voice cannot be customized. (Same as custom_pronunciation; maintained for backward compatibility.).
Attr SupportedFeatures supported_features:
	Describes the additional service features supported with the voice.
Attr VoiceModel customization:
	(optional) Returns information about a specified custom voice model. Note: This field is returned only when you list information about a specific voice and specify the GUID of a custom voice model that is based on that voice.

class VoiceModel(customization_id, name=None, language=None, owner=None, created=None, last_modified=None, description=None, words=None)[source]¶

Bases: object

VoiceModel.

Attr str customization_id:
	The customization ID (GUID) of the custom voice model. Note: When you create a new custom voice model, the service returns only the GUID of the new custom model; it does not return the other fields of this object.
Attr str name:	(optional) The name of the custom voice model.
Attr str language:
	(optional) The language identifier of the custom voice model (for example, en-US).
Attr str owner:	(optional) The GUID of the service credentials for the instance of the service that owns the custom voice model.
Attr str created:
	(optional) The date and time in Coordinated Universal Time (UTC) at which the custom voice model was created. The value is provided in full ISO 8601 format (YYYY-MM-DDThh:mm:ss.sTZD).
Attr str last_modified:
	(optional) The date and time in Coordinated Universal Time (UTC) at which the custom voice model was last modified. Equals created when a new voice model is first added but has yet to be updated. The value is provided in full ISO 8601 format (YYYY-MM-DDThh:mm:ss.sTZD).
Attr str description:
	(optional) The description of the custom voice model.
Attr list[Word] words:
	(optional) An array of Word objects that lists the words and their translations from the custom voice model. The words are listed in alphabetical order, with uppercase letters listed before lowercase letters. The array is empty if the custom model contains no words. Note: This field is returned only when you list information about a specific custom voice model.

class VoiceModels(customizations)[source]¶

Bases: object

VoiceModels.

Attr list[VoiceModel] customizations:
	An array of VoiceModel objects that provides information about each available custom voice model. The array is empty if the requesting service credentials own no custom voice models (if no language is specified) or own no custom voice models for the specified language.

class Voices(voices)[source]¶

Bases: object

Voices.

Attr list[Voice] voices:
	A list of available voices.

class Word(word, translation, part_of_speech=None)[source]¶

Bases: object

Word.

Attr str translation:
Attr str word:	A word from the custom voice model.
	The phonetic or sounds-like translation for the word. A phonetic translation is based on the SSML format for representing the phonetic string of a word either as an IPA or IBM SPR translation. A sounds-like translation consists of one or more words that, when combined, sound like the word.
Attr str part_of_speech:
	(optional) Japanese only. The part of speech for the word. The service uses the value to produce the correct intonation for the word. You can create only a single entry, with or without a single part of speech, for any word; you cannot create multiple entries with different parts of speech for the same word. For more information, see [Working with Japanese entries](https://console.bluemix.net/docs/services/text-to-speech/custom-rules.html#jaNotes).

class Words(words)[source]¶

Bases: object

Words.

Attr list[Word] words:

When adding words to a custom voice model, an array of Word objects that provides one or more words that are to be added or updated for the custom voice model and the translation for each specified word. When listing words from a custom voice model, an array of Word objects that lists the words and their translations from the custom voice model. The words are listed in alphabetical order, with uppercase letters listed before lowercase letters. The array is empty if the custom model contains no words.

watson_developer_cloud.text_to_speech_v1 module¶

Useful Links

Related Topics