TextToSpeechV1 | ibm-watson

The IBM Watson™ Text to Speech service provides APIs that use IBM's speech-synthesis capabilities to synthesize text into natural-sounding speech in a variety of languages, dialects, and voices. The service supports at least one male or female voice, sometimes both, for each language. The audio is streamed back to the client with minimal delay. interfaces

For speech synthesis, the service supports a synchronous HTTP Representational State Transfer (REST) interface and a WebSocket interface. Both interfaces support plain text and SSML input. SSML is an XML-based markup language that provides text annotation for speech-synthesis applications. The WebSocket interface also supports the SSML <mark> element and word timings.

The service offers a customization interface that you can use to define sounds-like or phonetic translations for words. A sounds-like translation consists of one or more words that, when combined, sound like the word. A phonetic translation is based on the SSML phoneme format for representing a word. You can specify a phonetic translation in standard International Phonetic Alphabet (IPA) representation or in the proprietary IBM Symbolic Phonetic Representation (SPR). The Arabic, Chinese, Dutch, and Korean languages support only IPA.

Hierarchy

BaseService
- TextToSpeechV1

Index

Namespaces

Constructors

constructor

Properties

Methods

Constructors

constructor

new TextToSpeechV1(options: UserOptions): TextToSpeechV1

Construct a TextToSpeechV1 object.

Parameters

Name Type Attribute Description

Name	Type	Attribute	Description
`options`	UserOptions		Options for the service.

options

UserOptions

Options for the service.

Returns TextToSpeechV1

Properties

Static DEFAULT_SERVICE_NAME

DEFAULT_SERVICE_NAME: string = "text_to_speech"

Static DEFAULT_SERVICE_URL

DEFAULT_SERVICE_URL: string = "https://api.us-south.text-to-speech.watson.cloud.ibm.com"

Methods

addWord

addWord(params: AddWordParams, callback?: Callback<Empty>): Promise<Response<Empty>>

Add a custom word.

Adds a single word and its translation to the specified custom voice model. Adding a new translation for a word that already exists in a custom model overwrites the word's existing translation. A custom model can contain no more than 20,000 entries. You must use credentials for the instance of the service that owns a model to add a word to it.

You can define sounds-like or phonetic translations for words. A sounds-like translation consists of one or more words that, when combined, sound like the word. Phonetic translations are based on the SSML phoneme format for representing a word. You can specify them in standard International Phonetic Alphabet (IPA) representation

<phoneme alphabet="ipa" ph="təmˈɑto"></phoneme>

or in the proprietary IBM Symbolic Phonetic Representation (SPR)

<phoneme alphabet="ibm" ph="1gAstroEntxrYFXs"></phoneme>

Note: This method is currently a beta release.

See also:

[Adding a single word to a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordAdd)
[Adding words to a Japanese custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
[Understanding customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).

Parameters

Name Type Attribute Description

params

AddWordParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`customizationId`	string		The customization ID (GUID) of the custom voice model. You must make the request with credentials for the instance of the service that owns the custom model.
`translation`	string		The phonetic or sounds-like translation for the word. A phonetic translation is based on the SSML format for representing the phonetic string of a word either as an IPA translation or as an IBM SPR translation. The Arabic, Chinese, Dutch, and Korean languages support only IPA. A sounds-like is one or more words that, when combined, sound like the word.
`word`	string		The word that is to be added or updated for the custom voice model.
`headers`	OutgoingHttpHeaders	Optional
`partOfSpeech`	PartOfSpeech \| string	Optional	Japanese only.* The part of speech for the word. The service uses the value to produce the correct intonation for the word. You can create only a single entry, with or without a single part of speech, for any word; you cannot create multiple entries with different parts of speech for the same word. For more information, see [Working with Japanese entries](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-rules#jaNotes).

callback

Callback<Empty>

Optional

Returns Promise<Response<Empty>>

addWords

addWords(params: AddWordsParams, callback?: Callback<Empty>): Promise<Response<Empty>>

Add custom words.

Adds one or more words and their translations to the specified custom voice model. Adding a new translation for a word that already exists in a custom model overwrites the word's existing translation. A custom model can contain no more than 20,000 entries. You must use credentials for the instance of the service that owns a model to add words to it.

<phoneme alphabet="ipa" ph="təmˈɑto"></phoneme>

or in the proprietary IBM Symbolic Phonetic Representation (SPR)

<phoneme alphabet="ibm" ph="1gAstroEntxrYFXs"></phoneme>

Note: This method is currently a beta release.

See also:

[Adding multiple words to a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsAdd)
[Adding words to a Japanese custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
[Understanding customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).

Parameters

Name Type Attribute Description

params

AddWordsParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`customizationId`	string		The customization ID (GUID) of the custom voice model. You must make the request with credentials for the instance of the service that owns the custom model.
`words`	Word[]		The Add custom words method accepts an array of `Word` objects. Each object provides a word that is to be added or updated for the custom voice model and the word's translation.
`headers`	OutgoingHttpHeaders	Optional

callback

Callback<Empty>

Optional

Returns Promise<Response<Empty>>

createVoiceModel

createVoiceModel(params: CreateVoiceModelParams, callback?: Callback<VoiceModel>): Promise<Response<VoiceModel>>

Create a custom model.

Creates a new empty custom voice model. You must specify a name for the new custom model. You can optionally specify the language and a description for the new model. The model is owned by the instance of the service whose credentials are used to create it.

Note: This method is currently a beta release.

See also: Creating a custom model.

Parameters

Name Type Attribute Description

params

CreateVoiceModelParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`name`	string		The name of the new custom voice model.
`description`	string	Optional	A description of the new custom voice model. Specifying a description is recommended.
`headers`	OutgoingHttpHeaders	Optional
`language`	Language \| string	Optional	The language of the new custom voice model. You create a custom voice model for a specific language, not for a specific voice. A custom model can be used with any voice, standard or neural, for its specified language. Omit the parameter to use the the default language, `en-US`.

callback

Callback<VoiceModel>

Optional

Returns Promise<Response<VoiceModel>>

deleteUserData

deleteUserData(params: DeleteUserDataParams, callback?: Callback<Empty>): Promise<Response<Empty>>

Delete labeled data.

Deletes all data that is associated with a specified customer ID. The method deletes all data for the customer ID, regardless of the method by which the information was added. The method has no effect if no data is associated with the customer ID. You must issue the request with credentials for the same instance of the service that was used to associate the customer ID with the data. You associate a customer ID with data by passing the X-Watson-Metadata header with a request that passes the data.

Note: If you delete an instance of the service from the service console, all data associated with that service instance is automatically deleted. This includes all custom voice models and word/translation pairs, and all data related to speech synthesis requests.

See also: Information security.

Parameters

Name Type Attribute Description

params

DeleteUserDataParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`customerId`	string		The customer ID for which all data is to be deleted.
`headers`	OutgoingHttpHeaders	Optional

callback

Callback<Empty>

Optional

Returns Promise<Response<Empty>>

deleteVoiceModel

deleteVoiceModel(params: DeleteVoiceModelParams, callback?: Callback<Empty>): Promise<Response<Empty>>

Delete a custom model.

Deletes the specified custom voice model. You must use credentials for the instance of the service that owns a model to delete it.

Note: This method is currently a beta release.

See also: Deleting a custom model.

Parameters

Name Type Attribute Description

params

DeleteVoiceModelParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`customizationId`	string		The customization ID (GUID) of the custom voice model. You must make the request with credentials for the instance of the service that owns the custom model.
`headers`	OutgoingHttpHeaders	Optional

callback

Callback<Empty>

Optional

Returns Promise<Response<Empty>>

deleteWord

deleteWord(params: DeleteWordParams, callback?: Callback<Empty>): Promise<Response<Empty>>

Delete a custom word.

Deletes a single word from the specified custom voice model. You must use credentials for the instance of the service that owns a model to delete its words.

Note: This method is currently a beta release.

Parameters

Name Type Attribute Description

params

DeleteWordParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`customizationId`	string		The customization ID (GUID) of the custom voice model. You must make the request with credentials for the instance of the service that owns the custom model.
`word`	string		The word that is to be deleted from the custom voice model.
`headers`	OutgoingHttpHeaders	Optional

callback

Callback<Empty>

Optional

Returns Promise<Response<Empty>>

getAuthenticator

getAuthenticator(): any

Inherited from AuthorizationV1.getAuthenticator
- Defined in node_modules/ibm-cloud-sdk-core/lib/base-service.d.ts:78
Get the instance of the authenticator set on the service.

Returns any

getPronunciation

getPronunciation(params: GetPronunciationParams, callback?: Callback<Pronunciation>): Promise<Response<Pronunciation>>

Get pronunciation.

Gets the phonetic pronunciation for the specified word. You can request the pronunciation for a specific format. You can also request the pronunciation for a specific voice to see the default translation for the language of that voice or for a specific custom voice model to see the translation for that voice model.

Note: This method is currently a beta release.

See also: Querying a word from a language.

Parameters

Name Type Attribute Description

params

GetPronunciationParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`text`	string		The word for which the pronunciation is requested.
`customizationId`	string	Optional	The customization ID (GUID) of a custom voice model for which the pronunciation is to be returned. The language of a specified custom model must match the language of the specified voice. If the word is not defined in the specified custom model, the service returns the default translation for the custom model's language. You must make the request with credentials for the instance of the service that owns the custom model. Omit the parameter to see the translation for the specified voice with no customization.
`format`	Format \| string	Optional	The phoneme format in which to return the pronunciation. The Arabic, Chinese, Dutch, and Korean languages support only IPA. Omit the parameter to obtain the pronunciation in the default format.
`headers`	OutgoingHttpHeaders	Optional
`voice`	Voice \| string	Optional	A voice that specifies the language in which the pronunciation is to be returned. All voices for the same language (for example, `en-US`) return the same translation.

callback

Callback<Pronunciation>

Optional

Returns Promise<Response<Pronunciation>>

getVoice

getVoice(params: GetVoiceParams, callback?: Callback<Voice>): Promise<Response<Voice>>

Get a voice.

Gets information about the specified voice. The information includes the name, language, gender, and other details about the voice. Specify a customization ID to obtain information for a custom voice model that is defined for the language of the specified voice. To list information about all available voices, use the List voices method.

See also: Listing a specific voice.

Parameters

Name Type Attribute Description

params

GetVoiceParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`voice`	Voice \| string		The voice for which information is to be returned.
`customizationId`	string	Optional	The customization ID (GUID) of a custom voice model for which information is to be returned. You must make the request with credentials for the instance of the service that owns the custom model. Omit the parameter to see information about the specified voice with no customization.
`headers`	OutgoingHttpHeaders	Optional

callback

Callback<Voice>

Optional

Returns Promise<Response<Voice>>

getVoiceModel

getVoiceModel(params: GetVoiceModelParams, callback?: Callback<VoiceModel>): Promise<Response<VoiceModel>>

Get a custom model.

Gets all information about a specified custom voice model. In addition to metadata such as the name and description of the voice model, the output includes the words and their translations as defined in the model. To see just the metadata for a voice model, use the List custom models method.

Note: This method is currently a beta release.

See also: Querying a custom model.

Parameters

Name Type Attribute Description

params

GetVoiceModelParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`customizationId`	string		The customization ID (GUID) of the custom voice model. You must make the request with credentials for the instance of the service that owns the custom model.
`headers`	OutgoingHttpHeaders	Optional

callback

Callback<VoiceModel>

Optional

Returns Promise<Response<VoiceModel>>

getWord

getWord(params: GetWordParams, callback?: Callback<Translation>): Promise<Response<Translation>>

Get a custom word.

Gets the translation for a single word from the specified custom model. The output shows the translation as it is defined in the model. You must use credentials for the instance of the service that owns a model to list its words.

Note: This method is currently a beta release.

Parameters

Name Type Attribute Description

params

GetWordParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`customizationId`	string		The customization ID (GUID) of the custom voice model. You must make the request with credentials for the instance of the service that owns the custom model.
`word`	string		The word that is to be queried from the custom voice model.
`headers`	OutgoingHttpHeaders	Optional

callback

Callback<Translation>

Optional

Returns Promise<Response<Translation>>

listVoiceModels

listVoiceModels(params?: ListVoiceModelsParams, callback?: Callback<VoiceModels>): Promise<Response<VoiceModels>>

List custom models.

Lists metadata such as the name and description for all custom voice models that are owned by an instance of the service. Specify a language to list the voice models for that language only. To see the words in addition to the metadata for a specific voice model, use the List a custom model method. You must use credentials for the instance of the service that owns a model to list information about it.

Note: This method is currently a beta release.

See also: Querying all custom models.

Parameters

Name Type Attribute Description

params

ListVoiceModelsParams

Optional

Properties

Name	Type	Attributes	Description
`headers`	OutgoingHttpHeaders	Optional
`language`	Language \| string	Optional	The language for which custom voice models that are owned by the requesting credentials are to be returned. Omit the parameter to see all custom voice models that are owned by the requester.

callback

Callback<VoiceModels>

Optional

Returns Promise<Response<VoiceModels>>

listVoices

listVoices(params?: ListVoicesParams, callback?: Callback<Voices>): Promise<Response<Voices>>

List voices.

Lists all voices available for use with the service. The information includes the name, language, gender, and other details about the voice. The ordering of the list of voices can change from call to call; do not rely on an alphabetized or static list of voices. To see information about a specific voice, use the Get a voice method.

See also: Listing all available voices.

Parameters

Name Type Attribute Description

params

ListVoicesParams

Optional

Properties

Name	Type	Attributes	Description
`headers`	OutgoingHttpHeaders	Optional

callback

Callback<Voices>

Optional

Returns Promise<Response<Voices>>

listWords

listWords(params: ListWordsParams, callback?: Callback<Words>): Promise<Response<Words>>

List custom words.

Lists all of the words and their translations for the specified custom voice model. The output shows the translations as they are defined in the model. You must use credentials for the instance of the service that owns a model to list its words.

Note: This method is currently a beta release.

Parameters

Name Type Attribute Description

params

ListWordsParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`customizationId`	string		The customization ID (GUID) of the custom voice model. You must make the request with credentials for the instance of the service that owns the custom model.
`headers`	OutgoingHttpHeaders	Optional

callback

Callback<Words>

Optional

Returns Promise<Response<Words>>

setServiceUrl

setServiceUrl(url: string): void

Set the service URL to send requests to.

Parameters

Name Type Attribute Description

Name	Type	Attribute	Description
`url`	string		The base URL for the service.

url

string

The base URL for the service.

Returns void

synthesize

synthesize(params: SynthesizeParams, callback?: Callback<ReadableStream | Buffer>): Promise<Response<ReadableStream | Buffer>>

Synthesize audio.

Synthesizes text to audio that is spoken in the specified voice. The service bases its understanding of the language for the input text on the specified voice. Use a voice that matches the language of the input text.

The method accepts a maximum of 5 KB of input text in the body of the request, and 8 KB for the URL and headers. The 5 KB limit includes any SSML tags that you specify. The service returns the synthesized audio stream as an array of bytes.

See also: The HTTP interface.

Audio formats (accept types)

The service can return audio in the following formats (MIME types).

Where indicated, you can optionally specify the sampling rate (rate) of the audio. You must specify a sampling rate for the audio/l16 and audio/mulaw formats. A specified sampling rate must lie in the range of 8 kHz to 192 kHz. Some formats restrict the sampling rate to certain values, as noted.
For the audio/l16 format, you can optionally specify the endianness (endianness) of the audio: endianness=big-endian or endianness=little-endian.

Use the Accept header or the accept parameter to specify the requested format of the response audio. If you omit an audio format altogether, the service returns the audio in Ogg format with the Opus codec (audio/ogg;codecs=opus). The service always returns single-channel audio.

audio/basic - The service returns audio with a sampling rate of 8000 Hz.
audio/flac - You can optionally specify the rate of the audio. The default sampling rate is 22,050 Hz.
audio/l16 - You must specify the rate of the audio. You can optionally specify the endianness of the audio. The default endianness is little-endian.
audio/mp3 - You can optionally specify the rate of the audio. The default sampling rate is 22,050 Hz.
audio/mpeg - You can optionally specify the rate of the audio. The default sampling rate is 22,050 Hz.
audio/mulaw - You must specify the rate of the audio.
audio/ogg - The service returns the audio in the vorbis codec. You can optionally specify the rate of the audio. The default sampling rate is 22,050 Hz.
audio/ogg;codecs=opus - You can optionally specify the rate of the audio. Only the following values are valid sampling rates: 48000, 24000, 16000, 12000, or 8000. If you specify a value other than one of these, the service returns an error. The default sampling rate is 48,000 Hz.
audio/ogg;codecs=vorbis - You can optionally specify the rate of the audio. The default sampling rate is 22,050 Hz.
audio/wav - You can optionally specify the rate of the audio. The default sampling rate is 22,050 Hz.
audio/webm - The service returns the audio in the opus codec. The service returns audio with a sampling rate of 48,000 Hz.
audio/webm;codecs=opus - The service returns audio with a sampling rate of 48,000 Hz.
audio/webm;codecs=vorbis - You can optionally specify the rate of the audio. The default sampling rate is 22,050 Hz.

For more information about specifying an audio format, including additional details about some of the formats, see Audio formats.

Warning messages

If a request includes invalid query parameters, the service returns a Warnings response header that provides messages about the invalid parameters. The warning includes a descriptive message and a list of invalid argument strings. For example, a message such as "Unknown arguments:" or "Unknown url query arguments:" followed by a list of the form "{invalid_arg_1}, {invalid_arg_2}." The request succeeds despite the warnings.

Parameters

Name Type Attribute Description

params

SynthesizeParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`text`	string		The text to synthesize.
`accept`	Accept \| string	Optional	The requested format (MIME type) of the audio. You can use the `Accept` header or the `accept` parameter to specify the audio format. For more information about specifying an audio format, see Audio formats (accept types) in the method description.
`customizationId`	string	Optional	The customization ID (GUID) of a custom voice model to use for the synthesis. If a custom voice model is specified, it works only if it matches the language of the indicated voice. You must make the request with credentials for the instance of the service that owns the custom model. Omit the parameter to use the specified voice with no customization.
`headers`	OutgoingHttpHeaders	Optional
`voice`	Voice \| string	Optional	The voice to use for synthesis.

callback

Callback<ReadableStream | Buffer>

Optional

Returns Promise<Response<ReadableStream | Buffer>>

updateVoiceModel

updateVoiceModel(params: UpdateVoiceModelParams, callback?: Callback<Empty>): Promise<Response<Empty>>

Update a custom model.

Updates information for the specified custom voice model. You can update metadata such as the name and description of the voice model. You can also update the words in the model and their translations. Adding a new translation for a word that already exists in a custom model overwrites the word's existing translation. A custom model can contain no more than 20,000 entries. You must use credentials for the instance of the service that owns a model to update it.

<phoneme alphabet="ipa" ph="təmˈɑto"></phoneme>

or in the proprietary IBM Symbolic Phonetic Representation (SPR)

<phoneme alphabet="ibm" ph="1gAstroEntxrYFXs"></phoneme>

Note: This method is currently a beta release.

See also:

[Updating a custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsUpdate)
[Adding words to a Japanese custom model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
[Understanding customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).

Parameters

Name Type Attribute Description

params

UpdateVoiceModelParams

The parameters to send to the service.

Properties

Name	Type	Attributes	Description
`customizationId`	string		The customization ID (GUID) of the custom voice model. You must make the request with credentials for the instance of the service that owns the custom model.
`description`	string	Optional	A new description for the custom voice model.
`headers`	OutgoingHttpHeaders	Optional
`name`	string	Optional	A new name for the custom voice model.
`words`	Word[]	Optional	An array of `Word` objects that provides the words and their translations that are to be added or updated for the custom voice model. Pass an empty array to make no additions or updates.

callback

Callback<Empty>

Optional

Hierarchy

Index

Namespaces

Constructors

Properties

Methods

Constructors

constructor

Parameters

Returns TextToSpeechV1

Properties

Static DEFAULT_SERVICE_NAME

Static DEFAULT_SERVICE_URL

Methods

addWord

Parameters

Properties

Returns Promise<Response<Empty>>

addWords

Parameters

Properties

Returns Promise<Response<Empty>>

createVoiceModel

Parameters

Properties

Returns Promise<Response<VoiceModel>>

deleteUserData

Parameters

Properties

Returns Promise<Response<Empty>>

deleteVoiceModel

Parameters

Properties

Returns Promise<Response<Empty>>

deleteWord

Parameters

Properties

Returns Promise<Response<Empty>>

getAuthenticator

Returns any

getPronunciation

Parameters

Properties

Returns Promise<Response<Pronunciation>>

getVoice

Parameters

Properties

Returns Promise<Response<Voice>>

getVoiceModel

Parameters

Properties

Returns Promise<Response<VoiceModel>>

getWord

Parameters

Properties

Returns Promise<Response<Translation>>

listVoiceModels

Parameters

Properties

Returns Promise<Response<VoiceModels>>

listVoices

Parameters

Properties

Returns Promise<Response<Voices>>

listWords

Parameters

Properties

Returns Promise<Response<Words>>

setServiceUrl

Parameters

Returns void

synthesize

Audio formats (accept types)

Warning messages

Parameters

Properties

Returns Promise<Response<ReadableStream | Buffer>>

updateVoiceModel

Parameters

Properties