SpeechToTextSession
public class SpeechToTextSession
The IBM Watson Speech to Text service enables you to add speech transcription capabilities to your application. It uses machine intelligence to combine information about grammar and language structure to generate an accurate transcription. Transcriptions are supported for various audio formats and languages.
This class enables fine-tuned control of a WebSockets session with the Speech to Text service.
Although it is a more complex interface than the SpeechToText
class, it provides more control
and customizability of the session.
-
The URL that shall be used to stream audio for transcription.
Declaration
Swift
public var websocketsURL: String { get set }
-
The default HTTP headers for all requests to the service.
Declaration
Swift
public var defaultHeaders: [String : String]
-
The results of the most recent recognition request.
Declaration
Swift
public var results: SpeechRecognitionResults { get }
-
Invoked when the session connects to the Speech to Text service.
Declaration
Swift
public var onConnect: (() -> Void)? { get set }
-
Invoked with microphone audio when a recording audio queue buffer has been filled. If microphone audio is being compressed, then the audio data is in Opus format. If uncompressed, then the audio data is in 16-bit mono PCM format at 16 kHZ.
Declaration
Swift
public var onMicrophoneData: ((Data) -> Void)?
-
Invoked every 0.025s when recording with the average dB power of the microphone.
Declaration
Swift
public var onPowerData: ((Float32) -> Void)? { get set }
-
Invoked when transcription results are received for a recognition request.
Declaration
Swift
public var onResults: ((SpeechRecognitionResults) -> Void)? { get set }
-
Invoked when an error or warning occurs.
Declaration
Swift
public var onError: ((WatsonError) -> Void)? { get set }
-
Invoked when the session disconnects from the Speech to Text service.
Declaration
Swift
public var onDisconnect: (() -> Void)?
-
init(authenticator:
model: baseModelVersion: languageCustomizationID: acousticCustomizationID: learningOptOut: endOfPhraseSilenceTime: splitTranscriptAtPhraseEnd: customerID: ) Undocumented
Declaration
Swift
public init( authenticator: Authenticator, model: String? = nil, baseModelVersion: String? = nil, languageCustomizationID: String? = nil, acousticCustomizationID: String? = nil, learningOptOut: Bool? = nil, endOfPhraseSilenceTime: Double? = nil, splitTranscriptAtPhraseEnd: Bool? = nil, customerID: String? = nil)
-
Connect to the Speech to Text service.
If set, the
onConnect()
callback will be invoked after the session connects to the service.Declaration
Swift
public func connect()
-
Start a recognition request.
Declaration
Swift
public func startRequest(settings: RecognitionSettings)
Parameters
settings
The configuration to use for this recognition request.
-
Send an audio file to transcribe.
Declaration
Swift
public func recognize(audio: URL)
Parameters
audio
The audio file to transcribe.
-
Send audio data to transcribe.
Declaration
Swift
public func recognize(audio: Data)
Parameters
audio
The audio data to transcribe.
-
Start streaming microphone audio data to transcribe.
By default, microphone audio data is compressed to Opus format to reduce latency and bandwidth. To disable Opus compression and send linear PCM data instead, set
compress
tofalse
.If compression is enabled, the recognitions request’s
contentType
setting should be set to “audio/ogg;codecs=opus”. If compression is disabled, then thecontentType
settings should be set to “audio/l16;rate=16000;channels=1”.This function may cause the system to automatically prompt the user for permission to access the microphone. Use
AVAudioSession.requestRecordPermission(_:)
if you would rather prefer to ask for the user’s permission in advance.Declaration
Swift
public func startMicrophone(compress: Bool = true)
Parameters
compress
Should microphone audio be compressed to Opus format? (Opus compression reduces latency and bandwidth.)
-
Stop streaming microphone audio data to transcribe.
Declaration
Swift
public func stopMicrophone()
-
Stop the recognition request.
Declaration
Swift
public func stopRequest()
-
Wait for any queued recognition requests to complete then disconnect from the service.
Declaration
Swift
public func disconnect()