Skip to main content

Make the first API call

This section describes the order of SpeechToText (STT) API calls and examples of scenarios that can be implemented with the API. The SpeechToText SDK offers SERVER, ON DEVICE, and HYBRID modes, and if not set in the app, HYBRID MODE is the default setting.

API call order

Initialize SDK

When starting the SpeechToText SDK, call the initialize API to initialize the SpeechToText SDK with the desired mode settings from the app.


enum class Mode {
HYBRID, // If the network is not connected, the SDK will connect to OnDevice speech recognition as needed.
SERVER, // Connects to speech recognition only when the network is available.
ON_DEVICE // Connects to OnDevice speech recognition.
}

val speechToText = SpeechToText(context, Mode.HYBRID) // Mode must be set

speechToText.initialize();

Add Listener

Before registering the SpeechToText app, call the addListener API to register the SpeechToText event callback.

speechToText.addListener(object: ResultListener { 
override fun onUpdated(stt: String, completed: Boolean) {
// stt : recognized text result
// completed : True if the user's voice recognition has been completed.
}

override fun onUpdatedEpdData(on: Long, off: Long) {
// on : sends the Buffer Size at the start point of the user's voice as Byte value.
// off : sends the Buffer Size at the end point of the user's voice as Byte value.
}

override fun onStartedRecognition() {
// Notifies that the user's voice has started.
}

override fun onEndedRecognition() {
// Notifies that the user's voice recognition has ended.
}

override fun onError() {
// Notifies that an error has occurred during voice recognition.
}

override fun onReady() {
// Notifies that the user can speak through the microphone.
}
})

Register App

Call the registerApp API to register the SpeechToText app with the SpeechToText server. If the MODE is ON_DEVICE, this API does not need to be used.

val clientID = "your_client_id"
val clientSecret = "your_client_secret"
speechToText.registerApp(clientID, clientSecret, object : OnServerConnectionListener {
override fun onConnected() {
// The app can receive events once registration with the Speech server is complete.
}

override fun onFailed(msg: String) {
// If registration to the Speech server fails, an event with an error message is sent to the app.
}
})

Remove Listener

Before exiting the SpeechToText app, release the previously registered event callback. To release the event callback, call the removeListener API.

speechToText.removeListener(listener: ResultListener)

Release SDK

Call the release API to release the SpeechToText SDK resources.

speechToText.release()

Examples of API usage scenarios

1. Start/Stop SpeechToText voice input

The following is an example code to start and stop SpeechToText voice input.

speechToText.request()
speechToText.addListener(object: ResultListener {
override fun onUpdated(stt: String, completed: Boolean) {
// e.g. no
// e.g. hello
// e.g. hello
// e.g. hello
// e.g. hello. complete(true)
}

override fun onUpdatedEpdData(on: Long, off: Long) {
// e.g. on(0) off(3200)
}

override fun onStartedRecognition() {
}

override fun onEndedRecognition() {
}

override fun onError() {
}
})


speechToText.stop()

2. Sending audio data

The following is an example code to send audio data. Data can be sent once, and it can also be sent every 100ms at a minimum. When sending data periodically, the completeAudioSend API should be called at the end of the transmission.

- To send a single data
context.assets.open("test_audio.pcm").readBytes().also {
speechToText.sendAudio(it)
}

- To send data multiple times
var offset = 0
val testPCM = assets.open("call_test_audio.pcm").readAllBytes()

while (offset < testPCM.size) {
val end = (offset + 3200).coerceAtMost(testPCM.size)
val chunk = testPCM.copyOfRange(offset, end)
offset += BYTES_IN_100_MS
speechToText.sendAudio(chunk)
Thread.sleep(100L)
}

- After sending all data, call the completeAudioSend API.
speechToText.completeAudioSend()