Make the first API call

This section describes the order of SpeechToText (STT) API calls and examples of scenarios that can be implemented with the API. The SpeechToText SDK offers SERVER, ON DEVICE, and HYBRID modes, and if not set in the app, HYBRID MODE is the default setting.

API call order

Initialize SDK
Add Listener

Remove Listener
Release SDK

Initialize SDK

When starting the SpeechToText SDK, call the initialize API to initialize the SpeechToText SDK with the desired mode settings from the app.

Kotlin
Java

enum class Mode {
    HYBRID,   // If the network is not connected, the SDK will connect to OnDevice speech recognition as needed.
    SERVER,   // Connects to speech recognition only when the network is available.
    ON_DEVICE // Connects to OnDevice speech recognition.
}

val speechToText = SpeechToText(context, Mode.HYBRID) // Mode must be set

speechToText.initialize();

public enum Mode {
    HYBRID,   // If the network is not connected, the SDK will connect to OnDevice speech recognition as needed.
    SERVER,   // Connects to speech recognition only when the network is available.
    ON_DEVICE // Connects to OnDevice speech recognition.
}

SpeechToText speechToText = new SpeechToText(context, Mode.HYBRID);  // Mode must be set

speechToText.initialize();

Add Listener

Before registering the SpeechToText app, call the addListener API to register the SpeechToText event callback.

Kotlin
Java

speechToText.addListener(object: ResultListener { 
    override fun onUpdated(stt: String, completed: Boolean) {
        // stt : recognized text result
        // completed : True if the user's voice recognition has been completed.
    }
    
    override fun onUpdatedEpdData(on: Long, off: Long) {
        // on  : sends the Buffer Size at the start point of the user's voice as Byte value.
        // off : sends the Buffer Size at the end point of the user's voice as Byte value.
    }

    override fun onStartedRecognition() {
        // Notifies that the user's voice has started.
    }

    override fun onEndedRecognition() {
        // Notifies that the user's voice recognition has ended.
    }

    override fun onError() {
        // Notifies that an error has occurred during voice recognition.
    }

    override fun onReady() {
        // Notifies that the user can speak through the microphone.
    }
})

speechToText.addListener(new ResultListener() { 
    @Override
    void onUpdated(String stt, boolean completed) {
        // stt : recognized text result
        // completed : True if the user's voice recognition has been completed.
    }
    
    @Override
    void onUpdatedEpdData(Long on, Long off) {
        // on  : sends the Buffer Size at the start point of the user's voice as Byte value.
        // off : sends the Buffer Size at the end point of the user's voice as Byte value.
    }

    @Override
    void onStartedRecognition() {
        // Notifies that the user's voice has started.
    }

    @Override
    void onEndedRecognition() {
        // Notifies that the user's voice recognition has ended.
    }

    @Override
    void onError() {
        // Notifies that an error has occurred during voice recognition.
    }

    @Override
    void onReady() {
        // Notifies that the user can speak through the microphone.
    }
});

Register App

Call the registerApp API to register the SpeechToText app with the SpeechToText server. If the MODE is ON_DEVICE, this API does not need to be used.

Kotlin
Java

val clientID = "your_client_id"
val clientSecret = "your_client_secret"
speechToText.registerApp(clientID, clientSecret, object : OnServerConnectionListener {
    override fun onConnected() {
        // The app can receive events once registration with the Speech server is complete.
    }

    override fun onFailed(msg: String) {
        // If registration to the Speech server fails, an event with an error message is sent to the app.
    }
})

String clientID = "your_client_id";
String clientSecret = "your_client_secret";

speechToText.registerApp(clientID, clientSecret, new OnServerConnectionListener() {
    @Override
    void onConnected() {
        // The app can receive events once registration with the Speech server is complete.
    }

    @Override
    void onFailed(String msg) {
        //  If registration to the Speech server fails, an event with an error message is sent to the app.
    }
});

Remove Listener

Before exiting the SpeechToText app, release the previously registered event callback. To release the event callback, call the removeListener API.

Kotlin
Java

speechToText.removeListener(listener: ResultListener)

speechToText.removeListener(ResultListener listener);

Release SDK

Call the release API to release the SpeechToText SDK resources.

Kotlin
Java

speechToText.release()

speechToText.release();

Examples of API usage scenarios

1. Start/Stop SpeechToText voice input

The following is an example code to start and stop SpeechToText voice input.

Kotlin
Java

speechToText.request()
speechToText.addListener(object: ResultListener { 
    override fun onUpdated(stt: String, completed: Boolean) {
        // e.g. no
        // e.g. hello
        // e.g. hello
        // e.g. hello
        // e.g. hello. complete(true)
    }
    
    override fun onUpdatedEpdData(on: Long, off: Long) {
        // e.g. on(0) off(3200)
    }

    override fun onStartedRecognition() {
    }

    override fun onEndedRecognition() {
    }

    override fun onError() {
    }
})


speechToText.stop()

speechToText.request();
speechToText.addListener(new ResultListener() { 
    @Override
    void onUpdated(String stt, boolean completed) {
        // e.g. no
        // e.g. hello
        // e.g. hello
        // e.g. hello
        // e.g. hello. complete(true)
    }
    
    @Override
    void onUpdatedEpdData(Long on, Long off) {
        // e.g. on(0) off(3200)
    }

    @Override
    void onStartedRecognition() {
    }

    @Override
    void onEndedRecognition() {
    }

    @Override
    void onError() {
    }
});

speechToText.stop();

2. Sending audio data

The following is an example code to send audio data. Data can be sent once, and it can also be sent every 100ms at a minimum. When sending data periodically, the completeAudioSend API should be called at the end of the transmission.

Kotlin
Java

- To send a single data
context.assets.open("test_audio.pcm").readBytes().also {
    speechToText.sendAudio(it)
}

- To send data multiple times
var offset = 0
val testPCM = assets.open("call_test_audio.pcm").readAllBytes()

while (offset < testPCM.size) {
    val end = (offset + 3200).coerceAtMost(testPCM.size)
    val chunk = testPCM.copyOfRange(offset, end)
    offset += BYTES_IN_100_MS
    speechToText.sendAudio(chunk)
    Thread.sleep(100L)
}

- After sending all data, call the completeAudioSend API.
speechToText.completeAudioSend()

- To send a single data
ByteArray data = context.assets.open("test_audio.pcm").readBytes()
speechToText.sendAudio(data);

- To send data multiple times
Int offset = 0;
ByteArray testPCM = context.assets.open("call_test_audio.pcm").readAllBytes();

while (offset < testPCM.size) {
    Int end = (offset + 3200).coerceAtMost(testPCM.size);
    ByteArray chunk = testPCM.copyOfRange(offset, end);
    offset += 3200;
    speechToText.sendAudio(chunk);
    Thread.sleep(100L)
}

- After sending all data, call the completeAudioSend API.
speechToText.completeAudioSend();

API call order​

Initialize SDK​

Add Listener​

Register App​

Remove Listener​

Release SDK​

Examples of API usage scenarios​

1. Start/Stop SpeechToText voice input​

2. Sending audio data​