Developing Tizen .NET Applications with Speech-To-Text

BY Annie Abraham 2017年 08月 18日 Tizen .NET Application, Tizen .NET

This blog explains how to create a simple Speech-To-Text application using the Tizen Speech-To-Text (STT) APIs, and also to design the application UI using XAML. This simple Speech-To-Text application demonstrates how to convert the speech input provided by the user to text, and display it on a label. The STT APIs used in this app are part of the Tizen C# Native APIs.

Prerequisites

The content presented in this blog is based on the assumption that you understand the structure of Tizen .NET application, and how to design the UI using XAML file. If not, refer to the Drumpad application.

Steps

The following are the steps involved in creating the Speech-To-Text application:

Creating the Project
Creating UI for the Application
Creating the STT Instance
Preparing the STT Instance
Binding Button Click Events
Adding the Privilege
Running the Application
Viewing Logs

Creating the Project

First, you need to create a Tizen Xamarin.Forms Single application. To do so, perform the following steps:

On the Visual Studio menu, go to File > New > Project. The New Project screen is displayed.
Go to Templates > Visual C# > Tizen, and select Blank App (Tizen Xamarin.Forms Single).
Enter the Name, Location, and Solution name. For example, in the above screenshot, the application is named as “STTDemo”.
Click OK to create the project.

Once the project is created, the general structure of the application is displayed as follows:

Creating UI for the Application

Now that you have created the project, use the XAML to create the UI. For information on XAML, refer to eXtensible Application Markup Language (XAML).

To create the UI, perform the following:

In the Solution Explorer pane, right-click STTDemo.Tizen and click Add > New Item. The Add New Item – STTDemo.Tizen screen is displayed.
Select the required XAML template for content page as shown in the following screenshot.
Name the page as MainPage.xaml and click Add.

Replace the code under MainPage.xaml with the following code:

<?xml version="1.0" encoding="utf-8" ?>
<ContentPage xmlns="http://xamarin.com/schemas/2014/forms"
             xmlns:x="http://schemas.microsoft.com/winfx/2009/xaml"
             x:Class="STTDemo.Tizen.MainPage">
    <StackLayout>

        <Label x:Name="_label"
                Text=""
                HorizontalOptions="CenterAndExpand"
                VerticalOptions="CenterAndExpand"
                FontSize="Large"
                HeightRequest="500"
                WidthRequest="700"
                BackgroundColor="White"/>

        <Button x:Name="_playBtn"
                Text="Play"
                HorizontalOptions="CenterAndExpand"
                VerticalOptions="CenterAndExpand"
                BackgroundColor="Blue"/>

        <Button x:Name="_stopBtn"
                Text="Stop"
                HorizontalOptions="CenterAndExpand"
                VerticalOptions="CenterAndExpand"
                IsEnabled="False"
                IsVisible="True"
                BackgroundColor="Blue"/>

    </StackLayout>
</ContentPage>

Using the above code snippet, you can create a:

Label to display the recognized text
Button “Play” to start the speech input
Button “Stop” to stop the speech input

MainPage.xaml has a corresponding MainPage.xaml.cs file where the events can be handled when the user clicks the Play and Stop buttons. Set MainPage as the Main Page of the application in STTDemo.cs by setting MainPage as “MainPage = new MainPage()” in the App constructor.

The following is the screenshot of the application:

Now that the UI is created, you need to implement the Speech-To-Text functionality.

Creating the STT Instance

The first step is to create an instance of STT. Create STT instance when the main page is created, having _sttInst as a private member of the MainPage class as follows:

private SttClient _sttInst = new SttClient();

Preparing the STT Instance

Ensure to prepare the STT instance before it can be used for speech to text conversion purpose. You can do this in the constructor of the MainPage class as follows:

_sttInst.StateChanged += _sttInst_StateChanged;
_sttInst.RecognitionResult += _sttInst_RecognitionResult;
_sttInst.Prepare();

Add an event handler to know about the state change of the STT object, and another event handler to get the recognition output once the input speech is processed.

To start the speech recording, the STT object must be in the “Ready” state. Since the state transition happens fast, the user need not wait for it. The state transition to prepared state happens by the time the user is ready to give the speech input.

Binding Button Click Events

In the MainPage.xaml.cs file, after the InitializeComponent() function, add event handlers for the Play and Stop buttons as shown in the following code example:

_playBtn.Clicked += (sender, e) =>
            {
                try
                {
                    _label.Text = "";
                    _sttInst.Start("en_US", RecognitionType.Free);
                    _playBtn.IsEnabled = false;
                    _stopBtn.IsEnabled = true;
                }
                catch (Exception ex)
                {
                    global::Tizen.Log.Error("STTDemo", ex.ToString());
                }
            };

_stopBtn.Clicked += (sender, e) =>
            {
                try
                {
                    _sttInst.Stop();
                    _stopBtn.IsEnabled = false;
                    _playBtn.IsEnabled = true;
                }
                catch (Exception ex)
                {
                    global::Tizen.Log.Error("STTDemo", ex.ToString());
                }
            };

In the above code example, US English is considered as the default language and the RecognitionType as free.

The Play button functionality can be described as follows:

Start the speech input
Toggle the Play and Stop buttons

The Stop button functionality can be described as follows:

Stop the speech input
Toggle the Play and Stop buttons

The following code example provides the complete MainPage class in MainPage.xaml.cs:

namespace STTDemo.Tizen
{
    [XamlCompilation(XamlCompilationOptions.Compile)]
    public partial class MainPage : ContentPage
    {
        private SttClient _sttInst = new SttClient();
        public MainPage()
        {
            InitializeComponent();
            _sttInst.StateChanged += _sttInst_StateChanged;
            _sttInst.RecognitionResult += _sttInst_RecognitionResult;
            _sttInst.Prepare();

            _playBtn.Clicked += (sender, e) =>
            {
                try
                {
                    _label.Text = "";
                    _sttInst.Start("en_US", RecognitionType.Free);
                    _playBtn.IsEnabled = false;
                    _stopBtn.IsEnabled = true;
                }
                catch (Exception ex)
                {
                    global::Tizen.Log.Error("STTDemo", ex.ToString());
                }
            };

            _stopBtn.Clicked += (sender, e) =>
            {
                try
                {
                    _sttInst.Stop();
                    _stopBtn.IsEnabled = false;
                    _playBtn.IsEnabled = true;
                }
                catch (Exception ex)
                {
                    global::Tizen.Log.Error("STTDemo", ex.ToString());
                }
            };
        }

        private void _sttInst_RecognitionResult(object sender, RecognitionResultEventArgs e)
        {
            global::Tizen.Log.Info("STTDemo", " Result:" + e.Result + " Message:" + e.Message + " DataCount:" + e.DataCount);
            if(e.Result == ResultEvent.FinalResult)
            {
                if(e.DataCount > 0)
                {
                    _label.Text = String.Concat(e.Data);
                }
                else
                {
                    _label.Text = "No Recognized Text";
                }
            }
        }

        ~MainPage()
        {
            _sttInst.Unprepare();
            _sttInst.StateChanged -= _sttInst_StateChanged;
            _sttInst.RecognitionResult -= _sttInst_RecognitionResult;
        }

        private void _sttInst_StateChanged(object sender, StateChangedEventArgs e)
        {
            global::Tizen.Log.Info("STTDemo", " Previous:" + e.Previous + " Current:" + e.Current);
        }
    }
}

Adding the Privilege

The STT APIs require the “http://tizen.org/privilege/recorder” privilege. This Tizen privilege enables the application to record video and audio. You can add this privilege to the tizen-manifest.xml file. To do so, perform the following steps:

1.In the Solution Explorer pane, go to STTDemo.Tizen project > tizen-manifest.xml.

2.Right-click the tizen-manifest.xml file and click Open.

3.Click Privileges > Add and select Custom Privileges.

4.In the Custom Privileges text box, enter http://tizen.org/privilege/recorder.

5.Click Ok to add the privilege.

Running the Application

Press CTRL+F5 to launch the application on the device/emulator.

Click the Play button and provide the speech input. Once you provide the speech input, click the Stop button. The conversion of speech to text can happen only after you click the Stop button. The STT Module Engine converts the speech to text and provides the text result in the RecognitionResult event. The recognized text is displayed on the label as shown in the following screenshot.