Speech Recognition

Top  Previous  Next

Use the following procedure to implement speech recognition in your application:

1.Identify an Audio Source (OPTIONAL): Use the PXC[M]AudioSource interface to enumerate and select an input audio device, as illustrated in Example 72.

C++ Example 72: Select an Audio Source

// session is a PXCSession instance.

PXCAudioSource *source=session->CreateAudioSource();

 

// Scan and Enumerate audio devices

source->ScanDevices();

PXCAudioSource::DeviceInfo dinfo;

for (int d=source->QueryDeviceNum()-1;d>=0;d--) {

   source->QueryDeviceInfo(i, &dinfo);

 

   // Select one and break out of the loop

   ...

}

 

// Set the active device

source->SetDevice(&dinfo);

C# Example 72: Select an Audio Source

// session is a PXCMSession instance.

PXCMAudioSource source=session.CreateAudioSource();

 

// Scan and Enumerate audio devices

source.ScanDevices();

PXCMAudioSource.DeviceInfo dinfo;

for (int d=source.QueryDeviceNum()-1;d>=0;d--) {

   source.QueryDeviceInfo(i, out dinfo);

 

   // Select one and break out of the loop

   ...

}

 

// Set the active device

source.SetDevice(dinfo);

Java Example 72: Select an Audio Source

// session is a PXCMSession instance.

PXCMAudioSource source=session.CreateAudioSource();

 

// Scan and Enumerate audio devices

source.ScanDevices();

PXCMAudioSource.DeviceInfo dinfo=new PXCMAudioSource();

for (int d=source.QueryDeviceNum()-1;d>=0;d--) {

   source.QueryDeviceInfo(i, out dinfo);

 

   // Select one and break out of the loop

   ...

}

 

// Set the active device

source.SetDevice(dinfo);

The SDK audio source supports reading from an audio file. The audio file can be in any system-supported audio formats, such as WAV, MP3, or WMV. Example 73 shows how to setup the audio source to read from an audio file.

C++ Example 73: Select a File Audio Source

// session is a PXCSession instance.

PXCAudioSource *source=session->CreateAudioSource();

 

// Set the audio fie

PXCAudioSource::DeviceInfo dinfo={};

wcscpy_s<sizeof(dinfo.did)/sizeof(pxcCHAR)>(dinfo.did,L"my_audio_file.wav");

 

// Set the active device

source->SetDevice(&dinfo);

C# Example 73: Select a File Audio Source

// session is a PXCMSession instance.

PXCMAudioSource source=session.CreateAudioSource();

 

// Set the audio fie

PXCMAudioSource.DeviceInfo dinfo=new PXCMAudioSource.DeviceInfo();

dinfo.did="my_audio_file.wav";

 

// Set the active device

source.SetDevice(dinfo);

Java Example 73: Select a File Audio Source

// session is a PXCMSession instance.

PXCMAudioSource source=session.CreateAudioSource();

 

// Set the audio fie

PXCMAudioSource.DeviceInfo dinfo=new PXCMAudioSource.DeviceInfo();

dinfo.did="my_audio_file.wav";

 

// Set the active device

source.SetDevice(dinfo);

2.Locate the Module Implementation: Use the CreateImpl function to create an instance of the PXCSpeechRecognition instance, as illustrated in Example 74.

C++ Example 74: Create a Speech Recognition Instance

PXCSpeechRecognition *sr=0;

session->CreateImpl<PXCSpeechRecognition>(&sr);

C# Example 74: Create a Speech Recognition Instance

PXCMSpeechRecognition sr;

session.CreateImpl<PXCMSpeechRecognition>(out sr);

Java Example 74: Create a Speech Recognition Instance

PXCMSpeechRecognition sr=new PXCMSpeechRecognition();

session.CreateImpl(sr);

3.Configure the Module: Get available configurations using the QueryProfile function and set the configuration using the SetProfile function. See Example 75.

C++ Example 75: Initialize the Speech Recognition Module

PXCSpeechRecognition::ProfileInfo pinfo;

sr->QueryProfile(0,&pinfo);

pinfo.language=PXCSpeechRecognition::LANGUAGE_US_ENGLISH;

sr->SetProfile(&pinfo);

C# Example 75: Initialize the Speech Recognition Module

PXCMSpeechRecognition.ProfileInfo pinfo;

sr.QueryProfile(0,out pinfo);

pinfo.language=PXCMSpeechRecognition.LanguageType.LANGUAGE_US_ENGLISH;

sr.SetProfile(pinfo);

Java Example 75: Initialize the Speech Recognition Module

PXCMSpeechRecognition.ProfileInfo pinfo=new PXCMSpeechRecognition.ProfileInfo;

sr.QueryProfile(0, pinfo);

pinfo.language=PXCMSpeechRecognition.LanguageType.LANGUAGE_US_ENGLISH;

sr.SetProfile(pinfo);

Always set the language. If not set, the default language is undetermined depending on what is currently installed on the platform.

4.Set the Recognition mode (OPTIONAL): See Command Control and Dictation on how to configure the speech recognition module to work in the command and control mode or in the dictation mode. By default, the module is set in the dictation mode.

C++ Example 76: Set the Dictation Mode

sr->SetDictation();

C# Example 76: Set the Dictation Mode

sr.SetDictation();

Java Example 76: Set the Dictation Mode

sr.SetDictation();

5.Execution Flow: Start speech recognition using the StartRec function and stop with the StopRec function. The application receives events on any recognition activities. See Handle Recognition Events for details on setting up event handlers.

C++ Example 77: Start/Stop Speech Recognition

// Start recognition

sr->StartRec(source, handler);

 

...

 

// Stop recognition

sr->StopRec();

C# Example 77: Start/Stop Speech Recognition

// Start recognition

sr.StartRec(source, handler);

 

...

 

// Stop recognition

sr.StopRec();

Java Example 77: Start/Stop Speech Recognition

// Start recognition

sr.StartRec(source, handler);

 

...

 

// Stop recognition

sr.StopRec();