Note
Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.
Create Grammars with Multiple Active Rules (Microsoft.Speech)
The root rule of a grammar is active and ready to use for recognition when the speech recognition engine loads the grammar. All the rules referenced by the root rule are also active when the grammar loads. However, a grammar may contain rules for separate actions that do not reference each other. Each rule is a separate and independent definition that uniquely identifies an action that the user can initiate by speaking.
In this example, we will create a grammar with rules that can be used to begin, pause, or stop playback of a music file. We will create a top-level root rule that contains a reference to the main rule element for each of the play, pause, and stop actions. The end result will be a valid path that begins at the root rule and may go to each rule in the grammar, depending on the spoken input.
We will begin with the grammar that was created in the previous topic, Create Variations of User Commands (Microsoft.Speech). This grammar can only be used to recognize spoken input for beginning the playback of an audio file.
<grammar version="1.0" xml:lang="en-US" root="playCommands"
xmlns="http://www.w3.org/2001/06/grammar">
<rule id="playCommands">
<ruleref uri="#playAction" />
<item> the </item>
<ruleref uri="#fileWords" />
</rule>
<rule id="playAction">
<one-of>
<item> play </item>
<item> start </item>
<item> begin </item>
</one-of>
</rule>
<rule id="fileWords">
<one-of>
<item> song </item>
<item> tune </item>
<item> track </item>
<item> item </item>
</one-of>
</rule>
</grammar>
We will add rules that recognize words for pausing and stopping the playback of a music file, using the list of actions and spoken input from the first topic, Create a List of User Commands (Microsoft.Speech). This next example contains the rules for recognizing pause commands.
<rule id="pauseCommands">
<ruleref uri="#pauseAction" />
<item> the </item>
<ruleref uri="#fileWords" />
</rule>
<rule id="pauseAction">
<one-of>
<item> pause </item>
<item> suspend </item>
<item> hold </item>
</one-of>
</rule>
Since the pause commands refer to the same files as the play commands, we can reuse the rule named fileWords. Now we will create the rules for the stop commands.
<rule id="stopCommands">
<ruleref uri="#stopAction" />
<item> the </item>
<ruleref uri="#fileWords" />
</rule>
<rule id="stopAction">
<one-of>
<item> stop </item>
<item> end </item>
<item> quit </item>
</one-of>
</rule>
The rule element named stopCommands will also reuse the rule element named fileWords. We have added rules to pause and stop playback, but these rules are not active because they are not referenced by the root rule. We will create a rule element named top-level that contains a reference to each of the rule elements named playCommands, pauseCommands, and stopCommands. We will make the rule named top-level the root rule for the grammar. Here is the resulting grammar:
<grammar version="1.0" xml:lang="en-US" root="topLevel"
xmlns="http://www.w3.org/2001/06/grammar">
<rule id="topLevel">
<ruleref uri="#playCommands" />
<ruleref uri="#pauseCommands" />
<ruleref uri="#stopCommands" />
</rule>
<rule id="playCommands">
<ruleref uri="#playAction" />
<item> the </item>
<ruleref uri="#fileWords" />
</rule>
<rule id="playAction">
<one-of>
<item> play </item>
<item> start </item>
<item> begin </item>
</one-of>
</rule>
<rule id="pauseCommands">
<ruleref uri="#pauseAction" />
<item> the </item>
<ruleref uri="#fileWords" />
</rule>
<rule id="pauseAction">
<one-of>
<item> pause </item>
<item> suspend </item>
<item> hold </item>
</one-of>
</rule>
<rule id="stopCommands">
<ruleref uri="#stopAction" />
<item> the </item>
<ruleref uri="#fileWords" />
</rule>
<rule id="stopAction">
<one-of>
<item> stop </item>
<item> end </item>
<item> quit </item>
</one-of>
</rule>
<rule id="fileWords">
<one-of>
<item> song </item>
<item> tune </item>
<item> track </item>
<item> item </item>
</one-of>
</rule>
</grammar>
To view a grammar of greater complexity that incorporates semantic definitions, see Grammar Example: Solitaire (Microsoft.Speech).