How to create a dataset for Azure custom speech using spx (speechCLI)

Mikel Broström Zalba 20

I am using the following command for creating a custom speech dataset in my Azure Speech service:


spx csr dataset create --api-version v3.1 --kind "Acoustic" --name "My Custom Speech" --description "My Acoustic Dataset Description" --project $project_id --content https://xyz.blob.core.windows.net/test-and-train-data --language "en-US"

The content flag is pointing to a specific container in my storage account where the data is stored. I tried this:


test-and-train-data

├── train.wav

└── trans.txt

and


test-and-train-data

└── wav_n_txt.zip

and:


test-and-train-data

└── en-US

    ├── train.wav

    └── trans.txt

and:


test-and-train-data

└── en-US

    └── wav_n_txt.zip

Because when running the spx dataset crate command I see "locale": "en-US".

I just get an error, no details and I cannot find a single example online for this. I have read everything under the custom speech overview. The download of the uploading process report is not working either. What am I doing wrong?

VasaviLankipalle-MSFT 15,836 Reputation points

2024-06-28T18:58:45.79+00:00

Hello @Mikel Broström Zalba , Thanks for using Microsoft Q&A Platform.

Could you please provide more details regarding the error message and the issue you are facing?
Mikel Broström Zalba 20 Reputation points

2024-06-29T08:31:47.27+00:00

I wish I had more details to provide @VasaviLankipalle-MSFT but I don't and that is why I am here. The error provide zero details in what is going wrong making it a blind debug. My speech datasets:The details for the one created:
VasaviLankipalle-MSFT 15,836 Reputation points

2024-07-02T21:42:19.08+00:00

Hello @Mikel Broström Zalba , sorry for the inconveniences that has caused. I would request you to raise the support request in the Azure portal for deeper investigation on this issue.

Share via

How to create a dataset for Azure custom speech using spx (speechCLI)