Do we have a batch transcription in microsoft Azure speech to text cognitive services using java sdk or java Rest API ?

Question

we have a embedded speech(microphone) speech to text cognitive service support in java but I want to implement a batch transcription using microsoft Azure cognitive services using java language, do we java sdk or java Rest API support for batch transcription, please share me the github link

Thanks & Regards,

Ganesh P

Thanks & Regards,

Ganesh P

Answer

Hello @Ganesh P , Thanks for using Microsoft Q&A Platform.

As mentioned in the documentation, the Speech to text REST API and Speech CLI support batch transcription.

The available GitHub sample codes for batch transcription are for Python, Node.js, and C# clients that call the batch transcription REST API: https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/batch/python

Unfortunately, Java sample code related to this is not available.

Answer

package audioTranslation;

import java.io.BufferedReader;

import java.io.IOException;

import java.io.InputStreamReader;

import java.net.HttpURLConnection;

import java.net.URI;

import java.net.URL;

import java.net.http.HttpClient;

import java.net.http.HttpRequest;

import java.net.http.HttpResponse;

import java.nio.charset.StandardCharsets;

import java.nio.file.Files;

import java.nio.file.Path;

import java.nio.file.Paths;

import java.util.ArrayList;

import java.util.List;

import org.json.JSONArray;

import org.json.JSONObject;

import com.fasterxml.jackson.databind.JsonNode;

import com.fasterxml.jackson.databind.ObjectMapper;

import com.fasterxml.jackson.databind.node.ArrayNode;

import com.fasterxml.jackson.databind.node.ObjectNode;

public class MultipleAudioTranscription {

private static final String SUBSCRIPTION_KEY ="xxxxx"; // your azure speech to text service subscriptionKey

private static final String REGION = "southeastasia"; // region

private static final String AUDIO_URL1 ="audiofile1publicURLfromtheCloudStorageS3"; // I think you can take any cloud statorage

private static final String AUDIO_URL2 ="audiofile2publicURLfromtheCloudStorageS3";

private static final String TRANSCRIPTIONS_URL = String.format("https://%s.api.cognitive.microsoft.com/speechtotext/v3.1/transcriptions", REGION);



// this is the transcribed  files downloading path

private static final String transcribedTextDownloadingPath="E:\GTA-Application-Docs\AudioTranslation\1\transcribedOutput\";  



private static List audioUrls = new ArrayList<>();

public static void main(String[] args) throws Exception {

	

	audioUrls.add(AUDIO_URL1);

	audioUrls.add(AUDIO_URL2);

	 

	

    HttpClient client = HttpClient.newHttpClient();

    ObjectMapper mapper = new ObjectMapper();

    ObjectNode body = mapper.createObjectNode();

    ArrayNode contentUrlsArray = body.putArray("contentUrls");

    for (String url : audioUrls) {

        contentUrlsArray.add(url);

    }

    body.put("locale", "en-US");

    body.put("displayName", "Multiple audio transcription");

    ObjectNode properties = body.putObject("properties");

    properties.put("wordLevelTimestampsEnabled", false);

    properties.put("punctuationMode", "DictatedAndAutomatic");

    properties.put("profanityFilterMode", "Masked");

    // Step 1: Create the transcription job

    HttpRequest request = HttpRequest.newBuilder()

            .uri(URI.create(TRANSCRIPTIONS_URL))

            .header("Ocp-Apim-Subscription-Key", SUBSCRIPTION_KEY)

            .header("Content-Type", "application/json")

            .POST(HttpRequest.BodyPublishers.ofString(body.toString()))

            .build();

    HttpResponse response = client.send(request, HttpResponse.BodyHandlers.ofString());

    if (response.statusCode() != 201 && response.statusCode() != 202) {

        throw new Exception("Failed to create transcription job: " + response.statusCode() + " - " + response.body());

    }

    String statusUrl = response.headers().firstValue("Location").orElse(null);

    if (statusUrl == null) {

        JsonNode jsonResponse = mapper.readTree(response.body());

        statusUrl = jsonResponse.get("self").asText();

    }

    System.out.println("Transcription job created. Check status at: " + statusUrl);

    // Step 2: Periodically check the status of the transcription job

    while (true) {

        JsonNode statusResponse = checkTranscriptionStatus(client, statusUrl);

        String status = statusResponse.get("status").asText();

        System.out.println("Transcription status: " + status);

        if ("Succeeded".equals(status)) {

            System.out.println("Transcription succeeded!");

            List transcribedTexts = getTranscribedTexts(statusUrl, SUBSCRIPTION_KEY);

            for (int i = 0; i < transcribedTexts.size(); i++) {

                System.out.println("Transcription for Audio " + (i + 1) + ": " + transcribedTexts.get(i));

                

                

                

                saveTranscriptionToFile(transcribedTexts.get(i), transcribedTextDownloadingPath+"transcription_" + (i + 1) + ".txt");

            }

            break;

        } else if ("Failed".equals(status)) {

            System.out.println("Transcription failed.");

            if (statusResponse.has("error")) {

                System.out.println("Error details: " + statusResponse.get("error").toString());

            }

            break;

        }

        Thread.sleep(30000);

    }

}

private static JsonNode checkTranscriptionStatus(HttpClient client, String statusUrl) throws Exception {

    HttpRequest request = HttpRequest.newBuilder()

            .uri(URI.create(statusUrl))

            .header("Ocp-Apim-Subscription-Key", SUBSCRIPTION_KEY)

            .build();

    HttpResponse response = client.send(request, HttpResponse.BodyHandlers.ofString());

    if (response.statusCode() == 200) {

        ObjectMapper mapper = new ObjectMapper();

        return mapper.readTree(response.body());

    } else {

        System.out.println("Error checking transcription status: " + response.statusCode());

        return null;

    }

}

private static List getTranscribedTexts(String statusUrl, String subscriptionKey) throws Exception {

    List transcribedTexts = new ArrayList<>();

    String filesUrl = statusUrl + "/files";

    // Set up the HTTP GET request

    URL url = new URL(filesUrl);

    HttpURLConnection connection = (HttpURLConnection) url.openConnection();

    connection.setRequestMethod("GET");

    connection.setRequestProperty("Ocp-Apim-Subscription-Key", subscriptionKey);

    // Check the response code

    int responseCode = connection.getResponseCode();

    if (responseCode == 200) {

        // Read the response

        BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));

        StringBuilder response = new StringBuilder();

        String inputLine;

        while ((inputLine = in.readLine()) != null) {

            response.append(inputLine);

        }

        in.close();

        // Parse the JSON response

        JSONObject jsonResponse = new JSONObject(response.toString());

        JSONArray valuesArray = jsonResponse.getJSONArray("values");

        for (int i = 0; i < valuesArray.length(); i++) {

            JSONObject fileObject = valuesArray.getJSONObject(i);

            if ("Transcription".equals(fileObject.getString("kind"))) {

                String transcriptionFileUrl = fileObject.getJSONObject("links").getString("contentUrl");

                // Set up the HTTP GET request for the transcription file

                URL transcriptionUrl = new URL(transcriptionFileUrl);

                HttpURLConnection transcriptionConnection = (HttpURLConnection) transcriptionUrl.openConnection();

                transcriptionConnection.setRequestMethod("GET");

                // Check the response code

                int transcriptionResponseCode = transcriptionConnection.getResponseCode();

                if (transcriptionResponseCode == 200) {

                    // Read the response

                    BufferedReader transcriptionIn = new BufferedReader(new InputStreamReader(transcriptionConnection.getInputStream()));

                    StringBuilder transcriptionResponse = new StringBuilder();

                    while ((inputLine = transcriptionIn.readLine()) != null) {

                        transcriptionResponse.append(inputLine);

                    }

                    transcriptionIn.close();

                    // Parse the JSON response

                    JSONObject transcriptionContent = new JSONObject(transcriptionResponse.toString());

                    if (transcriptionContent.has("combinedRecognizedPhrases")) {

                        JSONArray combinedRecognizedPhrases = transcriptionContent.getJSONArray("combinedRecognizedPhrases");

                        StringBuilder transcriptionText = new StringBuilder();

                        /*

                        for (int j = 0; j < combinedRecognizedPhrases.length(); j++) {

                            JSONObject phrase = combinedRecognizedPhrases.getJSONObject(j);

                            transcriptionText.append(phrase.getString("display")).append(" ");

                        }

                        */

                        transcriptionText .append(combinedRecognizedPhrases.getJSONObject(0).getString("display")).append(" ");

                        transcribedTexts.add(transcriptionText.toString().trim());

                    }

                }

            }

        }

    } else {

        System.out.println("Error retrieving transcription files: " + responseCode);

    }

    return transcribedTexts;

}

private static void saveTranscriptionToFile(String transcription, String fileName) {

    try {

        Path filePath = Paths.get(fileName);

        Files.write(filePath, transcription.getBytes(StandardCharsets.UTF_8));

        System.out.println("Transcription saved to: " + filePath.toAbsolutePath());

    } catch (IOException e) {

        e.printStackTrace();

    }

}

}

Hi Team, here i tried like above for batch transcription and got the result also, if any body knows better and optimized solution please update here, so that we all can improve

Thanks & Regards,

Ganesh

Share via

Do we have a batch transcription in microsoft Azure speech to text cognitive services using java sdk or java Rest API ?

2 answers

Your answer