How can I use speech sdk in python container

Question

Hello, I am struggling to use Speech SDK from within my python container which I use to deploy my django project. Here is my python code:

async def text_to_speech(self, text):
        speech_config = speechsdk.SpeechConfig(subscription=os.getenv('SPEECH_KEY'), region=os.getenv('SPEECH_REGION'))

        speech_config.speech_synthesis_voice_name='en-US-AndrewMultilingualNeural'

        synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=None)
        
        visemes = []

        def viseme_received(evt):
            visemes.append({"offset" : evt.audio_offset / 10000,"visemeId" :evt.viseme_id})

        synthesizer.viseme_received.connect(viseme_received)

        result = synthesizer.speak_text_async(text).get()
        
        if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
            audio_data = result.audio_data  # This is the audio data as a byte array
            return audio_data, visemes
        else:
            print(f"Speech synthesis canceled, reason: {result.reason}")
            if result.reason == speechsdk.ResultReason.Canceled:
                cancellation_details = result.cancellation_details
                print(f"Error details: {cancellation_details.error_details}")
            return None, None

To containerize my project I use this Dockerfile and compose.yml (there are also other services in the compose file which are irrelevant here): User's image

FROM python:3.12

ARG DB_NAME
ARG DB_USER
ARG DB_PASSWORD
ARG DB_HOST
ARG DB_PORT
ARG AZURE_OPENAI_ENDPOINT
ARG AZURE_OPENAI_API_KEY
ARG SPEECH_KEY
ARG SPEECH_REGION

WORKDIR /backend 

RUN pip install --upgrade pip
COPY ./requirements.txt .
RUN pip install -r requirements.txt

ENV DB_NAME=$DB_NAME
ENV DB_USER=$DB_USER
ENV DB_PASSWORD=$DB_PASSWORD
ENV DB_HOST=$DB_HOST
ENV DB_PORT=$DB_PORT
ENV AZURE_OPENAI_ENDPOINT=$AZURE_OPENAI_ENDPOINT
ENV AZURE_OPENAI_API_KEY=$AZURE_OPENAI_API_KEY
ENV SPEECH_KEY=$SPEECH_KEY
ENV SPEECH_REGION=$SPEECH_REGION

COPY . .

When starting the Django project locally everything works out fine. I also checked that the environment variables are set correctly within the container so that is not the problem. I even wrote a script and executed from within the container which made a call to the azure text to speech REST endpoint which worked (but I need viseme support unfortunately). Whenever the function that I provided is executed I get the following logs from the container.

Exception inside application: Exception with error code: 
[CALL STACK BEGIN]

/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1e3f11) [0x7f1fbd9e3f11]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x20d2e7) [0x7f1fbda0d2e7]
/lib/x86_64-linux-gnu/libc.so.6(+0x8df97) [0x7f1fc5443f97]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x20e41c) [0x7f1fbda0e41c]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1b445f) [0x7f1fbd9b445f]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1b3d15) [0x7f1fbd9b3d15]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1eaca5) [0x7f1fbd9eaca5]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1dbdc0) [0x7f1fbd9dbdc0]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1d5d8e) [0x7f1fbd9d5d8e]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0xdc8bd) [0x7f1fbd8dc8bd]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1eaca5) [0x7f1fbd9eaca5]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1ccb4f) [0x7f1fbd9ccb4f]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x21b9a6) [0x7f1fbda1b9a6]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(synthesizer_create_speech_synthesizer_from_config+0xf3) [0x7f1fbd8bf169]
/lib/x86_64-linux-gnu/libffi.so.8(+0x6f7a) [0x7f1fc2abef7a]
/lib/x86_64-linux-gnu/libffi.so.8(+0x640e) [0x7f1fc2abe40e]
/lib/x86_64-linux-gnu/libffi.so.8(ffi_call+0xcd) [0x7f1fc2abeb0d]
[CALL STACK END]

Runtime error: Failed to initialize platform (azure-c-shared). Error: 2176
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/channels/routing.py", line 71, in __call__
    return await application(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/channels/sessions.py", line 47, in __call__
    return await self.inner(dict(scope, cookies=cookies), receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/channels/sessions.py", line 263, in __call__
    return await self.inner(wrapper.scope, receive, wrapper.send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/channels/auth.py", line 185, in __call__
    return await super().__call__(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/channels/middleware.py", line 26, in __call__
    return await self.inner(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/channels/routing.py", line 150, in __call__
    return await application(
           ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/channels/consumer.py", line 94, in app
    return await consumer(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/channels/consumer.py", line 62, in __call__
    await await_many_dispatch([receive], self.dispatch)
  File "/usr/local/lib/python3.12/site-packages/channels/utils.py", line 51, in await_many_dispatch
    await dispatch(result)
  File "/usr/local/lib/python3.12/site-packages/channels/consumer.py", line 73, in dispatch
    await handler(message)
  File "/usr/local/lib/python3.12/site-packages/channels/generic/websocket.py", line 173, in websocket_connect
    await self.connect()
  File "/backend/api/consumers.py", line 59, in connect
    audio_data, visemes = await self.text_to_speech(text)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/backend/api/consumers.py", line 88, in text_to_speech
    synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=None)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/speech.py", line 2223, in __init__
    _call_hr_fn(fn=_sdk_lib.synthesizer_create_speech_synthesizer_from_config, *[
  File "/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/interop.py", line 62, in _call_hr_fn
    _raise_if_failed(hr)
  File "/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/interop.py", line 55, in _raise_if_failed
    __try_get_error(_spx_handle(hr))
  File "/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/interop.py", line 50, in __try_get_error
    raise RuntimeError(message)
RuntimeError: Exception with error code: 
[CALL STACK BEGIN]

/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1e3f11) [0x7f1fbd9e3f11]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x20d2e7) [0x7f1fbda0d2e7]
/lib/x86_64-linux-gnu/libc.so.6(+0x8df97) [0x7f1fc5443f97]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x20e41c) [0x7f1fbda0e41c]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1b445f) [0x7f1fbd9b445f]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1b3d15) [0x7f1fbd9b3d15]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1eaca5) [0x7f1fbd9eaca5]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1dbdc0) [0x7f1fbd9dbdc0]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1d5d8e) [0x7f1fbd9d5d8e]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0xdc8bd) [0x7f1fbd8dc8bd]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1eaca5) [0x7f1fbd9eaca5]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1ccb4f) [0x7f1fbd9ccb4f]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x21b9a6) [0x7f1fbda1b9a6]
/usr/local/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(synthesizer_create_speech_synthesizer_from_config+0xf3) [0x7f1fbd8bf169]
/lib/x86_64-linux-gnu/libffi.so.8(+0x6f7a) [0x7f1fc2abef7a]
/lib/x86_64-linux-gnu/libffi.so.8(+0x640e) [0x7f1fc2abe40e]
/lib/x86_64-linux-gnu/libffi.so.8(ffi_call+0xcd) [0x7f1fc2abeb0d]
[CALL STACK END]

Runtime error: Failed to initialize platform (azure-c-shared). Error: 2176

I tried port mapping in the compose service for both 5000:5000 and 443:443. Did not help unfortunately. What am I doing wrong? Thank you for your help :)

Share via

How can I use speech sdk in python container