@Aman Verma For getting phonemes for speech to text API you need to use the pronunciation assessment feature as part of the speech config. In the config for pronunciation assessment the granularity has to be defined to phoneme
level.
pronunciation_assessment_config = \
speechsdk.PronunciationAssessmentConfig(reference_text='reference text',
grading_system=msspeech.PronunciationAssessmentGradingSystem.HundredMark,
granularity=msspeech.PronunciationAssessmentGranularity.Phoneme)
speech_recognizer = speechsdk.SpeechRecognizer(
speech_config=speech_config, \
audio_config=audio_config)
# apply the pronunciation assessment configuration to the speech recognizer
pronunciation_assessment_config.apply_to(speech_recognizer)
result = speech_recognizer.recognize_once()
pronunciation_assessment_result = speechsdk.PronunciationAssessmentResult(result)
pronunciation_score = pronunciation_assessment_result.pronunciation_score
You can lookup this page in documentation for more details on using this functionality.
If an answer is helpful, please click on or upvote which might help other community members reading this thread.