@princeofpluto , Apologies for the delay in responding here!
You may leverage Azure Media Service (AMS) as per your requirement.
Video Indexer (VI) (https://www.videoindexer.ai) supports emotion detection. Video Indexer is partly built on top of AMS.
Which offers -Emotion detection: Identifies emotions based on speech (what's being said) and voice tonality (how it's being said). The emotion could be joy, sadness, anger, or fear.
For more information, visit VI’s portal or the VI developer portal, and test this capability.
You can also browse videos indexed as to emotional content: sample 1, sample 2, and sample 3.
Kindly see this doc for more details on this topic:
Cross-channel emotion analysis in Microsoft Video Indexer