Sentiment Analysis (U-SQL)

Summary

The SentimentAnalyzer cognitive function evaluates sentiment from the text. It returns a numeric score between 0 and 1 long with the sentiment string from the text. Scores close to 1 indicate positive sentiment and scores close to 0 indicate negative sentiment. Sentiment score is generated using classification techniques. The input features of the classifier include n-grams, features generated from part-of-speech tags, and word embeddings. English text is supported.

Arguments TBD

SentimentAnalyzer(
string TBD = "TBD", string TBD = "Sentiment", string TBD = "Conf")

Examples

Books
The examples utilize two books: War and Peace, and Peter Pan.

  • war_and_peace.csv is installed automatically when you install the cognitive assemblies. The file is located at /usqlext/samples/cognition/war_and_peace.csv.
  • PeterPan.txt was obtained from https://www.gutenberg.org/files/16/16-0.txt.

Extract Text

// War and Peace
@book =
    EXTRACT No int,
            Year string,
            Book string,
            Chapter string,
            Text string
    FROM @"/Samples/Books/war_and_peace.csv"
    USING Extractors.Csv();

// Peter Pan
@otherBook =
    EXTRACT Text string
    FROM @"/Samples/Books/PeterPan.txt"
    USING Extractors.Text(silent: true, delimiter: '`');

Extract Sentiment

REFERENCE ASSEMBLY [TextSentiment];

// War and Peace
@sentiment =
    PROCESS @book
    PRODUCE No,
            Year,
            Book,
            Chapter,
            Text,
            Sentiment string,
            Conf double
    READONLY No,
            Year,
            Book,
            Chapter,
            Text
    USING new Cognition.Text.SentimentAnalyzer(true); // true returns confidence

// Peter Pan
@otherSentiment =
    PROCESS @otherBook
    PRODUCE Text,
            Sentiment string,
            Conf double
    READONLY Text
    USING new Cognition.Text.SentimentAnalyzer(true); // true returns confidence

OUTPUT @sentiment
TO "/ReferenceGuide/Cognition/Text/SentimentAnalyzer1A.txt"
USING Outputters.Tsv();

OUTPUT @otherSentiment
TO "/ReferenceGuide/Cognition/Text/SentimentAnalyzer1B.txt"
USING Outputters.Tsv();

Calculate average sentiment

// War and Peace
@grouped =
    SELECT Year,
           Book,
           Chapter,
           AVG(Conf) AS Sentiment
    FROM @sentiment
    GROUP BY Year,
             Book,
             Chapter;

// Peter Pan
@otherGrouped =
    SELECT AVG(Conf) AS Sentiment
    FROM @sentiment;

OUTPUT @grouped
TO "/ReferenceGuide/Cognition/Text/SentimentAnalyzer2A.txt"
USING Outputters.Tsv();

OUTPUT @otherGrouped
TO "/ReferenceGuide/Cognition/Text/SentimentAnalyzer2B.txt"
USING Outputters.Tsv();

Combine the key phrases and chapter sentiment analysis

REFERENCE ASSEMBLY [TextKeyPhrase];

// First capture key phrases
@keyphrase =
    PROCESS @book
    PRODUCE No,
            Year,
            Book,
            Chapter,
            Text,
            KeyPhrase string
    READONLY No,
            Year,
            Book,
            Chapter,
            Text
    USING new Cognition.Text.KeyPhraseExtractor();

@KPsplits =
    SELECT No,
           Year,
           Book,
           Chapter,
           Text,
           T.KeyPhrase
    FROM @keyphrase
         CROSS APPLY EXPLODE (KeyPhrase.Split(';')) AS T(KeyPhrase);

// Then combine with sentiment extracted from earlier example
@all =
    SELECT K.Year,
           K.Book,
           K.Chapter,
           K.Text,
           K.KeyPhrase,
           S.Conf AS Sentiment
    FROM @KPsplits AS K
         INNER JOIN
             @sentiment AS S
         ON K.No == S.No;

OUTPUT @all
TO "/ReferenceGuide/Cognition/Text/SentimentAnalyzer3A.txt"
USING Outputters.Tsv();

See Also