U-SQL Advanced Analytics: Introducing Cognitive scenarios for Text and Imaging

Yesterday we introduced you to U-SQL Advanced Analytics and showed how Python can be used with U-SQL. Today, we'll show U-SQL's built-in support for Cognitive scenarios for images and text.

Currently U-SQL Supports these cognitive scenarios:

  • Detecting Objects in Images (Tagging)
  • Detecting Emotion in Faces in Images
  • Detecting Text in Images (OCR)
  • Text Key Phrase Extraction
  • Text Sentiment Analysis

Over time, we'll add more support and enhance the integration in many ways.

Here's an example of how U-SQL can be used to detect objects in images:

 REFERENCE ASSEMBLY ImageCommon;
REFERENCE ASSEMBLY FaceSdk;
REFERENCE ASSEMBLY ImageEmotion;
REFERENCE ASSEMBLY ImageTagging;
REFERENCE ASSEMBLY ImageOcr;

@imgs =
    EXTRACT FileName string, ImgData byte[]
    FROM @"/images/{FileName}.jpg"
    USING new Cognition.Vision.ImageExtractor();

// Extract the number of objects on each image and tag them 
@objects =
    PROCESS @imgs 
    PRODUCE FileName,
            NumObjects int,
            Tags string
    READONLY FileName
    USING new Cognition.Vision.ImageTagger();

OUTPUT @objects 
    TO "/objects.tsv"
    USING Outputters.Tsv();

In this sample the Cognition.Vision.ImageTagger() processor is used to detect objects and place a text description of them in the Tags column.

 

Here's an example of how U-SQL can be used to understand text:

 REFERENCE ASSEMBLY [TextCommon];
REFERENCE ASSEMBLY [TextSentiment];
REFERENCE ASSEMBLY [TextKeyPhrase];

@WarAndPeace =
    EXTRACT No int,
            Year string,
            Book string,
            Chapter string,
            Text string
    FROM @"/usqlext/samples/cognition/war_and_peace.csv"
    USING Extractors.Csv();

@sentiment =
    PROCESS @WarAndPeace
    PRODUCE No,
            Year,
            Book,
            Chapter,
            Text,
            Sentiment string,
            Conf double
    READONLY No,
             Year,
             Book,
             Chapter,
             Text
    USING new Cognition.Text.SentimentAnalyzer(true);
 OUTPUT @sentiment 
    TO "/sentiment.tsv"
    USING Outputters.Tsv();

In this sample the Cognition.Text.SentimentAnalyzer() processor is used to detect objects and place a text description of them in the Sentiment column.

To learn more about our support for U-SQL Advanced Analytics and how to enable it in your Data Lake Analytics Accounts, see our Getting Started guide .

Comments

  • Anonymous
    November 28, 2016
    This is awesome, really amazing stuff you guys have done with Data Lake overall and stuff like this just brings it to the next level. Looking forward to what's coming next!
  • Anonymous
    December 11, 2016
    The comment has been removed
    • Anonymous
      December 11, 2016
      The problem got resolved with the latest U-SQL in this post. Thanks a lot, Saveen.
  • Anonymous
    March 19, 2017
    I have a jpeg file that contains text but the output .csv file is all blank. The dll works with sample images but doesnt with mine.Is there a specification of jpeg file that ocr can read/detect ?
    • Anonymous
      March 22, 2017
      Rakesh,Can you please describe your scenario I more details what exactly you are trying to do.