Protected material detection

The Protected material text API flags known text content (for example, song lyrics, articles, recipes, and selected web content) that might be output by large language models. This guide provides details about the kind of content that the protected material API detects.

Protected material examples

Refer to this table for details of the major categories of protected material text detection. All four categories are applied when you call the API.

Category Scope Considered acceptable Considered harmful
Recipes Copyrighted content related to Recipes.

Other harmful or sensitive text is out of scope for this task, unless it intersects with Recipes IP copyright harm.
  • Links to web pages that contain information about recipes  
  • Any content from recipes that have no or low IP/Copyright protections: 
    • Lists of ingredients
    • Basic instructions for combining and cooking ingredients
  • Rejection or refusal to provide copyrighted content: 
    • Changing a topic to avoid sharing copyrighted content
    • Refusal to share copyrighted content
    • Providing nonresponsive information
  • Other literary content in a recipe 
    • Matching anecdotes, stories, or personal commentary about the recipe (40 characters or more)
    • Creative names for the recipe that are not limited to the well-known name of the dish, or a plain descriptive summary of the dish indicating what the primary ingredient is (40 characters or more)
    • Creative descriptions of the ingredients or steps for combining or cooking ingredients, including descriptions that contain more information than needed to create the dish, rely on imprecise wording, or contain profanity (40 characters or more)
  • Methods to access copyrighted content:
    • Ways to bypass paywalls to access recipes
Web Content All websites that have webmd.com as their URL domain name. Only focuses on issues of copyrighted content around Selected Web Content.

Other harmful or sensitive text is out of scope for this task, unless it intersects Selected Web Content harm.
  • Links to web pages 
  • Short excerpts or snippets of Selected Web Content as long as:
    • They are relevant to the user's query
    • They are fewer than 200 characters
  • Substantial content of Selected Web Content  
    • Response sections longer than 200 characters that bear substantial similarity to a block of text from the Selected Web Content
    • Excerpts from Selected Web Content that are longer than 200 characters
    • Quotes from Selected Web Content that are longer than 200 characters
  • Methods to access copyrighted content:
    • Ways to bypass paywalls or DRM protections to access copyrighted Selected Web Content
News Only focus on issues of copyrighted content around News.

Other harmful or sensitive text is out of scope for this task, unless it intersects News IP Copyright harm.
  • Links to web pages that host news or information about news, magazines, or blog articles as long as:
    • They have legitimate permissions
    • They have licensed news coverage
    • They are authorized platforms
  • Links to authorized web pages that contain embedded audio/video players as long as:
    • They have legitimate permissions
    • They have licensed news coverage
    • They are authorized streaming platforms
    • They are official YouTube channels
  • Short excerpts/snippets like headlines or captions from news articles as long as:
    • They are relevant to the user's query
    • They are not a substantial part of the article
    • They are not the entire article
  • Summary of news articles as long as:
    • It is relevant to the user's query
    • It is brief and factual
    • It does not copy/paraphrase a substantial part of the article
    • It is clearly and visibly cited as a summary
  • Analysis/Critique/Review of news articles as long as:
    • It is relevant to the user's query
    • It is brief and factual
    • It does not copy/paraphrase a substantial part of the article
    • It is clearly and visibly cited as an analysis/critique/review
  • Any news content that has no IP/Copyright protections:
    • News/Magazines/Blogs that are in the public domain
    • News/Magazines/Blogs for which Copyright protection has elapsed, been surrendered, or never existed
  • Rejection or refusal to provide copyrighted content:
    • Changing topic to avoid sharing copyrighted content
    • Refusal to share copyrighted content
    • Providing nonresponsive information
  • Links to pdf or any other file containing full text of news/magazine/blog articles, unless:
    • They are sourced from authorized platforms with legitimate permissions and licenses
  • News content
    • More than 200 characters taken verbatim from any news article
    • More than 200 characters substantially similar to a block of text from any news article
    • Direct access to news/magazine/blog articles that are behind paywalls
  • Methods to access copyrighted content:
    • Steps to download news from an unauthorized website
    • Ways to bypass paywalls or DRM protections to access copyrighted news or videos
Lyrics Only focuses on issues of copyrighted content around Songs.

Other harmful or sensitive text is out of scope for this task, unless it intersects Songs IP Copyright harm.
  • Links to web pages that contain information about songs such as:
    • Lyrics of the songs
    • Chords or tabs of the associated music
    • Analysis or reviews of the song/music
  • Links to authorized web pages that contain embedded audio/video players as long as:
    • They have legitimate permissions
    • They have licensed music
    • They are authorized streaming platforms
    • They are official YouTube channels
  • Short excerpts or snippets from lyrics of the songs as long as:
    • They are relevant to the user's query
    • They are not a substantial part of the lyrics
    • They are not the entire lyrics
    • They are not more than 11 words long
  • Short excerpts or snippets from chords/tabs of the songs as long as:
    • They are relevant to the user's query
    • They are not a substantial part of the chords/tabs
    • They are not the entire chords/tabs
  • Any content from songs that have no IP/Copyright protections:
    • Songs/Lyrics/Chords/Tabs that are in the public domain
    • Songs/Lyrics/Chords/Tabs for which Copyright protection has elapsed, been surrendered, or never existed
  • Rejection or refusal to provide copyrighted content:
    • Changing topic to avoid sharing copyrighted content
    • Refusal to share copyrighted content
    • Providing nonresponsive information
  • Lyrics of a song
    • Entire lyrics
    • Substantial part of the lyrics
    • Part of lyrics that contain more than 11 words
  • Chords or Tabs of a song
    • Entire chords/tabs
    • Substantial part of the chords/tabs
  • Links to webpages that contain embedded audio/video players that:
    • Do not have legitimate permissions
    • Do not have licensed music
    • Are not authorized streaming platforms
    • Are not official YouTube channels
  • Methods to access copyrighted content:
    • Steps to download songs from an unauthorized website
    • Ways to bypass paywalls or DRM protections to access copyrighted songs or videos

Next steps

Follow the quickstart to get started using Azure AI Content Safety to detect protected material.