PIIDetectionSkill Class

Using the Text Analytics API, extracts personal information from an input text and gives you the option of masking it.

All required parameters must be populated in order to send to server.

Inheritance
azure.search.documents.indexes._generated.models._models_py3.SearchIndexerSkill
PIIDetectionSkill

Constructor

PIIDetectionSkill(*, inputs: List[_models.InputFieldMappingEntry], outputs: List[_models.OutputFieldMappingEntry], name: str | None = None, description: str | None = None, context: str | None = None, default_language_code: str | None = None, minimum_precision: float | None = None, masking_mode: str | _models.PIIDetectionSkillMaskingMode | None = None, mask: str | None = None, model_version: str | None = None, pii_categories: List[str] | None = None, domain: str | None = None, **kwargs: Any)

Keyword-Only Parameters

Name Description
name
str

The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'.

description
str

The description of the skill which describes the inputs, outputs, and usage of the skill.

context
str

Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document.

inputs

Inputs of the skills could be a column in the source data set, or the output of an upstream skill. Required.

outputs

The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. Required.

default_language_code
str

A value indicating which language code to use. Default is en.

minimum_precision

A value between 0 and 1 that be used to only include entities whose confidence score is greater than the value specified. If not set (default), or if explicitly set to null, all entities will be included.

masking_mode

A parameter that provides various ways to mask the personal information detected in the input text. Default is 'none'. Known values are: "none" and "replace".

mask
str

The character used to mask the text if the maskingMode parameter is set to replace. Default is '*'.

model_version
str

The version of the model to use when calling the Text Analytics service. It will default to the latest available when not specified. We recommend you do not specify this value unless absolutely necessary.

pii_categories

A list of PII entity categories that should be extracted and masked.

domain
str

If specified, will set the PII domain to include only a subset of the entity categories. Possible values include: 'phi', 'none'. Default is 'none'.

Variables

Name Description
odata_type
str

A URI fragment specifying the type of skill. Required.

name
str

The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'.

description
str

The description of the skill which describes the inputs, outputs, and usage of the skill.

context
str

Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document.

inputs

Inputs of the skills could be a column in the source data set, or the output of an upstream skill. Required.

outputs

The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. Required.

default_language_code
str

A value indicating which language code to use. Default is en.

minimum_precision

A value between 0 and 1 that be used to only include entities whose confidence score is greater than the value specified. If not set (default), or if explicitly set to null, all entities will be included.

masking_mode

A parameter that provides various ways to mask the personal information detected in the input text. Default is 'none'. Known values are: "none" and "replace".

mask
str

The character used to mask the text if the maskingMode parameter is set to replace. Default is '*'.

model_version
str

The version of the model to use when calling the Text Analytics service. It will default to the latest available when not specified. We recommend you do not specify this value unless absolutely necessary.

pii_categories

A list of PII entity categories that should be extracted and masked.

domain
str

If specified, will set the PII domain to include only a subset of the entity categories. Possible values include: 'phi', 'none'. Default is 'none'.

Methods

as_dict

Return a dict that can be serialized using json.dump.

Advanced usage might optionally use a callback as parameter:

Key is the attribute name used in Python. Attr_desc is a dict of metadata. Currently contains 'type' with the msrest type and 'key' with the RestAPI encoded key. Value is the current value in this object.

The string returned will be used to serialize the key. If the return type is a list, this is considered hierarchical result dict.

See the three examples in this file:

  • attribute_transformer

  • full_restapi_key_transformer

  • last_restapi_key_transformer

If you want XML serialization, you can pass the kwargs is_xml=True.

deserialize

Parse a str using the RestAPI syntax and return a model.

enable_additional_properties_sending
from_dict

Parse a dict using given key extractor return a model.

By default consider key extractors (rest_key_case_insensitive_extractor, attribute_key_case_insensitive_extractor and last_rest_key_case_insensitive_extractor)

is_xml_model
serialize

Return the JSON that would be sent to server from this model.

This is an alias to as_dict(full_restapi_key_transformer, keep_readonly=False).

If you want XML serialization, you can pass the kwargs is_xml=True.

as_dict

Return a dict that can be serialized using json.dump.

Advanced usage might optionally use a callback as parameter:

Key is the attribute name used in Python. Attr_desc is a dict of metadata. Currently contains 'type' with the msrest type and 'key' with the RestAPI encoded key. Value is the current value in this object.

The string returned will be used to serialize the key. If the return type is a list, this is considered hierarchical result dict.

See the three examples in this file:

  • attribute_transformer

  • full_restapi_key_transformer

  • last_restapi_key_transformer

If you want XML serialization, you can pass the kwargs is_xml=True.

as_dict(keep_readonly: bool = True, key_transformer: ~typing.Callable[[str, ~typing.Dict[str, ~typing.Any], ~typing.Any], ~typing.Any] = <function attribute_transformer>, **kwargs: ~typing.Any) -> MutableMapping[str, Any]

Parameters

Name Description
key_transformer
<xref:function>

A key transformer function.

keep_readonly
Default value: True

Returns

Type Description

A dict JSON compatible object

deserialize

Parse a str using the RestAPI syntax and return a model.

deserialize(data: Any, content_type: str | None = None) -> ModelType

Parameters

Name Description
data
Required
str

A str using RestAPI structure. JSON by default.

content_type
str

JSON by default, set application/xml if XML.

Default value: None

Returns

Type Description

An instance of this model

Exceptions

Type Description
DeserializationError if something went wrong

enable_additional_properties_sending

enable_additional_properties_sending() -> None

from_dict

Parse a dict using given key extractor return a model.

By default consider key extractors (rest_key_case_insensitive_extractor, attribute_key_case_insensitive_extractor and last_rest_key_case_insensitive_extractor)

from_dict(data: Any, key_extractors: Callable[[str, Dict[str, Any], Any], Any] | None = None, content_type: str | None = None) -> ModelType

Parameters

Name Description
data
Required

A dict using RestAPI structure

content_type
str

JSON by default, set application/xml if XML.

Default value: None
key_extractors
Default value: None

Returns

Type Description

An instance of this model

Exceptions

Type Description
DeserializationError if something went wrong

is_xml_model

is_xml_model() -> bool

serialize

Return the JSON that would be sent to server from this model.

This is an alias to as_dict(full_restapi_key_transformer, keep_readonly=False).

If you want XML serialization, you can pass the kwargs is_xml=True.

serialize(keep_readonly: bool = False, **kwargs: Any) -> MutableMapping[str, Any]

Parameters

Name Description
keep_readonly

If you want to serialize the readonly attributes

Default value: False

Returns

Type Description

A dict JSON compatible object