Skip to main content

Privacera Platform master publication

Models

:

Models detect specific data elements in your data resources. The detection is done with various algorithms and heuristics.

Types of models

Privacera supports different types of models. You can filter the list of models using the search model option. This tab also displays the present number of record count.

Generic models

These are various general model parameters you can use to tailor matching of data.

Parameter

Data Type

Default

Description

INCLUDE_PATTERN_<#>

String

None

Patterns to be matched.

Can contain more than one pattern by changing the value of the <#> variable. For example: INCLUDE_PATTERN_1, INCLUDE_PATTERN_2, INCLUDE_PATTERN_3.

EXCLUDE_PATTERN_<#>

String

None

Patterns to be excluded from matching.

Can contain more than one pattern by changing the value of the <#> variable. For example, EXCLUDE_PATTERN_1, EXCLUDE_PATTERN_2, EXCLUDE_PATTERN_3.

ONLY_DIGITS

Boolean

FALSE

Indicates whether matching should use only the digits. Setting this parameter TRUE removes all non-numeric characters in the string before matching. For example, 1234-5 is treated as 12345.

CHECK_DIGIT_CODE_VALIDATE

String

None

Indicates whether to evaluate a checksum digit based on the last digit. Valid values:

  • LUHN

  • ABA

  • CUSIP

  • DIHEDRAL

  • IBAN

  • UK_NHS

  • MOD11

  • ISBN10

DO_LOOKUP

Boolean

FALSE

Indicates whether to use patterns specified by the LOOKUP_PATTERN parameter. If this parameter is set to TRUE, the patterns specified in LOOKUP_PATTERN are used.

LOOKUP_DICT

String

None

A dictionary name or key. See Dictionaries.

LOOKUP_PATTERN

String

None

Pattern for matching. See Patterns.

Note

See Embed Patterns in Dictionaries.

ISO3166_CC_VALIDATE_FLAG

Boolean

FALSE

Indicates whether to use Privacera-defined matching to validate an ISO two-character country code. If this parameter is set to TRUE, ISO3166_CC_PATTERN is used.

ISO3166_CC_PATTERN

None

A valid pattern for matching country codes. See Patterns.

Note

See Embed Patterns in Dictionaries.

ISO3166_CC_LOOKUP_KEY

None

Name of a defined dictionary. See Dictionaries.

Credit card model

The credit card model detects credit card numbers. It validates numbers based on the issuing network, length, and Luhn checksum.

Parameter

Type

Default

Meaning

CC_PATTERN

String

Privacera-supplied pattern for credit card numbers with range of digits, space or hyphen separated.

Credit card pattern, if you want to override the supplied pattern.

DEFAULT_TYPES

Boolean

True

Validate against known issuing network prefixes.

LUHN_CHECK

Boolean

True

Validate the Luhn checksum on the credit card number.

Supported credit card types

Credit Card Type

Conditions

Examples

American Express (AMEX) Card

Credit card starting with 34 or 37 and having 15 digits.

34xxxxxxxxxxxxx

37xxxxxxxxxxxxx

Master Card

  • Credit card starting with 51 to 55 and having 14 digits

  • Credit card starting with 2221 and having 12 digits

  • Credit card starting with 27 and having 13 digits.

51xxxxxxxxxxxx

2221xxxxxxxx

27xxxxxxxxxxx

Visa Card

Credit card starting with 4 and having 13 Or 16 digits.

4xxxxxxxxxxxx

4xxxxxxxxxxxxxxx

Diners Club Card

Credit card starting with 300 to 305 or 3095 or 36 or 38 or 39 and having 14 digits.

300xxxxxxxxxxx

3095xxxxxxxxxx

VPay (Visa) Card

Credit card starting with 4 and having 13 or 19 digits.

4xxxxxxxxxxxx

4xxxxxxxxxxxxxxxxxx

Date of birth model

The Date of Birth model detects various date formats.

Parameter

Type

Default

Meaning

MIN_AGE_YEARS

Integer

5

Age lower threshold.

MAX_AGE_YEARS

Integer

100

Age upper threshold.

USE_ALGO

Boolean

True

Tagging is done based on an algorithm to detect random distribution.

DATE_REGEX_var1

String

Pattern that matches a custom date format var1.

DATE_FORMAT_var1

String

Date Format that matches the pattern for var1.

Pre-configured date formats are:

  • International YYYYMD format with 4 digit year

  • US MDY with 4 digit or 2 digit year

  • Month abbreviated MDY

Additional formats can be configured. For example, configure a regex and a Java date format:

Parameter

Type

DATE_REGEX_1

\d{4} \d{2} \d{2}

DATE_FORMAT_1

yyyy MM dd

EIN model

The EIN model detects Employer Identification Number using patterns and digit validation.

Parameter

Type

Default

Meaning

EIN_PATTERN

String

Default

EIN digit pattern if you want to override the default pattern.

VALIDATIONS

Boolean

True

Age upper threshold.

STRICT_PATTERN

Boolean

True

Allow match only if EIN has exact format.

Geo latitude and longitude model

The geo model detects latitude and longitude coordinates. It can validate these values based on a geographical area.

Parameter

Type

Default

Meaning

MIN_LAT

Double

US min latitude

Lower limit (southern) on latitude.

MAX_LAT

Double

US max latitude

Upper limit (northern) on latitude.

MIN_LONG

Double

US min longitude

Lower limit (west) on longitude.

MAX_LONG

Double

US max longitude

Upper limit (east) on longitude.

MIN_FRACTIONAL_DIGITS

Integer

3

Minimum number of digits after the decimal point.

IMEI model

The IMEI model detects International Mobile Equipment Identity numbers that are used to identify mobile phones. It validates the Luhn checksum and the length of the IMEI.

ITIN model

The ITIN model detects Individual Tax Identifier Numbers (identifiers of individual taxpayers). It validates the format and digits of the ITIN.

Parameter

Type

Default

Meaning

ITIN_PATTERN

String

Default

ITIN digit pattern if you want to override the default pattern.

STRICT_PATTERN

Boolean

True

Allow match only if ITIN has exact format.

MIME model

The MIME model detects a file based on its Multipurpose Internet Mail Extensions type. The MIME type is detected using a combination of file extension and magic bytes in the header of the file. The detected MIME type is then looked up in a dictionary of MIME types.

Parameter

Type

Default

Meaning

LOOKUP_DICT

String

Identifier of dictionary of MIME types.

There are two pre-configured MIME model instances.

  • For detecting executable files: LOOKUP_DICT=EXEC_MIME_KEYWORD.

  • For detecting image files: LOOKUP_DICT=IMAGE_MIME_KEYWORD.

Phone number model

The Phone Number model detects phone numbers. It validates the format of the phone numbers based on the country for which it is configured.

Parameter

Type

Default

Meaning

COUNTRY_CODE

String

US

Two-character country code.

SSN model

The SSN model detects US Social Security Numbers. It validates the format and checks against a blacklist of SSN numbers.

Parameter

Type

Default

Meaning

SSN_PATTERN

String

Default

Override the default SSN pattern.

VALIDATIONS

Boolean

True

Validate against known blacklist of SSNs.

STRICT_PATTERN

Boolean

False

Allow match only if SSN has exact format.

USE_9_DIGIT_PATTERN

Boolean

False

Match against any nine digit number without format.

USE_4_DIGIT_PATTERN

Boolean

False

Match against any four digit number without format. Disables validation with blacklist of SSN.

STRICT_EXT_PATTERN

Boolean

True

Allow match only if SSN has exact format that is hyphen-, dot-, or space-separated.

Examples of Invalid SSNs

The SSN model would determine that the following SSNs are invalid.

  • SSN starting with 9 or 666 or 000 or 98765432.

  • SSN with 00 as the 4th and 5th digits.

  • SSN with 0000 as the sixth through ninth digits.

  • Any SSN like these:

    • 123456789

    • 111111111

    • 222222222

    • 333333333

    • 444444444

    • 555555555

    • 666666666

    • 777777777

    • 888888888

    • 999999999

VIN model

The VIN model detects Vehicle Identification Numbers. It validates the length and the VIN checksum.

Zip model

The Zip model detects US Zip codes. It detects both 5 digit and 5+4 digit variations and validates against a dictionary of US Zip codes.

Parameter

Type

Default

Meaning

ZIP_DICT_KEY

String

US_ZIP_LOOKUP

Key of the US Zip dictionary.

ZIP_PATTERN

String

Default

Validates content regular expression for list of ZIP codes.

STRICT_PATTERN

Boolean

False

Allow match only if Zip code has exact format. If set to true then only nine digits containing '-' and starting with five digits are considered a Zip code.

Create a model

To create a model, follow these steps:

  1. From the navigation menu, select Discovery > Models.

  2. Click Add Model.

    The Add Model dialog is displayed.

  3. In the Name field, enter a name for the model.

  4. In the Description field, enter a description of the model.

  5. In the Key field, enter a model key.

  6. From the Type dropdown menu, select a model type.

    Note

    See Types of Models for more information.

  7. From the Apply For dropdown menu, select File content.

    Note

    File content is resource content.

  8. Enable or disable the model using the Model Status toggle.

  9. Add model properties by clicking +.

  10. Enter a key and value into the Key and Value field. For example: Key: MIN_FRACTIONAL_DIGITS, Value: 2. You can add multiple model properties.

    Note

    For example: Key: MIN_FRACTIONAL_DIGITS, Value: 2. You can add multiple model properties.

  11. Click Save.

    The model is created.

Edit a model

You can edit a model by clicking the Edit icon in the Actions column.

To edit a model, follow these steps:

  1. Click the Edit icon in the Actions column.

    The Edit Model dialog displays.

  2. Make your desired changes.

  3. Click Save.

    The model is updated.

Delete a model

You can edit a model by clicking the Delete icon in the Actions column.

To delete a model, follow these steps:

  1. Click the Delete icon in the Actions column.

    The Confirm Delete dialog displays.

  2. Select Delete to confirm the deletion.

    The model is deleted.

Import a model

To import a model file in JSON format, follow these steps:

  1. In the Models home page, click the Import option.

    The Import dialog is displayed.

  2. Browse and select the JSON file and click Import.

The model file is imported.

Export a model

To export a model file in JSON format, follow these steps:

  1. In the Models page, click Export.

  2. From the drop-down menu, select one of the following options:

    • All Records: Export the entire set of models.

    • Select Records: Select the specific model to export. You can select multiple models.

  3. Click Export.

    The JSON file is exported.

List of Privacera-supplied models

The following is a list of the Privacera-supplied models. For precise details, look at the model itself in the Platform UI.

  • DOB_ML_MODEL

  • CC_ML_MODEL

  • ZIP_ML_MODEL

  • IMEI_ML_MODEL

  • SSN_ML_MODEL

  • EXEC_ML_MODEL

  • MIME_ML_MODEL

  • PHONE_NUMBER_ML_MODEL

  • GEO_LAT_LONG_ML_MODEL

  • CC_ML_MODEL_PROTECTED

  • EIN_ML_MODEL

  • ITIN_ML_MODEL

  • VIN_ML_MODEL

  • SSN_9_DIGIT_ML_MODEL

  • SSN_4_DIGIT_ML_MODEL

  • IMAGE_FILE_ML_MODEL

  • IMAGE_ML_MODEL