Skip to main content

Privacera Documentation

Table of Contents

Supported file formats for Discovery Scans

Privacera Discovery can scan the following file formats:

  • Structured data with taggable content and metadata:

    • .avro

    • .avro (nested)

    • .csv

    • .html

    • .json

    • .json (nested)

    • .orc

    • .parquet

    • .parquet (nested)

    • .sas

    • .tsv

    • .xls

    • .xlsx

    • .xml

  • Compressed/archive data with taggable content and metadata:

    • .gzip (single or multiple files)

    • .gz (single or multiple files)

    • .lzo/.lzop

    • .jar (single or multiple files)

    • .tar.gz (single or multiple files)

    • .snappy.parquet

    • .snappy.orc

    • .snappy.avro

    • .zip (single or multiple files)

    • .zlib.orc

    • .zlib.parquet

    • .zlib.avro

  • Unstructured data with taggable content and metadata:

    • .dat

    • .doc

    • .docx

    • .pdf

    • .txt

  • Media data with taggable metadata:

    Note

    For the following file formats, Discovery only supports metadata extraction.

    • .jpeg

    • .mp4

    • .mpeg