Skip to main content

Privacera Documentation

Set up Databricks encryption and masking

This topic describes how to install and configure the Privacera Encryption JAR file UDF in Privacera Manager Databricks to create UDFs for encryption and masking and to create policies for users and groups.

The overall approach is as follows:

  1. Install the Privacera Manager Encryption JAR in Databricks with the Databricks CLI or UI

  2. Upload Privacera Manager configuration files to Databricks

  3. Define UDFs in Databricks to call the Privacera Manager encryption protect and unprotect methods.

Prerequisites
  • In Databricks, make sure that the users who will use the UDFs have sufficient access to write the pertinent tables.

  • In Privacera Manager, make sure to configure the Databricks datasource: Databricks Spark Plugin (Python/SQL) on AWS, Azure, or GCP.

  • In Privacera Manager, make sure that Privacera Encryption has been enabled. .

  • In Privacera Manager, make sure that the users who will use the UDFs in Databricks have been given permission to access the Create scheme policies that are part of the UDF syntax.

  • In Privacera Manager, make sure that these same users have been given permission to Provide user access to Ranger KMS.

Methods for installing the Privacera encryption JAR

You can install the Privacera encryption JAR file in the following ways:

After you install the JAR file, you need to define some configuration properties and User-Defined Functions (UDFs) to call the Privacera encryption /protect and /unprotect API endpoints.

Install the Privacera encryption JAR using Databricks CLI

To install the Privacera encryption JAR using the Databricks CLI, follow these steps:

  1. Download the JAR to a local machine.

    The variable PRIVACERA_BASE_DOWNLOAD_URL depends on the version of the Privacera software you want. See Install Privacera Manager on Privacera Platform.

    export PRIVACERA_BASE_DOWNLOAD_URL=$<PRIVACERA_BASE_DOWNLOAD_URL>
    wget $<PRIVACERA_BASE_DOWNLOAD_URL>/privacera-crypto-jar-with-dependencies.jar -O privacera-crypto-jar-with-dependencies.jar
    
  2. Upload the JAR file to DBFS or an S3 location from where the Databricks cluster can access it.

  3. Upload the jar into DBFS using the Databricks CLI:

    databricks fs ls
    databricks fs mkdirs dbfs:/privacera/crypto/jars
    databricks fs cp privacera-crypto-jar-with-dependencies.jar dbfs:/privacera/crypto/jars/privacera-crypto-jar-with-dependencies.jar

Install the Privacera encryption JAR using Databricks UI

To install the Privacera encryption JAR using the Databricks UI, follow these steps:

  1. Navigate to the Databricks cluster details page by selecting Clusters > cluster name > Libraries.

  2. Click Install > New.

  3. Drop or upload the JAR file.

    dbfs:/privacera/crypto/jars/privacera-crypto-jar-with-dependencies.jar

  4. Wait until the JAR file is installed.

Create and upload encryption configuration files

The steps here rely on the default location of the Privacera crypto properties file. However, you can change this location to a directory of your choice. Follow the steps here and then see Create a custom path to the crypto properties file in Databricks.

To create and upload the encryption configuration files, do the following:

  1. Create the configuration file on your local machine. In the next step, upload the file to the Databricks cluster.

    mkdir -p privacera/crypto/configs
    cd privacera/crypto/configs
     # Edit the crypto_default.properties file to set the following variables. 
    vi crypto_default.properties
    privacera.portal.base.url=http://<APP_HOSTNAME.>:6868 
    privacera.portal.username=<SOME_USERNAME>
    privacera.portal.password=<SOME_PASSWORD>
     # Mode of encryption/decryption: rpc or native
    privacera.crypto.mode=native
    
  2. Upload the configuration file to DBFS.

    databricks fs ls
    databricks fs mkdirs dbfs:/privacera/crypto/configs
    databricks fs cp crypto_default.properties dbfs:/privacera/crypto/configs/crypto_default.properties