Skip to main content

PrivaceraCloud Documentation

Use a custom policy repository with Databricks

:

You can use a custom policy repository with Databricks if you use the fine-grained access control (FGAC) plug-in for access management. A custom policy repository uses a unique prefix, such as dev, which you specify as part of your Databricks cluster configuration.

The prefix string is injected into the Apache Ranger access control plug-in through an environment variable named SERVICE_NAME_PREFIX.

When configured, the FGAC plug-in can use access policies in custom policy repositories for the following service types:

  • Hive

  • S3

  • Files

  • ADLS

Complete one of the following procedures to use Databricks with custom policy repositories:

Configure a custom policy repository for all Databricks clusters with cluster policy

You can configure your Databricks cluster policy to inject the SERVICE_NAME_PREFIX environment variable for all clusters through cluster policy.

Procedure
  1. Log in to your Databricks account.

  2. Define a cluster policy and specify the following JSON configuration:

    {
      "spark_env_vars.SERVICE_NAME_PREFIX": {
        "type": "fixed",
        "value": "<SERVICE_NAME_PREFIX>"
      }
    }

    Where:

    • <SERVICE_NAME_PREFIX>: Specifies the policy repository prefix, such as qa.

  3. To apply the new cluster policy, restart each Databricks cluster.

Configure a custom policy repository for a single Databricks cluster with an environment variable

You can configure a specific Databricks cluster to inject the SERVICE_NAME_PREFIX environment variable through a cluster environment variable.

Procedure
  1. Log in to your Databricks account.

  2. From your list of Databricks clusters, edit the cluster you want to use with a custom policy repository.

  3. Update the cluster environment variables to include the following value:

    SERVICE_NAME_PREFIX=<SERVICE_NAME_PREFIX>

    Where:

    • <SERVICE_NAME_PREFIX>: Specifies the policy repository prefix, such as qa.

  4. To apply the new cluster policy, restart the Databricks cluster.

Configure a custom policy repository for a single Databricks cluster with an Init Script

You can edit the FGAC plug-in Init Script that your Databricks cluster runs such that it sets the SERVICE_NAME_PREFIX environment variable.

Procedure
  1. Log in to the system where you installed Privacera Manager.

  2. Locate the privacera_databricks.sh script and open the script with an editor.

  3. Modify the script you opened in the previous step and specify the following variable:

    SERVICE_NAME_PREFIX=<SERVICE_NAME_PREFIX>

    Where:

    • <SERVICE_NAME_PREFIX>: Specifies the policy repository prefix, such as qa.

  4. To update the modified script to your Databricks DBFS, enter the following command:

    dbfs cp privacera_databricks.sh dbfs:/<PATH>/privacera_databricks.sh

    Where:

    • <PATH>: Specifies the DBFS path to copy the updated script to.

  5. To apply the new cluster policy, restart the Databricks cluster.