Skip to main content

Privacera Documentation

System-level settings for Spark plugin on Privacera Platform

Property Name

Description

Example Values

DATABRICKS_VERSION

Set this property to select which version of Spark Config you want to be used.

From release 5.0 onwards, two versions of Spark Config can be used: V1 or V2. It uses V1 by default, and V2 is for preview purposes only.

If your Databricks version is >= 7.6, use V2 for the Spark Config. For lower versions, use V1.

V2

DATABRICKS_HOST_URL

Enter the URL where the Databricks environment is hosted.

For AZURE Databricks,

DATABRICKS_HOST_URL: "https://xdx-66506xxxxxxxx.2.azuredatabricks.net/?o=665066931xxxxxxx"

For AWS Databricks

DATABRICKS_HOST_URL: "https://xxx-7xxxfaxx-xxxx.cloud.databricks.com"

DATABRICKS_TOKEN

Enter the token.

To generate the token:

  1. Log in to your Databricks account.

  2. Click the user profile icon.

  3. Click User Settings.

  4. Click Generate New Token.

  5. (Optional) Enter a description (comment) and expiration period.

  6. Click Generate.

  7. Copy the generated token.

DATABRICKS_TOKEN: "xapid40xxxf65xxxxxxe1470eayyyyycdc06"

DATABRICKS_WORKSPACES_LIST

Add multiple Databricks workspaces to connect to Ranger.

  1. To add a single workspace, add the following default JSON in the text area to define the host URL and token of the Databricks workspace. The text area should not be left empty and should at least contain the default JSON.

    [
    {    "alias": "DEFAULT",    
         "databricks_host_url": "{{DATABRICKS_HOST_URL}}",    
         "token": "{{DATABRICKS_TOKEN}}"
    }
    ]

    Note

    Do not edit any of the values in the default JSON.

  2. To add two workspaces, use the following JSON.

    [
    {    "alias": "DEFAULT",    
         "databricks_host_url": "{{DATABRICKS_HOST_URL}}",    
         "token": "{{DATABRICKS_TOKEN}}"
    },
    {    "alias": "<workspace-2-alias>",    
         "databricks_host_url": "<workspace-2-url>",    
         "token": "<dbx-token-for-workspace-2>"
    }
    ]                               

Note

{{var}} is an Ansible variable. Such a variable re-uses the value of a predefined variable. Hence, do not edit the properties, databricks_host_url and token of the alias: DEFAULT as they are set by DATABRICKS_HOST_URL and DATABRICKS_TOKEN respectively.

DATABRICKS_ENABLE

If set to 'true' Privacera Manager will create the Databricks cluster Init script "ranger_enable.sh" to:

'~/privacera/privacera-manager/output/databricks/ranger_enable.sh.

"true"

"false"

DATABRICKS_MANAGE_INIT_SCRIPT

If set to 'true' Privacera Manager will upload Init script ('ranger_enable.sh') to the identified Databricks Host.

If set to 'false' upload the following two files to the DBFS location. The files can be located at ~/privacera/privacera-manager/output/databricks.

  • privacera_spark_plugin_job.conf

  • privacera_spark_plugin.conf

"true"

"false"

DATABRICKS_SPARK_PLUGIN_AGENT_JAR

Use the Java agent to assign a string of extra JVM options to pass to the Spark driver.

-javaagent:/databricks/jars/privacera-agent.jar

DATABRICKS_SPARK_PRIVACERA_CUSTOM_CURRENT_USER_UDF_NAME

Map logged-in user to Ranger user for row-filter policy.

current_user()

DATABRICKS_SPARK_PRIVACERA_VIEW_LEVEL_MASKING_ROWFILTER_EXTENSION_ENABLE

Property to enable masking, row-filter and data_admin access on view.

false

DATABRICKS_JWT_OAUTH_ENABLE

Enable JWT auth in Databricks plugin and Databricks Signed URL.

TRUE

DATABRICKS_JWT_PUBLIC_KEY_FILE_NAME

Enter the filename for the public key. Ensure the name does not contain any spaces

Note

Copy the public key in config/custom-properties folder.

jwttoken.pub

DATABRICKS_JWT_ISSUER

Enter the URL of the identity provider. Get it from the Prerequisites section.

https://your-idp-domain.com

DATABRICKS_JWT_SUBJECT

Subject of the JWT (the user)

api-token

DATABRICKS_JWT_SECRET

Property for jwt secret. If the jwt token has been encrypted using secret, use the property to set the secret.

DATABRICKS_JWT_USERKEY

Define a unique userkey.

client_id

DATABRICKS_JWT_GROUPKEY

Define a unique group key.

scope”

DATABRICKS_JWT_PARSER_TYPE

Assign one of the following values:

  • PING_IDENTITY

  • KEYCLOAKS

PING_IDENTITY

DATABRICKS_SQL_CLUSTER_POLICY_SPARK_CONF

Configure Databricks Cluster policy.

Add the following JSON in the text area:

[
{    
     "Note":"First spark conf",    
     "key":"spark.hadoop.first.spark.test",    
     "value":"test1"
},
{    
     "Note":"Second spark conf",    
     "key":"spark.hadoop.first.spark.test",    
     "value":"test2"
}
]                        

DATABRICKS_CUSTOM_SPARK_CONFIG_FILE

Using this property, you can pass custom properties to the Spark configuration.

  1. Create a file with the filename databricks-spark.conf.

  2. Add all the custom properties you want to pass. For example, you can add the property, "spark.databricks.delta.formatCheck.enabled"="false" in the file.

  3. Browse and select the Spark custom file where you have defined all the custom properties.

DATABRICKS_POST_PLUGIN_COMMAND_LIST

Note

This property is not part of the default YAML file, but can be added, if required.

Use this property, if you want to run a specific set of commands in the Databricks init script.

The following example will be added to the cluster init script to allow Athena JDBC via data access server.

DATABRICKS_POST_PLUGIN_COMMAND_LIST:

- sudo iptables -I OUTPUT 1 -p tcp -m tcp --dport 8181 -j ACCEPT

- sudo curl -k -u user:password {{PORTAL_URL}}/api/dataserver/cert?type=dataserver_jks -o /etc/ssl/certs/dataserver.jks

- sudo chmod 755 /etc/ssl/certs/dataserver.jks

DATABRICKS_RANGER_IS_FALLBACK_SUPPORTED

Use the property to enable/disable the fallback behavior to the privacera_files and privacera_hive services. It confirms whether the resources files should be allowed/denied access to the user.

To enable the fallback, set to true; to disable, set to false.

true

DATABRICKS_SERVICE_NAME_PREFIX

The service name prefix, by default this value is commented out to use Privacera. If you are using a custom value then you have to manually create service repositories from Privacera Portal, see Configure service name for Databricks Spark plugin on Privacera Platform.

Note

This property is applicable only for Databricks FGAC plugin.

dev