Skip to main content

Privacera Platform master publication

Databricks SQL
:

This topic shows how to configure access control in Databricks SQL.

Prerequisites

Ensure the following prerequisite is met:

  • Create an endpoint in Databricks SQL with a user having admin privileges. For more information, refer to Create an endpoint in Databricks SQL.

  • As you configure the endpoint using the link provided above, get the following values:

    • Host URL

    • JDBC URL

    • SQL endpoint token

    • Database List

CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Run the following commands.

    cd ~/privacera/privacera-manager/config
    cp sample-vars/vars.policysync.databricks.sql.analytics.yml custom-vars/
    vi config/custom-vars/vars.policysync.databricks.sql.analytics.yml
    
  3. Set the properties for your specific installation. For property details and description, see the Configuration Properties section that follows.

    Note

    Along with the above properties, you can add custom properties that are not included by default. For more information about these properties, see Databricks SQL Connector.

  4. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration properties

JDBC configuration properties

Table 36. JDBC configuration properties

Name

Type

Default

Required

Description

DATABRICKS_SQL_ANALYTICS_JDBC_URL

string

Yes

Specifies the JDBC URL for the Databricks SQL connector.

Use the following format for the JDBC URL:

jdbc:spark://<WORKSPACE_URL>:443/<DATABASE>;transportMode=http;ssl=1;AuthMech=3;httpPath=/sql/1.0/endpoints/1234567890

The workspace URL and the database name are derived from your Databricks SQL configuration.

DATABRICKS_SQL_ANALYTICS_JDBC_USERNAME

string

Yes

Specifies the JDBC username to use.

DATABRICKS_SQL_ANALYTICS_JDBC_PASSWORD

string

Yes

Specifies the access token of the SQL endpoint to use.

DATABRICKS_SQL_ANALYTICS_JDBC_DB

string

Yes

Specifies the name of the JDBC database to use.

DATABRICKS_SQL_ANALYTICS_OWNER_ROLE

string

No

Specifies the role that owns the resources managed by PolicySync. You must ensure that this user exists as PolicySync does not create this user.

  • If a value is not specified, resources are owned by the creating user. In this case, the owner of the resource will have all access to the resource.

  • If a value is specified, the owner of the resource will be changed to the specified value.

The following resource types are supported:

  • Database

  • Schemas

  • Tables

  • Views

DATABRICKS_SQL_ANALYTICS_HOST_URL

string

Yes

Specifies the base URL for the Databricks SQL instance.



Load keys and intervals

Table 37. Load keys and intervals

Name

Type

Default

Required

Description

DATABRICKS_SQL_ANALYTICS_LOAD_RESOURCES_KEY

string

load_like

No

Specifies how PolicySync loads resources from Databricks SQL. The following values are allowed:

  • load_like: Default value for loading resources, to be used in production.

  • load: Load resources from Databricks SQL with a top-down resources approach, that is, it first loads the database and then the schemas followed by tables and its columns. This mode is only for development purposes.

DATABRICKS_SQL_ANALYTICS_RESOURCE_SYNC_INTERVAL

integer

60

No

Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources.

DATABRICKS_SQL_ANALYTICS_PRINCIPAL_SYNC_INTERVAL

integer

420

No

Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly.

DATABRICKS_SQL_ANALYTICS_PERMISSION_SYNC_INTERVAL

integer

540

No

Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly.

DATABRICKS_SQL_ANALYTICS_AUDIT_SYNC_INTERVAL

integer

30

No

Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera.



Resources management

Table 38. Resources management

Name

Type

Default

Required

Description

DATABRICKS_SQL_ANALYTICS_MANAGE_DATABASE_LIST

string

No

Specifies a comma-separated list of database names for which PolicySync manages access control. If unset, access control is managed for all databases. If specified, use the following format. You can use wildcards. Names are case-sensitive.

An example list of databases might resemble the following: testdb1,testdb2,sales db*.

If specified, DATABRICKS_SQL_ANALYTICS_IGNORE_DATABASE_LIST takes precedence over this setting.

DATABRICKS_SQL_ANALYTICS_MANAGE_TABLE_LIST

string

No

Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

Use the following format when specifying a table:

<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>

If specified, DATABRICKS_SQL_ANALYTICS_IGNORE_TABLE_LIST takes precedence over this setting.

If you specify a wildcard, such as in the following example, all matched tables are managed:

<DATABASE_NAME>.<SCHEMA_NAME>.*

The specified value, if any, is interpreted in the following ways:

  • If unset, access control is managed for all tables.

  • If set to none no tables are managed.

DATABRICKS_SQL_ANALYTICS_IGNORE_DATABASE_LIST

string

No

Specifies a comma-separated list of database names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all databases are subject to access control.

For example:

testdb1,testdb2,sales_db*

This setting supersedes any values specified by DATABRICKS_SQL_ANALYTICS_MANAGE_DATABASE_LIST.

DATABRICKS_SQL_ANALYTICS_IGNORE_TABLE_LIST

string

No

Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. If not specified, all tables are subject to access control. Names are case-sensitive. Specify tables using the following format:

<DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME>

This setting supersedes any values specified by DATABRICKS_SQL_ANALYTICS_MANAGE_TABLE_LIST.



Users/Groups/Roles management

Table 39. Users/Groups/Roles management

Name

Type

Default

Required

Description

DATABRICKS_SQL_ANALYTICS_NAME_REPLACE_FROM_REGEX

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a user name and replaces each matching character with the value specified by the DATABRICKS_SQL_ANALYTICS_NAME_REPLACE_TO_STRING setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_NAME_REPLACE_TO_STRING

string

_

No

Specifies a string to replace the characters matched by the regex specified by the DATABRICKS_SQL_ANALYTICS_NAME_REPLACE_FROM_REGEX setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_USER_NAME_REPLACE_FROM_REGEX

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the DATABRICKS_SQL_ANALYTICS_USER_NAME_REPLACE_TO_STRING setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_USER_NAME_REPLACE_TO_STRING

string

_

No

Specifies a string to replace the characters matched by the regex specified by the DATABRICKS_SQL_ANALYTICS_USER_NAME_REPLACE_FROM_REGEX setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_GROUP_NAME_REPLACE_FROM_REGEX

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the DATABRICKS_SQL_ANALYTICS_GROUP_NAME_REPLACE_TO_STRING setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_GROUP_NAME_REPLACE_TO_STRING

string

_

No

Specifies a string to replace the characters matched by the regex specified by the DATABRICKS_SQL_ANALYTICS_GROUP_NAME_REPLACE_FROM_REGEX setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_ROLE_NAME_REPLACE_FROM_REGEX

string

[~`$&+:;=?@#|'<>.^*()_%\\\\[\\\\]!\\\\-\\\\/\\\\\\\\{}]

No

Specifies a regular expression to apply to a role name and replaces each matching character with the value specified by the DATABRICKS_SQL_ANALYTICS_ROLE_NAME_REPLACE_TO_STRING setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_ROLE_NAME_REPLACE_TO_STRING

string

_

No

Specifies a string to replace the characters matched by the regex specified by the DATABRICKS_SQL_ANALYTICS_ROLE_NAME_REPLACE_FROM_REGEX setting.

If not specified, no find and replace operation is performed.

DATABRICKS_SQL_ANALYTICS_USER_NAME_PERSIST_CASE_SENSITIVITY

boolean

false

No

Specifies whether PolicySync converts user names to lowercase when creating local users. If set to true, case sensitivity is preserved.

DATABRICKS_SQL_ANALYTICS_GROUP_NAME_PERSIST_CASE_SENSITIVITY

boolean

false

No

Specifies whether PolicySync converts group names to lowercase when creating local groups. If set to true, case sensitivity is preserved.

DATABRICKS_SQL_ANALYTICS_ROLE_NAME_PERSIST_CASE_SENSITIVITY

boolean

false

No

Specifies whether PolicySync converts role names to lowercase when creating local roles. If set to true, case sensitivity is preserved.

DATABRICKS_SQL_ANALYTICS_USER_NAME_CASE_CONVERSION

string

lower

No

Specifies how user name conversions are performed. The following options are valid:

  • lower: Convert to lowercase

  • upper: Convert to uppercase

  • none: Preserve case

This setting applies only if DATABRICKS_SQL_ANALYTICS_USER_NAME_PERSIST_CASE_SENSITIVITY is set to true.

DATABRICKS_SQL_ANALYTICS_GROUP_NAME_CASE_CONVERSION

string

lower

No

Specifies how group name conversions are performed. The following options are valid:

  • lower: Convert to lowercase

  • upper: Convert to uppercase

  • none: Preserve case

This setting applies only if DATABRICKS_SQL_ANALYTICS_GROUP_NAME_PERSIST_CASE_SENSITIVITY is set to true.

DATABRICKS_SQL_ANALYTICS_ROLE_NAME_CASE_CONVERSION

string

lower

No

Specifies how role name conversions are performed. The following options are valid:

  • lower: Convert to lowercase

  • upper: Convert to uppercase

  • none: Preserve case

This setting applies only if DATABRICKS_SQL_ANALYTICS_ROLE_NAME_PERSIST_CASE_SENSITIVITY is set to true.

DATABRICKS_SQL_ANALYTICS_CREATE_USER

boolean

true

No

Specifies whether PolicySync creates local users for each user in Privacera.

DATABRICKS_SQL_ANALYTICS_MANAGE_USERS

boolean

true

No

Specifies whether PolicySync maintains user membership in roles in the Databricks SQL data source.

DATABRICKS_SQL_ANALYTICS_MANAGE_GROUPS

boolean

true

No

Specifies whether PolicySync creates groups from Privacera in the Databricks SQL data source.

DATABRICKS_SQL_ANALYTICS_MANAGE_ROLES

boolean

true

No

Specifies whether PolicySync creates roles from Privacera in the Databricks SQL data source.

DATABRICKS_SQL_ANALYTICS_MANAGE_USER_LIST

string

No

Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive.

If not specified, PolicySync manages access control for all users.

If specified, DATABRICKS_SQL_ANALYTICS_IGNORE_USER_LIST takes precedence over this setting.

An example user list might resemble the following: user1,user2,dev_user*.

DATABRICKS_SQL_ANALYTICS_MANAGE_GROUP_LIST

string

No

Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive.

An example list of projects might resemble the following: group1,group2,dev_group*.

If specified, DATABRICKS_SQL_ANALYTICS_IGNORE_GROUP_LIST takes precedence over this setting.

DATABRICKS_SQL_ANALYTICS_MANAGE_ROLE_LIST

string

No

Specifies a comma-separated list of role names for which PolicySync manages access control. If unset, access control is managed for all roles. If specified, use the following format. You can use wildcards. Names are case-sensitive.

An example list of projects might resemble the following: role1,role2,dev_role*.

If specified, DATABRICKS_SQL_ANALYTICS_IGNORE_ROLE_LIST takes precedence over this setting.

DATABRICKS_SQL_ANALYTICS_IGNORE_USER_LIST

string

No

Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control.

This setting supersedes any values specified by DATABRICKS_SQL_ANALYTICS_MANAGE_USER_LIST.

DATABRICKS_SQL_ANALYTICS_IGNORE_GROUP_LIST

string

No

Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control.

This setting supersedes any values specified by DATABRICKS_SQL_ANALYTICS_MANAGE_GROUP_LIST.

DATABRICKS_SQL_ANALYTICS_IGNORE_ROLE_LIST

string

No

Specifies a comma-separated list of role names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all roles are subject to access control.

This setting supersedes any values specified by DATABRICKS_SQL_ANALYTICS_MANAGE_ROLE_LIST.

DATABRICKS_SQL_ANALYTICS_GROUP_ROLE_PREFIX

string

priv_group_

No

Specifies the prefix that PolicySync uses when creating local roles. For example, if you have a group named etl_users defined in Privacera and the role prefix is prefix_, the local role is named prefix_etl_users.

DATABRICKS_SQL_ANALYTICS_ROLE_ROLE_PREFIX

string

priv_role_

No

Specifies the prefix that PolicySync uses when creating roles from Privacera in the Databricks SQL data source.

For example, if you have a role in Privacera named finance defined in Privacera and the role prefix is role_prefix_, the local role is named role_prefix_finance.

DATABRICKS_SQL_ANALYTICS_USE_NATIVE_PUBLIC_GROUP

boolean

true

No

Specifies whether PolicySync uses the Databricks SQL native public group for access grants whenever a policy refers to a public group. The default value is true.

DATABRICKS_SQL_ANALYTICS_MANAGE_USER_FILTERBY_GROUP

boolean

false

No

Specifies whether to manage only the users that are members of groups specified by DATABRICKS_SQL_ANALYTICS_MANAGE_GROUP_LIST. The default value is false.

DATABRICKS_SQL_ANALYTICS_MANAGE_USER_FILTERBY_ROLE

boolean

false

No

Specifies whether to manage only users that are members of the roles specified by DATABRICKS_SQL_ANALYTICS_MANAGE_ROLE_LIST. The default value is false.

DATABRICKS_SQL_ANALYTICS_USER_USE_EMAIL_AS_SERVICE_NAME

boolean

true

No

This Property is used to map the username to the email address when granting/revoking access.



Access control management

Table 40. Access control management

Name

Type

Default

Required

Description

DATABRICKS_SQL_ANALYTICS_ENABLE_VIEW_BASED_MASKING

boolean

true

No

Specifies whether to use secure view based masking. The default value is true.

DATABRICKS_SQL_ANALYTICS_ENABLE_VIEW_BASED_ROW_FILTER

boolean

true

No

Specifies whether to use secure view based row filtering. The default value is true.

While Databricks SQL supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_CREATE_FOR_ALL

boolean

true

No

Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled.

DATABRICKS_SQL_ANALYTICS_MASKED_NUMBER_VALUE

integer

0

No

Specifies the default masking value for numeric column types.

DATABRICKS_SQL_ANALYTICS_MASKED_TEXT_VALUE

string

<MASKED>

No

Specifies the default masking value for text and string column types.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_NAME_PREFIX

string

No

Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

If you want to change the secure view schema name prefix, specify a value for this setting. For example, if the prefix is dev_, then the secure view name for a table named example1 is dev_example1.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_NAME_POSTFIX

string

No

Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same schema name as the table schema name.

If you want to change the secure view schema name postfix, specify a value for this setting. For example, if the postfix is _dev, then the secure view name for a table named example1 is example1_dev.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_DATABASE_NAME_PREFIX

string

No

Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name.

For example, if the prefix is priv_, then the secure view name for a database named example1 is priv_example1.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_DATABASE_NAME_POSTFIX

string

_secure

No

Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same name as the table database name.

For example, if the postfix is _sec, then the secure view name for a database named example1 is example1_sec.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_NAME_REMOVE_SUFFIX_LIST

string

No

Specifies a suffix to remove from a table or view name. For example, if the table is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

You can specify a single suffix or a comma separated list of suffixes.

DATABRICKS_SQL_ANALYTICS_SECURE_VIEW_DATABASE_NAME_REMOVE_SUFFIX_LIST

string

No

Specifies a suffix to remove from a database name. For example, if the database is named example_suffix you can remove the _suffix string. This transformation is applied before any custom prefix or postfix is applied.

You can specify a single suffix or a comma separated list of suffixes.

DATABRICKS_SQL_ANALYTICS_GRANT_UPDATES

boolean

true

Yes

Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is true.

DATABRICKS_SQL_ANALYTICS_ENABLE_DATA_ADMIN

boolean

true

No

This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view.

DATABRICKS_SQL_ANALYTICS_USE_HIVE_ACCESS_POLICIES

boolean

false

No

Uncomment and set this property to true to use the privacera_hive service instead of privacera_databricks_sql_analytics. After setting the Privacera Manager property, PolicySync starts syncing the policies from the privacera_hive service. Be aware that deny and exclude polices from privacera_hive will not work, as only allow policies from Hive are supported this way.



Access audits management

Table 41. Access audits management

Name

Type

Default

Required

Description

DATABRICKS_SQL_ANALYTICS_AUDIT_ENABLE

boolean

true

Yes

Specifies whether Privacera fetches access audit data from the data source.

DATABRICKS_SQL_ANALYTICS_AUDIT_INITIAL_PULL_MINUTES

integer

30

No

Specifies the initial delay, in minutes, before PolicySync retrieves access audits from Databricks SQL.

DATABRICKS_SQL_ANALYTICS_AUDIT_EXCLUDED_USERS

string

{{DATABRICKS_SQL_ANALYTICS_JDBC_USERNAME}}

No

Specifies a comma separated list of users to exclude when fetching access audits. For example: "user1,user2,user3".