- PrivaceraCloud Release 7.4
- Enhancements and updates in PrivaceraCloud release 7.4
- Known Issues in PrivaceraCloud 7.4
- PrivaceraCloud User Guide
- Overview of PrivaceraCloud
- Connect applications with the setup wizard
- Connect applications
- About applications
- Connect Azure Data Lake Storage Gen 2 (ADLS) to PrivaceraCloud
- Connect Amazon Textract to PrivaceraCloud
- Athena
- Privacera Discovery with Cassandra
- Connect Databricks to PrivaceraCloud
- Databricks SQL
- Databricks SQL Overview and Configuration
- Planning and general process
- Prerequisites
- Databricks SQL with Privacera Hive
- Connect Databricks SQL application
- Grant Databricks SQL permissions to PrivaceraCloud users
- Define a resource policy
- Test the policy
- Databricks SQL PolicySync fields
- Configuring column-level access control
- View-based masking functions and row-level filtering
- Create an endpoint in Databricks SQL
- Databricks SQL Fields
- Databricks SQL Hive Service Definition
- Databricks SQL Masking Functions
- Databricks SQL Encryption
- Use a custom policy repository with Databricks
- Connect Databricks SQL to Hive policy repository on PrivaceraCloud
- Databricks SQL Overview and Configuration
- Connect Databricks Unity Catalog to PrivaceraCloud
- Connect S3 to PrivaceraCloud
- Prerequisites in AWS console
- Connect S3 application to PrivaceraCloud
- Enable Privacera Access Management for S3
- Enable Data Discovery for S3
- S3 AWS Commands - Ranger Permission Mapping
- S3
- AWS Access with IAM
- Access AWS S3 buckets from multiple AWS accounts
- Add UserInfo in S3 Requests sent via Dataserver
- Control access to S3 buckets with AWS Lambda function on PrivaceraCloud
- Dremio Plugin
- DynamoDB
- Connect Elastic MapReduce from Amazon application to PrivaceraCloud
- Connect EMR application
- EMR Spark access control types
- PrivaceraCloud configuration
- AWS IAM roles using CloudFormation setup
- Create a security configuration
- Create EMR cluster
- How to configure multiple JSON Web Tokens (JWTs) for EMR
- EMR Native Ranger Integration with PrivaceraCloud
- Connect EMRFS S3 to PrivaceraCloud
- Files
- GBQ
- Google Cloud Storage
- Connect Glue to PrivaceraCloud
- Google BigQuery for PolicySync
- Connect Kinesis to PrivaceraCloud
- Connect Lambda to PrivaceraCloud
- Microsoft SQL Server
- MySQL for Discovery
- Open Source Apache Spark
- Oracle for Discovery
- PostgreSQL
- Connect Power BI to PrivaceraCloud
- Presto
- Redshift
- Snowflake
- Starburst Enterprise with PrivaceraCloud
- Starburst Enterprise Presto
- Trino
- Connect users
- Data access Users, Groups, and Roles
- UserSync
- Portal user LDAP/AD
- Datasource
- Okta Setup for SAML-SSO
- Azure AD setup
- SCIM Server User-Provisioning
- User Management
- Identity
- Access Manager
- Access Manager
- Resource Policies
- Tag Policies
- Scheme Policies
- Service Explorer
- Reports
- Audit
- About data access users, groups, and roles resource policies
- Security zones
- Discovery
- Classifications via random sampling
- Privacera Discovery scan targets
- Propagate Privacera Discovery Tags to Ranger
- Enable offline scanning on Azure Data Lake Storage Gen 2 (ADLS)
- Enable Real-time Scanning of S3 Buckets
- Enable Real-time Scanning on Azure Data Lake Storage Gen 2 (ADLS)
- Enable Discovery Realtime Scanning Using IAM Role
- Encryption
- Overview of Privacera Encryption
- Encryption schemes
- Presentation schemes
- Masking schemes
- Create scheme policies
- Privacera-supplied encryption schemes for the Privacera API
- Privacera-supplied encryption schemes for the Bouncy Castle API
- API date input formats
- Deprecated encryption formats, algorithms, and scopes
- Privacera Encryption REST API
- PEG API endpoint
- PEG REST API encryption endpoints
- Prerequisites
- Common PEG REST API fields
- Construct the datalist for the /protect endpoint
- Deconstruct the response from the /unprotect endpoint
- Example data transformation with the /unprotect endpoint and presentation scheme
- Example PEG API endpoints
- Make encryption API calls on behalf of another user
- Privacera Encryption UDF for masking in Databricks on PrivaceraCloud
- Privacera Encryption UDFs for Trino on PrivaceraCloud
- Syntax of Privacera Encryption UDFs for Trino
- Prerequisites for installing Privacera Crypto plug-in for Trino
- Download and install Privacera Crypto jar
- Set variables in Trino etc/crypto.properties
- Restart Trino to register the Privacera encryption and masking UDFs for Trino
- Example queries to verify Privacera-supplied UDFs
- Privacera Encryption UDF for masking in Trino on PrivaceraCloud
- Encryption UDFs for Apache Spark on PrivaceraCloud
- Launch Pad
- Settings
- Dashboard
- Usage statistics
- Operational status of PrivaceraCloud and RSS feed
- How to Get Support
- Coordinated Vulnerability Disclosure (CVD) Program of Privacera
- Shared Security Model
- PrivaceraCloud Previews
- Preview: File Explorer for S3
- Preview: File Explorer for Azure
- Preview: File Explorer for GCS
- Preview: Scan Generic Records with NER Model
- Preview: Scan Electronic Health Records with NER Model
- Preview: OneLogin setup for SAML-SSO
- Preview: Azure Active Directory SCIM Server UserSync
- Preview: OneLogin UserSync
- Preview: PingFederate UserSync
- Quickstart for Databricks Unity Catalog on PrivaceraCloud
- What do I need to do in my Databricks Workspace?
- Where is the sample dataset in my Databricks Workspace?
- What should I do in the PrivaceraCloud web portal?
- Access use-case - How do I give a user access to a table or restrict from running a SQL select query?
- Access use-case - How do I restrict a user from seeing contents of a column in the result of a SQL select query?
- Column masking use-case - How do I restrict a user from seeing contents of a column by masking the values in the result of a SQL select query?
- Access use-case - How do I disallow a user from seeing certain rows of a table?
- PrivaceraCloud documentation changelog
Google BigQuery for PolicySync
This topic describes how to connect a BigQuery application to PrivaceraCloud.
Connect BigQuery Application
Go to Settings -> Applications.
On the Applications screen, select BigQuery.
Enter the application Name and Description, and then click SAVE.
Click the toggle button to enable Access Management for BigQuery.
In the BASIC tab, enter the values in the required(*) fields and click SAVE.
In the ADVANCED tab, you can add custom properties.
Caution
Advanced properties should be modified in consultation with Privacera.
Click the IMPORT PROPERTIES link to browse and import application properties.
Connector Properties
Basic fields
Field name | Type | Default | Required | Description |
---|---|---|---|---|
BigQuery project location |
|
| Yes | Specifies the geographical region where the taxonomy for the PolicySync should be created. |
BigQuery project id |
| Yes | Specifies the Google project ID where your Google BigQuery data source resides. For example: | |
Service account email |
| Yes | Specifies the service account email address that PolicySync uses. You must specify this value if you are not using a Google Cloud Platform (GCP) virtual machine attached service account. | |
BigQuery private key content |
| No | Specifies the Google Cloud Platform (GCP) account credential key JSON content. PolicySync uses this data to connect to Google BigQuery. | |
Projects to set access control policies |
| Yes | Specifies a comma-separated list of project names to which access control is managed by PolicySync. If unset, PolicySync manages all projects. If specified, use the following format. You can use wildcards. Names are case-sensitive. The list of projects to ignore takes precedence over any projects specified by this setting. An example list of projects might resemble the following: | |
Native public group identity name |
| Yes | Set this property to your preferred value, policysync uses this native public group for access grants whenever there is policy created referring to public group inside it. The following values are allowed:
| |
Enable audit |
|
| Yes | Specifies whether Privacera fetches access audit data from the data source. |
Advanced fields
Field name | Type | Default | Required | Description |
---|---|---|---|---|
Create custom iam roles in gcp |
|
| No | Specifies whether PolicySync automatically creates custom IAM roles in your Google Cloud Platform project or organization for fine-grained access control (FGAC). If set to |
GCP custom iam roles scope |
|
| No | Specifies whether PolicySync creates and uses custom IAM roles at the project or organizational level in Google Cloud Platform (GCP). The following values are allowed:
|
GCP organization id |
| No | Specifies the Google Cloud Platform (GCP) organizational ID. Specify this only if you configured PolicySync to use custom IAM roles at the organizational level. | |
Datasets to set access control policies |
| Yes | Specifies a list of comma-separated datasets that PolicySync manages access control to. You can use wildcards in the value. Names are case-sensitive. If you want to manage all datasets, do not set a value. For example: testproject1.dataset1,testproject2.dataset2,sales_project*.sales* You can configure the postfix by specifying Secure view dataset name postfix. If specified, the Datasets to ignore while setting access control policies setting takes precedence over this setting. | |
Tables to set access control policies |
| No | Specifies a comma-separated list of table names for which PolicySync manages access control. You can use wildcards. Use the following format when specifying a table: <PROJECT_NAME>.<DATASET_NAME>.<TABLE_NAME> If specified, Tables to ignore while setting access control policies takes precedence over this setting. If you specify a wildcard, such as in the following example, all matched tables are managed:
The specified value, if any, is interpreted in the following ways:
| |
Projects to ignore while setting access control policies |
| No | Specifies a comma-separated list of project names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all projects are subject to access control. For example: This setting supersedes any values specified by Projects to set access control policies. | |
Datasets to ignore while setting access control policies |
| No | Specifies a comma-separated list of dataset names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all datasets are subject to access control. For example: This setting supersedes any values specified by Datasets to set access control policies. | |
Tables to ignore while setting access control policies |
| No | Specifies a comma-separated list of table names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all tables are subject to access control. Specify tables using the following format: <PROJECT_NAME>.<DATASET_NAME>.<TABLE_NAME> This setting supersedes any values specified by Tables to set access control policies. | |
Users to set access control policies |
| No | Specifies a comma-separated list of user names for which PolicySync manages access control. You can use wildcards. Names are case-sensitive. If not specified, PolicySync manages access control for all users. If specified, Users to be ignored by access control policies takes precedence over this setting. An example user list might resemble the following: | |
Groups to set access control policies |
| No | Specifies a comma-separated list of group names for which PolicySync manages access control. If unset, access control is managed for all groups. If specified, use the following format. You can use wildcards. Names are case-sensitive. An example list of projects might resemble the following: If specified, Groups to be ignored by access control policies takes precedence over this setting. | |
Users to be ignored by access control policies |
| No | Specifies a comma-separated list of user names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all users are subject to access control. This setting supersedes any values specified by Users to set access control policies. | |
Groups to be ignored by access control policies |
| No | Specifies a comma-separated list of group names that PolicySync does not provide access control for. You can specify wildcards. Names are case-sensitive. If not specified, all groups are subject to access control. This setting supersedes any values specified by Groups to set access control policies. | |
Set access control policies only on the users from managed groups |
|
| No | Specifies whether to manage only the users that are members of groups specified by Groups to set access control policies. The default value is false. |
Enforce bigquery native row filter |
|
| No | Specifies whether to use the data source native row filter functionality. This setting is disabled by default. When enabled, you can create row filters only on tables, but not on views. |
Enforce masking policies using secure views |
|
| No | Specifies whether to use secure view based masking. The default value is |
Enforce row filter policies using secure views |
|
| No | Specifies whether to use secure view based row filtering. The default value is While Google BigQuery supports native filtering, PolicySync provides additional functionality that is not available natively. Enabling this setting is recommended. |
Create secure view for all tables/views |
|
| No | Specifies whether to create secure views for all tables and views that are created by users. If enabled, PolicySync creates secure views for resources regardless of whether masking or filtering policies are enabled. |
Default masking value for numeric datatype |
|
| No | Specifies the masking value used for numeric data types. |
Default masking value for text/string datatype |
|
| No | Specifies the masking value used for text or string data types. |
Secure view name prefix |
| No | Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name. If you want to change the secure view dataset name prefix, specify a value for this setting. For example, if the prefix is | |
Secure view name postfix |
| No | Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name. If you want to change the secure view dataset name postfix, specify a value for this setting. For example, if the postfix is | |
Secure view dataset name prefix |
| No | Specifies a prefix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name. If you want to change the secure view dataset name prefix, specify a value for this setting. For example, if the prefix is | |
Secure view dataset name postfix |
|
| No | Specifies a postfix string for secure views. By default view-based row filter and masking-related secure views have the same dataset name as the table dataset name. If you want to change the secure view dataset name postfix, specify a value for this setting. For example, if the postfix is |
Enable this for policy enforcements and user/group/role management. |
|
| Yes | Specifies whether PolicySync performs grants and revokes for access control and creates, updates, and deletes queries for users, groups, and roles. The default value is |
Enable to use data admin functionality. |
|
| No | This property is used to enable the data admin feature. With this feature enabled you can create all the policies on native tables/views, and respective grants will be made on the secure views of those native tables/views. These secure views will have row filter and masking capability. In case you need to grant permission on the native tables/views then you can select the permission you want plus data admin in the policy. Then those permissions will be granted on both the native table/view as well as its secure view. |
ignore audit for users |
| No | Specifies a comma separated list of users to exclude when fetching access audits. For example: | |
project id used to fetch BigQuery audits |
| No | Specifies the project ID where Google BigQuery stores audit log data. | |
dataset used to fetch BigQuery audits |
| No | Specifies the name of the dataset where Google BigQuery logs audit data. Privacera uses this data for running audit queries. |
Custom fields
Canonical name | Type | Default | Description |
---|---|---|---|
|
|
| Specifies whether the PolicySync uses the service account attached to your virtual machine for the credentials to connect to the data source. |
|
| Specifies a list of mappings between PolicySync custom IAM role names and your custom role names. Use the following format when specifying your custom role names: <PRIVACERA_DEFAULT_ROLE_NAME_1>:<CUSTOM_ROLE_NAME_1> <PRIVACERA_DEFAULT_ROLE_NAME_2>:<CUSTOM_ROLE_NAME_2> The following is a list of the default custom role names:
| |
|
|
| Specifies how PolicySync loads resources from Google BigQuery. The following values are allowed:
|
|
|
| Specifies the interval in seconds for PolicySync to wait before checking for new resources or changes to existing resources. |
|
|
| Specifies the interval in seconds for PolicySync to wait before reconciling principals with those in the data source, such as users, groups, and roles. When differences are detected, PolicySync updates the principals in the data source accordingly. |
|
|
| Specifies the interval in seconds for PolicySync to wait before reconciling Apache Ranger access control policies with those in the data source. When differences are detected, PolicySync updates the access control permissions on data source accordingly. |
|
|
| Specifies the interval in seconds to elapse before PolicySync retrieves access audits and saves the data in Privacera. |
|
|
| Specifies a regular expression to apply to a username and replaces each matching character with the value specified by the If not specified, no find and replace operation is performed. |
|
|
| Specifies a string to replace the characters matched by the regex specified by the If not specified, no find and replace operation is performed. |
|
|
| Specifies a regular expression to apply to a group and replaces each matching character with the value specified by the If not specified, no find and replace operation is performed. |
|
|
| Specifies a string to replace the characters matched by the regex specified by the If not specified, no find and replace operation is performed. |
|
|
| Specifies how PolicySync manages column-level access control. The following values are allowed:
|
|
|
| Specifies a string to use as part of the name of native row filter and masking policies. |
|
|
| Specifies a template for the name that PolicySync uses when creating a row filter policy. For example, given a table proj_priv_ds_priv_data_<ROW_FILTER_ITEM_NUMBER> |
|
|
| Specifies the name of the dataset where PolicySync creates custom masking functions. |
|
| Specifies a suffix to remove from a table or view name. For example, if the table is named You can specify a single suffix or a comma separated list of suffixes. | |
|
| Specifies a suffix to remove from a secure view dataset name. For example, if the dataset is named You can specify a single suffix or a comma separated list of suffixes, such as | |
|
|
| Specifies the interval at which the authorized view ACLs updater thread updates the permissions in the dataset if any permission updates are pending. |
|
|
| Specifies the maximum number of attempts that PolicySync makes to execute a grant query if it is unable to do so successfully. The default value is |
|
|
| Specifies whether PolicySync applies grants and revokes in batches. If enabled, this behavior improves overall performance of applying permission changes. |
|
|
| Specifies the maximum interval, in minutes, of the time window that SQL queries use to retrieve access audit information. If there are a large number of audits records, narrowing the window interval improves performance. For example, if the interval is set to SELECT * FROM audits where time_from=00:01 and time_to=00:30; SELECT * FROM audits where time_from=00:31 and time_to=01:00; SELECT * FROM audits where time_from=01:01 and time_to=01:30; |