Skip to main content

Privacera Documentation

Get started with AWS Lake Formation

AWS Lake Formation is a fully managed service that makes it easy to build, secure, and manage data lakes. AWS Lake Formation provides its own permissions model that augments the IAM permissions model. This centrally defined permissions model enables fine-grained access to data stored in data lakes through a simple grant or revoke mechanism, much like a relational database management system (RDMS). AWS Lake Formation permissions are enforced using granular controls at the column, row, and cell-levels across AWS services, including Amazon Athena, Amazon EMR, and Amazon Redshift.

AWS Lake Formation makes it easier for you to build, secure, and manage data lakes. AWS Lake Formation helps you define granular data access policies for the metadata and data through a grant/revoke permissions model.

Why AWS Lake Formation with Privacera?

Privacera offers a data access governance platform that allows for secure data sharing across hybrid environments and cloud services. When Privacera is configured with a AWS Lake Formation connector, it syncs policies from Lake Formation for databases in the AWS Glue data Catalog and can be used to enforce those policies on Databricks, Databricks SQL, Trino, Starburst, and Dremio. Similarly, these policies can be enforced on other data sources, such as Amazon Redshift, Amazon RDS for PostgreSQL, Amazon Aurora, Snowflake, and Amazon RDS. So, for access control policies, Lake Formation will be the source of truth, and whatever policies are defined in AWS Lake Formation can be applied to or enforced on a variety of other data sources.

Using AWS Lake Formation with Privacera allows you to gain access control over a wide range of cloud services and improves access control from the AWS Lake Formation.

Advantages of using AWS Lake Formation with Privacera

  • Centralizes fine-grained data access control policies over a wide range of cloud services and various types of data sources.

  • Provides centralized, detailed audit information about data access.

  • Customizes reports and dashboards to support compliance, audit, and governance.

Connector configuration modes

Following two modes are available for configuring the AWS Lake Formation connector with Privacera.

Push mode

In this mode, Privacera is the source of truth for access control policies. The access control policies are defined in the Privacera and then these policies will be pushed to AWS Lake Formation. From there, these policies will be enforced for the AWS Lake Formation-supported services, such as Amazon Redshift Spectrum, Amazon EMR, and Amazon Athena. Refer to Figure 1, “Push Mode Configuration for AWS Lake Formation Connector.

As shown in the following image, with the Push mode, all the policies are stored and managed by Privacera. For the databases that are in Amazon S3 and managed by AWS Glue Catalog, Privacera will push the policies to AWS Lake Formation using Lake Formation APIs. AWS Lake Formation will enforce these policies natively. Privacera uses its connector architecture to enforce the remaining data sources.

Figure 1. Push Mode Configuration for AWS Lake Formation Connector
Push Mode Configuration for AWS Lake Formation Connector


Pull mode

In this mode, the AWS Lake Formation is the source of truth for access control. The access control policies are pulled from the Lake Formation at specific time intervals. From Privacera, and then these policies get enforced on various data sources defined by the configuration provided. Refer to Figure 2, “Pull Mode Configuration for AWS Lake Formation Connector.

As shown in the following image, with the Pull mode, AWS Lake Formation is the primary store for the dataset in S3 that is managed by Glue Catalog. Since AWS Lake Formation is the primary store, administrators will manage these policies directly in the AWS Lake Formation console or through its APIs. Since the same databases and tables defined in AWS Glue could be used by other third party tools–such as Databricks and Trino–it is paramount that the same policies are consistently enforced by these tools also.

Figure 2. Pull Mode Configuration for AWS Lake Formation Connector
Pull Mode Configuration for AWS Lake Formation Connector


Privacera has native integrations with most of the other tools that use AWS Glue and it can assist in enforcing these policies. This is implemented by pulling the policies and tags from AWS Lake Formation and pushing them to Privacera. Once the policies and tags are in Privacera, then Privacera will enforce them in Databricks and/or Trino by applying the same original policies defined in AWS Lake Formation.

Figure 3. AWS reference architecture to govern Databricks access
AWS reference architecture to govern Databricks access