Skip to main content

Privacera Documentation

Table of Contents

Access AWS S3 buckets from multiple AWS accounts on PrivaceraCloud

You can access AWS S3 buckets from multiple AWS accounts. To do so, follow these steps:

  1. Create or use two AWS accounts, Account A and Account B.

  2. Log in to AWS console as account A and create an IAMrole.

    arn:aws:iam::12345678:role/DataServer_Role
  3. Revise this AWS IAM Role in Account A so that it has full access to the data resources that will be connected to your PrivaceraCloud account.

  4. Establish IAM Role trust relationship with the PrivaceraCloud AWS access role. For more information, see AWS Access with IAM role on PrivaceraCloud.

  5. Log in to Account B in your AWS console to access the bucket through dataserver.

  6. Select Bucket B and update the bucket policy as shown in the following sample JSON:

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Principal": {
                    "AWS": [
                        "arn:aws:iam::12345678:role/DataServer_Role"
                    ]
                },
                "Action": "s3:*",
                "Resource": [
                    "arn:aws:s3:::bucketB/*",
                    "arn:aws:s3:::bucketB"
                ]
            }
        ]
    }

    Note

    The IAM role ‘DataServer_Role’ is from Account A.

  7. On PrivaceraCloud, add or update the IAM role arn:aws:iam::12345678:role/DataServer_Role in S3 application.

  8. Once the S3 application is configured you can access Bucket B from Account B through AWS-CLI. For more information, see Scripts for AWS CLI or Azure CLI for managing connected applications.

    aws s3 ls s3://bucketB
  9. You can access Bucket B from Databricks cluster. For more information, see Connect Databricks to PrivaceraCloud.

    readFilePath = "s3a://bucketB/sample_sf.csv"
    df = spark.read.csv(readFilePath, inferSchema=True, header=True)
    df.count()
    df.show(2)