Skip to main content

Privacera Platform master publication

Discovery in AWS
:
Discovery

This topic allows you to set up the AWS configuration for installing Privacera Discovery in a Docker and Kubernetes (EKS) environment.

IAM policies

To use the Privacera Discovery service, ensure the following IAM policies are attached to the Privacera_PM_Role role to access the AWS services.

Policy to create AWS resources

Policy to create AWS resources is required only during installation or when Discovery is updated through Privacera Manager. This policy gives permissions to Privacera Manager to create AWS resources like DynamoDB, Kinesis, SQS, and S3 using terraform.

  • ${AWS_REGION}: AWS region where the resources will get created.

     {
    "Version":"2012-10-17",
    "Statement":[
        {
            "Sid":"CreateDynamodb",
            "Effect":"Allow",
            "Action":[
                "dynamodb:CreateTable",
                "dynamodb:DescribeTable",
                "dynamodb:ListTables",
                "dynamodb:TagResource",
                "dynamodb:UntagResource",
                "dynamodb:UpdateTable",
                "dynamodb:UpdateTableReplicaAutoScaling",
                "dynamodb:UpdateTimeToLive",
                "dynamodb:DescribeTimeToLive",
                "dynamodb:ListTagsOfResource",
                "dynamodb:DescribeContinuousBackups"
            ],
            "Resource":"arn:aws:dynamodb:${AWS_REGION}:*:table/privacera*"
        },
        {
            "Sid":"CreateKinesis",
            "Effect":"Allow",
            "Action":[
                "kinesis:CreateStream",
                "kinesis:ListStreams",
                "kinesis:UpdateShardCount"
            ],
            "Resource":"arn:aws:kinesis:${AWS_REGION}:*:stream/privacera*"
        },
        {
            "Sid":"CreateS3Bucket",
            "Effect":"Allow",
            "Action":[
                "s3:CreateBucket",
                "s3:ListAllMyBuckets",
                "s3:GetBucketLocation"
                
            ],
            "Resource":[
                "arn:aws:s3:::*"
            ]
        },
        {
            "Sid":"CreateSQSMessages",
            "Effect":"Allow",
            "Action":[
                "sqs:CreateQueue",
                "sqs:ListQueues"
            ],
            "Resource":[
                "arn:aws:sqs:${AWS_REGION}:${ACCOUNNT_ID}:privacera*"
            ]
        }
    ]
    }
 
CLI configuration
  1. SSH to the instance where Privacera is installed.

  2. Configure your environment.

    • Configure Discovery for a Kubernetes environment. You need to set the Kubernetes cluster name. For more information, see Discovery (Kubernetes Mode)

    • For a Docker environment, you can skip this step.

  3. Run the following commands.

    cd ~/privacera/privacera-manager
    cp config/sample-vars/vars.discovery.aws.yml config/custom-vars/
    vi config/custom-vars/vars.discovery.aws.yml
    
  4. Edit the following properties. For property details and description, refer to the Configuration Properties below.

    DISCOVERY_BUCKET_NAME: "<PLEASE_CHANGE>"
    

    To configure a bucket, add the property as follows, where bucket-1 is the name of the bucket:

    DISCOVERY_BUCKET_NAME: "bucket-1"
    

    To configure a bucket containing a folder, add the property as follows:

    DISCOVERY_BUCKET_NAME: "bucket-1/folder1"
    
  5. Uncomment/Add the following variable to enable Autoscalability of Executor pods:

    DISCOVERY_K8S_SPARK_DYNAMIC_ALLOCATION_ENABLED: "true"
    
  6. (Optional) If you want to customize Discovery configuration further, you can add custom Discovery properties. For more information, refer to Discovery Custom Properties.

    For example, by default, the username and password for the Discovery service is padmin/padmin. If you choose to change it, refer to Add Custom Properties.

  7. Run the following commands.

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Configuration properties

Property

Description

Example

DISCOVERY_BUCKET_NAME

Set the bucket name where Discovery will store its metadata files

container1

[Properties of Topic and Table names](../pm-ig/customize_topic_and_tables_names.md)

Topic and Table names are assigned by default in Privacera Discovery. To customize any topic or table name, refer to the link.

Enable realtime scan

An AWS SQS queue is required, if you want to enable realtime scan on the S3 bucket.

After running the PM update command, an SQS queue will be created for you automatically with the name, privacera_bucket_sqs_{{DEPLOYMENT_ENV_NAME}}, where {{DEPLOYMENT_ENV_NAME}} is the environment name you set in the vars.privacera.yml file. This queue name will appear in the list of queues of your AWS SQS account.

If you have an SQS queue which you want to use, add the DISCOVERY_BUCKET_SQS_NAME property in the vars.discovery.aws.yml file and assign your SQS queue name.

If you want to enable realtime scan on the bucket, click here.