Skip to main content

Privacera Documentation

Connect Open Source Apache Spark to PrivaceraCloud

You first obtain an account-specific script from your PrivaceraCloud account, followed by adding a startup step to open source Spark.

Three configurations are available depending on your requirement. Fine-Grained Access Control (FGAC) and Object-Level Access Control (OLAC) are supported in each of the configurations:

Obtain installation script

Obtain the account unique <privacera-plugin-script-download-url>. This script and other commands run in your Spark command shell to complete the PrivaceraCloud installation.

  1. Go to Settings > API Key.

  2. Use an existing active API Key or generate a new one.

    Note

    Make sure the Expiry column is set to Never Expires.

  3. Click the i icon to get the scripts.

  4. On the Plugins Setup Script, click the COPY URL button. Save this value on your Spark server. It is needed as the <privacera-plugin-script-download-url> in the next step.

OLAC Setup
  1. OLAC is supported only with JWT token authentication. See PrivaceraCloud data access methods.

  2. Add the following properties in your Dataserver application to enable JWT authorization. In the following code block, 0 is the index. By increasing the index, you can add multiple JWT properties.

    privacera.jwt.oauth.enable=true
    privacera.jwt.0.token.issuer=<PLEASE_CHANGE>
    privacera.jwt.0.token.subject=<PLEASE_CHANGE>
    privacera.jwt.0.token.secret=<PLEASE_CHANGE>
    privacera.jwt.0.token.publickey=<PLEASE_CHANGE>
    privacera.jwt.0.token.userKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.groupKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.parserType=<PLEASE_CHANGE>

    Property

    Description

    Example

    privacera.jwt.oauth.enable

    Property to enable JWT auth in Privacera services.

    true

    privacera.jwt.{index}.token.issuer

    Property to enter the URL of the identity provider.

    https://you-idp-domain.com

    privacera.jwt.{index}.token.publickey

    The JWT token public key in String format (Need to delete all newlines).

    -----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY-----

    privacera.jwt.{index}.token.secret

    [Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret.

    privacera-api

    privacera.jwt.{index}.token.subject

    [Optional] Add this If JWT Token has a subject.

    api-token

    privacera.jwt.{index}.token.userKey

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    client-id

    privacera.jwt.{index}.token.groupKey

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    scope

    privacera.jwt.{index}.token.parser.type

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    PING_IDENTITY: When groupKey is an array

    KEYCLOAK: When groupKey is space separator

    KEYCLOAK

  3. Run the Dataserver.

  4. SSH to the instance where Spark is installed and you want to install Privacera Plugin.

  5. Create a directory ~/privacera and download the script. Replace <privacera-plugin-script-download-url> with the Privacera Plugin download URL.

    mkdir ~/privacera/spark-plugin-install
    cd ~/privacera/spark-plugin-install
    wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
  6. Create a file privacera_env.sh that contains the parameters required for your plugin installation:

    vi privacera_env.sh

    Property

    Description

    PLUGIN_TYPE

    Type of Privacera Plugin which you want to install.

    SPARK_PLUGIN_TYPE

    Spark Plugin type OLAC. JWT Authentication will be enabled by default.

    SPARK_HOME

    This is the home directory of your Spark installation. For example, the directory path can be /home/user/spark.

    SPARK_CLUSTER_NAME

    Cluster Name which will show up in the Privacera Ranger Audits page.

  7. Add the following properties:

    PLUGIN_TYPE="spark"
    SPARK_PLUGIN_TYPE="OLAC"
    SPARK_HOME="<PLEASE_CHANGE>"
    SPARK_CLUSTER_NAME="privacera-spark"
  8. Run the script.

    chmod +x privacera_plugin.sh
    ./privacera_plugin.sh

    The script sets up the Privacera Plugin in the OLAC mode.

FGAC Setup
  1. FGAC is recommended to be used with JWT authentication enabled.

    Note

    If JWT authentication is disabled, access control will fail on the system user or proxy user.

  2. SSH to the instance where Spark is installed and you want to install Privacera Plugin.

  3. Create a directory ~/privacera and download the script. Replace <privacera-plugin-script-download-url> with the Privacera Plugin download URL.

    mkdir ~/privacera/spark-plugin-install
    cd ~/privacera/spark-plugin-install
    wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
  4. Create a file privacera_env.sh which will contain the parameters required for your plugin installation.

    vi privacera_env.sh
  5. Add the following properties:

    PLUGIN_TYPE="spark"
    SPARK_PLUGIN_TYPE="FGAC"
    SPARK_HOME="<PLEASE_CHANGE>"
    SPARK_CLUSTER_NAME="privacera-spark"

    Property

    Description

    PLUGIN_TYPE

    Type of Privacera Plugin which you want to install.

    SPARK_PLUGIN_TYPE

    Spark Plugin type FGAC.

    SPARK_HOME

    This is the home directory of your Spark installation. For example, the directory path can be /home/user/spark.

    SPARK_CLUSTER_NAME

    Cluster Name which will show up in the Privacera Ranger Audits page.

  6. Add the following properties when JWT auth is enabled:

    JWT_OAUTH_ENABLE="true"
    JWT_ISSUER="<PLEASE_CHANGE>"
    JWT_PUBLIC_KEY="<PLEASE_CHANGE>"
    #JWT_SECRET="<PLEASE_CHANGE>"
    #JWT_SUBJECT="<PLEASE_CHANGE>"
    JWT_USERKEY="<PLEASE_CHANGE>"
    JWT_GROUPKEY="<PLEASE_CHANGE>"
    JWT_PARSER_TYPE="<PLEASE_CHANGE>"

    Note

    To configure multiple JWTs, refer to FGAC with multiple JWT configurations below.

    Property

    Description

    Example

    JWT_OAUTH_ENABLE

    To enable JWT authentication.

    JWT_OAUTH_ENABLE="true"

    JWT_ISSUER

    The URL of the identity provider.

    JWT_ISSUER="https://your-idp-domain.com"

    JWT_PUBLIC_KEY

    The JWT token public key in String format.

    JWT_SECRET

    Uncomment and add value if the JWT token has been encrypted using secret.

    JWT_SECRET="privacera-secret"

    JWT_SUBJECT

    Uncomment and add value if JWT Token has a subject.

    JWT_SUBJECT="api-token"

    JWT_USERKEY

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    JWT_USERKEY="client_id"

    JWT_GROUPKEY

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    JWT_GROUPKEY="scope"

    JWT_PARSER_TYPE

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    JWT_PARSER_TYPE="KEYCLOAK"

  7. Run the script.

    chmod +x privacera_plugin.sh
    ./privacera_plugin.sh

    The script will set up the Privacera Plugin in the FGAC mode.

FGAC with multiple JWT configurations

To configure multiple JWT configurations add the below index based properties in the privacera_env.sh file. In which {index} start from 0 to n.

JWT_OAUTH_ENABLE="true"

JWT_{index}_ISSUER="<PLEASE_CHANGE>"
JWT_{index}_PUBLICKEY="<PLEASE_CHANGE>"
JWT_{index}_SUBJECT="<PLEASE_CHANGE>"
JWT_{index}_SECRET="<PLEASE_CHANGE>"
JWT_{index}_USERKEY="<PLEASE_CHANGE>"
JWT_{index}_GROUPKEY="<PLEASE_CHANGE>"
JWT_{index}_PARSER_TYPE="<PLEASE_CHANGE>"

For example, for two configurations: (starts at 0)

JWT_OAUTH_ENABLE="true"

JWT_0_ISSUER="https://mydomain.com/issuer"
JWT_0_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----"
JWT_0_SUBJECT=”principal1”
JWT_0_SECRET=”shkl-XXXX-XXXX-XXXX”
JWT_0_USERKEY="client_id"
JWT_0_GROUPKEY="scope"
JWT_0_PARSER_TYPE="PING_IDENTITY"

JWT_1_ISSUER="https://mydomain.com/issuer"
JWT_1_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----"
JWT_1_SUBJECT=”principal2”
JWT_1_SECRET=”suhjk-XXXX-XXXX-XXXX”
JWT_1_USERKEY="client_id"
JWT_1_GROUPKEY="scope"
JWT_1_PARSER_TYPE="KEYCLOAK"

If you have an existing Open Source Spark setup running on Kubernetes, you can update your existing Docker file used to create Spark image to add steps for installing Privacera Plugin.

OLAC Setup

OLAC is supported only with JWT token authentication. Your Dataserver application should be configured with JWT Token support. Create a new Dataserver, if it does not exist. See PrivaceraCloud data access methods.

  1. Add the following properties in your Dataserver application to enable JWT authorization. In the following code block, 0 is the index. By increasing the index, you can add multiple JWT properties.

    privacera.jwt.oauth.enable=true
    privacera.jwt.0.token.issuer=<PLEASE_CHANGE>
    privacera.jwt.0.token.subject=<PLEASE_CHANGE>
    privacera.jwt.0.token.secret=<PLEASE_CHANGE>
    privacera.jwt.0.token.publickey=<PLEASE_CHANGE>
    privacera.jwt.0.token.userKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.groupKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.parserType=<PLEASE_CHANGE>

    Property

    Description

    Example

    privacera.jwt.oauth.enable

    Property to enable JWT auth in Privacera services.

    true

    privacera.jwt.{index}.token.issuer

    Property to enter the URL of the identity provider.

    https://you-idp-domain.com

    privacera.jwt.{index}.token.publickey

    The JWT token public key in String format (Need to delete all newlines).

    -----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY-----

    privacera.jwt.{index}.token.secret

    [Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret.

    privacera-api

    privacera.jwt.{index}.token.subject

    [Optional] Add this If JWT Token has a subject.

    api-token

    privacera.jwt.{index}.token.userKey

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    client-id

    privacera.jwt.{index}.token.groupKey

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    scope

    privacera.jwt.{index}.token.parser.type

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    PING_IDENTITY: When groupKey is an array

    KEYCLOAK: When groupKey is space separator

    KEYCLOAK

  2. Run the Dataserver.

  3. SSH to the instance where Spark is installed and you want to install Privacera Plugin.

  4. Copy the following to your Docker file. Set the PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL property.

    ######## Install Privacera Spark Plugin Start ###########
    
    # ENV SPARK_HOME /opt/apache/spark
    RUN apt-get -y install zip unzip wget
    ENV PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL="<PLEASE_CHANGE>"
    ENV PLUGIN_TYPE="spark"
    ENV SPARK_PLUGIN_TYPE="OLAC"
    ENV SPARK_CLUSTER_NAME="privacera-spark"
    RUN echo "Downloading Script from $PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL"
    RUN wget ${PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL} -O privacera_plugin.sh
    RUN chmod +x privacera_plugin.sh
    RUN ./privacera_plugin.sh
    
    ######## Install Privacera Spark Plugin End ###########
  5. Save the Docker file and build the image. You will now have a Docker image for Open Source Spark With Privacera Plugin enabled.

FGAC Setup
  1. FGAC is recommended to be used with JWT authentication enabled.

    Note

    If JWT authentication is disabled, access control will fail on the system user or proxy user.

  2. SSH to the instance where Spark is installed and you want to install Privacera Plugin.

  3. Copy the following to your Docker file. Set the PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL property. For the JWT properties, refer the table below.

    ######## Install Privacera Spark Plugin Start ###########
    
    # ENV SPARK_HOME /opt/apache/spark
    RUN apt-get -y install zip unzip wget
    ENV PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL="<PLEASE_CHANGE>"
    ENV PLUGIN_TYPE="spark"
    ENV SPARK_PLUGIN_TYPE="FGAC"
    ENV SPARK_CLUSTER_NAME="privacera-spark"
    ENV JWT_OAUTH_ENABLE "true"
    ENV JWT_ISSUER=<PLEASE_CHANGE>
    ENV JWT_PUBLIC_KEY=<PLEASE_CHANGE>
    ENV JWT_SECRET=<PLEASE_CHANGE>
    ENV JWT_SUBJECT=<PLEASE_CHANGE>
    ENV JWT_USERKEY=<PLEASE_CHANGE>
    ENV JWT_GROUPKEY=<PLEASE_CHANGE>
    ENV JWT_PARSER_TYPE=<PLEASE_CHANGE>
    RUN echo "Downloading Script from $PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL"
    RUN wget ${PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL} -O privacera_plugin.sh
    RUN chmod +x privacera_plugin.sh
    RUN ./privacera_plugin.sh
    
    ######## Install Privacera Spark Plugin End ###########

    Note

    To configure multiple JWTs, refer to FGAC with Multiple JWT Configuration in an Existing Docker File below.

    Property

    Description

    Example

    JWT_OAUTH_ENABLE

    To enable JWT authentication.

    JWT_OAUTH_ENABLE="true"

    JWT_ISSUER

    The URL of the identity provider.

    JWT_ISSUER="https://your-idp-domain.com"

    JWT_PUBLIC_KEY

    The JWT token public key in String format.

    JWT_SECRET

    Uncomment and add value if the JWT token has been encrypted using secret.

    JWT_SECRET="privacera-secret"

    JWT_SUBJECT

    Uncomment and add value if JWT Token has a subject.

    JWT_SUBJECT="api-token"

    JWT_USERKEY

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    JWT_USERKEY="client_id"

    JWT_GROUPKEY

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    JWT_GROUPKEY="scope"

    JWT_PARSER_TYPE

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    JWT_PARSER_TYPE="KEYCLOAK"

  4. Save the Docker file and build the image. You will now have a Docker image for Open Source Spark With Privacera Plugin enabled.

FGAC with Multiple JWT Configuration in an Existing Docker File

To configure multiple JWT configurations add the below index based Environment variable in the Docker file. In which {index} start from 0 to n.

ENV JWT_OAUTH_ENABLE "true"
ENV JWT_{index}_ISSUER="<PLEASE_CHANGE>"
ENV JWT_{index}_PUBLICKEY="<PLEASE_CHANGE>"
ENV JWT_{index}_SUBJECT="<PLEASE_CHANGE>"
ENV JWT_{index}_SECRET="<PLEASE_CHANGE>"
ENV JWT_{index}_USERKEY="<PLEASE_CHANGE>"
ENV JWT_{index}_GROUPKEY="<PLEASE_CHANGE>"
ENV JWT_{index}_PARSER_TYPE="<PLEASE_CHANGE>"

For example, for two configurations: (starts at 0)

######## Install Privacera Spark Plugin Start ############ 
ENV SPARK_HOME /opt/apache/spark
RUN apt-get -y install zip unzip wget
ENV PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL="<PLEASE_CHANGE>"
ENV PLUGIN_TYPE="spark"
ENV SPARK_PLUGIN_TYPE="FGAC"
ENV SPARK_CLUSTER_NAME="privacera-spark"

ENV JWT_OAUTH_ENABLE "true"
ENV JWT_0_ISSUER="https://mydomain.com/issuer"
ENV JWT_0_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----"
ENV JWT_0_SUBJECT=”principal1”
ENV JWT_0_SECRET=”shkl-XXXX-XXXX-XXXX”
ENV JWT_0_USERKEY="client_id"
ENV JWT_0_GROUPKEY="scope"
ENV JWT_0_PARSER_TYPE="PING_IDENTITY"

ENV JWT_1_ISSUER="https://mydomain.com/issuer"
ENV JWT_1_PUBLICKEY="-----BEGIN PUBLIC KEY-----MIIBIjANXXXXXDAQAB-----END PUBLIC KEY-----"
ENV JWT_1_SUBJECT=”principal2”
ENV JWT_1_SECRET=”suhjk-XXXX-XXXX-XXXX”
ENV JWT_1_USERKEY="client_id"
ENV JWT_1_GROUPKEY="scope"
ENV JWT_1_PARSER_TYPE="KEYCLOAK"

The scripts will help you create an Open Source Spark image with Privacera Plugin and push it to the specified Docker Hub which can be used to run Spark with Privacera.

OLAC Setup

OLAC is supported only with JWT token authentication. Your Dataserver application should be configured with JWT Token support. Create a new Dataserver, if it does not exist. See PrivaceraCloud data access methods.

  1. Add the following properties in your Dataserver application to enable JWT authorization. In the following code block, 0 is the index. By increasing the index, you can add multiple JWT properties.

    privacera.jwt.oauth.enable=true
    privacera.jwt.0.token.issuer=<PLEASE_CHANGE>
    privacera.jwt.0.token.subject=<PLEASE_CHANGE>
    privacera.jwt.0.token.secret=<PLEASE_CHANGE>
    privacera.jwt.0.token.publickey=<PLEASE_CHANGE>
    privacera.jwt.0.token.userKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.groupKey=<PLEASE_CHANGE>
    privacera.jwt.0.token.parserType=<PLEASE_CHANGE>
    

    Property

    Description

    Example

    privacera.jwt.oauth.enable

    Property to enable JWT auth in Privacera services.

    true

    privacera.jwt.{index}.token.issuer

    Property to enter the URL of the identity provider.

    https://you-idp-domain.com

    privacera.jwt.{index}.token.publickey

    The JWT token public key in String format (Need to delete all newlines).

    -----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY-----

    privacera.jwt.{index}.token.secret

    [Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret.

    privacera-api

    privacera.jwt.{index}.token.subject

    [Optional] Add this If JWT Token has a subject.

    api-token

    privacera.jwt.{index}.token.userKey

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    client-id

    privacera.jwt.{index}.token.groupKey

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    scope

    privacera.jwt.{index}.token.parser.type

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    PING_IDENTITY: When groupKey is an array

    KEYCLOAK: When groupKey is space separator

    privacera.jwt.token.parser.type=KEYCLOAK

  2. Run the Dataserver.

  3. SSH to the instance where you want to install Privacera Plugin.

  4. Create a directory ~/privacera and download the script. Replace <privacera-plugin-script-download-url> with the Privacera Plugin download URL.

    mkdir ~/privacera/spark-plugin-install
    cd ~/privacera/spark-plugin-install
    wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
  5. Create a file privacera_env.sh which will contain the parameters required for your plugin installation.

    vi privacera_env.sh
  6. Add the following properties:

    PLUGIN_TYPE="spark_k8s"
    export SPARK_VERSION="3.3.0"
    SPARK_HOME="/opt/privacera/spark"
    SPARK_PLUGIN_TYPE="OLAC"
    HUB="<PLEASE_CHANGE>"
    HUB_USERNAME="<PLEASE_CHANGE>"
    HUB_PASSWORD="<PLEASE_CHANGE>"
    ENV_TAG="<PLEASE_CHANGE>"

    Property

    Description

    PLUGIN_TYPE

    Type of Privacera Plugin which you want to install.

    SPARK_PLUGIN_TYPE

    Spark Plugin type OLAC. JWT Authentication will be enabled by default.

    SPARK_VERSION

    Specifies the version of Apache Spark. Must be one of the following versions: 3.1.2, 3.2.2, or 3.3.0

    SPARK_HOME

    This is the home directory of your Spark installation. For example, the directory path can be /opt/privacera/spark.

    HUB

    The Docker hub URL where you want the image to be pushed.

    HUB_USERNAME

    Docker hub username.

    HUB_PASSWORD

    Docker hub password.

    ENV_TAG

    Docker image tag.

  7. Run the script.

    chmod +x privacera_plugin.sh
    ./privacera_plugin.sh

    The script will build the Spark image with Privacera Spark plugin and publish it to the Docker hub.

FGAC Setup

FGAC is recommended to be used with JWT authentication enabled.

Note

If JWT authentication is disabled, access control will fail on the system user or proxy user.

  1. SSH to the instance where you want to install Privacera Plugin.

  2. Create a directory ~/privacera and download the script. Replace <privacera-plugin-script-download-url> with the Privacera Plugin download URL.

    mkdir ~/privacera/spark-plugin-install
    cd ~/privacera/spark-plugin-install
    wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
  3. Create a file privacera_env.sh which will contain the parameters required for your plugin installation.

    vi privacera_env.sh
  4. Add the following properties:

    PLUGIN_TYPE="spark_k8s"
    export SPARK_VERSION="3.3.0"
    SPARK_HOME="/opt/privacera/spark"
    SPARK_PLUGIN_TYPE="FGAC"
    SPARK_CLUSTER_NAME="privacera-spark"

    Property

    Description

    PLUGIN_TYPE

    Type of Privacera Plugin which you want to install.

    SPARK_PLUGIN_TYPE

    Spark Plugin type FGAC.

    SPARK_VERSION

    Specifies the version of Apache Spark. Must be one of the following versions: 3.1.2, 3.2.2, or 3.3.0

    SPARK_HOME

    This is the home directory of your Spark installation. For example, the directory path can be /opt/privacera/spark.

    SPARK_CLUSTER_NAME

    Cluster Name which will show up in the Privacera Ranger Audits page.

  5. Add the following properties when JWT auth is enabled:

    JWT_OAUTH_ENABLE="true"
    JWT_ISSUER="<PLEASE_CHANGE>"
    JWT_PUBLIC_KEY="<PLEASE_CHANGE>"
    #JWT_SECRET="<PLEASE_CHANGE>"
    #JWT_SUBJECT="<PLEASE_CHANGE>"
    JWT_USERKEY="<PLEASE_CHANGE>"
    JWT_GROUPKEY="<PLEASE_CHANGE>"
    JWT_PARSER_TYPE="<PLEASE_CHANGE>"

    Property

    Description

    Example

    JWT_OAUTH_ENABLE

    To enable JWT authentication.

    JWT_OAUTH_ENABLE="true"

    JWT_ISSUER

    The URL of the identity provider.

    JWT_ISSUER="https://your-idp-domain.com"

    JWT_PUBLIC_KEY

    The JWT token public key in String format.

    JWT_SECRET

    Uncomment and add value if the JWT token has been encrypted using secret.

    JWT_SECRET="privacera-secret"

    JWT_SUBJECT

    Uncomment and add value if JWT Token has a subject.

    JWT_SUBJECT="api-token"

    JWT_USERKEY

    Property to define a unique userKey whose value will be used in user for Ranger policies.

    JWT_USERKEY="client_id"

    JWT_GROUPKEY

    Property to define a unique groupKey whose value will be used in group for Ranger policies.

    JWT_GROUPKEY="scope"

    JWT_PARSER_TYPE

    JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAK.

    JWT_PARSER_TYPE="KEYCLOAK"

  6. Add the following Docker Hub properties:

    HUB="<PLEASE_CHANGE>"
    HUB_USERNAME="<PLEASE_CHANGE>"
    HUB_PASSWORD="<PLEASE_CHANGE>"
    ENV_TAG="<PLEASE_CHANGE>"

    Property

    Description

    HUB

    The Docker hub URL where you want the image to be pushed.

    HUB_USERNAME

    Docker hub username.

    HUB_PASSWORD

    Docker hub password.

    ENV_TAG

    Docker image tag.

  7. Run the script.

    chmod +x privacera_plugin.sh
    ./privacera_plugin.sh

    The script will build the Spark image with Privacera Spark plugin and publish it to the Docker hub.

  1. SSH to the instance where you want to deploy Spark on the EKS cluster.

  2. To obtain PRIVACERA_DOWNLOAD_URL:

    1. Go to Settings -> API Key.

    2. Use an existing active API Key or generate a new one.

      Note

      Make sure the expiry column is set to Never Expires.

    3. Click the information icon and copy Ranger Admin URL.

  3. Export the download url:

    export PRIVACERA_DOWNLOAD_URL="RANGER_ADMIN_URL"
  4. Create spark-k8s-artifacts folder.

    mkdir ~/privacera/spark-k8s-artifacts
    cd ~/privacera/spark-k8s-artifacts
  5. Download and extract packages.

    wget ${PRIVACERA_DOWNLOAD_URL}/plugin/spark/k8s-spark-deploy.tar.gz -O k8s-spark-deploy.tar.gz
    tar xzf k8s-spark-deploy.tar.gz
    rm -r k8s-spark-deploy.tar.gz
    cd k8s-spark-deploy/
  6. Open penv.sh file and substitute the values of the following properties. Refer to the table below:

    Property

    Description

    Example

    SPARK_NAME_SPACE

    Kubernetes namespace

    privacera-spark-plugin-test

    SPARK_PLUGIN_IMAGE

    Docker image with hub

    ${HUB}/privacera-spark-plugin:${ENV_TAG}

    SPARK_DOCKER_PULL_SECRET

    Secret for docker-registry

    spark-plugin-docker-hub

    SPARK_PLUGIN_ROLE_BINDING

    Spark role Binding

    privacera-sa-spark-plugin-role-binding

    SPARK_PLUGIN_SERVICE_ACCOUNT

    Spark services account

    privacera-sa-spark-plugin

    SPARK_PLUGN_ROLE

    Spark services account role

    privacera-sa-spark-plugin-role

    SPARK_PLUGIN_APP_NAME

    Spark plugin application name

    privacera-spark-examples

  7. Run the following command to replace the property values in EKS deployment YAML file.

    mkdir -p backup
    cp *.yml backup/
    ./replace.sh
  8. Run the following command to create EKS resources.

    kubectl apply -f namespace.yml 
    kubectl apply -f service-account.yml 
    kubectl apply -f role.yml
    kubectl apply -f role-binding.yml
  9. Run the following command to create secret for docker-registry.

    kubectl create secret docker-registry spark-plugin-docker-hub --docker-server=<PLEASE_CHANGE> --docker-username=<PLEASE_CHANGE>  --docker-password='<PLEASE_CHANGE>' --namespace=<PLEASE_CHANGE>
  10. Run the following command to deploy a sample Spark application. Replace ${SPARK_NAME_SPACE} with the Kubernetes namespace.

    kubectl apply -f privacera-spark-examples.yml -n ${SPARK_NAME_SPACE}

    Note

    This is a sample file used for deployment. As per your use case, you can create a Spark deployment file and deploy a Docker image.

    This will deploy a Spark application in EKS pod with Privacera plugin and it will keep the pod running, so that you can use it in interactive mode.