Open Source Spark#
You first obtain an account-specific script from your PrivaceraCloud account, followed by adding a startup step to open source Spark.
Three configurations are available depending on your requirement. Fine-Grained Access Control [FGAC] and Object-Level Access Control [OLAC] are supported in each of the configurations:
- Configure Privacera Plugin on Local/Virtual Machine
- Configure Privacera Plugin in an Existing Docker File of an EKS Cluster
- Configure Privacera Plugin using Privacera Scripts in an EKS Cluster
Obtain Installation Script#
Obtain the account unique <privacera-plugin-script-download-url>
.
This script and other commands run in your Spark command shell to complete the PrivaceraCloud installation.
Steps:
-
Go to Settings > API Key.
-
Use an existing active API Key or generate a new one.
Note
Make sure the Expiry column is set to "Never Expires".
-
Click the i icon to get the scripts.
-
On the Plugins Setup Script, click the COPY URL button. Save this value on your Spark server. It is needed as the
<privacera-plugin-script-download-url>
in the next step.
Configure Privacera Plugin on Local/Virtual Machine#
OLAC Setup#
-
OLAC is supported only with JWT token authentication.
Your Dataserver application should be configured with JWT Token support. Create a new Dataserver, if it does not exist. See Data Access Server.
-
Add the following properties in your Dataserver application to enable JWT authorization.
privacera.jwt.oauth.enable=true privacera.jwt.token.issuer=<PLEASE_CHANGE> privacera.jwt.token.publickey=<PLEASE_CHANGE> privacera.jwt.token.secret=<PLEASE_CHANGE> privacera.jwt.token.subject=<PLEASE_CHANGE> privacera.jwt.token.userKey=<PLEASE_CHANGE> privacera.jwt.token.groupKey=<PLEASE_CHANGE> privacera.jwt.token.parser.type=<PLEASE_CHANGE>
Property Description Example privacera.jwt.oauth.enable Property to enable JWT auth in Privacera services. privacera.jwt.oauth.enable=true privacera.jwt.token.issuer Property to enter the URL of the identity provider. privacera.jwt.token.issuer=https://you-idp-domain.com privacera.jwt.token.publickey The JWT token public key in String format (Need to delete all newlines). -----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY----- privacera.jwt.token.secret [Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret. privacera.jwt.token.secret=privacera-api privacera.jwt.token.subject [Optional] Add this If JWT Token has a subject. privacera.jwt.token.subject=api-token privacera.jwt.token.userKey Property to define a unique userKey whose value will be used in user for Ranger policies. client-id privacera.jwt.token.groupKey Property to define a unique groupKey whose value will be used in group for Ranger policies. scope privacera.jwt.token.parser.type JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAKS.
PING_IDENTITY: When groupKey is an array
KEYCLOAKS: When groupKey is space separatorprivacera.jwt.token.parser.type=KEYCLOAKS After adding the properties, run the Dataserver, and then proceed to the next step.
-
SSH to the instance where Spark is installed and you want to install Privacera Plugin.
-
Create a directory
~/privacera
and download the script. Replace<privacera-plugin-script-download-url>
with the Privacera Plugin download URL.mkdir ~/privacera/spark-plugin-install cd ~/privacera/spark-plugin-install wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
-
Create a file
privacera_env.sh
which will contain the parameters required for your plugin installation.vi privacera_env.sh
Add the following properties:
PLUGIN_TYPE="spark" SPARK_PLUGIN_TYPE="OLAC" SPARK_HOME="<PLEASE_CHANGE>" SPARK_CLUSTER_NAME="privacera-spark"
Property Description PLUGIN_TYPE Type of Privacera Plugin which you want to install. SPARK_PLUGIN_TYPE Spark Plugin type OLAC. JWT Authentication will be enabled by default. SPARK_HOME This is the home directory of your Spark installation. For example, the directory path can be /home/user/spark
.SPARK_CLUSTER_NAME Cluster Name which will show up in the Privacera Ranger Audits page. -
Run the script.
chmod +x privacera_plugin.sh ./privacera_plugin.sh
The script will set up the Privacera Plugin in the OLAC mode.
FGAC Setup#
-
FGAC is recommended to be used with JWT authentication enabled.
Note
If JWT authentication is disabled, access control will fall on the system user or proxy user.
-
SSH to the instance where Spark is installed and you want to install Privacera Plugin.
-
Create a directory
~/privacera
and download the script. Replace<privacera-plugin-script-download-url>
with the Privacera Plugin download URL.mkdir ~/privacera/spark-plugin-install cd ~/privacera/spark-plugin-install wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
-
Create a file
privacera_env.sh
which will contain the parameters required for your plugin installation.vi privacera_env.sh
Add the following properties:
PLUGIN_TYPE="spark" SPARK_PLUGIN_TYPE="FGAC" SPARK_HOME="<PLEASE_CHANGE>" SPARK_CLUSTER_NAME="privacera-spark"
Property Description PLUGIN_TYPE Type of Privacera Plugin which you want to install. SPARK_PLUGIN_TYPE Spark Plugin type FGAC. SPARK_HOME This is the home directory of your Spark installation. For example, the directory path can be /home/user/spark
.SPARK_CLUSTER_NAME Cluster Name which will show up in the Privacera Ranger Audits page. Add the following properties when JWT auth is enabled:
JWT_OAUTH_ENABLE="true" JWT_ISSUER="<PLEASE_CHANGE>" JWT_PUBLIC_KEY="<PLEASE_CHANGE>" #JWT_SECRET="<PLEASE_CHANGE>" #JWT_SUBJECT="<PLEASE_CHANGE>" JWT_USERKEY="<PLEASE_CHANGE>" JWT_GROUPKEY="<PLEASE_CHANGE>" JWT_PARSER_TYPE="<PLEASE_CHANGE>"
Property Description Example JWT_OAUTH_ENABLE To enable JWT authentication. JWT_OAUTH_ENABLE="true" JWT_ISSUER The URL of the identity provider. JWT_ISSUER="https://your-idp-domain.com" JWT_PUBLIC_KEY The JWT token public key in String format. JWT_SECRET Uncomment and add value if the JWT token has been encrypted using secret. JWT_SECRET="privacera-secret" JWT_SUBJECT Uncomment and add value if JWT Token has a subject. JWT_SUBJECT="api-token" JWT_USERKEY Property to define a unique userKey whose value will be used in user for Ranger policies. JWT_USERKEY="client_id" JWT_GROUPKEY Property to define a unique groupKey whose value will be used in group for Ranger policies. JWT_GROUPKEY="scope" JWT_PARSER_TYPE JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAKS. JWT_PARSER_TYPE="KEYCLOAKS" -
Run the script.
chmod +x privacera_plugin.sh ./privacera_plugin.sh
The script will set up the Privacera Plugin in the FGAC mode.
Configure Privacera Plugin in an Existing Docker File#
If you have an existing Open Source Spark setup running on Kubernetes, you can update your existing Docker file used to create Spark image to add steps for installing Privacera Plugin.
OLAC Setup#
-
OLAC is supported only with JWT token authentication.
Your Dataserver application should be configured with JWT Token support. Create a new Dataserver, if it does not exist. See Data Access Server.
-
Add the following properties in your Dataserver application to enable JWT authorization.
privacera.jwt.oauth.enable=true privacera.jwt.token.issuer=<PLEASE_CHANGE> privacera.jwt.token.publickey=<PLEASE_CHANGE> privacera.jwt.token.secret=<PLEASE_CHANGE> privacera.jwt.token.subject=<PLEASE_CHANGE> privacera.jwt.token.userKey=<PLEASE_CHANGE> privacera.jwt.token.groupKey=<PLEASE_CHANGE> privacera.jwt.token.parser.type=<PLEASE_CHANGE>
Property Description Example privacera.jwt.oauth.enable Property to enable JWT auth in Privacera services. privacera.jwt.oauth.enable=true privacera.jwt.token.issuer Property to enter the URL of the identity provider. privacera.jwt.token.issuer=https://you-idp-domain.com privacera.jwt.token.publickey The JWT token public key in String format (Need to delete all newlines). -----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY----- privacera.jwt.token.secret [Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret. privacera.jwt.token.secret=privacera-api privacera.jwt.token.subject [Optional] Add this If JWT Token has a subject. privacera.jwt.token.subject=api-token privacera.jwt.token.userKey Property to define a unique userKey whose value will be used in user for Ranger policies. client-id privacera.jwt.token.groupKey Property to define a unique groupKey whose value will be used in group for Ranger policies. scope privacera.jwt.token.parser.type JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAKS.
PING_IDENTITY: When groupKey is an array
KEYCLOAKS: When groupKey is space separatorprivacera.jwt.token.parser.type=KEYCLOAKS After adding the properties, run the Dataserver, and then proceed to the next step.
-
SSH to the instance where Spark is installed and you want to install Privacera Plugin.
-
Copy the following to your Docker file. Set the
PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL
property. To get the Privacera Plugin download URL, from Obtain Installation Script.######## Install Privacera Spark Plugin Start ########### # ENV SPARK_HOME /opt/apache/spark RUN apt-get -y install zip unzip wget ENV PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL="<PLEASE_CHANGE>" ENV PLUGIN_TYPE="spark" ENV SPARK_PLUGIN_TYPE="OLAC" ENV SPARK_CLUSTER_NAME="privacera-spark" RUN echo "Downloading Script from $PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL" RUN wget ${PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL} -O privacera_plugin.sh RUN chmod +x privacera_plugin.sh RUN ./privacera_plugin.sh ######## Install Privacera Spark Plugin End ###########
-
Save the Docker file and build the image. You will now have a Docker image for Open Source Spark With Privacera Plugin enabled.
FGAC Setup#
-
FGAC is recommended to be used with JWT authentication enabled.
Note
If JWT authentication is disabled, access control will fall on the system user or proxy user.
-
SSH to the instance where Spark is installed and you want to install Privacera Plugin.
-
Copy the following to your Docker file. Set the
PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL
property. To get the Privacera Plugin download URL, from Obtain Installation Script. And for the JWT properties, refer the table below.######## Install Privacera Spark Plugin Start ########### # ENV SPARK_HOME /opt/apache/spark RUN apt-get -y install zip unzip wget ENV PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL="<PLEASE_CHANGE>" ENV PLUGIN_TYPE="spark" ENV SPARK_PLUGIN_TYPE="FGAC" ENV SPARK_CLUSTER_NAME="privacera-spark" ENV JWT_OAUTH_ENABLE "true" ENV JWT_ISSUER=<PLEASE_CHANGE> ENV JWT_PUBLIC_KEY=<PLEASE_CHANGE> ENV JWT_SECRET=<PLEASE_CHANGE> ENV JWT_SUBJECT=<PLEASE_CHANGE> ENV JWT_USERKEY=<PLEASE_CHANGE> ENV JWT_GROUPKEY=<PLEASE_CHANGE> ENV JWT_PARSER_TYPE=<PLEASE_CHANGE> RUN echo "Downloading Script from $PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL" RUN wget ${PCLOUD_PLUGIN_SCRIPT_DOWNLOAD_URL} -O privacera_plugin.sh RUN chmod +x privacera_plugin.sh RUN ./privacera_plugin.sh ######## Install Privacera Spark Plugin End ###########
Property Description Example JWT_OAUTH_ENABLE To enable JWT authentication. JWT_OAUTH_ENABLE="true" JWT_ISSUER The URL of the identity provider. JWT_ISSUER="https://your-idp-domain.com" JWT_PUBLIC_KEY The JWT token public key in String format. JWT_SECRET Uncomment and add value if the JWT token has been encrypted using secret. JWT_SECRET="privacera-secret" JWT_SUBJECT Uncomment and add value if JWT Token has a subject. JWT_SUBJECT="api-token" JWT_USERKEY Property to define a unique userKey whose value will be used in user for Ranger policies. JWT_USERKEY="client_id" JWT_GROUPKEY Property to define a unique groupKey whose value will be used in group for Ranger policies. JWT_GROUPKEY="scope" JWT_PARSER_TYPE JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAKS. JWT_PARSER_TYPE="KEYCLOAKS" -
Save the Docker file and build the image. You will now have a Docker image for Open Source Spark With Privacera Plugin enabled.
Configure Privacera Plugin using Privacera Scripts#
The scripts will help you create an Open Source Spark image with Privacera Plugin and push it to the specified Docker Hub which can be used to run Spark with Privacera.
OLAC Setup#
-
OLAC is supported only with JWT token authentication.
Your Dataserver application should be configured with JWT Token support. Create a new Dataserver, if it does not exist. See Data Access Server.
-
Add the following properties in your Dataserver application to enable JWT authorization.
privacera.jwt.oauth.enable=true privacera.jwt.token.issuer=<PLEASE_CHANGE> privacera.jwt.token.publickey=<PLEASE_CHANGE> privacera.jwt.token.secret=<PLEASE_CHANGE> privacera.jwt.token.subject=<PLEASE_CHANGE> privacera.jwt.token.userKey=<PLEASE_CHANGE> privacera.jwt.token.groupKey=<PLEASE_CHANGE> privacera.jwt.token.parser.type=<PLEASE_CHANGE>
Property Description Example privacera.jwt.oauth.enable Property to enable JWT auth in Privacera services. privacera.jwt.oauth.enable=true privacera.jwt.token.issuer Property to enter the URL of the identity provider. privacera.jwt.token.issuer=https://you-idp-domain.com privacera.jwt.token.publickey The JWT token public key in String format (Need to delete all newlines). -----BEGIN PUBLIC KEY-----MIIBIjANB-----END PUBLIC KEY----- privacera.jwt.token.secret [Optional] Add this If the JWT token has been encrypted using secret, use this property to set the secret. privacera.jwt.token.secret=privacera-api privacera.jwt.token.subject [Optional] Add this If JWT Token has a subject. privacera.jwt.token.subject=api-token privacera.jwt.token.userKey Property to define a unique userKey whose value will be used in user for Ranger policies. client-id privacera.jwt.token.groupKey Property to define a unique groupKey whose value will be used in group for Ranger policies. scope privacera.jwt.token.parser.type JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAKS.
PING_IDENTITY: When groupKey is an array
KEYCLOAKS: When groupKey is space separatorprivacera.jwt.token.parser.type=KEYCLOAKS After adding the properties, run the Dataserver, and then proceed to the next step.
-
SSH to the instance where you want to install Privacera Plugin.
-
Create a directory
~/privacera
and download the script. Replace<privacera-plugin-script-download-url>
with the Privacera Plugin download URL.mkdir ~/privacera/spark-plugin-install cd ~/privacera/spark-plugin-install wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
-
Create a file
privacera_env.sh
which will contain the parameters required for your plugin installation.vi privacera_env.sh
Add the following properties:
PLUGIN_TYPE="spark_k8s" SPARK_PLUGIN_TYPE="OLAC" HUB="<PLEASE_CHANGE>" HUB_USERNAME="<PLEASE_CHANGE>" HUB_PASSWORD="<PLEASE_CHANGE>" ENV_TAG="<PLEASE_CHANGE>"
Property Description PLUGIN_TYPE Type of Privacera Plugin which you want to install. SPARK_PLUGIN_TYPE Spark Plugin type OLAC. JWT Authentication will be enabled by default. HUB The Docker hub URL where you want the image to be pushed. HUB_USERNAME Docker hub username. HUB_PASSWORD Docker hub password. ENV_TAG Docker image tag. -
Run the script.
chmod +x privacera_plugin.sh ./privacera_plugin.sh
The script will build the Spark image with Privacera Spark plugin and publish it to the Docker hub.
FGAC Setup#
-
FGAC is recommended to be used with JWT authentication enabled.
Note
If JWT authentication is disabled, access control will fall on the system user or proxy user.
-
SSH to the instance where you want to install Privacera Plugin.
-
Create a directory
~/privacera
and download the script. Replace<privacera-plugin-script-download-url>
with the Privacera Plugin download URL.mkdir ~/privacera/spark-plugin-install cd ~/privacera/spark-plugin-install wget <privacera-plugin-script-download-url> -O privacera_plugin.sh
-
Create a file
privacera_env.sh
which will contain the parameters required for your plugin installation.vi privacera_env.sh
Add the following properties:
PLUGIN_TYPE="spark_k8s" SPARK_PLUGIN_TYPE="FGAC" SPARK_HOME="<PLEASE_CHANGE>" SPARK_CLUSTER_NAME="privacera-spark"
Property Description PLUGIN_TYPE Type of Privacera Plugin which you want to install. SPARK_PLUGIN_TYPE Spark Plugin type FGAC. SPARK_HOME This is the home directory of your Spark installation. For example, the directory path can be /home/user/spark
.SPARK_CLUSTER_NAME Cluster Name which will show up in the Privacera Ranger Audits page. Add the following properties when JWT auth is enabled:
JWT_OAUTH_ENABLE="true" JWT_ISSUER="<PLEASE_CHANGE>" JWT_PUBLIC_KEY="<PLEASE_CHANGE>" #JWT_SECRET="<PLEASE_CHANGE>" #JWT_SUBJECT="<PLEASE_CHANGE>" JWT_USERKEY="<PLEASE_CHANGE>" JWT_GROUPKEY="<PLEASE_CHANGE>" JWT_PARSER_TYPE="<PLEASE_CHANGE>"
Property Description Example JWT_OAUTH_ENABLE To enable JWT authentication. JWT_OAUTH_ENABLE="true" JWT_ISSUER The URL of the identity provider. JWT_ISSUER="https://your-idp-domain.com" JWT_PUBLIC_KEY The JWT token public key in String format. JWT_SECRET Uncomment and add value if the JWT token has been encrypted using secret. JWT_SECRET="privacera-secret" JWT_SUBJECT Uncomment and add value if JWT Token has a subject. JWT_SUBJECT="api-token" JWT_USERKEY Property to define a unique userKey whose value will be used in user for Ranger policies. JWT_USERKEY="client_id" JWT_GROUPKEY Property to define a unique groupKey whose value will be used in group for Ranger policies. JWT_GROUPKEY="scope" JWT_PARSER_TYPE JWT Parser Type. Values can be PING_IDENTITY or KEYCLOAKS. JWT_PARSER_TYPE="KEYCLOAKS" Add the following Docker Hub properties:
HUB="<PLEASE_CHANGE>" HUB_USERNAME="<PLEASE_CHANGE>" HUB_PASSWORD="<PLEASE_CHANGE>" ENV_TAG="<PLEASE_CHANGE>"
Property Description HUB The Docker hub URL where you want the image to be pushed. HUB_USERNAME Docker hub username. HUB_PASSWORD Docker hub password. ENV_TAG Docker image tag. -
Run the script.
chmod +x privacera_plugin.sh ./privacera_plugin.sh
The script will build the Spark image with Privacera Spark plugin and publish it to the Docker hub.
Deploy Spark on EKS Cluster#
-
SSH to the instance where you want to deploy Spark on the EKS cluster.
-
Get the Privacera Plugin download URL and set it in the following property. See Obtain Installation Script.
export PRIVACERA_DOWNLOAD_URL="<PLEASE_CHANGE>"
-
Create
spark-k8s-artifacts
folder.mkdir ~/privacera/spark-k8s-artifacts cd ~/privacera/spark-k8s-artifacts
-
Download and extract packages.
wget ${PRIVACERA_DOWNLOAD_URL}/plugin/spark/k8s-spark-deploy.tar.gz -O k8s-spark-deploy.tar.gz tar xzf k8s-spark-deploy.tar.gz rm -r k8s-spark-deploy.tar.gz cd k8s-spark-deploy/
-
Open
penv.sh
file and substitute the values of the following properties. Refer to the table below:Property Description Example SPARK_NAME_SPACE Kubernetes namespace privacera-spark-plugin-test SPARK_PLUGIN_IMAGE Docker image with hub ${HUB}/privacera-spark-plugin:${ENV_TAG} SPARK_DOCKER_PULL_SECRET Secret for docker-registry spark-plugin-docker-hub SPARK_PLUGIN_ROLE_BINDING Spark role Binding privacera-sa-spark-plugin-role-binding SPARK_PLUGIN_SERVICE_ACCOUNT Spark services account privacera-sa-spark-plugin SPARK_PLUGN_ROLE Spark services account role privacera-sa-spark-plugin-role SPARK_PLUGIN_APP_NAME Spark plugin application name privacera-spark-examples -
Run the following command to replace the property values in EKS deployment YAML file.
mkdir -p backup cp *.yml backup/ ./replace.sh
-
Run the following command to create EKS resources.
kubectl apply -f namespace.yml kubectl apply -f service-account.yml kubectl apply -f role.yml kubectl apply -f role-binding.yml
-
Run the following command to create secret for
docker-registry
.kubectl create secret docker-registry spark-plugin-docker-hub --docker-server=<PLEASE_CHANGE> --docker-username=<PLEASE_CHANGE> --docker-password='<PLEASE_CHANGE>' --namespace=<PLEASE_CHANGE>
-
Run the following command to deploy a sample Spark application. Replace
${SPARK_NAME_SPACE}
with the Kubernetes namespace.kubectl apply -f privacera-spark-examples.yml -n ${SPARK_NAME_SPACE}
Note
This is a sample file used for deployment. As per your use case, you can create a Spark deployment file and deploy a Docker image.
This will deploy a Spark application in EKS pod with Privacera plugin and it will keep the pod running, so that you can use it in interactive mode.
Validation#
To validate your Spark deployment, refer to Privacera Plugin in Spark on EKS - Validation