Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DCOS-40151] Added Sentry to Kerberized Hive image #397

Open
wants to merge 11 commits into
base: sh-dcos-39050
Choose a base branch
from

Conversation

susanxhuynh
Copy link
Contributor

@susanxhuynh susanxhuynh commented Aug 28, 2018

Added Sentry for Hive authorization in the Kerberized Hive image.

  • Download and start the Sentry service
  • Sentry config file
  • Hive configuration to use Sentry for authorization. This is in two files: hive-site.xml and a client version of sentry-site.xml (Hive is a client of Sentry).

Testing: manually ran a Spark Hive job against the Hive image as "alice" with and without CREATETABLE privileges.

@samvantran
Copy link
Contributor

Hi @elezar and gang, this PR builds off #392 by adding Sentry-related items to the Hive Dockerfile. Ultimately, the image built in this PR gets tested in the Hive integration test PR #399

PR #392 has already been approved so I thought we should merge this PR into that one (once you give the 👍), then merge 392 into master. The final step would be to get PR #399 working and approved which will be enough to consider the story done and we can move forward with soaking/releasing a new Spark package.

Copy link
Contributor

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions and comments from me. Don't see any explicit blockers though.

if [ -f "$SENTRY_CONF_TEMPLATE" ]; then
sed s/{{HOSTNAME}}/$HOSTNAME/ $SENTRY_HOME/conf/sentry-site.xml.template > $SENTRY_HOME/conf/sentry-site.xml
sed s/{{HOSTNAME}}/$HOSTNAME/ $HIVE_CONF/sentry-site.xml.template > $HIVE_CONF/sentry-site.xml
$SENTRY_HOME/bin/sentry --command schema-tool --conffile $SENTRY_CONF_FILE --dbType derby --initSchema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in the if block?

@@ -1,5 +1,14 @@
FROM cdh5-hive

ENV SENTRY_VERSION 1.5.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question. Do we ever want to use the non kerberized Hive? If not, we could just drop this all into a single Dockerfile.

@@ -33,7 +33,7 @@
[
"hostname",
"IS",
"10.0.0.114"
"1.2.3.4"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used at all?

GRANT ROLE test_role to GROUP root;
GRANT ALL on DATABASE default to ROLE test_role WITH GRANT OPTION;
EOF
/usr/local/hive/bin/beeline -u "jdbc:hive2://localhost:10000/default;principal=hive/${HOSTNAME}@LOCAL" -f grant_alice.sql
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this port (1000) configurable elsewhere?

# Grant permissions to user “alice”
echo "Grant permissions to user alice ..."
kdestroy
kinit -k -t /usr/local/hadoop/etc/hadoop/hdfs.keytab hive/${HOSTNAME}@LOCAL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have to be the hive user? could one use the hdfs user here too?

# download sentry
RUN curl -L http://archive.cloudera.com/cdh${CDH_VERSION}/cdh/${CDH_VERSION}/sentry-${SENTRY_VERSION}-cdh${CDH_EXACT_VERSION}.tar.gz \
| tar -xzC /usr/local && \
cd /usr/local && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ${SENTRY_HOME}?

</property>
<property>
<name>sentry.hive.server</name>
<value>server1</value>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this name specified?

</property>
<property>
<name>sentry.service.client.server.rpc-port</name>
<value>8038</value>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this port mapped to the Marathon JSON requirements somewhere too?

<configuration>
<property>
<name>sentry.hive.server</name>
<value>server1</value>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is server1 specific to something at all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants