一、Ranger简介
Apache Ranger提供一个集中式安全管理框架, 并解决授权和审计。它可以对Hadoop生态的组件如HDFS、Yarn、Hive、Hbase等进行细粒度的数据访问控制。通过操作Ranger控制台,管理员可以轻松的通过配置策略来控制用户访问权限。
1、组件列表
# | Service Name | Listen Port | Core Ranger Service |
---|---|---|---|
1 | ranger | 6080/tcp | Y (ranger engine - 3.0.0-SNAPSHOT version) |
2 | ranger-postgres | 5432/tcp | Y (ranger datastore) |
3 | ranger-solr | 8983/tcp | Y (audit store) |
4 | ranger-zk | 2181/tcp | Y (used by solr) |
5 | ranger-usersync | - | Y (user/group synchronization from Local Linux/Mac) |
6 | ranger-kms | 9292/tcp | N (needed only for Encrypted Storage / TDE) |
7 | ranger-tagsync | - | N (needed only for Tag Based Policies to be sync from ATLAS) |
2、支持的数据引擎服务
# | Service Name | Listen Port | Service Description |
---|---|---|---|
1 | Hadoop | 8088/tcp | Apache Hadoop 3.3.0 |
2 | HBase | 16000/tcp 16010/tcp 16020/tcp 16030/tcp | Apache HBase 2.4.6 Protected by Apache Ranger's HBase Plugin |
3 | Hive | 10000/tcp | Apache Hive 3.1.2 Protected by Apache Ranger's Hive Plugin |
4 | Kafka | 6667/tcp | Apache Kafka 2.8.1 Protected by Apache Ranger's Kafka Plugin |
5 | Knox | 8443/tcp | Apache Knox 1.4.0 Protected by Apache Ranger's Knox Plugin |
3、技术架构
说明:
Ranger Admin:该模块是Ranger的核心,它内置了一个Web管理界面,用户可以通过这个Web管理界面或者REST接口来制定安全策略;
Agent Plugin:该模块是嵌入到Hadoop生态圈组件的插件,它定期从Ranger Admin拉取策略并执行,同时记录操作以供审计使用;
User Sync:该模块是将操作系统用户/组的权限数据同步到Ranger数据库中。
4、工作流程
Ranger Admin是Apache Ranger和用户交互的主要界面,用户登录Ranger Admin后可以针对不同的Hadoop组件定制不同的安全策略,当策略制定并保存后,Agent Plugin会定期从Ranger Admin拉取该组件配置的所有策略,并缓存到本地。当有用户请求Hadoop组件时,Agent Plugin提供鉴权服务,并将鉴权结果反馈给相应的组件,从而实现了数据服务的权限控制功能。当用户在Ranger Admin中修改了配置策略后,Agent Plugin会拉取新策略并更新,如果用户在Ranger Admin中删除了配置策略,那么Agent Plugin的鉴权服务也无法继续使用。
以Hive为例子,具体流程如下所示:
二、集群规划
本次测试采用2台虚拟机,操作系统版本为centos7.6
Ranger版本为2.4.0,官网没有安装包,必须通过源码进行编译,过程可参考:
Centos7.6 + Apache Ranger 2.4.0编译(docker方式)_snipercai的博客-CSDN博客
Mysql版本5.7.43,KDC版本1.15.1,Hadoop版本为3.3.4
IP地址 | 主机名 | Hadoop | Ranger |
192.168.121.101 | node101.cc.local | NN1 DN | Ranger Admin |
192.168.121.102 | node102.cc.local | NN2 DN | |
192.168.121.103 | node103.cc.local | DN | Ranger Admin MYSQL |
三、安装Mysql
1、下载安装mysql
2、配置/etc/my.cnf
3、启动服务
● mysqld.service - MySQL Server
Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2023-08-11 15:16:50 CST; 5s ago
Docs: man:mysqld(8)
http://dev.mysql.com/doc/refman/en/using-systemd.html
Process: 8875 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
Process: 8822 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
Main PID: 8878 (mysqld)
CGroup: /system.slice/mysqld.service
└─8878 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid
Aug 11 15:16:46 node103.cc.local systemd[1]: Starting MySQL Server...
Aug 11 15:16:50 node103.cc.local systemd[1]: Started MySQL Server.
4、获得临时密码
2023-08-11T07:16:48.014862Z 1 [Note] A temporary password is generated for root@localhost: x5/dgr.?sasI
5、修改root密码及刷新权限
Securing the MySQL server deployment.
Enter password for user root: x5/dgr.?sasI
The existing password for the user account root has expired. Please set a new password.
New password: Mysql@103!
Re-enter new password: Mysql@103!
The 'validate_password' plugin is installed on the server.
The subsequent steps will run with the existing configuration of the plugin.
Using existing password for root.
Estimated strength of the password: 100
Change the password for root ? ((Press y|Y for Yes, any other key for No) : y
New password: Mysql@103!
Re-enter new password: Mysql@103!
Estimated strength of the password: 100
Do you wish to continue with the password provided?(Press y|Y for Yes, any other key for No) : y
By default, a MySQL installation has an anonymous user,allowing anyone to log into MySQL without having to havea user account created for them. This is intended only for testing, and to make the installation go a bit smoother.You should remove them before moving into a production
environment.
Remove anonymous users? (Press y|Y for Yes, any other key for No) : y
Success.
Normally, root should only be allowed to connect from'localhost'. This ensures that someone cannot guess at the root password from the network.
Disallow root login remotely? (Press y|Y for Yes, any other key for No) : n
Success.
By default, MySQL comes with a database named 'test' that anyone can access. This is also intended only for testing, and should be removed before moving into a production environment.
Remove test database and access to it? (Press y|Y for Yes, any other key for No) : y
- Dropping test database...
Success.
- Removing privileges on test database...
Success.
Reloading the privilege tables will ensure that all changes made so far will take effect immediately.
Reload privilege tables now? (Press y|Y for Yes, any other key for No) : y
Success.
All done!
6、测试登录
Enter password: Mysql@103!
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.7.43 MySQL Community Server (GPL)
Copyright (c) 2000, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
四、安装Ranger Admin
1、解压编译后的程序包
2、创建ranger用户
配置install.properties
3、创建Ranger数据库
4、初始化Ranger admin
Installation of Ranger PolicyManager Web Application is completed.
usermod: no changes
[2023/08/11 18:03:29]: [I] Soft linking /etc/ranger/admin/conf to ews/webapp/WEB-INF/classes/conf
5、启动Ranger admin
Starting Apache Ranger Admin Service
Apache Ranger Admin Service with pid 18756 has started.
6、登录网页
输入网址:http://192.168.121.103:6080/login.jsp
默认用户:admin
如没有配置rangerAdmin_password,则默认密码为admin
五、安装usersync
1、解压编译后的程序包
2、配置install.properties
3、安装usersync模块
Creating ranger-usersync-env-logdir.sh file
Creating ranger-usersync-env-hadoopconfdir.sh file
Creating ranger-usersync-env-piddir.sh file
Creating ranger-usersync-env-confdir.sh file
4、开启自动同步
配置/opt/ranger/ranger-2.4.0-usersync/conf/ranger-ugsync-site.xml
5、启动usersync模块
Starting Apache Ranger Usersync Service
Apache Ranger Usersync Service with pid 10537 has started.
6、查看用户同步
当启动usersync模块之后,会自动同步当前Linux系统中的用户
注意:这里只会同步除了root和虚拟用户外的用户,即UID和GID号较小的不同步
六、安装hdfs-plugin
1、解压编译后的程序包
2、配置install.properties
3、开启HDFS-Plugin
Custom user and group is available, using custom user and group.
+ Fri Aug 18 10:22:34 CST 2023 : hadoop: lib folder=/opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib conf folder=/opt/hadoop/hadoop-3.3.4/etc/hadoop
+ Fri Aug 18 10:22:34 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/hdfs-site.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.hdfs-site.xml.20230818-102234 ...
+ Fri Aug 18 10:22:35 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-hdfs-audit.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-hdfs-audit.xml.20230818-102234 ...
+ Fri Aug 18 10:22:35 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-hdfs-security.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-hdfs-security.xml.20230818-102234 ...
+ Fri Aug 18 10:22:35 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-policymgr-ssl.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-policymgr-ssl.xml.20230818-102234 ...
+ Fri Aug 18 10:22:36 CST 2023 : Saving current JCE file: /etc/ranger/hadoopdev/cred.jceks to /etc/ranger/hadoopdev/.cred.jceks.20230818102236 ...
Ranger Plugin for hadoop has been enabled. Please restart hadoop to ensure that changes are effective.
4、重启HDFS
5、页面配置HDFS服务
Service Name:要与REPOSITORY_NAME保持一致
Username:hadoop用户
Password:hadoop用户密码
Namenode URL:HDFS的namenode的rpc地址,非web地址,如果是HA架构,多个nn节点地址用逗号分割
填写完成后,先点Test Connection测试一下连接。
如果连接测试成功,可以添加完成
6、验证权限控制
使用test用户读取rangertest数据和上传文件,但无法上传文件到rangertest目录下
使用ranger配置策略,允许test用户可以操作rangertest目录
配置后test用户可以操作上传文件到rangertest目录
七、安装Yarn-plugin
1、解压编译后的程序包
2、配置install.properties
3、开启HDFS-Plugin
Custom user and group is available, using custom user and group.
+ Wed Aug 30 17:29:26 CST 2023 : yarn: lib folder=/opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib conf folder=/opt/hadoop/hadoop-3.3.4/etc/hadoop
+ Wed Aug 30 17:29:26 CST 2023 : Saving /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-policymgr-ssl.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-policymgr-ssl.xml.20230830-172926 ...
+ Wed Aug 30 17:29:26 CST 2023 : Saving /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-yarn-audit.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-yarn-audit.xml.20230830-172926 ...
+ Wed Aug 30 17:29:26 CST 2023 : Saving /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-yarn-security.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-yarn-security.xml.20230830-172926 ...
+ Wed Aug 30 17:29:26 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-policymgr-ssl.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-policymgr-ssl.xml.20230830-172926 ...
+ Wed Aug 30 17:29:26 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-yarn-audit.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-yarn-audit.xml.20230830-172926 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-yarn-security.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-yarn-security.xml.20230830-172926 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/yarn-site.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.yarn-site.xml.20230830-172926 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving lib file: /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/ranger-plugin-classloader-2.4.0.jar to /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/.ranger-plugin-classloader-2.4.0.jar.20230830172927 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving lib file: /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/ranger-yarn-plugin-impl to /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/.ranger-yarn-plugin-impl.20230830172927 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving lib file: /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/ranger-yarn-plugin-shim-2.4.0.jar to /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/.ranger-yarn-plugin-shim-2.4.0.jar.20230830172927 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving current JCE file: /etc/ranger/yarn_repo/cred.jceks to /etc/ranger/yarn_repo/.cred.jceks.20230830172927 ...
+ Wed Aug 30 17:29:29 CST 2023 : Saving current JCE file: /etc/ranger/yarn_repo/cred.jceks to /etc/ranger/yarn_repo/.cred.jceks.20230830172929 ...
Ranger Plugin for yarn has been enabled. Please restart yarn to ensure that changes are effective.
说明:
如果报错如下:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory,原因是yarn-plugin缺少jar,可以从/opt/ranger/ranger-2.4.0-hdfs-plugin/install/lib中拷贝commons-logging-XXX.jar到/opt/ranger/ranger-2.4.0-yarn-plugin/install/lib/下。
同理,如果是Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/lang3/StringUtils,则需要拷贝commons-lang3-XXX.jar
同理,如果Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/htrace/core/Tracer$Builder,则需要拷贝htrace-core4-4.1.0-incubating.jar
同理,如果Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/compress/archivers/tar/TarArchiveInputStream,则需要拷贝commons-compress-XXX.jar
4、重启HDFS
5、页面配置YARN服务
Service Name:要与REPOSITORY_NAME保持一致
Username:hadoop用户
Password:hadoop用户密码
YARN REST URL *:yarn的resourcemanager的地址,如果是HA架构,多个rm节点地址用逗号或分号分割
填写完成后,先点Test Connection测试一下连接。
如果连接测试成功,可以添加完成
6、验证权限控制
配置新的策略,如下,yarn上的队列Queue默认为root.default,以此为例,允许hadoop用户在队列root.default上提交,拒绝test用户在队列root.default上提交
说明:策略配置后,生效时间为30秒
Number of Maps = 3
Samples per Map = 3
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Starting Job
2023-08-30 18:22:27,786 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
2023-08-30 18:22:27,911 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1693388094495_0006
2023-08-30 18:22:28,118 INFO input.FileInputFormat: Total input files to process : 3
2023-08-30 18:22:28,243 INFO mapreduce.JobSubmitter: number of splits:3
2023-08-30 18:22:28,579 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1693388094495_0006
2023-08-30 18:22:28,581 INFO mapreduce.JobSubmitter: Executing with tokens: []
2023-08-30 18:22:28,856 INFO conf.Configuration: resource-types.xml not found
2023-08-30 18:22:28,862 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2023-08-30 18:22:29,083 INFO impl.YarnClientImpl: Submitted application application_1693388094495_0006
2023-08-30 18:22:29,208 INFO mapreduce.Job: The url to track the job: http://node102.cc.local:8088/proxy/application_1693388094495_0006/
2023-08-30 18:22:29,209 INFO mapreduce.Job: Running job: job_1693388094495_0006
2023-08-30 18:22:39,549 INFO mapreduce.Job: Job job_1693388094495_0006 running in uber mode : false
2023-08-30 18:22:39,552 INFO mapreduce.Job: map 0% reduce 0%
2023-08-30 18:22:59,558 INFO mapreduce.Job: map 100% reduce 0%
2023-08-30 18:23:07,782 INFO mapreduce.Job: map 100% reduce 100%
2023-08-30 18:23:08,807 INFO mapreduce.Job: Job job_1693388094495_0006 completed successfully
2023-08-30 18:23:08,991 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=72
FILE: Number of bytes written=1121844
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=768
HDFS: Number of bytes written=215
HDFS: Number of read operations=17
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=3
Launched reduce tasks=1
Data-local map tasks=3
Total time spent by all maps in occupied slots (ms)=54819
Total time spent by all reduces in occupied slots (ms)=5732
Total time spent by all map tasks (ms)=54819
Total time spent by all reduce tasks (ms)=5732
Total vcore-milliseconds taken by all map tasks=54819
Total vcore-milliseconds taken by all reduce tasks=5732
Total megabyte-milliseconds taken by all map tasks=56134656
Total megabyte-milliseconds taken by all reduce tasks=5869568
Map-Reduce Framework
Map input records=3
Map output records=6
Map output bytes=54
Map output materialized bytes=84
Input split bytes=414
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=84
Reduce input records=6
Reduce output records=0
Spilled Records=12
Shuffled Maps =3
Failed Shuffles=0
Merged Map outputs=3
GC time elapsed (ms)=366
CPU time spent (ms)=3510
Physical memory (bytes) snapshot=693649408
Virtual memory (bytes) snapshot=10972426240
Total committed heap usage (bytes)=436482048
Peak Map Physical memory (bytes)=194879488
Peak Map Virtual memory (bytes)=2741276672
Peak Reduce Physical memory (bytes)=110477312
Peak Reduce Virtual memory (bytes)=2748796928
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=354
File Output Format Counters
Bytes Written=97
Job Finished in 41.735 seconds
Estimated value of Pi is 3.55555555555555555556
Number of Maps = 3
Samples per Map = 3
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Starting Job
2023-08-30 18:48:58,861 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
2023-08-30 18:48:59,045 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/test/.staging/job_1693392487740_0001
2023-08-30 18:48:59,260 INFO input.FileInputFormat: Total input files to process : 3
2023-08-30 18:48:59,444 INFO mapreduce.JobSubmitter: number of splits:3
2023-08-30 18:48:59,834 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1693392487740_0001
2023-08-30 18:48:59,835 INFO mapreduce.JobSubmitter: Executing with tokens: []
2023-08-30 18:49:00,093 INFO conf.Configuration: resource-types.xml not found
2023-08-30 18:49:00,094 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2023-08-30 18:49:00,295 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/test/.staging/job_1693392487740_0001
java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: org.apache.hadoop.security.AccessControlException: User test does not have permission to submit application_1693392487740_0001 to queue default