使用Docker部署了跨主机的3节点Kafka集群,网络使用的是host模式,暴露JMX_PORT端口后,CMAK还是获取不到Metrics数据,报错类似如下:
Failed to get broker metrics for BrokerIdentity(1,192.168.4.11,9999,false,true,Map(PLAINTEXT -> 9092))
在参考如下资料:
- confluent kafka官方文档有关jmx端口配置 https://docs.confluent.io/platform/current/installation/docker/operations/monitoring.html#configure-environment
- Github 中 yahoo/CMAK 的issue的讨论 https://github.com/yahoo/CMAK/issues/563 重要原文如下:
Well, it looks like KafkaManager can't connect to the JMX port... first I would check that it's possible to connect from KM host to each of the brokers using netcat, you should see something like this:
$ netcat -z -n -v 10.0.20.7 9096 Connection to 10.0.20.7 9096 port [tcp/*] succeeded!
In case it's successful, then check which interface is KM using, check firewalls, etc, something is blocking the request between KM and Kafka.
If netcat fails then most likely there is a problem with the JMX configuration in kafka, check -Dcom.sun.management.jmxremote.local.only=false and in hosts with multiple interfaces -Djava.rmi.server.hostname= is configured to the correct ip.
根据上述参考,以及原程序中kafka-run-class.sh,总结如下Docker环境变量中需要配置如下,以docker-compose.yml文件为例:
environment:
- JMX_PORT=9999
- KAFKA_JMX_OPTS=-Djava.rmi.server.hostname=192.168.4.11 -Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.rmi.port=9999 -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
本人使用的镜像是bitnami/kafka:2.3.0。
后面经过调试最终参数如下:
environment:
- JMX_PORT=9999
- KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote.rmi.port=9999 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false