0
点赞
收藏
分享

微信扫一扫

如何使用Java代码访问CDH的Solr服务

霸姨 2022-09-21 阅读 168

温馨提示:要看高清无码套图,请使用手机打开并单击图片放大查看。


1.文档编写目的



CDH集群使用的Solr版本为4.10.3,Java开发中会经常使用到solrj客户端包访问Solr集群。本篇文章主要讲述如何使用Java代码访问Kerberos和非Kerberos环境下的Solr集群。


  • 内容概述

1.环境准备

2.非Kerberos及Kerberos环境连接示例


  • 测试环境

1.Kerberos集群CDH5.11.2,OS为Redhat7.2

2.非Kerberos集群CDH5.13,OS为CentOS6.5


  • 前置条件

1.集群已安装Sorl服务

2.Kerberos和非Kerberos集群Solr服务正常

3.已创建好一个测试用的collection1


2.环境准备



1.Maven依赖



<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>

<dependencies>
<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr-solrj</artifactId>
<version>4.10.3-cdh5.11.2</version>
</dependency>

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.0-cdh5.11.2</version>
</dependency>
</dependencies>


注意:这里使用的是CDH的依赖库,如果使用开源的4.10.3的依赖库会导致访问Kerberos环境下的Solr认证失败。


2.创建访问Solr集群的Keytab文件(非Kerberos集群可跳过此步)


[ec2-user@ip-172-31-22-86 keytab]$ sudo kadmin.localAuthenticating as principal mapred/admin@CLOUDERA.COM with password.kadmin.local:  listprincs fayson*fayson@CLOUDERA.COM
kadmin.local: xst -norandkey -k fayson.keytab fayson@CLOUDERA.COM
...
kadmin.local: exit[ec2-user@ip-172-31-22-86 keytab]$ lltotal 4-rw------- 1 root root 514 Nov 28 10:54 fayson.keytab[ec2-user@ip-172-31-22-86 keytab]$


如何使用Java代码访问CDH的Solr服务_solr


3.创建jaas-client.conf文件,内容如下(非Kerberos集群可跳过此步)


Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
keyTab="/Volumes/Transcend/keytab/fayson.keytab"
storeKey=true
useTicketCache=true
debug=true
principal="fayson@CLOUDERA.COM";
};


将标红部分修改为自己的Keytab文件路径及Kerberos账号


3.非Kerberos环境



1.示例代码



package com.cloudera.solr;import com.cloudera.bean.Message;import org.apache.solr.client.solrj.SolrQuery;import org.apache.solr.client.solrj.SolrServerException;import org.apache.solr.client.solrj.impl.CloudSolrServer;import org.apache.solr.client.solrj.response.QueryResponse;import org.apache.solr.common.SolrDocument;import org.apache.solr.common.SolrDocumentList;import org.apache.solr.common.SolrInputDocument;import java.io.IOException;/**
 * package: com.cloudera.solr
 * describe:
使用Solrj4.10.3-cdh5.11.2版本访问非Kerberos环境下的Solr集群 * creat_user: Fayson
 * email: htechinfo@163.com
 * creat_date: 2017/11/26
 * creat_time:
上午12:08
 *
公众号:Hadoop实操 */public class NoneKBSolrTest {
   
static final String zkHost = "13.229.70.204:2181/solr";
   
static final String defaultCollection = "collection1";
   
static final int socketTimeout = 20000;
   
static final int zkConnectTimeout = 1000;

   
public static void main(String[] args) {
       
CloudSolrServer cloudSolrServer = new CloudSolrServer(zkHost);
        cloudSolrServer.
setDefaultCollection(defaultCollection);
        cloudSolrServer.
setZkClientTimeout(zkConnectTimeout);
        cloudSolrServer.
setZkConnectTimeout(socketTimeout);
        cloudSolrServer.
connect();

search(cloudSolrServer, "id:12345678911");
addIndex(cloudSolrServer);
deleteIndex(cloudSolrServer, "12345678955");
}

/**
* 查找 *
* @param solrClient
* @param String
*/
public static void search(CloudSolrServer solrClient, String String) {
SolrQuery query = new SolrQuery();
query.setQuery(String);
try {
QueryResponse response = solrClient.query(query);
SolrDocumentList docs = response.getResults();

System.out.println("文档个数:" + docs.getNumFound());
System.out.println("查询时间:" + response.getQTime());

for (SolrDocument doc : docs) {
String id = (String) doc.getFieldValue("id");
String created_at = (String) doc.getFieldValue("created_at");
String text = (String) doc.getFieldValue("text");
String text_cn = (String) doc.getFieldValue("text_cn");
System.out.println("id: " + id);
System.out.println("created_at: " + created_at);
System.out.println("text: " + text);
System.out.println("text_cn: " + text_cn);
System.out.println();
}
} catch (Exception e) {
System.out.println("Unknowned Exception!!!!");
e.printStackTrace();
}
}

/**
* 添加索引 *
* @param solrClient
*/
public static void addIndex(CloudSolrServer solrClient) {
try {
SolrInputDocument solrInputDocument = new SolrInputDocument();
solrInputDocument.setField("id", "1234567890");
solrInputDocument.setField("created_at", "2017-11-25 02:35:07");
solrInputDocument.setField("text", "hello world");
solrInputDocument.setField("text_cn", "张三是个农民,勤劳致富,奔小康");
solrClient.add(solrInputDocument);
solrClient.commit();
} catch (Exception e) {
System.out.println("Unknowned Exception!!!!!");
e.printStackTrace();
}
}

/**
* 使用JavaBean对象向Solr集群创建索引 *
* @param solrServer
*/
public static void addBean(CloudSolrServer solrServer) {

Message message = new Message("12345678911", "2017-11-25 02:35:07", "hello world", "张三是个农民,勤劳致富,奔小康");
try {
solrServer.addBean(message);
solrServer.commit();
} catch (IOException e) {
e.printStackTrace();
} catch (SolrServerException e) {
e.printStackTrace();
}
}

/**
* 删除指定Collection中的Index
*
* @param solrServer
* @param id
*/
public static void deleteIndex(CloudSolrServer solrServer, String id) {
try {
solrServer.deleteById(id);
} catch (SolrServerException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}


4.Kerberos环境




1.示例代码运行


package com.cloudera.solr;import com.cloudera.bean.Message;import org.apache.solr.client.solrj.SolrQuery;import org.apache.solr.client.solrj.SolrServerException;import org.apache.solr.client.solrj.impl.CloudSolrServer;import org.apache.solr.client.solrj.impl.HttpClientUtil;import org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer;import org.apache.solr.client.solrj.response.QueryResponse;import org.apache.solr.common.SolrDocument;import org.apache.solr.common.SolrDocumentList;import org.apache.solr.common.SolrInputDocument;import java.io.IOException;/**
 * package: com.cloudera.solr
 * describe: Kerberos
环境下的Solr访问 * creat_user: Fayson
 * email: htechinfo@163.com
 * creat_date: 2017/11/26
 * creat_time:
上午2:10
 *
公众号:Hadoop实操 */public class KBSolrTest {

   
static final String zkHost = "ip-172-31-22-86.ap-southeast-1.compute.internal:2181/solr";
   
static final String defaultCollection = "collection1";
   
static final int socketTimeout = 20000;
   
static final int zkConnectTimeout = 10000;

   
public static void main(String[] args) {
       
System.setProperty("java.security.krb5.conf", "/Volumes/Transcend/keytab/krb5.conf");
       
System.setProperty("javax.security.auth.useSubjectCredsOnly", "false");
       
System.setProperty("sun.security.krb5.debug", "true");
       
System.setProperty("java.security.auth.login.config", "/Volumes/Transcend/keytab/jaas-client.conf");

       
HttpClientUtil.setConfigurer(new Krb5HttpClientConfigurer());
       
CloudSolrServer cloudSolrServer = new CloudSolrServer(zkHost);
        cloudSolrServer.
setDefaultCollection(defaultCollection);
        cloudSolrServer.
setZkConnectTimeout(zkConnectTimeout);
        cloudSolrServer.
setZkClientTimeout(socketTimeout);
        cloudSolrServer.
connect();

       
addIndex(cloudSolrServer);

       
addBeanIndex(cloudSolrServer);

       
search(cloudSolrServer, "id:12345678955");
       
search(cloudSolrServer, "id:12345678966");

       
deleteIndex(cloudSolrServer, "12345678955");
       
search(cloudSolrServer, "id:12345678955");

    }

   
/**
     *
查找     *
     *
@param solrClient
    
* @param String
    
*/
   
public static void search(CloudSolrServer solrClient, String String) {
       
SolrQuery query = new SolrQuery();
        query.
setQuery(String);
       
try {
           
QueryResponse response = solrClient.query(query);
           
SolrDocumentList docs = response.getResults();

           
System.out.println("文档个数:" + docs.getNumFound());
           
System.out.println("查询时间:" + response.getQTime());

           
for (SolrDocument doc : docs) {
               
String id = (String) doc.getFieldValue("id");
               
String created_at = (String) doc.getFieldValue("created_at");
               
String text = (String) doc.getFieldValue("text");
                
String text_cn = (String) doc.getFieldValue("text_cn");
               
System.out.println("id: " + id);
               
System.out.println("created_at: " + created_at);
               
System.out.println("text: " + text);
               
System.out.println("text_cn: " + text_cn);
               
System.out.println();
            }
        }
catch (SolrServerException e) {
           
e.printStackTrace();
        }
catch (Exception e) {
           
System.out.println("Unknowned Exception!!!!");
           
e.printStackTrace();
        }
    }

   
/**
     *
添加索引     *
     *
@param solrClient
    
*/
   
public static void addIndex(CloudSolrServer solrClient) {
       
try {
           
SolrInputDocument solrInputDocument = new SolrInputDocument();
            solrInputDocument.
setField("id", "12345678955");
            solrInputDocument.
setField("created_at", "2017-11-25 02:35:07");
            solrInputDocument.
setField("text", "hello world");
            solrInputDocument.
setField("text_cn", "张三是个农民,勤劳致富,奔小康");
           
solrClient.add(solrInputDocument);
           
solrClient.commit();
        }
catch (Exception e) {
           
System.out.println("Unknowned Exception!!!!!");
           
e.printStackTrace();
        }
    }

   
/**
     *
@param solrClient
    
*/
   
public static void addBeanIndex(CloudSolrServer solrClient) {
       
try {
           
Message message = new Message("12345678966", "2017-11-25 02:35:07", "hello world", "李四也是个农民,勤劳致富,奔小康");
           
solrClient.addBean(message);

           
solrClient.commit();
        }
catch (Exception e) {
           
System.out.println("Unknowned Exception!!!!!");
           
e.printStackTrace();
        }
    }

   
/**
     *
删除索引     *
     *
@param solrClient
    
* @param id
    
*/
   
public static void deleteIndex(CloudSolrServer solrClient, String id) {
       
try {
           
solrClient.deleteById(id);
        }
catch (SolrServerException e) {
           
e.printStackTrace();
        }
catch (IOException e) {
           
e.printStackTrace();
        }
    }
}


5.工程打包运行



这里以Kerberos环境的为例,可以将solrdemo工程中的run目录拷贝至服务器做相应修改即可运行,目录结构如下


如何使用Java代码访问CDH的Solr服务_apache_02


1.使用maven命令将工程打包,这里的命令打包的是一个不可运行的jar


mvn clean package


将编译好的jar包放置lib目录下。


2.编写run.sh脚本


#!/bin/bashfor file in `ls lib/*jar` do
CLASSPATH=$CLASSPATH:$filedoneexport CLASSPATHfor file in `ls /opt/cloudera/parcels/CDH/jars/*.jar`do
CLASSPATH=$CLASSPATH:$filedoneexport CLASSPATH/usr/java/jdk1.8.0_131-cloudera/bin/java com.cloudera.solr.KBSolrTest


注意:将上面标红部分修改为自己集群的依赖包目录及需要执行的类。


3.运行run.sh测试


如何使用Java代码访问CDH的Solr服务_cloudera_03


注意:Fayson做测试把jaas-client.confkrb5.conf配置写死在代码里面,大家可以做相应的调整作为参数传递至代码中。


6.总结



这里Fayson在调试代码时也遇到很多坑,比如CDH集群的Solr版本为4.10.3,但我选择的Solrj版本为7.10.1,在调试是能够正常查询Solr集群的数据,但不能向Solr集群添加Index。后又选择使用Sorl官网提供solrj4.10.3版本,在调试Kerberos环境时,由于无Krb5HttpClientConfigurer类,导致调试Kerberos环境时遇到各种坑,最终选择了solrj4.10.3-cdh5.11.2版本里面含有Krb5HttpClientConfigurer类,最终解决Kerberos环境下的Solr访问。

 

GitHub源码地址:

​​https://github.com/javaxsky/cdhproject​​



为天地立心,为生民立命,为往圣继绝学,为万世开太平。

温馨提示:要看高清无码套图,请使用手机打开并单击图片放大查看。





推荐关注Hadoop实操,第一时间,分享更多Hadoop干货,欢迎转发和分享。

如何使用Java代码访问CDH的Solr服务_apache_04

原创文章,欢迎转载,转载请注明:转载自微信公众号Hadoop实操


举报

相关推荐

0 条评论