0
点赞
收藏
分享

微信扫一扫

Hive多字符列分割

笑望叔叔 2023-02-21 阅读 92


Hive多字符分割列


1 简介

此手册应用于hive建表时指定列按照多字段分割的情景。

2 准备

2.1 环境说明

hive版本: 1.1.0-cdh5.4.7

3 使用

3.1 使用说明

方法一)通过org.apache.hadoop.hive.contrib.serde2.RegexSerDe格式的serde。

1)  建表语句

#指定以^|~作为分隔符

CREATE TABlE tableex3(id STRING, name STRING)

ROW FORMAT SERDE'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'

WITH SERDEPROPERTIES (

"input.regex" = "^(.*)\\^\\|~(.*)$"

)

STORED AS TEXTFILE;

2)  准备数据

1^|~wee

2^|~do

we^|~xml

%^|~we

3)  转载数据

load data local inpath '/var/lib/hadoop-hdfs/tee.txt'into table tableex3;


4)  验证:

select * from tableex3;

+--------------+----------------+--+

| tableex3.id | tableex3.name  |

+--------------+----------------+--+

| 1           | wee            |

| 2           | do             |

| we          | xml            |

| %           | we             |

| NULL        | NULL           |

+--------------+----------------+--+

方法二)通过org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe格式的serde。

#指定以^|~作为分隔符

CREATE TABLE multi_delim (col1 STRING, col2 STRING,Col3STRING) ROW FORMAT SERDE'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES("field.delim"="^|~");

cat /var/lib/hadoop-hdfs/tee3.txt

1^|~wee^|~hi

2^|~do^|~where

we^|~xml^|~rice

%^|~we^|~^|

load data local inpath '/var/lib/hadoop-hdfs/tee.txt'into table tableex3;


select * from multi_delim;

+-------------------+-------------------+-------------------+--+

| multi_delim.col1  | multi_delim.col2  | multi_delim.col3  |

+-------------------+-------------------+-------------------+--+

| 1                 | wee               | hi                |

| 2                 | do                | where             |

| we                | xml               | rice              |

| %                 | we                | ^|                |

|                   | NULL              | NULL              |

+-------------------+-------------------+-------------------+--+

3.2 使用问题

暂无.

5 总结

目前impala尚不支持多字段分割

Not Support for multiple-character string as the fielddelimite

Hive多字符列分割_hadoop

​​https://issues.cloudera.org/browse/IMPALA-2428​​



 

举报

相关推荐

0 条评论