0

点赞

收藏

分享

hive多级分区表的使用

small_Sun 2022-07-12 阅读 71

标签: hive 大数据数据 sql 二级 Hive

简介

一个分区字段就是一级目录,如果是两个分区,就是两个分区目录,比如说 area province,那么目录就是 /area/province/xxxxxx ,数据都在xxxxxx这里.

hive多级分区表的使用_大数据

创建表

二级分区使用方式就是在一级分区基础上 PARTITIONED BY后面多写个字段即可

create table order_multi_partition
(
    order_no   string,
    order_time string
)
    PARTITIONED BY (event_timt string, step string)
    row format delimited fields terminated by '\t';

准备数据

order_created.txt

10703007267488  2014-05-01  06:01:12.334+01
10101043505096  2014-05-01  07:28:12.342+01
10103043509747  2014-05-01  07:50:12.33+01
10103043501575  2014-05-01  09:27:12.33+01
10104043514061  2014-05-01  09:03:12.324+01

导入数据

现在order_created.txt所在的目录在/root/soft

[root@zjj101 soft]# ls
data  docker  hadoop-2.7.2  hive-1.2.1  myconf  order_created.txt  tmp
[root@zjj101 soft]# pwd
/root/soft

使用load方式导入
sql:

load data local inpath '/root/soft/order_created.txt' into table order_multi_partition
  PARTITION (event_timt = '2014-05-01', step = '1')

查看结果

sql

select *
from order_multi_partition
where event_timt = '2014-05-01'
  and step = '1';

hive多级分区表的使用_大数据_02

hdfs上面的数据

hive多级分区表的使用_二级_03

再次上传到另一个二级分区

上传前先查看一下有没有数据

select *
from order_multi_partition
where event_timt = '2014-05-01'
  and step = '2';

hive多级分区表的使用_sql_04
没有数据

将数据导入到step2分区

导入到(event_timt = ‘2014-05-01’, step = ‘2’)分区
sql:

load data local inpath '/root/soft/order_created.txt' into table order_multi_partition
    PARTITION (event_timt = '2014-05-01', step = '2');

查看结果

sql:

select *
from order_multi_partition
where event_timt = '2014-05-01'
  and step = '2';

hive多级分区表的使用_hive_05
此时就有了

hive多级分区表的使用_hive_06

0 条评论

关注