0
点赞
收藏
分享

微信扫一扫

Avro建表语法

i奇异 2022-01-31 阅读 118
  • step1:指定文件类型

  • step2:指定Schema

  • step3:建表方式

实施

  • Hive官网:LanguageManual DDL - Apache Hive - Apache Software Foundation

  • DataBrics官网:Create Table | Databricks on AWS

  • Avro用法:AvroSerDe - Apache Hive - Apache Software Foundation

指定文件类型

  • 方式一:指定类型

stored as avro

 方式二:指定解析类

--解析表的文件的时候,用哪个类来解析
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
--读取这张表的数据用哪个类来读取
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
--写入这张表的数据用哪个类来写入
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'

指定Schema

  • 方式一:手动定义Schema

CREATE TABLE embedded
COMMENT "这是表的注释"
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES (
  'avro.schema.literal'='{
    "namespace": "com.howdy",
    "name": "some_schema",
    "type": "record",
    "fields": [ { "name":"string1","type":"string"}]
  }'
);

方式二:加载Schema文件

CREATE TABLE embedded
COMMENT "这是表的注释"
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED as INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES (
 'avro.schema.url'='file:///path/to/the/schema/embedded.avsc'
);

 

建表语法

  • 方式一:指定类型和加载Schema文件

create external table one_make_ods_test.ciss_base_areas
comment '行政地理区域表'
PARTITIONED BY (dt string)
stored as avro
location '/data/dw/ods/one_make/full_imp/ciss4.ciss_base_areas'
TBLPROPERTIES ('avro.schema.url'='/data/dw/ods/one_make/avsc/CISS4_CISS_BASE_AREAS.avsc');

方式二:指定解析类和加载Schema文件

create external table one_make_ods_test.ciss_base_areas
comment '行政地理区域表'
PARTITIONED BY (dt string)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
location '/data/dw/ods/one_make/full_imp/ciss4.ciss_base_areas'
TBLPROPERTIES ('avro.schema.url'='/data/dw/ods/one_make/avsc/CISS4_CISS_BASE_AREAS.avsc');
create external table 数据库名称.表名
comment '表的注释'
partitioned by
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
location '这张表在HDFS上的路径'
TBLPROPERTIES ('这张表的Schema文件在HDFS上的路径')

 

 

 

 

 

举报

相关推荐

0 条评论