0
点赞
收藏
分享

微信扫一扫

【Spark加速】加大hive表在HDFS存的分片文件大小

鱼满舱 2024-05-27 阅读 14

配置参数:
spark.hadoop.hive.exec.orc.default.stripe.size=78643200
spark.hadoop.orc.stripe.size=78643200
spark.hadoopRDD.targetBytesInPartition=78643200
spark.hadoop.hive.exec.dynamic.partition.mode=nonstrict
spark.sql.sources.partitionOverwriteMode=dynamic
spark.sql.hive.convertMetastoreOrc=true

代码里落表前的.repartition(5000)这种要删掉

spark.sql.shuffle.partitions=5000这个配置参数也要删掉

举报

相关推荐

0 条评论