0
点赞
收藏
分享

微信扫一扫

hive sql 优化记录

Raow1 2022-04-08 阅读 116

在调度流程执行过程中发现流程的执行时间随着时间的增加,执行时间也在逐步变长,于是决定对流程的SQL进行优化处理:

原SQL如下:

SELECT 

    kuscode,

    gstore_name,

    from_unixtime(cast(store_open_date as BIGINT),'yyyy-MM-dd') store_open_date,

    round(sum(if(member_ascribe_uscode = uscode,store_amount - tax_total_amount,0)),2) store_amount,

    count(distinct if(member_ascribe_uscode = uscode,mid,'')) mid_count,

  from tmp_order_sapply_g

where bi_dt >= '20220101'

                  and store_open_date >= date_sub(current_timestamp() ,37)

              and store_open_date <> '9999-12-31 00:00:00'   

    and order_time < date_add(store_open_date,30)   

   GROUP BY uscode,store_name,store_open_date

优化后的SQL:

SELECT 

    kuscode,

    gstore_name,

    from_unixtime(cast(store_open_date as BIGINT),'yyyy-MM-dd') store_open_date,

    round(sum(if(member_ascribe_uscode = uscode,store_amount - tax_total_amount,0)),2) store_amount,

    count(distinct if(member_ascribe_uscode = uscode,mid,'')) mid_count,

    from tmp_order_sapply_g

where  bi_dt >= regexp_replace(cast(date_sub(current_timestamp() ,40) as STRING),'-','')

               and store_open_date >= date_sub(current_timestamp() ,37)

              and store_open_date <> '9999-12-31 00:00:00'   

    and order_time < date_add(store_open_date,30)   

GROUP BY uscode,store_name,store_open_date

主要是针对分区键进行范围限定。

举报

相关推荐

0 条评论