0
点赞
收藏
分享

微信扫一扫

Hadoop文件解压缩

杨小羊_ba17 2022-02-17 阅读 167


Class

​org.apache.hadoop.io.compress .CompressionCodecFactory​

A factory that will find the correct codec for a given filename.

Method

​CompressionCodec getCodec(Path file)​

Find the relevant compression codec for the given file based on its filename suffix.

获得这个压缩数据文件采用哪种压缩数据算法。

package Compress;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.compress.CompressionCodec;
import org.apache.hadoop.io.compress.CompressionCodecFactory;
import org.apache.hadoop.io.compress.CompressionInputStream;
import org.apache.hadoop.mapreduce.Job;

/**
* 解压缩
* @author liguodong
*/
public class Decompression {

final static String file = "/liguodong/data.gz";
public static void main(String[] args) throws IOException {

Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "DeCodec");
//打包运行必须执行的方法
job.setJarByClass(Decompression.class);

CompressionCodecFactory codecFactory = new CompressionCodecFactory(conf);
//返回一个解压缩的实例
CompressionCodec codec = codecFactory.getCodec(new Path(file));
//返回被算法解压了的输入流
CompressionInputStream inputStream = codec.createInputStream
(new FileInputStream(new File(file)));
//将输入流文件写出到去除了扩展名的文件
FileOutputStream outputStream = new FileOutputStream
(new File(codecFactory.removeSuffix(file, codec.getDefaultExtension())));
IOUtils.copyBytes(inputStream, outputStream, conf);

}
}

打成jar包:Decodec.jar

[root@master liguodong]# yarn jar Decodec.jar
15/06/05 21:54:25 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
[root@master liguodong]# ll
总用量 524824
-rw-r--r-- 1 root root 1492 6月 5 19:47 codec.jar
-rw-r--r-- 1 root root 536870912 6月 5 21:54 data
-rw-r--r-- 1 root root 521844 6月 5 21:40 data.gz



举报

相关推荐

0 条评论