Sometimes its required to output hive results in gzip files to reduce the file size so that the files can be transferred over network.
To do this, run the following commands in hive before running the query. The following code sets these options and then runs the hive query. The output of this hive query will be stored in gzip files.
Java Program to List Contents of Directory in Hadoop (HDFS)
To do this, run the following commands in hive before running the query. The following code sets these options and then runs the hive query. The output of this hive query will be stored in gzip files.
set mapred.output.compress=true; set hive.exec.compress.output=true; set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; set io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec; INSERT OVERWRITE DIRECTORY 'hive_out' select * from tables limit 10000;
Java Program to List Contents of Directory in Hadoop (HDFS)
Good post
ReplyDelete