This article introduces you with the file handling commands of Hadoop which is used through console for handling files. You can upload, download, search, list files through these command. You can ssh to your Hadoop box and use these commands. These commands are import commands and its must to learn these commands before learning advanced concepts of Hadoop Big Data Platform.
Hadoop is distributed file system and computation platform for Big Data handling. Hadoop has their own file system which is know as HDFS or Hadoop File System. This file system is used to manage the files in the Hadoop cluster environment. You can use use these commands to copy, get - download file, view files, list files, delete files, move files all these file operations on the Hadoop HDFS.
Here are the usage and explanations of Hadoop shell commands:
1. Become hdfs user
To correctly work with the hadoop hdfs you should login to your Hadoop cluster and then change use to hdfs. You can use the following command:
2. Create Directory command
First of all we will see the command for creating directory on the Hadoop HDFS. Following command is used to create directory in HDFS:
hadoop fs -mkdir <paths>
Where <paths> is the path or directory which is to be created.
hadoop fs -mkdir /test
hadoop fs -mkdir /test/usr
hadoop fs -mkdir /test/usr/deepak
Above three command creates test, usr and deepak directories one by one. So you can use this command to make one directory at a time on the Hadoop HDFS.
3. List content of a directory
Now we will see the command for listing the content of a directory. Following command is used for listing the content of a directory:
hadoop fs -ls <args>
To list the content of root of HDFS following command is used:
hadoop fs -ls /
4. Upload file in HDFS
Hadoop HDFS is file system and you can upload and download the file from the HDFS. To upload the file following command is used:
hadoop fs -put <localfile> <HDFS_dest_Path>
hadoop fs -put test.txt /test/usr/deepak/
Above command uploads test.txt from current directory to the /test/usr/deepak/ directory in HDFS.
5. Download file from HDFS
Following command can be used to download the file from HDFS:
hadoop fs -get <hdfs_src> <localdst>
Following command downloads test.txt from HDFS to local (current) directory:
hadoop fs -get /test/usr/deepak/test.txt ./
You can specify the full local directory patch also if you want to download in a different directory.
6. Viewing content of a file
Following command is used to view the content of a file which is present at HDFS:
hadoop fs -cat <path[filename]>
Following command will display the content of test.txt on console:
hadoop fs -cat /test/usr/deepak/test.txt
7. Copy file from one directory to another in HDFS
We will make a new directory in HDFS, run the following command to make a new directory:
hadoop fs -mkdir /test/usr/deepak2
Following command is used to copy file in HDFS:
hadoop fs -cp <source> <dest>
Now use the following command to copy test.txt from /test/usr/deepak directory /test/usr/deepak2 directory:
hadoop fs -cp /test/usr/deepak/test.txt /test/usr/deepak2/test.txt
8. Copy file from Local file system to HDFS and copy file from HDFS to Local file system
This command is similar to the put and get command discussed above with a difference that it just takes only a local file reference.
Example of copying to HDFS:
hadoop fs -copyFromLocal /home/deepak/test.txt /test/usr/deepak2/test.txt
Syntax of command:
hadoop fs -copyFromLocal <localsrc> URI
Downloading from HDFS:
hadoop fs -copyToLocal /test/usr/deepak2/test.txt /home/deepak/test.txt
Syntax of command:
hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst>
9. Move file command
Following command is used for moving the file from one directory to another:
hadoop fs -mv <src> <dest>
hadoop fs -mv /test/usr/deepak2/test.txt /home/deepak2/
10. Removing file/directory - command for deleting file or directory in HDFS
Syntax of the command:
hadoop fs -rm <arg>
Following command deletes the file /test/usr/deepak2/test.txt
hadoop fs -rm /test/usr/deepak2/test.txt
Directory can only be deleted with rm command if it is empty.
11. tail command to view few lines
In HDFS you can also use the tail command to view the few lines of a file.
hadoop fs -tail <path[filename]>
hadoop fs -tail /test/usr/deepak2/test.txt
12. Disk usage of a file
Following command can be used to see the disk usage of a file:
hadoop fs -du <path>
hadoop fs -du /test/usr/deepak2/test.txt
13. Finding hadoop version
Following command is used to print the version of Hadoop installed on your system:
14. Running cluster balancer utility
Following is the command for running the cluster balancer for Hadoop:
15. To empty the trash in Hadoop.
Use following command:
hadoop fs -expunge
16. File/Directory change mode command
Following is the example of chmod command in Hadoop:
sudo -u hdfs hadoop fs -chmod 600 /test/usr/deepak2/test.txt
In this tutorial we learned about the important commands of Hadoop HDFS.
Check more at Big Data tutorials, technologies, questions and answers.
Posted on: May 13, 2017 If you enjoyed this post then why not add us on Google+? Add us to your Circles