Essential Linux commands for data analysis

ssh
connect to a Unix server
ssh username@hostname

pwd
prints the current working directory
pwd

ls
list directory contents
ls
ls ../another/directory

cd
change directory
cd ../ Move up one directory.
cd ~ Return to home directory.
cd /path/to/move/to

Reading files

cat/zcat
print a file to screen

cat myfile.txt

less
print a file, paginated

less myfile.txt

head/tail
print the first/last n lines

head -15 myfile.txt

Manipulating files & directories

nano1
basic text editor

mkdir

mkdir new_folder

create nested directories
mkdir -p new_folder1/new_folder2

cp
cp file_to_copy.txt copy_file_name.txt

mv
move (or rename) a file
mv resum.txt resume.txt

rm
delete stuff

delete a single file
rm myfile.txt

recursively delete a directory
rm mydirectory

Manipulating data

cut
slice a column from a delimited file

print the second column from a comma-delimited file
cut -d, -f2

wc
count characters, words, or lines

count lines in myfile.txt
wc -l myfile.txt

sort sort a file

sort file in reverse numerical order of the fourth column
sort -nrk 4

Misc commands

find

grep

wget

gzip/gunzip

tar


  1. I prefer Vim, but the learning curve is very steep. [return]