Table of Contents
File, Network and Revision Control Utilities
File Utilities
Batch processing
find
- find files/directories/named pipes/etc.
- See examples here.- -execoption allows running a command on the- findresults.- Not everything is possible withfind's-execoption.- A shellforloop may be more appropriate instead.
 
 
 
- Quicklylocatea file on a local filesystem.- Uselocate
- If set up on the system, can find files rapidly usinglocate filename.
 
xargs
- execute commands on a collection of arguments
- See examples here.
- Also see examples in thexargsman page.
- Often used at the back end of afindcommand because it's more capable thanfind's-exec option.
# Find all hidden files (files that begin with period ".*"),
# starting in the current dir, and then use xargs to run "ls -l"
# on all the find results.
$ find . -type f -name '.*' | xargs ls -l
# Same output as 
# $ find . -type f -name '.*' -exec ls -l {} \;
# List all users logged in and run finger on each userid
$ w -h | awk '{ print $1 }' | sort | uniq | xargs finger
#
# w -h               - list logged in users and associated info, excluding header (-h)
# awk '{ print $1 }' - print only userid from w output
# sort | uniq        - reduce duplicates
# xargs finger       - run finger on list of users from uniq
Archiving and compression
tar
- Tape Archive (See article.)
- Archive file/dirs, preserving file/dir attributes.- Usually operate on directories, not files.
 
- Common options:
        c - create archive
        t - view existing archive
        v - operate verbosely
        x - extract archive
        f - create the specified tar file ( or use "-" to send tar'ed
            files to stdout ) 
- Examples:
# Create tar archive (cs370.tar) of your ~/cs370 directory in /tmp:
# Dashes ("-") are optional for tar options.
$ tar cvf /tmp/cs370-${USER}.tar ~/cs370  # "tar -cvf ..." does the same thing
# Change to /tmp
$ cd /tmp
# "tar tvf cs370-${USER}.tar" shows the contents of cs370.tar.
# "tar xvf cs370-${USER}.tar" extracts the contents of cs370.tar to
# the CURRENT directory. (BE CAREFUL.)
- Since.tararchive are not compressed,taris often used in combination with a file compressor such asgzip,bzip2orxz. See examples below.
gzip/bzip2/xz
- Compression of single files- gzipis faster;- bzip2compresses more;- xzcompresses better than- bzip2and is faster.
- See article on gzip and bzip2.
 
- Compress individual files:
$ cd /tmp
# Copy nano config files to /tmp
$ cp /usr/share/nano/*.nanorc  /tmp
$ gzip *.nanorc    # will result in all .nanorc files in current dir
                   # being compressed and given the .nanorc.gz extension
$ gunzip *.gz      # will uncompress the .nanorc.gz files and
                   # leave files w/ .nanorc extensions
$ bzip2 *.nanorc
$ bunzip2 *.bz2    # same as above, but with bzip2
$ xz *.nanorc
$ unxz *.xz        # same as above, but with xz
- Use “-c” option to gzip, bzip2 and xz send compressed data to stdout
# Copy large wordlist to /tmp $ cp /usr/share/dict/words /tmp # Compress wordlist to a separate compressed files: $ gzip -c words > words.gz $ bzip2 -c words > words.bz2 $ xz -c words > words.xz # Compare size of compressed file formats. # Also try zip: $ zip words.zip words
- View compressed text files- On most Linux systems, program documentation under /usr/share/doc is usually compressed to save space.
 
# View nano documentation $ cd /usr/share/doc/nano $ ls # View a compressed file (NEWS.gz) $ gunzip -c NEWS.gz | less or, more simply, $ zless NEWS.gz # Cat a compressed file (NEWS.gz) $ zcat NEWS.gz:
- gzip/bzip2/xz are often used in combination withtar.
# Tar your ~/cs370 dir to tar's stdout (-) and xz it,
# redirecting the result to cs370.tar.xz:
$ tar cvf -  ~/cs370 | xz -c > /tmp/cs370-${USER}.tar.xz
# Change to /tmp
$ cd /tmp
# Do the reverse to view contents of cs370-${USER}.tar.xz:
$ unxz -c cs370-${USER}.tar.xz | tar tvf -
- GNU tar (the version most widely used) has command line options that make it much easier to compress tar archives with gzip, bzip2 and xz:
$ tar cvJf /tmp/cs370-${USER}.tar.xz ~/cs370 
# GNU tar's "J" option forces use of xz to compress the tar archive
#        if "z"        uses          gzip
#        if "j"        uses          bzip2
# to view tar.xz    
$ tar tvJf cs370-${USER}.tar.xz
# to extract tar.xz 
$ cd /tmp; tar xvJf cs370-${USER}.tar.xz
Network Utilities
telnet/ftp
- Venerable remote login and file transfer programs- contain known security vulnerabilities
 
- Should avoid using, especially on older, legacy systems.
- Telnet sometimes useful for querying network ports for services
# Check if vnc service running on plato (port 5900) # Any response means the service is running telnet plato 5900
ssh/scp
- Secure Shell and Secure Copy- verify ssh setup from week 1
 
- More secure and versatile remote login and file transfer programs- See article on ssh security mechanisms.
 
- ssh is used as both a remote login program and remote command execution method:
- Warning about running programs through ~/.bashrc and their possible effects on ssh/scp.
# # Remote login: # # Login remotely to rockhopper. # Authenticate using either a password or encrypted # key exchange: $ ssh <your_userid>@rockhopper # See verbose output of a ssh login process $ ssh -v <your_userid>@rockhopper # # Remote command execution: # # Run the 'uptime' command on plato: $ ssh plato 'uptime' # See logins on rockhopper $ ssh rockhopper 'finger' # See jchung logins on rockhopper $ ssh rockhopper 'finger | grep -i chung' # Same thing, but stdout from rockhopper piped to local grep $ ssh rockhopper 'finger' | grep -i chung # Tar your ~/cs370 dir locally, pipe to gzip on rockhopper to # create rockhopper:/tmp/$USER-cs370.tar.gz: $ tar cvf - ~/cs370 | ssh rockhopper "gzip -c > /tmp/$USER-cs370.tar.gz"
- Remote file transfers with scp (uses same authentication mechanism as ssh):
# Create and transfer /tmp/cs370.tar.xz to your home dir on the plato server: $ tar cJf /tmp/cs370.tar.xz ~/cs370 $ scp /tmp/cs370.tar.xz plato:~ # Transfer ~/cs370.tar.xz from plato to local /tmp: $ scp plato:~/cs370.tar.xz /tmp
rsync
- Remote Sync (See article)
- More efficient file transfer program that is useful for keeping remote directories synchronized with local ones- Rsync algorithm transfers differences between local and remote copies of files, rather than entire files.
 
- Uses ssh authentication by default
# Transfer entire ~/cs370 dir to a remote machine:/tmp # rsync command options (similar to cp options) # -a archive (recursively copy dirs and preserve all file/dir attributes) # -u update (only transfer files that are newer than destination) # -v verbose # # In this rsync command, the source is ~/cs370 and the destination is localhost:/tmp. $ rsync -auv ~/cs370 localhost:/tmp # Run it again. # Since -u (update) is being used, nothing gets transferred because the source # and destination are both up-to-date. $ rsync -auv ~/cs370 localhost:/tmp # Update timestamp of ~/cs370/examples dir with 'touch', # and run rsync again. $ touch ~/cs370/examples $ rsync -auv ~/cs370 localhost:/tmp
wget
- Web Get (See article)
- non-interactive URL download program
- Default mode: download and save html file at specified URL
$ wget "http://wikiless.tiekoetter.com/wiki/regular_expressions" # Saves article to file "regular_expressions".
- -O file_nameoption saves to specified- file_name.- -O -sends html to stdout.
 
- The cURL utility has similar functionality and is simpler.
The "-" (STDOUT) convention
- To work well with other programs (see the UNIX philosophy), utilities liketarandwgetallow the use of the “-” (STDOUT) convention.- Output that would normally be sent to a file is sent instead to STDOUT with “-”.- tar cvf - ~/cs370# (sends tar archive data to STDOUT instead of to a .tar file)
- wget -O - http://monmouth.edu# (sends retrieved URL to STDOUT instead of to a file)
 
- A third utility we've looked at,enscript, also uses the “-” convention.
 
Diff/Patch
diff - find differences between two files
- Run the following commands first:
mkdir -p ~/cs370/examples/revcontrol cd ~/cs370/examples/revcontrol wget -q http://bit.ly/2zZgGiV -O diffpatch.tar.xz # download diffpatch.tar.xz tar xvJf diffpatch.tar.xz # extract the diffpatch directory ls cd diffpatch
- diff- compares 2 files and displays a list of editing changes that would convert the first file into the second file.- The 3 kinds of editing changes area-add lines,c-change lines, andd-delete lines.
 
 
        SYNOPSIS
               diff [options] from-file to-file
- Examples:
# diff input file #1
# saved as seuss1
$ cat seuss1
If a packet hits a pocket on a socket on a port,
and the bus is interrupted at a very last resort,  
and the access of the memory makes your floppy disk abort,
then the socket packet pocket has an error to report.
# diff input file #2
# saved as seuss2
$ cat seuss2
If a pocket hits a rocket on a socket on a port,
and the bus is interrupted at a very last resort,  
and the access of the memory makes your floppy abort,
then the socket packet pocket has an error to report.
       
# Use diff to show differences between seuss1 and seuss2:
$ diff seuss1 seuss2
1c1
< If a packet hits a pocket on a socket on a port,
---
> If a pocket hits a rocket on a socket on a port,
3c3
< and the access of the memory makes your floppy disk abort,
---
> and the access of the memory makes your floppy abort,
# diff input file #3
# saved as seuss3
$ cat seuss3
If a pocket hits a rocket on a socket on a port,
and the bus is interrupted at a very last resort,  
and the access of the memory makes your floppy abort,
then the socket packet pocket has an error to report.
       
If your cursor finds a menu item followed by a dash,
and the double-clicking icon puts your window in the trash,
and your data is corrupted cause the index doesn't hash,
then your situation's hopeless and your system's gonna crash!
       
# Use diff to show differences between seuss2 and seuss3:
$ diff seuss2 seuss3
4a5,9
> 
> If your cursor finds a menu item followed by a dash,
> and the double-clicking icon puts your window in the trash,
> and your data is corrupted cause the index doesn't hash,
> then your situation's hopeless and your system's gonna crash!
# diff input file #4
# saved as seuss4
$ cat seuss4
If a packet hits a pocket on a socket on a port,
and the access of the memory makes your floppy disk abort,
and the bus is interrupted at a very last resort,  
then the socket packet pocket has an error to report.
# Use diff to show differences between seuss3 and seuss4:
$ diff seuss3 seuss4
1c1,2
< If a pocket hits a rocket on a socket on a port,
---
> If a packet hits a pocket on a socket on a port,
> and the access of the memory makes your floppy disk abort,
3d3
< and the access of the memory makes your floppy abort,
5,9d4
< 
< If your cursor finds a menu item followed by a dash,
< and the double-clicking icon puts your window in the trash,
< and your data is corrupted cause the index doesn't hash,
< then your situation's hopeless and your system's gonna crash!
patch - apply a diff file to an original
        SYNOPSIS
               patch [options] [originalfile [patchfile]]
- Example:
# Using diff and patch to merge changes $ diff seuss3 seuss4 > diff34 # Generate diff file $ cp seuss3 seuss3.orig # Backup original seuss3 $ patch --verbose seuss3 diff34 # Apply diff34 to seuss3 Hmm... Looks like a normal diff to me... Patching file seuss3 using Plan A... Hunk #1 succeeded at 1. Hunk #2 succeeded at 4. Hunk #3 succeeded at 5. done $ cat seuss3 # seuss3 is now the same as seuss4 If a packet hits a pocket on a socket on a port, and the access of the memory makes your floppy disk abort, and the bus is interrupted at a very last resort, then the socket packet pocket has an error to report.
Revision Control Utilities
- version control systems (VCS)
- Help to keep track of versions of files.- Store the differences between versions of files, rather than entire versions of files.- Saves space.
- UNIXdiffcommand or equivalent functionality plays a part in defining differences between versions of files.
 
 
- Plays an important role in software development, particularly team development- Single user version control systems: RCS
- 
- Centralized: CVS, Subversion
- Decentralized: Git, Mercurial
 
 
Lab Activities
1. **(Do in lab)** Find all files that contain a string or regular expression
In ~/.bashrc, define a shell function called searchfiles which uses find to list all files that contain the string (or regular expression) that you pass in as the first function parameter, $1. Note that we don't want to search file names but file contents for a string, and then list the files that match. 
Answer:
# function searchfiles which uses find to list all files that 
# contain the string (or regular expression)
# that you pass in as the first function parameter, $1.
function searchfiles
{
   find . -type f |                 # list all files recursively starting in . (current dir)
   xargs grep -li "$1" 2> /dev/null # using xargs, make grep list files (-l) in which a match is found
	
   # Can also use command substitution, if not too many find results:
   # grep -li "$1" $(find . -type f) 2> /dev/null
   # 
   # If using GNU grep (most UNIX systems), can use just grep recursively (-r):
   # grep -rli "$1" 2> /dev/null
}
2. **(Do in class)** Change to directory based on find result
In ~/.bashrc, define a shell function called findcd that changes to a directory based on a find result. If what you're searching for matches a filename, then change to the directory where that file resides. If what you're searching for matches a directory name, then change to that directory.
Example usage:
# Change to a dir named randomwall or to a dir that contains a file called randomwall findcd randomwall # Change to a dir named examples findcd examples # Change to a dir that contains a file called roster findcd roster
- Note: This should be a function and not a shell script because shell scripts run in their own sub-shells.
Answer:
# function findcd - changes to a directory based on the first hit from a find
function findcd
{
   # "head -n1" chooses first find result;
   # use head instead of tail here, else may have 
   # to wait for find to print many search results;
   # if find finds nothing, $findresult is ""
   findresult=$(find . -iname "*$1*" | head -n1)
   # If $findresult is a file, can't cd to it,
   # so have to trim $findresult to a directory
   if [ -f "$findresult" ]; then
      filename=$(basename "$findresult") # see man basename
      # delete $filename from end (\$) of $findresult
      findresult=$(echo "$findresult" | sed "s/$filename\$//")
   fi
   cd "$findresult" # If $findresult is "", nothing happens.
}
3. Find maximum directory depth
Within your home directory, find the maximum depth of a directory. Your results should include the directory's name.
Note: You'll need to use the find command's -printf option. See man find.
Answer:
# Starting in current directory (.), find directories (-type d), 
# print the depth of each directory found (-printf "%d "),
# print the path of each directory found (-print),
# do a descending, numeric sort (-rn), show only the first result (head -n1)
find . -type d -printf '%d ' -print | sort -rn | head -n1
or
find . -type d -printf '%d ' -exec ls -ld {} ';' | sort -rn | head -n1
4. **(Do in class)** Create and use a git repository with gitlab
- Login to gitlab at cssegit.monmouth.edu.
- Set up SSH login to gitlab. You should have created a SSH private/public key pair in Week 1).- The ssh_setup.sh script can be used to check your SSH keys setup.
- NOTE: Add a SSH key to your gitlab profile before creating any new projects on gitlab.
 
- Back up your course directory (~/cs370or~/se370) usingcporrsync:
# Using cp cp -av ~/cs370 ~/cs370-$(date +%m%d%y)
# or using rsync rsync -av ~/cs370 ~/cs370-$(date +%m%d%y)
- Create a new repository (project) on gitlab.- DO NOT include a README when creating the project.
- Make it a private project.
- Follow the instructions on gitlab under“Push an existing folder”to git-initialize your UNIX account course directory and push the contents to gitlab.
 
- Add user jchung as a member of your gitlab project (member type: Reporter).
- Ifgitcommandline retrieval and push operations require a userid and password to be entered even though you added your SSH public key to your gitlab profile, then see this possible solution (reddit).
