====== File, Network and Revision Control Utilities ====== ---- ===== File Utilities ===== ---- ==== Batch processing ==== === find === * find files/directories/named pipes/etc. * See examples [[http://www.devdaily.com/unix/edu/examples/find.shtml|here]]. * //-exec// option allows running a command on the find results. * Not everything is possible with //find's// //-exec// option. * A shell //[[cs_370_-_text_processing_utilities#nospace|for]]// loop may be more appropriate instead. * //find// is sometimes recommended over //ls// in certain situations. * Quickly //locate// a file on a local filesystem. * Use //[[cs_370_-_unix_shells_and_shell_scripting#shell_command_substitution|locate]]// * If set up on the system, can find files rapidly using //locate filename//. === xargs === * execute commands on a collection of arguments * See examples [[http://en.wikipedia.org/wiki/Xargs|here]]. * Also see examples in the //xargs// man page. * Often used at the back end of a //find// command because it's more capable than //find's// -exec option. # Find all hidden files in the cs370 dir (files that begin with # period ".*") and then use xargs to run "ls -l" on all # the find results. $ find ./cs370 -type f -name '.*' | xargs ls -l # Same output as # $ find ./cs370 -type f -name '.*' -exec ls -l {} ';' # List all users logged in and run finger on each userid $ w -h | awk '{ print $1 }' | sort | uniq | xargs finger # # w -h - list logged in users and associated info, excluding header (-h) # awk '{ print $1 }' - print only userid from w output # sort | uniq - reduce duplicates # xargs finger - run finger on list of users from uniq ==== Archiving and compression ==== === tar === * Tape Archive (See [[http://en.wikipedia.org/wiki/Tar_(file_format)|article]].) * Archive file/dirs, preserving file/dir attributes. * Usually operate on directories, not files. * Common options: c - create archive t - view existing archive v - operate verbosely x - extract archive f - create the specified tar file ( or use "-" to send tar'ed files to stdout ) * Examples: # Create tar archive (cs370.tar) of your ~/cs370 directory in /tmp: # Dashes ("-") are optional for tar options. $ tar cvf /tmp/cs370.tar ~/cs370 # "tar -cvf ..." does the same thing # Change to /tmp $ cd /tmp # "tar tvf cs370.tar" shows the contents of cs370.tar. # "tar xvf cs370.tar" extracts the contents of cs370.tar to # the CURRENT directory. (BE CAREFUL.) * Since //.tar// archive are not compressed, //tar// is often used in combination with a file compressor such as //gzip//, //bzip2// or //xz//. See examples below. === gzip/bzip2/xz === * Compression of single files * //gzip// is faster; //bzip2// compresses more; //xz// compresses better than //bzip2// and is faster. * See [[http://www.debianadmin.com/create-and-extract-bz2-and-gz-files.html|article]] on gzip and bzip2. * See [[http://tukaani.org/xz/format.html|information on xz format]]. * Compress individual files: $ cd /tmp # Copy nano config files to /tmp $ cp /usr/share/nano/*.nanorc /tmp $ gzip *.nanorc # will result in all .nanorc files in current dir # being compressed and given the .nanorc.gz extension $ gunzip *.gz # will uncompress the .nanorc.gz files and # leave files w/ .nanorc extensions $ bzip2 *.nanorc $ bunzip2 *.bz2 # same as above, but with bzip2 $ xz *.nanorc $ unxz *.xz # same as above, but with xz * Use "-c" option to gzip, bzip2 and xz send compressed data to stdout # Copy large wordlist to /tmp $ cp /usr/share/dict/words /tmp # Compress wordlist to a separate compressed files: $ gzip -c words > words.gz $ bzip2 -c words > words.bz2 $ xz -c words > words.xz # Compare size of compressed file formats. # Also try zip: $ zip words.zip words * View compressed text files * On most Linux systems, program documentation under /usr/share/doc is usually compressed to save space. # View nano documentation $ cd /usr/share/doc/nano $ ls # View a compressed file (NEWS.gz) $ gunzip -c NEWS.gz | less or, more simply, $ zless NEWS.gz # Cat a compressed file (NEWS.gz) $ zcat NEWS.gz: * gzip/bzip2/xz are often used in combination with //tar//. # Tar your ~/cs370 dir to tar's stdout (-) and xz it, # redirecting the result to cs370.tar.xz: $ tar cvf - ~/cs370 | xz -c > /tmp/cs370.tar.xz # Change to /tmp $ cd /tmp # Do the reverse to view contents of cs370.tar.xz: $ unxz -c cs370.tar.xz | tar tvf - * GNU tar (the version most widely used) has command line options that make it much easier to compress tar archives with gzip, bzip2 and xz: $ tar cvJf /tmp/cs370.tar.xz ~/cs370 # GNU tar's "J" option forces use of xz to compress the tar archive # if "z" uses gzip # if "j" uses bzip2 # to view tar.xz $ tar tvJf cs370.tar.xz # to extract tar.xz $ cd /tmp; tar xvJf cs370.tar.xz ---- ---- ===== Network Utilities ===== ----- ==== telnet/ftp ==== * Venerable remote login and file transfer programs * contain known security vulnerabilities * Should avoid using, especially on older, legacy systems. * Telnet sometimes useful for querying network ports for services # Check if vnc service running on plato (port 5900) # Any response means the service is running telnet plato 5900 ==== ssh/scp ==== * Secure Shell and Secure Copy * verify ssh setup from [[cs_370_-_introduction_unix_fundamentals#secure_shell_ssh|week 1]] * More secure and versatile remote login and file transfer programs * See [[http://en.wikipedia.org/wiki/Secure_Shell|article]] on ssh security mechanisms. * ssh is used as both a remote login program and remote command execution method: * Warning about running programs through ~/.bashrc and their possible effects on ssh/scp. # # Remote login: # # Login remotely to rockhopper. # Authenticate using either a password or encrypted # key exchange: $ ssh @rockhopper # See verbose output of a ssh login process $ ssh -v @rockhopper # # Remote command execution: # # Run the 'uptime' command on csselin01: $ ssh csselin01 'uptime' # See logins on rockhopper $ ssh rockhopper 'finger' # See jchung logins on rockhopper $ ssh rockhopper 'finger | grep -i chung' # Same thing, but stdout from rockhopper piped to local grep $ ssh rockhopper 'finger' | grep -i chung # Tar your ~/cs370 dir locally, pipe to gzip on rockhopper to # create rockhopper:/tmp/$USER-cs370.tar.gz: $ tar cvf - ~/cs370 | ssh rockhopper "gzip -c > /tmp/$USER-cs370.tar.gz" * Remote file transfers with scp (uses same authentication mechanism as ssh): # Create and transfer /tmp/cs370.tar.xz to your home dir on the plato server: $ tar cJf /tmp/cs370.tar.xz ~/cs370 $ scp /tmp/cs370.tar.xz plato:~ # Transfer ~/cs370.tar.xz from plato to local /tmp: $ scp plato:~/cs370.tar.xz /tmp ==== rsync ==== * Remote Sync (See [[http://en.wikipedia.org/wiki/Rsync|article]]) * More efficient file transfer program that is useful for keeping remote directories synchronized with local ones * Rsync algorithm transfers differences between local and remote copies of files, rather than entire files. * Uses ssh authentication by default # Transfer entire ~/cs370 dir to a remote machine:/tmp # rsync command options (similar to cp options) # -a archive (recursively copy dirs and preserve all file/dir attributes) # -u update (only transfer files that are newer than destination) # -v verbose # # In this rsync command, the source is ~/cs370 and the destination is localhost:/tmp. $ rsync -auv ~/cs370 localhost:/tmp # Run it again. # Since -u (update) is being used, nothing gets transferred because the source # and destination are both up-to-date. $ rsync -auv ~/cs370 localhost:/tmp # Update timestamp of ~/cs370/examples dir with 'touch', # and run rsync again. $ touch ~/cs370/examples $ rsync -auv ~/cs370 localhost:/tmp ==== wget ==== * Web Get (See [[http://en.wikipedia.org/wiki/Wget|article]]) * non-interactive URL download program * Default mode: download and save html file at specified URL $ wget "http://en.wikipedia.org/wiki/regular_expressions" # Saves article to file //regular_expressions//. * //-O file_name// option saves to specified //file_name//. * // -O -// sends html to stdout. * The [[https://en.wikipedia.org/wiki/CURL|cURL]] utility has similar functionality and is simpler. ==== The "-" (STDOUT) convention ==== * To work well with other programs (see the [[cs_370_-_introduction_unix_fundamentals#the_unix_philosophy_or_style | UNIX philosophy]]), utilities like tar and wget allow the use of the "-" (STDOUT) convention. * Output that would normally be sent to a file is sent instead to STDOUT with "-". * tar cvf - ~/cs370 # (sends tar archive data to STDOUT instead of to a .tar file) * wget -O - http://monmouth.edu # (sends retrieved URL to STDOUT instead of to a file) * A third utility we've looked at, [[https://cssegit.monmouth.edu/jchung/csse370repo/-/blob/main/scripts/text2png|enscript]], also uses the "-" convention. ---- ---- ===== Diff/Patch ===== ----- ==== diff - find differences between two files ==== * Run the following commands first: mkdir -p ~/cs370/examples/revcontrol cd ~/cs370/examples/revcontrol wget -q http://bit.ly/2zZgGiV -O diffpatch.tar.xz # download diffpatch.tar.xz tar xvJf diffpatch.tar.xz # extract the diffpatch directory ls cd diffpatch * diff * compares 2 files and displays a list of editing changes that would convert the first file into the second file. * The 3 kinds of editing changes are ''a''-add lines, ''c''-change lines, and ''d''-delete lines. SYNOPSIS diff [options] from-file to-file * Examples: # diff input file #1 # saved as seuss1 $ cat seuss1 If a packet hits a pocket on a socket on a port, and the bus is interrupted at a very last resort, and the access of the memory makes your floppy disk abort, then the socket packet pocket has an error to report. # diff input file #2 # saved as seuss2 $ cat seuss2 If a pocket hits a rocket on a socket on a port, and the bus is interrupted at a very last resort, and the access of the memory makes your floppy abort, then the socket packet pocket has an error to report. # Use diff to show differences between seuss1 and seuss2: $ diff seuss1 seuss2 1c1 < If a packet hits a pocket on a socket on a port, --- > If a pocket hits a rocket on a socket on a port, 3c3 < and the access of the memory makes your floppy disk abort, --- > and the access of the memory makes your floppy abort, # diff input file #3 # saved as seuss3 $ cat seuss3 If a pocket hits a rocket on a socket on a port, and the bus is interrupted at a very last resort, and the access of the memory makes your floppy abort, then the socket packet pocket has an error to report. If your cursor finds a menu item followed by a dash, and the double-clicking icon puts your window in the trash, and your data is corrupted cause the index doesn't hash, then your situation's hopeless and your system's gonna crash! # Use diff to show differences between seuss2 and seuss3: $ diff seuss2 seuss3 4a5,9 > > If your cursor finds a menu item followed by a dash, > and the double-clicking icon puts your window in the trash, > and your data is corrupted cause the index doesn't hash, > then your situation's hopeless and your system's gonna crash! # diff input file #4 # saved as seuss4 $ cat seuss4 If a packet hits a pocket on a socket on a port, and the access of the memory makes your floppy disk abort, and the bus is interrupted at a very last resort, then the socket packet pocket has an error to report. # Use diff to show differences between seuss3 and seuss4: $ diff seuss3 seuss4 1c1,2 < If a pocket hits a rocket on a socket on a port, --- > If a packet hits a pocket on a socket on a port, > and the access of the memory makes your floppy disk abort, 3d3 < and the access of the memory makes your floppy abort, 5,9d4 < < If your cursor finds a menu item followed by a dash, < and the double-clicking icon puts your window in the trash, < and your data is corrupted cause the index doesn't hash, < then your situation's hopeless and your system's gonna crash! ==== patch - apply a diff file to an original ==== SYNOPSIS patch [options] [originalfile [patchfile]] * Example: # Using diff and patch to merge changes $ diff seuss3 seuss4 > diff34 # Generate diff file $ cp seuss3 seuss3.orig # Backup original seuss3 $ patch --verbose seuss3 diff34 # Apply diff34 to seuss3 Hmm... Looks like a normal diff to me... Patching file seuss3 using Plan A... Hunk #1 succeeded at 1. Hunk #2 succeeded at 4. Hunk #3 succeeded at 5. done $ cat seuss3 # seuss3 is now the same as seuss4 If a packet hits a pocket on a socket on a port, and the access of the memory makes your floppy disk abort, and the bus is interrupted at a very last resort, then the socket packet pocket has an error to report. ---- ===== Revision Control Utilities ===== ----- * version control systems (VCS) * Help to keep track of versions of files. * Store the differences between versions of files, rather than entire versions of files. * Saves space. * UNIX //diff// command or equivalent functionality plays a part in defining differences between versions of files. * Plays an important role in software development, particularly team development * Single user version control systems: [[http://en.wikipedia.org/wiki/Revision_Control_System|RCS]] * Multi-user version control systems: [[http://en.wikipedia.org/wiki/Concurrent_Versions_System|CVS]], [[http://en.wikipedia.org/wiki/Subversion_(software)|Subversion]], [[http://en.wikipedia.org/wiki/Git_(software)|Git]], [[https://en.wikipedia.org/wiki/Mercurial|Mercurial]] * Centralized: CVS, Subversion * Decentralized: Git, Mercurial ---- ===== Lab Activities ===== ----- ==== 1. Find all files that contain a string or regular expression ==== In //~/.bashrc//, define a shell function called //searchfiles// which uses //find// to list all files that contain the string (or regular expression) that you pass in as the first function parameter, //$1//. Note that we don't want to search file //names// but file //contents// for a string, and then list the files that match. Answer: # function searchfiles which uses find to list all files that # contain the string (or regular expression) # that you pass in as the first function parameter, $1. function searchfiles { find . -type f | # list all files recursively starting in . (current dir) xargs grep -li "$1" 2> /dev/null # using xargs, make grep list files (-l) in which a match is found # Can also use command substitution, if not too many find results: # grep -li "$1" $(find . -type f) 2> /dev/null # # If using GNU grep (most UNIX systems), can use just grep recursively (-r): # grep -rli "$1" 2> /dev/null } ==== 2. Change to directory based on find result ==== In ~/.bashrc, define a shell function called //findcd// that changes to a directory based on a find result. If what you're searching for matches a filename, then change to the directory where that file resides. If what you're searching for matches a directory name, then change to that directory. Example usage: # Change to a dir named randomwall or to a dir that contains a file called randomwall findcd randomwall # Change to a dir named examples findcd examples # Change to a dir that contains a file called roster findcd roster * Note: This should be a function and not a shell script because shell scripts run in their own sub-shells. Answer: # function findcd - changes to a directory based on the first hit from a find function findcd { # "head -n1" chooses first find result; # use head instead of tail here, else may have # to wait for find to print many search results; # if find finds nothing, $findresult is "" findresult=$(find . -iname "*$1*" | head -n1) # If $findresult is a file, can't cd to it, # so have to trim $findresult to a directory if [ -f "$findresult" ]; then filename=$(basename "$findresult") # see man basename # delete $filename from end (\$) of $findresult findresult=$(echo "$findresult" | sed "s/$filename\$//") fi cd "$findresult" # If $findresult is "", nothing happens. } ==== 3. Find maximum directory depth ==== Within your home directory, find the maximum depth of a directory. Your results should include the directory's name. Note: You'll need to use the //find// command's //-printf// option. See //man find//. Answer: # Starting in current directory (.), find directories (-type d), # print the depth of each directory found (-printf "%d "), # print the path of each directory found (-print), # do a descending, numeric sort (-rn), show only the first result (head -n1) find . -type d -printf '%d ' -print | sort -rn | head -n1 or find . -type d -printf '%d ' -exec ls -ld {} ';' | sort -rn | head -n1 ==== 4. Create and use a git repository with gitlab ==== **(NOTE: Counts toward your participation grade.)** * Login to gitlab at [[http://cssegit.monmouth.edu|cssegit.monmouth.edu]]. * Set up SSH login to gitlab. You should have created a SSH private/public key pair in [[cs_370_-_introduction_unix_fundamentals#secure_shell_ssh|Week 1]]). * The [[https://cssegit.monmouth.edu/jchung/csse370repo/-/blob/main/scripts/ssh_setup.sh|ssh_setup.sh]] script can be used to check your SSH keys setup. * Back up your course directory (''~/cs370'' or ''~/se370'') using ''cp'' or ''rsync'': # Using cp cp -av ~/cs370 ~/cs370-$(date +%m%d%y) # or using rsync rsync -av ~/cs370 ~/cs370-$(date +%m%d%y) * Create a new repository (project) on gitlab. * **DO NOT** include a README when creating the project. * Make it a private project. * Follow the instructions on gitlab under ''"Push an existing folder"'' to git-initialize your UNIX account course directory and push the contents to gitlab. * Add user jchung as a member of your gitlab project (member type: Reporter). ----