====== Text Processing Utilities ======
* Also see "[[https://missing.csail.mit.edu/2020/data-wrangling|Data Wrangling]]" section of MIT's Missing Semester course.
* The [[cs_370_-_introduction_unix_fundamentals#the_unix_philosophy_or_style|UNIX philosophy]] is driven by a core set of UNIX text processing utilities that work together to process streams of text.
* These text processing utilities are line-based, meaning they process text streams one line at a time.
* Text streams are read from files on disk, redirected from files on disk using ''< file'' or are read through standard input.
* The Advanced Bash Scripting Guide also has a [[http://en.tldp.org/LDP/abs/html/textproc.html|section]] on text processing utilities that includes usage examples.
* Some of the utilities that we'll cover are listed in the "Text Utilities" category in the [[https://en.wikipedia.org/wiki/List_of_GNU_Core_Utilities_commands|GNU project's core Unix utilities list]].
----
===== Examples Setup =====
(**Do in class**) Run the following command to begin setting up a subdirectory structure for the text processing utility examples:
for topic in grep sed sort uniq awk tr; do echo mkdir -p ~/cs370/examples/text/$topic; done # | bash
Remove the end comment ''#'' to pipe (''|'') the mkdir commands to ''bash''.
Note: The above directories could have been created without a ''for'' loop, using shell [[https://tldp.org/LDP/abs/html/special-chars.html#BRACEEXPREF|brace expansion]]:
mkdir -p ~/cs370/examples/text/{grep,sed,sort,uniq,awk,tr}
===== grep =====
* print lines matching a pattern in a file or stdin
* SYNOPSIS (also see the manual page for grep)
grep [options] PATTERN [FILE...]
grep [options] [-e PATTERN | -f FILE] [FILE...]
* grep handles normal regular expressions (see [[cs370/cs_370_-_regular_expressions|Regular Expressions]])
* egrep (extended grep) supposed to handle extended regular expressions, e.g. +, ?
* fgrep (fast grep) only searches for fixed strings
=== Basic grep ===
$ cat grepfile # see grepfile contents
Well you know it's your bedtime,
So turn off the light,
Say all your prayers and then,
Oh you sleepy young heads dream of wonderful things,
Beautiful mermaids will swim through the sea,
And you will be swimming there too.
$ grep the grepfile # look for pattern "the" in grepfile
So turn off the light,
Say all your prayers and then,
Beautiful mermaids will swim through the sea,
And you will be swimming there too.
$ cat grepfile | grep the # pipe grepfile to grep
So turn off the light,
Say all your prayers and then,
Beautiful mermaids will swim through the sea,
And you will be swimming there too.
# look for whole word "the" in grepfile and number lines found
$ grep -wn the grepfile
2:So turn off the light,
5:Beautiful mermaids will swim through the sea,
# look for lines without "the", number lines
$ grep -wnv the grepfile
1:Well you know it's your bedtime,
3:Say all your prayers and then,
4:Oh you sleepy young heads dream of wonderful things,
6:And you will be swimming there too.
=== Pattern files with grep ===
# Read search patterns from a file, and search for the patterns in a file.
# See the grep "-f" option.
# Pattern file contents
cat ids
s1306205
s1321300
# grepfile contents
cat list_of_ids
s1064730
s1185725
s1294895
s1306205
s1321300
s1333911
s1359142
$ grep -f ids list_of_ids
s1306205
s1321300
=== Regular expressions with grep ===
$ grep .nd grepfile
Say all your prayers and then,
Oh you sleepy young heads dream of wonderful things,
And you will be swimming there too.
$ grep ^.nd grepfile
And you will be swimming there too.
$ grep sw.*ng grepfile
And you will be swimming there too.
$ grep [A-D] grepfile
Beautiful mermaids will swim through the sea,
And you will be swimming there too.
$ grep "\." grepfile
And you will be swimming there too.
$ grep a. grepfile
Say all your prayers and then,
Oh you sleepy young heads dream of wonderful things,
Beautiful mermaids will swim through the sea,
$ grep a.$ grepfile
Beautiful mermaids will swim through the sea,
$ grep [a-m]nd grepfile
Say all your prayers and then,
$ grep [^a-m]nd grepfile
Oh you sleepy young heads dream of wonderful things,
And you will be swimming there too.
$ egrep s.+w grepfile
Oh you sleepy young heads dream of wonderful things,
Beautiful mermaids will swim through the sea,
$ egrep "off|will" grepfile
So turn off the light,
Beautiful mermaids will swim through the sea,
And you will be swimming there too.
$ egrep im*ing grepfile
And you will be swimming there too.
$ egrep im?ing grepfile
? Why no matches ?
=== grep pattern match context options ===
* By default, grep returns lines that match a given pattern.
* But, sometimes you want to see the context around the matching lines.
* grep context options are -C (context), -A (after context), -B (before context)
# -C 1 option below means grep will show 1 line above and
# and 1 line below the matching lines:
$ grep -C 1 sleepy grepfile
Say all your prayers and then,
Oh you sleepy young heads dream of wonderful things,
Beautiful mermaids will swim through the sea,
# -A 2 option means show up to 2 lines AFTER the matching lines:
$ grep -A 2 sleepy grepfile
Oh you sleepy young heads dream of wonderful things,
Beautiful mermaids will swim through the sea,
And you will be swimming there too.
# -B 2 option means show up to 2 lines BEFORE the matching lines:
$ grep -B 2 sleepy grepfile
So turn off the light,
Say all your prayers and then,
Oh you sleepy young heads dream of wonderful things,
----
===== sed - Stream EDitor =====
* Scans one or more files or stdin and performs an editing action on all lines that match a particular condition.
* Useful for simple repetitive editing tasks.
* SYNOPSIS
sed [ -e command ] [ -f scriptfile ] { fileName }
=== The sed commands ===
* a\ - Append text
* c\ - Change text
* d - Delete text
* i\ - Insert text
* r - Insert file
* s/regexpr/str/ - Substitute 1st occurrence of regexpr by str
* s/regexpr/str/g - Substitute every occurrence of regexpr by str
=== Substituting Text ===
# The sed input file:
$ cat fiction
The lone monarch butterfly flew flutteringly through
the cemetery, dancing on and glancing against headstone
after headstone before alighting atop Willie Mitchell's
already lowered casket, causing gasps of awe to fly
from the open mouths of five or six lingering mourners,
until a big shovelful of dirt landed on it and it died.
$ sed 's/^/ /' fiction > fiction.indented
# contents of 'fiction' indented by one space:
$ cat fiction.indented
The lone monarch butterfly flew flutteringly through
the cemetery, dancing on and glancing against headstone
after headstone before alighting atop Willie Mitchell's
already lowered casket, causing gasps of awe to fly
from the open mouths of five or six lingering mourners,
until a big shovelful of dirt landed on it and it died.
$ sed 's/^ *//' fiction.indented # removes leading spaces
# To insert the indentations directly into 'fiction' means
# doing an "in-place" edit of 'fiction', using sed's '-i' option:
$ sed -i 's/^/ /' fiction
=== Deleting Text ===
$ sed '/a/d' fiction # remove all lines containing char 'a'.
from the open mouths of five or six lingering mourners,
$ sed '/\/d' fiction # remove lines containing the word 'a'.
The lone monarch butterfly flew flutteringly through
the cemetery, dancing on and glancing against headstone
after headstone before alighting atop Willie Mitchell's
already lowered casket, causing gasps of awe to fly
from the open mouths of five or six lingering mourners,
=== Appending/Inserting Text ===
# Sed accepts sed scripts with the '-f' option;
# sed5 is a sed script containing sed commands;
# It will insert 2 lines at line 1:
$ cat sed5
1i\
Copyright 2002 Joe Chung\
All rights reserved\
$ sed -f sed5 fiction
Copyright 2002 Joe Chung
All rights reserved
The lone monarch butterfly flew flutteringly through
the cemetery, dancing on and glancing against headstone
after headstone before alighting atop Willie Mitchell's
already lowered casket, causing gasps of awe to fly
from the open mouths of five or six lingering mourners,
until a big shovelful of dirt landed on it and it died.
* For simpler sed text insertions based on a pattern search, can use the following:
Append text after a line that contains pattern with
sed '/pattern/a line of text here' filename
Insert text before a line that contains pattern with
sed '/pattern/i line of text here' filename
Examples of appending and inserting a line of text:
$ cat test
foo
bar
option
baz
$ sed '/option/a append text here' test
foo
bar
option
append text here
baz
$ sed '/option/i insert text here' test
foo
bar
insert text here
option
baz
=== Replacing (Changing) Text ===
# Another sed script, containing a sed change text directive:
$ cat sed6
1,3c\
Lines 1-3 are censored.\
$ sed -f sed6 fiction
Lines 1-3 are censored.
already lowered casket, causing gasps of awe to fly
from the open mouths of five or six lingering mourners,
until a big shovelful of dirt landed on it and it died.
# Another sed script, containing a sed change text directive:
$ cat sed7
1c\
Line 1 is censored.
2c\
Line 2 is obfuscated.
3c\
Line 3 is kaput.
$ sed -f sed7 fiction
Line 1 is censored.
Line 2 is obfuscated.
Line 3 is kaput.
already lowered casket, causing gasps of awe to fly
from the open mouths of five or six lingering mourners,
until a big shovelful of dirt landed on it and it died.
=== Inserting files ===
# We want to insert a file called 'fin' using sed:
$ cat fin
The End
# Direct sed to insert 'fin' at end of 'fiction'
$ sed '$r fin' fiction
The lone monarch butterfly flew flutteringly through
the cemetery, dancing on and glancing against headstone
after headstone before alighting atop Willie Mitchell's
already lowered casket, causing gasps of awe to fly
from the open mouths of five or six lingering mourners,
until a big shovelful of dirt landed on it and it died.
The End
=== Multiple sed Commands ===
# Use sed's '-e' option to perform multiple sed operations
# per line:
$ sed -e 's/^/<< /' -e 's/$/ >>/' fiction
<< The lone monarch butterfly flew flutteringly through >>
<< the cemetery, dancing on and glancing against headstone >>
<< after headstone before alighting atop Willie Mitchell's >>
<< already lowered casket, causing gasps of awe to fly >>
<< from the open mouths of five or six lingering mourners, >>
<< until a big shovelful of dirt landed on it and it died. >>
----
===== sort - sort lines of text files or stdin =====
* SYNOPSIS
sort [OPTION]... [FILE]...
* Does ascending, lexicographic (alphabetical) sort by default.
* Entire lines are seen as the sort key by default, but can specify with fields to use as sort keys in delimited text.
# Sort input file:
$ cat sortfile
jan Start chapter 3 10th
Jan Start chapter 1 30th
Jan Start chapter 5 23rd
Jan End chapter 3 23rd
Mar Start chapter 7 27
may End chapter 7 17th
Apr End Chapter 5 1
Feb End chapter 5 14
$ sort sortfile
Apr End Chapter 5 1
Feb End chapter 5 14
Jan End chapter 3 23rd
Jan Start chapter 1 30th
jan Start chapter 3 10th
Jan Start chapter 5 23rd
Mar Start chapter 7 27
may End chapter 7 17th
# Force reverse or descending sort:
$ sort -r sortfile
may End chapter 7 17th
Mar Start chapter 7 27
Jan Start chapter 5 23rd
jan Start chapter 3 10th
Jan Start chapter 1 30th
Jan End chapter 3 23rd
Feb End chapter 5 14
Apr End Chapter 5 1
# Sort starting in the 1st (+0) field, end at the 2nd (-1) field;
# alternatively: sort --key=1,1 sortfile:
$ sort +0 -1 sortfile
Apr End Chapter 5 1
Feb End chapter 5 14
jan Start chapter 3 10th
Jan End chapter 3 23rd
Jan Start chapter 1 30th
Jan Start chapter 5 23rd
Mar Start chapter 7 27
may End chapter 7 17th
# Sort by month name in 1st field
$ sort +0 -1 -M sortfile
Jan End chapter 3 23rd
Jan Start chapter 1 30th
jan Start chapter 3 10th
Jan Start chapter 5 23rd
Feb End chapter 5 14
Mar Start chapter 7 27
Apr End Chapter 5 1
may End chapter 7 17th
# sort by the 5th (last) field numerically;
# alternatively: sort --key=5 -n sortfile
$ sort +4 -5 -n sortfile
Apr End Chapter 5 1
jan Start chapter 3 10th
Feb End chapter 5 14
may End chapter 7 17th
Jan End chapter 3 23rd
Jan Start chapter 5 23rd
Mar Start chapter 7 27
Jan Start chapter 1 30th
----
===== uniq - remove duplicate lines from a sorted file or stdin =====
* SYNOPSIS
uniq [OPTION]... [INPUT [OUTPUT]]
* Requires input to be sorted
* So, usually used in conjunction with ''sort''
# Input file for uniq:
$ cat animals
cat snake
monkey snake
dolphin elephant
dolphin elephant
goat elephant
pig pig
pig pig
monkey pig
# Default mode filters out non-unique lines:
$ uniq animals
cat snake
monkey snake
dolphin elephant
goat elephant
pig pig
monkey pig
# count instances of nonunique lines
$ uniq -c animals
1 cat snake
1 monkey snake
2 dolphin elephant
1 goat elephant
2 pig pig
1 monkey pig
# Ignore first field of each line when
# looking for duplicates:
$ uniq -1 animals
cat snake
dolphin elephant
pig pig
----
===== awk - pattern scanning and processing language =====
* Awk is a specialized programming language.
* awk programs
* can be supplied on command line surrounded by single quotes. For example,
$ awk -F "." '{ print "mkdir " $2 }'
* can be placed in a text file specified using the "-f" option. For example,
$ awk -F "." -f makedirs
where makedirs contains
{ print "mkdir " $2 }
* awk programs act on individual lines in a file or standard input and have the general form:
Synopsis:
awk [ condition ] [ { action } ]
condition can be:
- special token BEG[awk - pattern scanning and processing language] IN or END
- expression using logical or relational operators and/or regular expression
action is performed on every line of input that matches the
condition and can be one or more C-like programming statements:
- if (conditional) statement [ else statement ]
- while (conditional) statement
- for (expression; conditional; expression ) statement
- break/continue
- variable = expression
- print [ list of expressions ] [ > expression ]
- printf format [ , list of expressions ] [ > expression ]
- next (skips the remaining patterns on the current line of input)
- exit (skips the rest of the current line)
- [ list of statements ]
* Awk has its own set of built-in variables.
* In the examples below,
* $0 represents an entire line, $1 the first field, $2 the second field, etc.
* NF represents the number of fields in a line; $NF represents the last field in a line.
* NR represents the current line number.
=== Accessing individual fields of lines of text ===
# Say we have this input file:
$ cat float
Wish I was floating in blue across the sky,
My imagination is strong,
And I often visit the days
When everything seemed so clear.
Now I wonder what I'm doing here at all...
$ awk '{print NF, $0}' float
9 Wish I was floating in blue across the sky,
4 My imagination is strong,
6 And I often visit the days
5 When everything seemed so clear.
9 Now I wonder what I'm doing here at all...
# Awk fields are delimited using white space by default.
=== BEGIN and END conditions applied to lines of text ===
# Say that the file awk2 contains these awk statements:
$ cat awk2
BEGIN { print "Start of file" }
{ print $1 $3 $NF }
END { print "End of file" }
$ awk -f awk2 float
Start of file:
Wishwassky,
Myisstrong,
Andoftendays
Whenseemedclear.
Nowwonderall...
End of file
# Equivalently, on the command line:
$ awk 'BEGIN { print "Start of file" } { print $1 $3 $NF } END { print "End of file" }' float
=== Logical operators in awk conditions ===
$ awk 'NR > 1 && NR < 4 { print NR, $1, $3, $NF }' float
2 My is strong,
3 And often days
* Regular expressions in awk conditions
$ awk '/t.+e/ { print $0 }' float
Wish I was floating in blue across the sky,
And I often visit the days
When everything seemed so clear.
Now I wonder what I'm doing here at all...
=== Awk condition ranges ===
$ awk '/strong/,/clear/ { print $0 }' float
My imagination is strong,
And I often visit the days
When everything seemed so clear.
=== Awk delimiters ===
* Awk uses 1 or more spaces as the default delimiter between fields.
* You can specify a delimiter with -F.
# See contents of /etc/passwd (delimited file using : as the delimiter)
$ cat /etc/passwd
# Extract fields of /etc/passwd using awk:
$ awk -F ":" '{ print $1, $3, $NF }' /etc/passwd # 1st, 3rd and last fields
=== Using cut instead of awk ===
* Can often use the [[https://en.wikipedia.org/wiki/Cut_(Unix) | cut]] text processing command instead of ''awk'' to extract fields in delimited lines of text.
* If the delimiter is simple.
===== tr - TRanslating Characters =====
* SYNOPSIS
tr -cds string1 string2
* tr options
* -c complement of string1
* -d delete string1
* -s cause every repeated output char in string1 to be condensed into a single instance
* Operates only on stdin
# Input file:
$ cat go.cart
go cart
racing
# Translating case: probably the most common use of tr
$ tr a-z A-Z < go.cart
GO CART
RACING
# Replace character ranges
$ tr a-c D-E < go.cart
go EDrt
rDEing
# Replace every non-"a" with "X"
$ tr -c a X < go.cart
XXXXaXXXXXaXXXXX
# Replace non-"a-z" with (new line)
# Could substitute '\n' for '\012'
$ tr -c a-z '\012' < go.cart
go
cart
racing
# Just delete characters
$ tr -d a-c < go.cart
go rt
ring
----
----
===== Exercises =====
* **(Do in class)** Save all the following exercise scripts in your ''~/bin'' directory.
----
==== 1. nospace.sh ====
Create a script ''nospace.sh'' to look for filenames with spaces in them in the current directory and to rename those files, converting the spaces to _ (underscore).
To test ''nospace.sh'', in a separate ''nospace'' directory, use ''touch'' to create a bunch of files that have spaces in the file names:
mkdir nospace
cd nospace
touch "report one" "report two" "report three" "reports four and five"
[[https://cssegit.monmouth.edu/jchung/csse370repo/-/blob/main/scripts/nospace|Link to nospace.sh code]]
==== 2. wget pipeline ====
Download the following file using wget:
http://rockhopper.monmouth.edu/~jchung/cs370/modem.out
Write a pipeline to extract only the PPP ip address "72.68.102.102" from this file. Incorporate wget in the pipeline.
Complete the pipeline using ''sed'', and later, ''awk''.
Solution using sed:
# wget: Quiet (-q) wget output while sending fetched modem.out to stdout (-O -)
# grep: Match 1 line of modem.out containing "PPP"
# sed: Delete all information before the IP address
wget -q -O - http://rockhopper.monmouth.edu/~jchung/cs370/modem.out |
grep PPP |
sed 's/.*PPP *//' # or sed 's/.*PPP\s*//'
Solution using awk:
# wget: Quiet (-q) wget output while sending fetched modem.out to stdout (-O -)
# grep: Match 1 line of modem.out containing "PPP"
# awk: Extract IP address, which is the 5th field ($5) in the line,
# IP Network Address PPP 72.68.102.102
wget -q -O - http://rockhopper.monmouth.edu/~jchung/cs370/modem.out |
grep PPP |
awk '{print $5}'
==== 3. randlines.sh ====
Write a script ''randlines.sh'' to randomize the order of lines in standard input. Here's a start:
#!/bin/bash
#
# randlines.sh: Randomize lines in standard input
#
# Uses $RANDOM shell variable (found at the Advanced BASH
# Shell Scripting Guide).
while read myline # Read one line of stdin at a time.
do
echo $RANDOM $myline
done
Using either the ''head'' or ''tail'' command, create a variant of ''randlines.sh'' called ''randline.sh'' that outputs just one line at random from standard input.
Note: We are just re-implementing the functionality of the ''shuf'' command which randomizes lines of files and stdin.
==== 4. wordfreq.sh ====
Create a script called ''wordfreq.sh'' to print the number of occurrences of all words in a //file or standard input//. Output must be sorted descending by number of occurrences.
Sample output if input is https://www.gutenberg.org/cache/epub/11231/pg11231.txt:
738 the
519 i
508 to
472 of
434 and
387 a
305 in
210 his
204 that
193 was
191 my
189 he
169 you
162 not
150 with
146 it
141 me
139 him
121 bartleby
...
We want ''wordfreq.sh'' to be able to handle both STDIN and files given as arguments. So, it should be able to do something like
fortune | wordfreq.sh # process STDIN with wordfreq.sh
and also
wordfreq.sh input.txt # wordfreq.sh a input file
(and also)
wordfreq.sh input*.txt # wordfreq.sh multiple input files together
[[https://cssegit.monmouth.edu/jchung/csse370repo/-/blob/main/scripts/wordfreq|Link to wordfreq.sh code]]
==== 5. makeuserids.sh ====
Study the [[https://wikiless.tiekoetter.com/wiki/Cut_(Unix) | cut]] text processing command. Apply ''cut'' to a file containing this list of names:
Wehman, John
Wehner, Monk
Weid, Kahn
Weigner, Ray
Weimann, Joseph
Weimmer, Nottingham
Weinberg, John
Weiner, Stephanie
Weiner, Joseph
Weinert, Molly
Weingarten, Joyce
Weinraub, John
Use ''cut'' to extract the first letters of the first names, convert to lower case, and write the letters to a file called ''firstinit''.
m
j
r
k
j
...
and so on
Use ''cut'' again to extract the first 7 letters of the last names, convert to lower case, and write to a file called ''lastname''.
wehman
wehner
weid
weigner
weimann
...
and so on
Study the [[https://wikiless.tiekoetter.com/wiki/Paste_(Unix) | paste]] text processing utility. Use ''paste'' to paste ''firstinit'' and ''lastname'' together, eliminating any spaces.
mwehman
jwehner
rweid
kweigner
jweimann
...
and so on
and redirect the result to a filed called ''userids''.
Write a script ''makeuserids.sh'' to perform the above tasks on an input file.
[[https://cssegit.monmouth.edu/jchung/csse370repo/-/blob/main/scripts/makeuserids.sh|Link to makeuserids.sh code]] |
[[https://cssegit.monmouth.edu/jchung/csse370repo/-/blob/main/scripts/makeuserids-ps.sh| makeuserids-ps.sh]] (alternative version that uses process substitution)
==== 6. grep context pipeline ====
Write a pipeline to turn the following input (saved in a file called 'servers'):
# comment blah
bigblah
{
blah
{
host MA-FXDWF-14
{
hardware ethernet 00:13:21:5C:11:16;
fixed-address 192.168.19.29;
}
host MA-FXDWF-15
{
hardware ethernet 00:13:21:5D:12:17;
fixed-address 192.168.19.30;
}
host MA-FXDWF-16
{
hardware ethernet 00:13:21:5E:13:18;
fixed-address 192.168.19.31;
}
...
...
# repeats 4000 times
...
...
}
blah
}
into this (for import into a spreadsheet):
MA-FXDWF-14???00:13:21:5C:11:16???192.168.19.29
MA-FXDWF-15???00:13:21:5D:12:17???192.168.19.30
MA-FXDWF-16???00:13:21:5E:13:18???192.168.19.31
...
...
* Solution #1:
grep -A 3 "host" servers | # find lines that contain "host", list 3 lines following each matching line
tr -d '\n' | # delete new lines to put everything on one line
sed "s/--/\n/g" | # insert a new line where "--" occurs ("--" separates the grep matches)
awk '{ print $2, $6, $8 }' | # print 2nd, 6th and 8th tokens, using default awk delimiter
tr -d ';' | # delete semicolons
sed "s/ /???/g" # replace single spaces with ???
==== 6. roster processing ====
Download a class ''[[https://piazza.com/class_profile/get_resource/lwd125nsggo6gv/lwto3basqdivx|roster.txt]]''. Using ''sed'' search and replace operations, convert the raw ''roster.txt'' file to a list with the following format:
Lastname-Firstname:StudentID
The list would be even better if Lastname and Firstname were both lower case, like this:
lastname-firstname:StudentID
* Solution #1:
cat roster.txt |
awk -F ", " '{ print $1"-"$2":"$3 }' | # Using ", " as delimiter, extract and print last"-"first":"id
sed "s/ [A-Z]\.//" | # Search for and delete middle initials (space, uppercase letter, period)
tr A-Z a-z # Convert all to lowercase
==== 7. webadvisor2roster.sh ====
In the script ''webadvisor2roster.sh'' take a [[https://piazza.com/class_profile/get_resource/lwd125nsggo6gv/lx3o4zuutdk3nt|roster from webadvisor]] and transform it into
Last, First [MI], ID
format, writing to the file ''roster''.
[[https://cssegit.monmouth.edu/jchung/csse370repo/-/blob/main/scripts/webadvisor2roster|Link to webadvisor2roster.sh code]]
==== (SKIP) 8. randomseating (SKIP) ====
In the script ''randomseating'', combine last names from a ''[[https://cssegit.monmouth.edu:2443/jchung/csse370/blob/master/misc/roster|roster]]'' (see 7. above) and a ''[[https://cssegit.monmouth.edu:2443/jchung/csse370/blob/master/misc/seats|seats]]'' file to randomize seating in HH 305.
[[https://cssegit.monmouth.edu:2443/jchung/csse370/blob/master/scripts/randomseating|Link to randomseating code]]
[[https://cssegit.monmouth.edu:2443/jchung/csse370/blob/master/scripts/randomseating-v2|Link to randomseating-v2 code]] **(preferred)**
==== 9. Sum the points in quiz1 ====
Sum and display the total points in [[https://piazza.com/class_profile/get_resource/lwd125nsggo6gv/lx0zewgfjjj5zl|the quiz 1 file]].
* Solution
expression=$(cat csse370-su24-quiz1.txt | grep "[0-9] point" | sed "s/[^0-9]//g" | tr '\n' '+' | sed "s/+$//")
answer=$(( expression ))
echo $answer
or
echo $(( $(cat csse370-su24-quiz1.txt | grep "[0-9] point" | sed "s/[^0-9]//g" | tr '\n' '+' | sed "s/+$//") ))
or
# use bc, a command line calculator
echo $(cat csse370-su24-quiz1.txt | grep "[0-9] point" | sed "s/[^0-9]//g" | tr '\n' '+' | sed "s/+$//") | bc
#
# pipeline breakdown
#
grep "[0-9] point" | # find lines that contain "n point(s)"
sed "s/[^0-9]//g" | # delete all non-digit chars, leaving only a column of numbers
tr '\n' '+' | # put all on single line, separated by "+"
sed "s/+$//" # delete last "+" at end
==== 10. Sort a string ====
Sort the following string from a [[https://scavtestthree.wordpress.com/2017/09/25/first-blog-post|scavenger hunt challenge]]:
22fl6abbz7yaabcdeezez99178
See the [[https://en.wikipedia.org/wiki/Fold_(Unix)|fold]] core text processing utility.
* Solution
echo 22fl6abbz7yaabcdeezez99178 |
fold -w 1 | # lines can be only 1 char wide (print string vertically)
sort |
tr -d '\n' # remove newlines to return to horizontal
==== 11. text2png.sh ====
Write the text2png script that turns standard input into a large wallpaper-type image file.
This will be a fairly long shell script that demonstrates:
* using functions
* handling standard input into a script
* handling script command line options
[[https://cssegit.monmouth.edu/jchung/csse370repo/-/blob/main/scripts/text2png|Link to text2png.sh code]] | [[https://cssegit.monmouth.edu/jchung/csse370repo/-/blob/main/scripts/text2png_getopts|text2png_getopts.sh]] (alternate version that uses getopts)
----