====== Introduction to Python ======

===== Python =====

  * Creator: Guido van Rossum
  * Introduced: 1991
  * Open source
  * Comes standard on many UNIX/Linux systems<del>, macOS</del>
    * <del>Note: The system Python on macOS and Linux is currently version 2.x and should not be used for this course.</del>
  * Windows and macOS installers obtainable from [[https://www.python.org/downloads/ | python.org]] and [[http://activestate.com | activestate.com]]
    * **Do not install Python from the Windows Store.**
    * On Windows, be sure to select the **"Add python.exe to PATH"** option when installing.
  * Used for system programming/administration, web programming/development, network programming (it was the original bittorrent client), GUI development, games, data science

===== Python Uses =====

  * CGI/web application programming
     * [[https://wiki.python.org/moin/WebFrameworks|Web application frameworks]]
     * [[http://highscalability.com/blog/2012/3/26/7-years-of-youtube-scalability-lessons-in-30-minutes.html|Youtube]]
     * Exploit development/testing
  * Application scripting
  * System maintenance/installation
    * RedHat/Fedora Linux's [[http://en.wikipedia.org/wiki/Anaconda_(installer) | Anaconda]], [[http://en.wikipedia.org/wiki/DNF_(software) | DNF]] package management system
  * GUIs
    * [[http://wiki.python.org/moin/TkInter | TkInter (Tk for Python)]]
    * [[http://www.pygtk.org/ | PyGTK]]
  * Games
    * [[http://pygame.org | Pygame]]
  * Data science
    * [[https://en.wikipedia.org/wiki/Anaconda_(Python_distribution) | Anaconda]]
===== Python Online Resources =====

  * [[http://www.python.org|Official site]]
  * [[https://docs.python.org/3/tutorial/index.html|The Python Tutorial]]
  * [[http://docs.python.org/library/|The Python Standard Library]] (//Useful//)
  * [[http://docs.python.org/reference/index.html|The Python Language Reference]]
  * [[https://diveintopython3.net/|Dive Into Python]] (online book)
===== Python Execution (Conventional) =====

  * Programs typically given .py extension
  * Executed with //python prog.py// or //python3 prog.py//
  * Or use shell script type line at top of Perl script (UNIX systems only)
    * #!/path/to/python
    * and make executable with //chmod +x prog.py//
  * Interactive shell (read-eval-print loop) by running //python// or //python3//

===== Python Execution (Jupyter Notebook) =====

  * A [[https://en.wikipedia.org/wiki/Project_Jupyter#Jupyter_Notebook|web-based interactive computational environment]] that supports Python
  * Popular for data science programming in Python
  * Notebooks are saved with the .ipynb file extension.
  * Link: [[https://realpython.com/jupyter-notebook-introduction|Installation and getting started]]
    * Can also use on [[https://colab.research.google.com/notebooks/intro.ipynb|Google Colab environment]] for data science/machine learning
===== Python Intro =====

==== What Python Looks Like ====

  * Do this in the python interactive shell (python3)

<code>
>>> import calendar
>>> cal = calendar
>>> cal.prmonth(2017,2)
  
   February 2017
Mo Tu We Th Fr Sa Su
       1  2  3  4  5
 6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28
  
>>> cal.weekday(2017,2,17)
4
  
>>> cal.weekday(1973,11,14)
2
    
>>> cal.prmonth(1973,11)
   November 1973
Mo Tu We Th Fr Sa Su
          1  2  3  4
 5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30

>>> # Some object introspection
>>> # Type cal. and hit tab a couple of times
>>> cal.
>>> # Also, try
>>> dir(cal)
</code>

===== Python: Everything is an Object =====

<code>
>>> x = 'hello, world'
>>> y = x.upper()
>>> y
'HELLO, WORLD!'
  
>>> def swapper(mystr):
.  .  .      return mystr.swapcase() # indent mandatory
.  .  .
>>> swapper(x)
'HELLO, WORLD!'
  
>>> x
'hello, world!'
  
>>> def parts(mystr,sep=','):
.  .  .     return mystr.partition(sep)
.  .  .

>>> parts(x)
('hello', ',', ' world!')
</code>

===== Python: Everything is an Object (even functions) =====

<code>
>>> def personalize(greeting, name='Joni'):
.  .  .     # Replaces 'world' with a given name
.  .  .     return greeting.replace('world', name)
.  .  .

>>> x
'hello, world!'
  
>>> personalize(x, 'Joanne')
'hello, Joanne!'

>>> personalize(x) # Use the default name='Joni' parameter
'hello, Joni!'


>>> # Python functions are "first class" (http://tiny.cc/ggh4vz)
  
>>> funclist = [swapper, personalize, parts]
>>> for func in funclist:
.  .  .     func(x)
.  .  .
'HELLO, WORLD!'
'hello, Joni!'
('hello', ',', ' world!')
</code>

===== Python Syntax Highlights - blocks & indentation =====

  * Python motto: "There should be one--and preferably only one--obvious way to do it."
    * Note the [[introduction_to_perl#perl_syntax_highlights|contrast with Perl]].

  * Indentation
    * Python uses indentation to indicate the run of a block.
      * That makes indentation __mandatory__.
    * Blocks in some other language:

<code>
void foo(int x) {
        if (x == 0) {
		bar();
		baz();
	} else {
		quo(x);
		foo(x - 1);
	}
}
</code>

    * Blocks in Python

<code>
def foo(x):
	if x == 0:
	        bar()
		baz()
	else:
		quo(x)
		foo(x - 1)
</code>

    * Another example

<code>
x = 1                       # block 0
if x == 1:                  # header line:
    y = 2                   # block 1
    if y == 2:              # header line:
	print('in block2')  # block 2
    print('in block1')      # block 1
print('in block0')          # block 0
</code>

    * Exceptions to the indentation-as-blocks rule or the "whitespace thing"

<code>
# open list bracket [] pairs may span lines
L = ["Good",
     "Bad",
     "Ugly"]

# Backslashes allow line continuation
if a == b and c == d and \
   d == e and f == g:
   print('old')

# Parentheses allow line continuation, usually
if (a == b and c == d and
    d == e and e == f):
    print('new')
</code>

===== Python Syntax Highlights - standard input/output =====

  * stdin/out in Perl:

<code>
while ($myline = <STDIN>) {
	print $myline;
}
</code>

  * Equivalent in Python: ([[python_exercises#exercise_1_-_standard_inputoutput|Exercise 1]])

<code>
import sys
for line in sys.stdin:
	sys.stdout.write( line )
</code>

===== Python Syntax - flow control =====

  * if/elif/else, while/else, for/else

<code>
# Assume these assignments:
x = 10
y = 10
b = 1


# if then else
if (b == 1):
   y = 1
elif (x == 10):
   y = 2
else:
   y = 3


# while (else) loop
while (x != 0):
   x = x - 1
   if (b == 1): continue # continue with next loop repetition
   break                 # break out of loop; skip else:
else:                    # run if we didn't exit loop with break
   x = x + 1


# for (else) loop
for x in range(4):      # repeats 4 times x=0..3
   y = y + 1
   if (b == 1): continue
   break                # break out of loop; skip else:
else:                   # run if we didn't exit loop with break
   y = y + 1
</code>

===== Python Syntax - built-in objects =====

  * "Everything is an object"
    * Object types
      * Numbers -       3.1415, 1234
      * Strings -       'spam', "guido's"
      * Lists -         [1, 2, 3, 4], ['one', 'two', 'three']
        * As in Perl, arrays are named lists
      * Dictionaries -  {'food': 'spam', 'taste': 'yum'}
        * These are Python's associative arrays (hashes)
      * Tuples -        (1, 'spam', 4, 'U')
        * //immutable// lists
      * Files -         text = open('eggs.txt', 'r').read()

  * Strings, lists, and tuples are categorized as built-in "Sequence Types" in Python.
    * See https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range
      * Strings and Tuples are "immutable" sequence types.
        * Once they are created, they cannot be modified.
      * Lists are "mutable" sequence types.
  * Dictionaries are categorized as a built-in "Mapping Type"
    * See https://docs.python.org/3/library/stdtypes.html#mapping-types-dict

===== Python Syntax - strings =====

  * concatenating: str1 + str2
    * cannot normally mix types around "+" when concatenating
  * repeating:     str2 * 3
  * indexing:      str2[ i ]
  * slicing:       str2[ i:j ]
  * length:        len( str2 )
  * methods:
    * str2.find( 'pa' )
    * str2.replace( 'pa', 'xx' )
    * str1.split()
  * convert to string with //str// function: str( len( str2 ) )
<code>
print("one is " + str(1))
</code>

  * See https://docs.python.org/3/library/stdtypes.html#string-methods for complete list of Python string methods
  * Do [[cs498gpl:python_exercises | Exercise 2]].

===== Python Syntax - string formatting =====

  * **This is old.**
  * similar to //printf// function in other languages
  * Syntax:

<code>
'formatting codes corresponding to list of objects, and other characters' % ( comma-separated list of objects )
</code>

    * Example:

<code>
print("Number of %i character words: %i" % ( x, char_count[ x ] ))      # %i denotes integer
</code>

    * See https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting for string formatting operations.

  * **It is recommended that you use the newer f-strings instead.**
    * available since Python 3.6
    * https://docs.python.org/3/tutorial/inputoutput.html
===== Python Syntax - lists =====

  * Lists can be "anonymous"
<code>
for x in [1, 2, 3]:   # list [1, 2, 3] is anonymous
   print(x)
</code>

  * The //range()// function generates a list-like iterable sequence type object of values, most frequently numbers.

<code>
for x in range( 0, 10 ): # range( 0, 10 ) or just range( 10 ) yields the range 0..9
   print(x)
</code>

  * Named lists are like arrays
    * But in Python, you call them "lists", not "arrays".

<code>
List1 = [0, 1, 2, 3]
List2 = range( 1, 5 )     # Not a list, but a range object
List2 = list(range(1, 5)) # Convert range to list containing [1, 2, 3, 4]
</code>

    * Lists have to be "declared" if starting as an empty list

<code>
List3 = []   # an empty list
</code>

===== Python Syntax - list functions =====

  * Lists are a "Mutable Sequence Type" in Python.
    * See http://docs.python.org/library/stdtypes.html#mutable-sequence-types for operations and methods that apply to mutable sequnce types.

  * Size of a list

<code>
len( List3 )
</code>

  * Concatenate lists

<code>
list1 = list(range( 1, 5 ))
list2 = list(range( 6, 10 ))
list3 = list1 + list2
print(list3)   # output: [1, 2, 3, 4, 6, 7, 8, 9]
</code>

  * Grow list by one object

<code>
list1.append( 4 )
</code>

  * Grow list by a list of objects

<code>
list1.extend( [5, 6, 7] )
</code>

  * Sort, reverse
    * Beware: in-place alteration of list contents
      * Make a list copy first, if needed
<code>
list1.reverse()
print(list1)   # output: [4, 3, 2, 1]
list1.sort()
print(list1)   # output: [1, 2, 3, 4]
</code>

  * Shrink list by one object

<code>
del list3[ len(list3) - 1 ]   # delete last element at last index of list3

x = list3.pop()   # delete last element of list3 and assign val to x
</code>

===== Python Syntax - list iteration =====

  * Using Python's //for// structure
<code>
# Method 1:
for x in list3:
   print(x)   # print values in list3, one per line

# Method 2:
for x in range( len( list3 ) ):
   print(list[ x ])   # print index-accessed values in list3, one per line
</code>

===== Python Syntax - dictionaries =====

  * See http://docs.python.org/library/stdtypes.html#mapping-types-dict

  * Python's version of Perl's hashes:

<code>
D2 = { 'spam': 2, 'eggs': 3 }   # 2 string keys and 2 int values
D3 = { 1: 10, 2: 14 }           # 2 int keys and 2 int values
</code>

  * Index by key

<code>
D2[ 'eggs' ] += 1   # increment number of eggs by 1
print(D2['eggs'])

D3[ 1 ] += 1        # increment value at key=1 by 1
print(D3[ 1 ])

D3[ 5 ] = 4         # new key:value pair added to D3
</code>

===== Python Syntax - dictionary functions =====

  * get keys

<code>
print(D2.keys())    # outputs an keys object containing the list: ['eggs', 'spam']
</code>

  * get values

<code>
print(D2.values())  # outputs a values object containing a list: [2, 4]
</code>

  * get key:value pairs

<code>
print(D2.items())
</code>

  * get number of key:value pairs

<code>
print(len(D2))
</code>

  * get value or a default value if nonexistent

<code>
print(D2.get('bacon'))       # outputs 'None' since nonexistent key
print(D2.get('bacon', -1))   # outputs -1 since nonexistent key
</code>

  * See if a key exists

<code>
D3.has_key( 3 )   # (Python 2 only) returns False since the key 3 does not exist in D3
3 in D3           # Check if key 3 is in dictionary D3
</code>

  * "merge" two dictionaries

<code>
D3 = { 'toast':4, 'muffin':5, 'spam':1000 }
D2.update( D3 )   # 'spam' value from D3 overwrites 'spam' value in D2
</code>

===== Python Syntax - dictionary iteration =====

  * Using Python's //for// structure
<code>
table = {'Python': 'Guido van Rossum',
	'Perl':   'Larry Wall',
	'Tcl':    'John Ousterhout' }

for lang in table.keys():
   print (lang, '\t', table[ lang ])
</code>

  * Note: Python dictionaries have been insertion ordered since version 3.6.
    * Prior to that, dictionaries were unordered like Perl hashes.

===== Python Syntax - files (not stdin) =====

  * See https://docs.python.org/3/library/io.html#module-io

  * Reading
<code>
x = open("input.txt", "r")             # open file for input and assign to object x

while 1:
   y = x.readline()                    # read next line of x
   if (not y):
	  break

for eachline in x.readlines():         # x.readlines() creates [list] of lines in x
   y = eachline

y = x.read()                           # read entire file into a string
y = x.readline()                       # read next line
y = x.readlines()                      # read file into list of strings

x.close()
</code>

  * The read(), readline() and readlines() methods also apply to [[cs498gpl:introduction_to_python#python_syntax_highlights_-_standard_inputoutput | Python stdin]].

  * Writing

<code>
file = open("output.txt", "w") 

file.write("Hello World") 
file.write("This is our new text file.")

file.close() 
</code>

  * The ''with'' statement for resource management (including files)
    * ensures that files are closed properly even if exceptions occur
      * All file operations must occur within the ''with'' block.
    
<code>
with open('input.txt', 'r') as input_file, open('output.txt', 'w') as output_file:
    input = input_file.read()
    # ...
    # process input into output
    # ...
    output_file.write(output)
</code>

===== Python Syntax - command line arguments =====

  * //sys.argv// returns the List of command line arguments, including the script name

<code>
import sys

x = sys.argv   # List of command line arguments
print(x)

print(x[ 1 ])


output:
$ python cmdargs.py 1 2 3
['cmdargs.py', '1', '2', '3']   
1

# Note: All elements in sys.argv are strings, even args that contain only digits
</code>

===== Python Syntax - functions =====

<code>
def factorial(n):                      # define a function
   if (n == 1):
	  return (1)
   else:
	  return (n * factorial(n-1))      # recursion

x = factorial(5)                       # call a function
</code>

  * scope issues
    * to access global scope vars, use //global varname//

<code>
def accessglobal():
   global glib                         # access a global scope var
   glib = 100

glib = 0
accessglobal()
print("glib is %i after call to accessglobal()" % glib)
</code>

----
----

===== Python & regular expressions =====

  * See http://docs.python.org/library/re.html

<code>
# python_re.py

import re

str = 'I am a string'

# store RE in a RE pattern object:
regex = re.compile(r'string$')

#
# matching:
#
# finds first instance of regex using module's (re) search function:
if re.search(r'string$', str):
	print("str ends with 'string'")
	
# finds first instance of regex using RE pattern object:
if regex.search(str):
	print("str ends with 'string'")

# finds all instances of regex (returns list of matching substrings)
if regex.findall(str): # findall being used as an if condition
	print("str ends with 'string'")

print(re.findall(r'[AEIOUaeiou]', str)) # findall being used normally


#
# search/replace:
#
# replaces all instances of ' a ' with ' another ' in str;
# so the default is global search and replace:
str = re.sub(r' a ', ' another ', str)
print("str is now: " + str)

# Added "1" to replace only the first instance of ' a ' with ' another ' in str:
str = re.sub(r' a ', ' another ', str, 1)


#
# split:
#
pattern = re.compile(r'\W+')  # \W matches any non-alphanumeric character;

print(pattern.split('This is a test, short and sweet, of split().'))
# output: ['This', 'is', 'a', 'test', 'short', 'and', 'sweet', 'of', 'split', '']

print(pattern.split('This is a test, short and sweet, of split().', 3))
# At most 3 splits are performed, and the rest of the string is left unsplit.
# output: ['This', 'is', 'a', 'test, short and sweet, of split().']

# Can use re module split method without using a pattern object:
print(re.split(r'\W+', 'This is a test, short and sweet, of split().'))


#
# regexp groups (back references in Perl):
#
mystring = "abcdefg"
mygroups = re.search( '(a.)(c.)(e.)(g)', mystring )
# mygroups is a "match object" (See http://docs.python.org/library/re.html#match-objects)

print(mygroups.group( 0 ))   # group( 0 ) is the whole of mystring, i.e., abcdefg
print(mygroups.group( 1 ))   # group( 1 ) is ab
print(mygroups.group( 2 ))
print(mygroups.group( 3 ))
print(mygroups.group( 4 ))   # group( 4 ) is g

# The match object .group() method returns regular strings:
mynewstring = mygroups.group(4) + mygroups.group(3) + mygroups.group(2)

print(mynewstring)
</code>

----

===== Python modules and sys.path =====

  * The Python interpreter imports modules from the Python library search path, defined in sys.path.

<code>
import sys
print(sys.path)
</code>

  * Python looks in several places when you try to import a module
  * //sys.path// is a standard Python list, which can be modified with standard list methods.
  * Python modules generally end in .py or .pyc (if compiled)
    * Some are built into Python, such as the sys module
    * Not all modules are written in Python; some are written in C for greater speed.
    * One can add a new path, e.g. ///export/home/hawkdom2/jchung/lib/python/test//, to Python's search path at runtime by appending the path name to sys.path
      * Python will then also look in that path for modules, whenever you try to import a module.
      * The effect lasts as long as Python is running.

===== sys.path and user-created modules =====

  * Altering sys.path in a Python program

<code>
import sys

# Add my own module path to sys.path;
# possible because sys.path is a standard Python list
sys.path.append( '/export/home/hawkdom2/jchung/lib/python/test' )

print(sys.path)
</code>

  * function defs in user-defined modules become methods of those modules

<code>
# Import /export/home/hawkdom2/jchung/lib/python/test/mymod.py
# where the mymod.py contains a simple function def:
#
#    def greeting():
#       print "This is being printed in jchung's mymod module."
#
import mymod

# Calls the greeting() method in the imported mymod module
mymod.greeting()
</code>

  * When the Python interpreter executes //import mymod//, the module file //mymod.py// is automatically //byte compiled//.
    * A file named //mymod.pyc// or a similar name may be found in a directory called //<nowiki>__pycache__</nowiki>//.
    * Python modules can also be manually byte-compiled using the //python3 -m py_compile modname.py//.
    * Byte-compiled modules load faster than non-compiled modules; they don't execute any faster.
      * They may also be useful for code obfuscation (information hiding).
  * //sys.modules// is a run-time dictionary that contains all the modules that are loaded by a Python program.

===== Importing modules in various ways =====

  * The //import// statement
    * //import// identifies an external file to be loaded.
      * The name that is imported also becomes a variable in the program, a reference to the module object:

<code>
import module1                   # Get module as a whole
module1.printer('Hello world!')  # Access module member names using module name.member_name
</code>

  * The //from..import// statement
    * //from..import// imports member names from a module, so there's no need to qualify these member names:

<code>
from module1 import printer      # Import member name from module1
printer('Hello world!')          # Access module member names directly.
</code>

  * The //from..import *// statement
    * Special form of //from// that imports all module member names:

<code>
from module1 import *          # Import 'printer()' and any other names from module1
printer('Hello world!')
</code>

===== Python packages =====

  * Packages are collections of modules, typically in a file system directory hierarchy.
    * Example: /usr/lib/python3.9/email
      * The package here is //email//
      * //email// is a subdirectory within /usr/lib/python3.9, which is inside Python3.9's default //sys.path//
      * Inside //email// is a collection of modules (.py files and their .pyc byte-compiled counterparts) that provide Python email handling functions.
      * Also inside //email// is a special module //%%__init__.py%%// that identifies //email// as a package
        * Without //%%__init__.py%%//, attempts to //import email.something// will fail.
      * Package modules and submodules are accessed using **"."** notation
        * //import email.mime.audio// imports the //audio// module from /usr/lib/python3.9/email/mime/audio.py
        * //from email import message// imports the //message// module from //email//
        * //from email import *// imports all module member names from //email//
          * //from ... import *// requires the //%%__all__%%// list to be defined in //%%__init__.py%%// to work

----

===== Python List Comprehensions =====

  * See http://docs.python.org/tutorial/datastructures.html#list-comprehensions
  * List comprehension: Syntactic construct available in some programming languages for creating a list based on existing lists
  * Each python list comprehension consists of an expression followed by a //for// clause, then zero or more //for// or //if// clauses.
    * The result is a list resulting from evaluating the expression in the context of the //for// and //if// clauses which follow it.

  * Examples:

<code>
S = [2*x for x in range(101) if x**2 > 3]
print S

Output: [4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50,
		 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100,
		 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140,
		 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180,
		 182, 184, 186, 188, 190, 192, 194, 196, 198, 200]


freshfruit = ['  banana', '  loganberry ', 'passion fruit  ']
print [weapon.strip() for weapon in freshfruit]

Output: ['banana', 'loganberry', 'passion fruit']


# multiple lists in a list comprehension:
vec1 = [2, 4, 6]
vec2 = [4, 3, -9]
print [x*y for x in vec1 for y in vec2]

Output: [8, 6, -18, 16, 12, -36, 24, 18, -54]


# If the expression would evaluate to a tuple, it must be parenthesized:
vec = [2, 4, 6]
print [(x, x**2) for x in vec]

Output: [(2, 4), (4, 16), (6, 36)]


# The dict() constructor builds dictionaries directly from lists of 
# key-value pairs stored as tuples.
# When the pairs form a pattern, list comprehensions can compactly
# specify the key-value list.
print dict([(x, x**2) for x in (2, 4, 6)]) # dict applied to a list comprehension

Output: {2: 4, 4: 16, 6: 36}
</code>

----