====== Introduction to Python ====== ===== Python ===== * Creator: Guido van Rossum * Introduced: 1991 * Open source * Comes standard on many UNIX/Linux systems, macOS * Note: The system Python on macOS and Linux is currently version 2.x and should not be used for this course. * Windows and macOS installers obtainable from [[https://www.python.org/downloads/ | python.org]] and [[http://activestate.com | activestate.com]] * **Do not install Python from the Windows Store.** * On Windows, be sure to select the **"Add Python 3.x to PATH"** option when installing. * Used for system programming/administration, web programming/development, network programming (it was the original bittorrent client), GUI development, games, data science ===== Python Uses ===== * CGI/web application programming * [[https://wiki.python.org/moin/WebFrameworks|Web application frameworks]] * [[http://highscalability.com/blog/2012/3/26/7-years-of-youtube-scalability-lessons-in-30-minutes.html|Youtube]] * Exploit development/testing * Application scripting * System maintenance/installation * RedHat/Fedora Linux's [[http://en.wikipedia.org/wiki/Anaconda_(installer) | Anaconda]], [[http://en.wikipedia.org/wiki/Yellow_dog_Updater%2C_Modified | Yum]] package management system * GUIs * [[http://wiki.python.org/moin/TkInter | TkInter (Tk for Python)]] * [[http://www.pygtk.org/ | PyGTK]] * Games * [[http://pygame.org | Pygame]] * Data science * [[https://en.wikipedia.org/wiki/Anaconda_(Python_distribution) | Anaconda]] ===== Python Online Resources ===== * [[http://www.python.org|Official site]] * [[https://docs.python.org/3/tutorial/index.html|The Python Tutorial]] * [[http://docs.python.org/library/|The Python Standard Library]] (//Useful//) * [[http://docs.python.org/reference/index.html|The Python Language Reference]] * [[https://diveintopython3.net/|Dive Into Python]] (online book) ===== Python Execution (Conventional) ===== * Programs typically given .py extension * Executed with //python prog.py// or //python3 prog.py// * Or use shell script type line at top of Perl script (UNIX systems only) * #!/path/to/python * and make executable with //chmod +x prog.py// * Interactive shell (read-eval-print loop) by running //python// or //python3// ===== Python Execution (Jupyter Notebook) ===== * A [[https://en.wikipedia.org/wiki/Project_Jupyter#Jupyter_Notebook|web-based interactive computational environment]] that supports Python * Popular for data science programming in Python * Notebooks are saved with the .ipynb file extension. * Link: [[https://realpython.com/jupyter-notebook-introduction|Installation and getting started]] * Can also use on [[https://colab.research.google.com/notebooks/intro.ipynb|Google Colab environment]] for data science/machine learning ===== Python Intro ===== ==== What Python Looks Like ==== * Do this in the python interactive shell (python3) >>> import calendar >>> cal = calendar >>> cal.prmonth(2017,2) February 2017 Mo Tu We Th Fr Sa Su 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 >>> cal.weekday(2017,2,17) 4 >>> cal.weekday(1973,11,14) 2 >>> cal.prmonth(1973,11) November 1973 Mo Tu We Th Fr Sa Su 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 >>> # Some object introspection >>> # Type cal. and hit tab a couple of times >>> cal. >>> # Also, try >>> dir(cal) ===== Python: Everything is an Object ===== >>> x = 'hello, world' >>> y = x.upper() >>> y 'HELLO, WORLD!' >>> def swapper(mystr): . . . return mystr.swapcase() # indent mandatory . . . >>> swapper(x) 'HELLO, WORLD!' >>> x 'hello, world!' >>> def parts(mystr,sep=','): . . . return mystr.partition(sep) . . . >>> parts(x) ('hello', ',', ' world!') ===== Python: Everything is an Object (even functions) ===== >>> def personalize(greeting, name='Joni'): . . . # Replaces 'world' with a given name . . . return greeting.replace('world', name) . . . >>> x 'hello, world!' >>> personalize(x, 'Joanne') 'hello, Joanne!' >>> personalize(x) # Use the default name='Joni' parameter 'hello, Joni!' >>> # Python functions are "first class" (http://tiny.cc/ggh4vz) >>> funclist = [swapper, personalize, parts] >>> for func in funclist: . . . func(x) . . . 'HELLO, WORLD!' 'hello, Joni!' ('hello', ',', ' world!') ===== Python Syntax Highlights - blocks & indentation ===== * Python motto: "There should be one--and preferably only one--obvious way to do it." * Note the [[introduction_to_perl#perl_syntax_highlights|contrast with Perl]]. * Indentation * Python uses indentation to indicate the run of a block. * That makes indentation __mandatory__. * Blocks in some other language: void foo(int x) { if (x == 0) { bar(); baz(); } else { quo(x); foo(x - 1); } } * Blocks in Python def foo(x): if x == 0: bar() baz() else: quo(x) foo(x - 1) * Another example x = 1 # block 0 if x == 1: # header line: y = 2 # block 1 if y == 2: # header line: print('in block2') # block 2 print('in block1') # block 1 print('in block0') # block 0 * Exceptions to the indentation-as-blocks rule or the "whitespace thing" # open list bracket [] pairs may span lines L = ["Good", "Bad", "Ugly"] # Backslashes allow line continuation if a == b and c == d and \ d == e and f == g: print('old') # Parentheses allow line continuation, usually if (a == b and c == d and d == e and e == f): print('new') ===== Python Syntax Highlights - standard input/output ===== * stdin/out in Perl: while ($myline = ) { print $myline; } * Equivalent in Python: ([[python_exercises#exercise_1_-_standard_inputoutput|Exercise 1]]) import sys for line in sys.stdin: sys.stdout.write( line ) ===== Python Syntax - flow control ===== * if/elif/else, while/else, for/else # Assume these assignments: x = 10 y = 10 b = 1 # if then else if (b == 1): y = 1 elif (x == 10): y = 2 else: y = 3 # while (else) loop while (x != 0): x = x - 1 if (b == 1): continue # continue with next loop repetition break # break out of loop; skip else: else: # run if we didn't exit loop with break x = x + 1 # for (else) loop for x in range(4): # repeats 4 times x=0..3 y = y + 1 if (b == 1): continue break # break out of loop; skip else: else: # run if we didn't exit loop with break y = y + 1 ===== Python Syntax - built-in objects ===== * "Everything is an object" * Object types * Numbers - 3.1415, 1234 * Strings - 'spam', "guido's" * Lists - [1, 2, 3, 4], ['one', 'two', 'three'] * As in Perl, arrays are named lists * Dictionaries - {'food': 'spam', 'taste': 'yum'} * These are Python's associative arrays (hashes) * Tuples - (1, 'spam', 4, 'U') * //immutable// lists * Files - text = open('eggs.txt', 'r').read() * Strings, lists, and tuples are categorized as built-in "Sequence Types" in Python. * See https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range * Strings and Tuples are "immutable" sequence types. * Once they are created, they cannot be modified. * Lists are "mutable" sequence types. * Dictionaries are categorized as a built-in "Mapping Type" * See https://docs.python.org/3/library/stdtypes.html#mapping-types-dict ===== Python Syntax - strings ===== * concatenating: str1 + str2 * cannot normally mix types around "+" when concatenating * repeating: str2 * 3 * indexing: str2[ i ] * slicing: str2[ i:j ] * length: len( str2 ) * methods: * str2.find( 'pa' ) * str2.replace( 'pa', 'xx' ) * str1.split() * convert to string with //str// function: str( len( str2 ) ) print("one is " + str(1)) * See https://docs.python.org/3/library/stdtypes.html#string-methods for complete list of Python string methods * Do [[cs498gpl:python_exercises | Exercise 2]]. ===== Python Syntax - string formatting ===== * **This is old.** * similar to //printf// function in other languages * Syntax: 'formatting codes corresponding to list of objects, and other characters' % ( comma-separated list of objects ) * Example: print("Number of %i character words: %i" % ( x, char_count[ x ] )) # %i denotes integer * See https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting for string formatting operations. * **It is recommended that you use the newer f-strings instead.** * available since Python 3.6 * https://docs.python.org/3/tutorial/inputoutput.html ===== Python Syntax - lists ===== * Lists can be "anonymous" for x in [1, 2, 3]: # list [1, 2, 3] is anonymous print(x) * The //range()// function generates a list-like iterable sequence type object of values, most frequently numbers. for x in range( 0, 10 ): # range( 0, 10 ) or just range( 10 ) yields the range 0..9 print(x) * Named lists are like arrays * But in Python, you call them "lists", not "arrays". List1 = [0, 1, 2, 3] List2 = range( 1, 5 ) # Not a list, but a range object List2 = list(range(1, 5)) # Convert range to list containing [1, 2, 3, 4] * Lists have to be "declared" if starting as an empty list List3 = [] # an empty list ===== Python Syntax - list functions ===== * Lists are a "Mutable Sequence Type" in Python. * See http://docs.python.org/library/stdtypes.html#mutable-sequence-types for operations and methods that apply to mutable sequnce types. * Size of a list len( List3 ) * Concatenate lists list1 = list(range( 1, 5 )) list2 = list(range( 6, 10 )) list3 = list1 + list2 print(list3) # output: [1, 2, 3, 4, 6, 7, 8, 9] * Grow list by one object list1.append( 4 ) * Grow list by a list of objects list1.extend( [5, 6, 7] ) * Sort, reverse * Beware: in-place alteration of list contents * Make a list copy first, if needed list1.reverse() print(list1) # output: [4, 3, 2, 1] list1.sort() print(list1) # output: [1, 2, 3, 4] * Shrink list by one object del list3[ len(list3) - 1 ] # delete last element at last index of list3 x = list3.pop() # delete last element of list3 and assign val to x ===== Python Syntax - list iteration ===== * Using Python's //for// structure # Method 1: for x in list3: print(x) # print values in list3, one per line # Method 2: for x in range( len( list3 ) ): print(list[ x ]) # print index-accessed values in list3, one per line ===== Python Syntax - dictionaries ===== * See http://docs.python.org/library/stdtypes.html#mapping-types-dict * Python's version of Perl's hashes: D2 = { 'spam': 2, 'eggs': 3 } # 2 string keys and 2 int values D3 = { 1: 10, 2: 14 } # 2 int keys and 2 int values * Index by key D2[ 'eggs' ] += 1 # increment number of eggs by 1 print(D2['eggs']) D3[ 1 ] += 1 # increment value at key=1 by 1 print(D3[ 1 ]) D3[ 5 ] = 4 # new key:value pair added to D3 ===== Python Syntax - dictionary functions ===== * get keys print(D2.keys()) # outputs an keys object containing the list: ['eggs', 'spam'] * get values print(D2.values()) # outputs a values object containing a list: [2, 4] * get key:value pairs print(D2.items()) * get number of key:value pairs print(len(D2)) * get value or a default value if nonexistent print(D2.get('bacon')) # outputs 'None' since nonexistent key print(D2.get('bacon', -1)) # outputs -1 since nonexistent key * See if a key exists D3.has_key( 3 ) # (Python 2 only) returns False since the key 3 does not exist in D3 3 in D3 # Check if key 3 is in dictionary D3 * "merge" two dictionaries D3 = { 'toast':4, 'muffin':5, 'spam':1000 } D2.update( D3 ) # 'spam' value from D3 overwrites 'spam' value in D2 ===== Python Syntax - dictionary iteration ===== * Using Python's //for// structure table = {'Python': 'Guido van Rossum', 'Perl': 'Larry Wall', 'Tcl': 'John Ousterhout' } for lang in table.keys(): print (lang, '\t', table[ lang ]) * Note: Python dictionaries have been insertion ordered since version 3.6. * Prior to that, dictionaries were unordered like Perl hashes. ===== Python Syntax - files (not stdin) ===== * See https://docs.python.org/3/library/io.html#module-io * Reading x = open("input.txt", "r") # open file for input and assign to object x while 1: y = x.readline() # read next line of x if (not y): break for eachline in x.readlines(): # x.readlines() creates [list] of lines in x y = eachline y = x.read() # read entire file into a string y = x.readline() # read next line y = x.readlines() # read file into list of strings x.close() * The read(), readline() and readlines() methods also apply to [[cs498gpl:introduction_to_python#python_syntax_highlights_-_standard_inputoutput | Python stdin]]. * Writing file = open("output.txt", "w") file.write("Hello World") file.write("This is our new text file.") file.close() ===== Python Syntax - command line arguments ===== * //sys.argv// returns the List of command line arguments, including the script name import sys x = sys.argv # List of command line arguments print(x) print(x[ 1 ]) output: $ python cmdargs.py 1 2 3 ['cmdargs.py', '1', '2', '3'] 1 ===== Python Syntax - functions ===== def factorial(n): # define a function if (n == 1): return (1) else: return (n * factorial(n-1)) # recursion x = factorial(5) # call a function * scope issues * to access global scope vars, use //global varname// def accessglobal(): global glib # access a global scope var glib = 100 glib = 0 accessglobal() print("glib is %i after call to accessglobal()" % glib) ---- ---- ===== Python & regular expressions ===== * See http://docs.python.org/library/re.html # python_re.py import re str = 'I am a string' # store RE in a RE pattern object: regex = re.compile(r'string$') # # matching: # # finds first instance of regex using module's (re) search function: if re.search(r'string$', str): print("str ends with 'string'") # finds first instance of regex using RE pattern object: if regex.search(str): print("str ends with 'string'") # finds all instances of regex (returns list of matching substrings) if regex.findall(str): # findall being used as an if condition print("str ends with 'string'") print(re.findall(r'[AEIOUaeiou]', str)) # findall being used normally # # search/replace: # # replaces all instances of ' a ' with ' another ' in str; # so the default is global search and replace: str = re.sub(r' a ', ' another ', str) print("str is now: " + str) # Added "1" to replace only the first instance of ' a ' with ' another ' in str: str = re.sub(r' a ', ' another ', str, 1) # # split: # pattern = re.compile(r'\W+') # \W matches any non-alphanumeric character; print(pattern.split('This is a test, short and sweet, of split().')) # output: ['This', 'is', 'a', 'test', 'short', 'and', 'sweet', 'of', 'split', ''] print(pattern.split('This is a test, short and sweet, of split().', 3)) # At most 3 splits are performed, and the rest of the string is left unsplit. # output: ['This', 'is', 'a', 'test, short and sweet, of split().'] # Can use re module split method without using a pattern object: print(re.split(r'\W+', 'This is a test, short and sweet, of split().')) # # regexp groups (back references in Perl): # mystring = "abcdefg" mygroups = re.search( '(a.)(c.)(e.)(g)', mystring ) # mygroups is a "match object" (See http://docs.python.org/library/re.html#match-objects) print(mygroups.group( 0 )) # group( 0 ) is the whole of mystring, i.e., abcdefg print(mygroups.group( 1 )) # group( 1 ) is ab print(mygroups.group( 2 )) print(mygroups.group( 3 )) print(mygroups.group( 4 )) # group( 4 ) is g # The match object .group() method returns regular strings: mynewstring = mygroups.group(4) + mygroups.group(3) + mygroups.group(2) print(mynewstring) ---- ===== Python modules and sys.path ===== * The Python interpreter imports modules from the Python library search path, defined in sys.path. import sys print(sys.path) * Python looks in several places when you try to import a module * //sys.path// is a standard Python list, which can be modified with standard list methods. * Python modules generally end in .py or .pyc (if compiled) * Some are built into Python, such as the sys module * Not all modules are written in Python; some are written in C for greater speed. * One can add a new path, e.g. ///export/home/hawkdom2/jchung/lib/python/test//, to Python's search path at runtime by appending the path name to sys.path * Python will then also look in that path for modules, whenever you try to import a module. * The effect lasts as long as Python is running. ===== sys.path and user-created modules ===== * Altering sys.path in a Python program import sys # Add my own module path to sys.path; # possible because sys.path is a standard Python list sys.path.append( '/export/home/hawkdom2/jchung/lib/python/test' ) print(sys.path) * function defs in user-defined modules become methods of those modules # Import /export/home/hawkdom2/jchung/lib/python/test/mymod.py # where the mymod.py contains a simple function def: # # def greeting(): # print "This is being printed in jchung's mymod module." # import mymod # Calls the greeting() method in the imported mymod module mymod.greeting() * When the Python interpreter executes //import mymod//, the module file //mymod.py// is automatically //byte compiled//. * A file named //mymod.pyc// or a similar name may be found in a directory called //__pycache__//. * Python modules can also be manually byte-compiled using the //python3 -m py_compile modname.py//. * Byte-compiled modules load faster than non-compiled modules; they don't execute any faster. * They may also be useful for code obfuscation (information hiding). * //sys.modules// is a run-time dictionary that contains all the modules that are loaded by a Python program. ===== Importing modules in various ways ===== * The //import// statement * //import// identifies an external file to be loaded. * The name that is imported also becomes a variable in the program, a reference to the module object: import module1 # Get module as a whole module1.printer('Hello world!') # Access module member names using module name.member_name * The //from..import// statement * //from..import// imports member names from a module, so there's no need to qualify these member names: from module1 import printer # Import member name from module1 printer('Hello world!') # Access module member names directly. * The //from..import *// statement * Special form of //from// that imports all module member names: from module1 import * # Import 'printer()' and any other names from module1 printer('Hello world!') ===== Python packages ===== * Packages are collections of modules, typically in a file system directory hierarchy. * Example: /usr/lib/python3.9/email * The package here is //email// * //email// is a subdirectory within /usr/lib/python3.9, which is inside Python3.9's default //sys.path// * Inside //email// is a collection of modules (.py files and their .pyc byte-compiled counterparts) that provide Python email handling functions. * Also inside //email// is a special module //%%__init__.py%%// that identifies //email// as a package * Without //%%__init__.py%%//, attempts to //import email.something// will fail. * Package modules and submodules are accessed using **"."** notation * //import email.mime.audio// imports the //audio// module from /usr/lib/python3.9/email/mime/audio.py * //from email import message// imports the //message// module from //email// * //from email import *// imports all module member names from //email// * //from ... import *// requires the //%%__all__%%// list to be defined in //%%__init__.py%%// to work ---- ===== Python List Comprehensions ===== * See http://docs.python.org/tutorial/datastructures.html#list-comprehensions * List comprehension: Syntactic construct available in some programming languages for creating a list based on existing lists * Each python list comprehension consists of an expression followed by a //for// clause, then zero or more //for// or //if// clauses. * The result is a list resulting from evaluating the expression in the context of the //for// and //if// clauses which follow it. * Examples: S = [2*x for x in range(101) if x**2 > 3] print S Output: [4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200] freshfruit = [' banana', ' loganberry ', 'passion fruit '] print [weapon.strip() for weapon in freshfruit] Output: ['banana', 'loganberry', 'passion fruit'] # multiple lists in a list comprehension: vec1 = [2, 4, 6] vec2 = [4, 3, -9] print [x*y for x in vec1 for y in vec2] Output: [8, 6, -18, 16, 12, -36, 24, 18, -54] # If the expression would evaluate to a tuple, it must be parenthesized: vec = [2, 4, 6] print [(x, x**2) for x in vec] Output: [(2, 4), (4, 16), (6, 36)] # The dict() constructor builds dictionaries directly from lists of # key-value pairs stored as tuples. # When the pairs form a pattern, list comprehensions can compactly # specify the key-value list. print dict([(x, x**2) for x in (2, 4, 6)]) # dict applied to a list comprehension Output: {2: 4, 4: 16, 6: 36} ----