Python Modules, Packages & Object Oriented Approach

Python mindmap for modules, packages and object-oriented approach

Iniziamo. È gratuito!
o registrati con il tuo indirizzo email
Python Modules, Packages & Object Oriented Approach da Mind Map: Python Modules, Packages & Object Oriented Approach

1. Python Modules

1.1. Decomposition

1.1.1. breaking down code into smaller self contained parts

1.2. Managing code size and complexity

1.3. Think of a module as a book, folders as shelves and folder collections as libraries, where each book's chapters consist of functions, variables, classes and objects

1.4. Python standard library

1.4.1. Modules and built-in functions that come included with a Python distribution

1.4.2. Includes modules written in C that provide access to file system

1.5. Python import module

1.5.1. use import keyword + name of module

1.5.2. you can import multiple modules in single import statement by comma separating module names

1.5.2.1. e.g. import math, sys

1.5.3. import statement can be anywhere in code but must come before first invocation

1.6. Python namespace

1.6.1. An analogy is a social group where everyone is known by a unique name, perhaps making use of nicknames to ensure unique identification

1.6.2. when you import a module, this is a source file that will have a bunch of associated names that will become known within your code, but by default they won't override any names in your code and must be accessed by prefixing the module name - e.g. math.pi (where math is a module and pi is a constant defined inside that module)

1.6.2.1. import <module>

1.6.2.1.1. all names in that module are accessible but qualifying with module name prefix is mandatory

1.6.2.2. from <module> import <name(s)>

1.6.2.2.1. all names imported are accessible without module name qualification

1.6.2.2.2. <name(s)> can be comma separated list

1.6.2.2.3. e.g. from math import sin, pi

1.6.2.2.4. overrides any pre-existing names, but equally names can be defined in your code after the import and those definitions then override

1.6.2.3. from <module> import *

1.6.2.3.1. Imports all entities from a module

1.6.2.3.2. Higher risk of name conflicts

1.6.2.3.3. Convenient but considered bad practice for regular code

1.6.2.4. import <module> as <alias>

1.6.2.4.1. Imports module and assigns alias, which you use in qualifying references to that module's entities

1.6.2.4.2. note that "as" is a keyword

1.6.2.4.3. after successful aliased import, the original module name cannot be used

1.6.2.5. from <module> import <name> as <alias>

1.6.2.5.1. Imports specific entity (name) with an alias

1.6.2.5.2. <name> as <alias> can repeat in single statement with comma separations

1.7. Python dir() function

1.7.1. Built in function you can use after import <module> to list alphabetically all the entities available from the imported module

1.7.2. example:

1.7.2.1. import math dir(math)

1.7.3. if you import module with an alias, then you must use alias with dir()

2. Python math module

2.1. sin(x)

2.1.1. sine of x

2.2. cos(x)

2.2.1. cosine of x

2.3. tan(x)

2.3.1. tangent of x

2.4. asin(x)

2.4.1. arcsine of x

2.5. acos(x)

2.5.1. arccosine of x

2.6. atan(x)

2.6.1. archtangent of x

2.7. pi

2.7.1. constant that approximates pi value

2.8. radians(x)

2.8.1. converts x from degrees to radians

2.9. degrees(x)

2.9.1. converts x from radians to degrees

2.10. e

2.10.1. constant that approximates Euler's number

2.11. exp(x)

2.11.1. e to the power of x

2.12. log(x)

2.12.1. natural logarithm of x

2.13. log(x, b)

2.13.1. logarithm of x to the power of b

2.14. log10(x)

2.14.1. decimal logarithm of x, more precise than log(x, 10)

2.15. log2(x)

2.15.1. binary logarithm of x, more precise than log(x, 2)

2.16. pow(x, y)

2.16.1. x to the power of y

2.16.2. note: this is a built-in function, not really part of math module, so no need to import math to use it

2.17. ceil(x)

2.17.1. ceiling ox x (smallest integer greater than or equal to x)

2.18. floor(x)

2.18.1. floor of x (largest integer less than or equal to x)

2.19. trunc(x)

2.19.1. value of x truncated to an integer

2.19.1.1. behaves like floor on positive numbers and ceil on negative numbers

2.20. factorial(x)

2.20.1. value of x!

2.20.1.1. x must be positive, otherwise exception raised

2.20.1.2. x must be unambiguously resolvable to a whole number, otherwise exception raised

2.20.1.2.1. ok

2.20.1.2.2. not ok

2.21. hypot(x, y)

2.21.1. returns length of hypotenuse for right-angled triangle with leg lengths of x and y

3. Python random module

3.1. implements pseudo-random number generators

3.1.1. algorithms aren't random - they are deterministic and predictable

3.1.2. A random number generator takes a value called a seed, treats it as an input value, calculates a "random" number based on it (the method depends on a chosen algorithm) and produces a new seed value

3.1.2.1. The initial seed value determines the order in which the generated values will appear

3.1.2.2. if you set the seed value to a fixed value and make a sequence of calls to the random number generator, the "random" numbers produced by that post-seed sequence are always reproducible if you repeat with the same seed

3.1.2.3. using a number derived from the current system date and time is a commonly used source for a seed number because it always produces a different set of random numbers due to never repeating the seed

3.2. seed()

3.2.1. sets seed value

3.2.2. argument is optional, but if supplied it takes an integer or converts to an integer

3.2.2.1. note: even seed('hello') will work, as 'hello' string is converted to an integer

3.2.2.2. without argument, the current system datetime is converted to integer and used

3.2.3. you don't have to explicitly set seed before using one of the random number generator functions, and in this case the system datetime will automatically be used to default the seed at the time the module is imported

3.3. random()

3.3.1. returns next random float number between 0.0 and 1.0

3.4. randrange(end)

3.4.1. return random integer between 0 and end minus 1

3.4.1.1. e.g. randrange(5) returns random integer between 0 and 4

3.5. randrange(begin,end)

3.5.1. return random integer between begin and end minus 1

3.6. randrange(begin,end, step)

3.6.1. return random integer between begin and end minus 1 in steps of step

3.7. randint(left,right)

3.7.1. return random integer between left and right

3.7.1.1. e.g. randint(1,5) returns random integer between 1 and 5

3.8. choice(sequence)

3.8.1. return random element from a sequence, such as a list of numbers

3.8.1.1. if you use with a for loop and the list .remove method on each iteration, you can use this like a lottery draw

3.9. sample(sequence, elements_to_choose=1)

3.9.1. return a list of length elements_to_choose (which defaults to 1 if omitted) drawn in random order from a sequence, such as a list

3.9.2. elements_to_choose cannot exceed length of sequence, otherwise an exception is raised

4. Python platform module

4.1. think of your code executing at the top of a pyramid: 1. Code 2. Python runtime environment 3. OS 4. Hardware (device drivers, etc.)

4.1.1. opening a file, for example, is an instruction that goes from your code to the Python runtime environment, which handles the OS instruction, and the OS understands how to interact with the hardware for the required disk reads into memory, etc.

4.2. platform(alias = False, terse = False)

4.2.1. returns info about the platform that the Python runtime is hosted on

4.2.1.1. e.g. Windows-10

4.3. machine()

4.3.1. returns generic name of processor

4.3.1.1. e.g. AMD64

4.4. processor()

4.4.1. returns real processor name if possible

4.4.1.1. e.g. Intel64 Family 6 Model 78 Stepping 3, GenuineIntel

4.5. system()

4.5.1. returns generic name of OS

4.5.1.1. e.g. Windows

4.6. version()

4.6.1. returns version of OS

4.6.1.1. e.g. 10.0.18362

4.7. python_implementation()

4.7.1. returns Python implementation

4.7.1.1. e.g. CPython

4.7.2. returns Python implementation

4.8. python_version_tuple()

4.8.1. returns major version, minor version and patch level as 3-element tuple

4.8.1.1. e.g. ('3', '8', '1')

5. Python Standard Module index

5.1. There are many modules, which collectively make up the Python universe, and pure Python is like a single galaxy within that universe

5.2. The idea is to find specific modules for what you need to do and then learn how to use them

6. Python packages

6.1. Group together modules

7. Python creating a module

7.1. Observation based on experiment

7.1.1. module.py is empty file representing a module

7.1.2. main.py is file in same directory as module.py and includes a single line: import module.py

7.1.3. when you run main.py for first time, it produces some effects on the file system

7.1.3.1. a __pycache__ subdirectory is created

7.1.3.2. file is created inside __pycache__ subdirectory, named with following convention: <module_name>.<python_disribution>.xy.pyc, where x is major version no and y is minor version no

7.1.3.2.1. e.g. module.cpython-36.pyc

7.1.3.3. .pyc file contains semi-compiled code, optimised for execution by Python interpreter

7.1.3.3.1. makes module code faster to load and run next time

7.1.3.3.2. Python automatically tracks changes to source module and rebuilds .pyc file when required

7.2. Running import statement for module file automatically creates a variable labelled __name__

7.2.1. __name__ variable returns two different values depending on execution context

7.2.1.1. when code execution is inside module file itself, __name__ returns '__main__'

7.2.1.2. when code execution is outside module file (i.e. you are referencing it as <module_name>.__name__ having previously executed import <module_name>, it will return '<module_name>'

7.2.1.3. this can be used to check execution context and develop appropriate conditional logic based on that

7.2.1.3.1. For example, as modules are generally collections of functions, designed for import and not to be executed as a standalone file, you might add some logic based on __name__ to print some helpful message should someone decide to execute the module file directly

8. Python module features

8.1. When you add variables in a module, there is no way in Python to keep that variable hidden or protected from unwanted changes by the module user.

8.1.1. Python module developers must trust their users not to mis-use the module variable.

8.1.2. There is a common convention to prefix "internal" variable names with an underscore "_" or double underscore "__". This is intended to communicate to the module user that its supposed to be an internal read-only variable.

8.2. The shabang or hashbang line that is often added to the top of a module file begins with "#!" is just a comment to Python but for Unix, Linux and MacOS it instructs the OS how to execute contents of the file.

8.2.1. Example is "#!/usr/bin/env python3", which would be common to see in python modules residing on Linux for example.

8.3. It is common practice to include a comment enclosed by triple quotes """ either side, which may well be a multi-line comment.

8.3.1. This explains the purpose of the module and is known as the doc-string.

8.3.1.1. Typically this will immediately follow the hashbang comment at the top of the module file.

9. Python sys module

9.1. path variable

9.1.1. holds list of paths that are searched when running the import statement

9.1.1.1. Python supports reading zip files as directories for modules, which helps save a lot of disk space

9.1.2. appending to or inserting into sys.path list variable is how you can store usable modules in different sub-directories, distinct from program files that use them

10. Python packages

10.1. Packages are a collection of related modules organised into a hierarchical sub-directory collection in the host file system

10.2. A reference to a function in a module that is nested below the top level package directory is made using dot (.) notation to separate the sub-directory references

10.2.1. example: extra.good.best.tau.funT()

10.2.1.1. extra, good and best represent hierarchical sub-directories in the package

10.2.1.2. extra is the top level directory for the package

10.2.1.3. tau is a module (filename tau.py) located inside the best sub-directory

10.2.1.4. funT() is a function located inside the tau.py module

10.3. In order for Python to recognise that a particular collection of module files represents a package, initialisation is required

10.3.1. Package initialisation is achieved by placing a file with the following name in the top level directory for the package:

10.3.1.1. __init__.py

10.3.1.1.1. if you don't require any special initialisation for the package, this file can be empty but the file itself must exist

10.4. A common file structure for storing program files and packages is: packages programs

10.4.1. here is a common piece of code to have at the top of your program files given the aforementioned structure of parallel sub-directories named programs and packages

10.4.1.1. from sys import path path.append('..\\packages')

10.4.1.1.1. the double dot (..) steps back up 1 level in directory hierarchy (from programs), and the backslash is doubled because Python recognises \ as an escape character, so we must escape it for Python to treat as a sub-directory reference

11. Python exceptions

11.1. When code is syntactically correct but results in an error, two things happen: 1. Program execution is halted 2. An exception object is created

11.1.1. known as raising an exception

11.1.2. if code does not handle exception, then program will be forcibly terminated

11.1.3. Python interpreter returns name of exception in its error message when not handled

11.1.3.1. example:

11.1.3.1.1. ZeroDivisionError: division by zero

11.1.3.2. part before colon is name of exception

11.1.4. Exception handling for all risky code:

11.1.4.1. try: <try block> except exc1: <exception block for exc1> except exc2: <exception block for exc2> except: <catch all exception block>

11.2. Rules for try-except

11.2.1. the except branches are searched in the same order in which they appear in the code

11.2.2. you must not use more than one except branch with a certain exception name

11.2.3. the number of different except branches is arbitrary - the only condition is that if you use try, you must put at least one except (named or not) after it

11.2.4. the except keyword must not be used without a preceding try

11.2.5. if any of the except branches is executed, no other branches will be visited

11.2.6. if none of the specified except branches matches the raised exception, the exception remains unhandled

11.2.7. if an unnamed except branch exists (one without an exception name), it has to be specified as the last

11.2.8. it is also possible to extend a standard try-except block with an else branch, which if added MUST follow all except branches and will only be executed if no exception arises from the try block at the top

11.2.8.1. it is also possible to extend a try-catch block with a finally branch, which if added MUST be the last branch (i.e. after all except branches and the else branch, if the latter exists), and unlike else, the finally branch will always execute regardless of whether or not an exception arose from the try block

11.2.8.1.1. example

11.3. Python 3 defines 63 built-in exceptions, which form a hierarchy

11.3.1. Example: ZeroDivisionError

11.3.1.1. is a more specific exception of type ArithmeticError

11.3.1.1.1. ArithmeticError is more specific exception of type Exception

11.3.2. significance of hierarchy is that your try-except block can handle exceptions at any level from most specific to most general

11.3.2.1. example: the following two code fragments are semantically equivalent because ArithmeticError is a general form of the specific ZeroDivisionException that the code is triggering

11.3.2.1.1. try: y = 1 / 0 except ZeroDivisionError: print("Oooppsss...") print("THE END.")

11.3.2.1.2. try: y = 1 / 0 except ArithmeticError: print("Oooppsss...") print("THE END.")

11.3.2.2. Avoid adding exception handlers for more general exceptions before more specific ones in the same hierarchy - this will make the more specific exception handlers useless as their code is unreachable

11.3.3. you can also include multiple exceptions in single except block

11.3.3.1. exceptions must be comma separated and enclosed in brackets ( )

11.3.3.1.1. example

11.4. You can also manually trigger exceptions by using the raise keyword with the name of a built-in exception

11.4.1. e.g. raise ZeroDivisionError

11.4.2. it's also possible to use raise without naming an exception, but this is only valid from inside an except block

11.4.2.1. can be useful for distributing exception handling across your code

11.4.2.2. example:

11.4.2.2.1. def badFun(n): try: return n / 0 except: print("I did it again!") raise try: badFun(0) except ArithmeticError: print("I see!") print("THE END.")

11.5. You can use the assert statement as a fail-safe check, which will raise an exception of type AssertionError if the expression following the assert keyword does not resolve to True

11.5.1. example usage:

11.5.1.1. import math x = float(input("Enter a number: ")) assert x >= 0.0 x = math.sqrt(x) print(x)

11.5.1.1.1. if x >= 0.0 does not resolve to True it will raise AssertionError, otherwise it will do nothing

11.5.2. assert will also raise AssertionError exception for the following results: number equating to zero empty string None

11.6. Common exceptions in hierarchy:

11.6.1. BaseException

11.6.1.1. Exception

11.6.1.1.1. ArithmeticError

11.6.1.1.2. AssertionError

11.6.1.1.3. LookupError

11.6.1.1.4. MemoryError

11.6.1.1.5. StandardError

11.6.1.2. KeyboardInterrupt

11.6.1.2.1. concrete exception raised when the user uses a keyboard shortcut designed to terminate a program's execution (Ctrl-C in most OSs); if handling this exception doesn't lead to program termination, the program continues its execution

11.6.1.3. most general (abstract) of all Python exceptions - all other exceptions are included in this one; it can be said that the following two except branches are equivalent: except: and except BaseException:

11.7. Exceptions are classes and when an exception is raised via a try block, this will create an object (i.e. an instance of one of the classes in the class hierarchy that begins with BaseException)

11.7.1. the following simple example demonstrates how we can access information about a captured exception object - note the use of the "as" keyword followed by the alias (e in this example)

11.7.1.1. try: i = int("Hello!") except Exception as e: print(type(e).__name__) print(e.__str__()) i = None print("i =",i)

11.7.1.1.1. returns:

11.8. Python custom exceptions

11.8.1. you can define your own custom exception classes - this can be an specialised extension of a more specific exception class or if you want to create your own very particular hierarchy, you can use the high level Exception class as your top level superclass

11.8.1.1. example:

11.8.1.1.1. class PizzaError(Exception): def __init__(self, pizza = "unknown", message = ""): Exception.__init__(self, message) self.pizza = pizza def __str__(self): return "PizzaError" class TooMuchCheeseError(PizzaError): def __init__(self, pizza = "unknown", cheese = ">100", message = ""): PizzaError.__init__(self, pizza, message) self.cheese = cheese def __str__(self): return "TooMuchCheeseError"

12. Text Handling

12.1. Computers store characters as numbers

12.2. Common need to process character data across varied computer systems led to standardisation of character encoding systems

12.2.1. ASCII is one of the most popular, common standards, based on latin alphabet, allowing for 256 characters

12.2.1.1. Latin alphabet + numeric digits and various common whitespace (e.g. TAB, SPACE) and control (e.g. CR, LF) characters, plus a few commonly used symbols (e.g. $, !) are encoded within first 128 code points of ASCII (0 to 127)

12.2.1.2. ASCII leverages concept of code page for the upper 128 characters (128 to 255) to support needs for some other languages with similar alphabets

12.2.1.2.1. this means that single code point in 128 to 255 range in ASCII can return a different character depending on the code page that is being applied

12.2.2. ASCII is inadequate to support need for internationalization, which is a term that may be referred to as I18N (starts with "I" + 18 letters, ends with "N")

12.2.2.1. Code pages solved the I18N problem for a while but was recognised as imperfect, which led to the Unicode standard

12.2.2.1.1. Unicode assigns unique (unambiguous) characters (letters, hyphens, ideograms, etc.) to more than a million code points

12.2.2.1.2. First 128 characters of Unicode are identical to ASCII

12.2.2.1.3. first 256 Unicode code points are identical to the ISO/IEC 8859-1 code page (a code page designed for western European languages)

12.2.2.1.4. Unicode standard says nothing about how to code and store the characters in the memory and files. It only names all available characters and assigns them to planes (a group of characters of similar origin, application, or nature).

12.2.3. Each character (alphabetic, numeric, symbolic, whitespace or control) is represented in an encoding system by a code point, each of which has a unique number assigned to it by the encoding system

12.3. Strings in Python are immutable sequences

12.3.1. you can iterate them like lists and you can access individual characters via index references

12.3.1.1. examples

12.3.1.1.1. myString = 'hello world' for i in range(len(myString)): print(mystring[i], sep="", end="")

12.3.1.1.2. myString = 'hello world' for c in myString: print(c, sep="", end="")

12.3.2. slices work with strings too

12.3.2.1. alpha = "abdefg" print(alpha[1:3]) print(alpha[3:-2]) print(alpha[::2])

12.3.2.1.1. returns: bd e adf

12.3.3. you can use the in and not in operators with strings too, like lists

12.3.4. unlike lists, you cannot use del with an index reference to remove any part of a string, although you can use del on the whole string

12.3.4.1. it follows that unlike lists, strings do not have an append() or insert() method, so any attempt to use those with a variable holding a string will raise an exception

12.3.5. min() function works with strings and lists alike

12.3.5.1. example:

12.3.5.1.1. print(min("aAbByYzZ")) t = 'The Knights Who Say "Ni!"' print('[' + min(t) + ']') t = [0, 1, 2] print(min(t))

12.3.5.2. string argument cannot be an empty string, otherwise it will throw a ValueError exception

12.3.6. max() function works in opposite way to min()

12.3.7. index() method returns first index position of a substring passed as argument, but if substring not found it will raise ValueError exception

12.3.7.1. example

12.3.7.1.1. myString = 'Hello world' print(myString.index('w')) print(myString.index('Hell')) print (myString.index('World'))

12.3.8. list() function will convert a string to a list

12.3.8.1. example:

12.3.8.1.1. print(list("abcabc"))

12.3.9. count() method works the same for strings as for lists

12.3.9.1. example:

12.3.9.1.1. myString = "abcabc" myList = list(myString) print(myString.count('a')) print(myList.count('a'))

12.4. Multiline strings are specified using either 3 apostrophes ''' or 3 quotes """

12.4.1. examples

12.4.1.1. multiLine = '''Line #1 Line #2'''

12.4.1.2. multiLine = """Line #1 Line #2"""

12.4.2. note that this is illegal

12.4.2.1. multiLine = 'Line #1 Line #2'

12.4.3. note that len includes the whitespace characters

12.4.3.1. when you press enter in Python editor, you will get LF whitespace character added, which can also be denoted as \n

12.5. String data supports use of + and * operators, which is an example of overloading as they do not behave in same way as for arithmetic operations involving numbers

12.5.1. + performs string concatention

12.5.1.1. print('hello ' + 'world')

12.5.1.1.1. returns hello world

12.5.2. * performs string multiplication

12.5.2.1. print('a' * 3)

12.5.2.1.1. returns aaa

12.5.3. += and *= are both supported for string assignments

12.6. Python ord() function

12.6.1. takes 1 character string argument and returns the Unicode encoding number for the character

12.7. Python chr() function

12.7.1. opposite function of ord(), takes single integer argument and returns Unicode string character

12.8. Python string specific methods

12.8.1. capitalize() method

12.8.1.1. if character at index[0] of source string is a letter, capitalize it, and convert any other letters in string to lower case, with result being a new string

12.8.1.1.1. example:

12.8.2. center() method

12.8.2.1. one-parameter variant of the center() method makes a copy of the original string, trying to center it inside a field of a specified width

12.8.2.1.1. example:

12.8.2.1.2. if string length exceeds argument value, then copy of original string returned without any added spaces

12.8.2.2. two-parameter variant of center() makes use of the character from the second argument, instead of a space

12.8.2.2.1. example:

12.8.3. endswith() method

12.8.3.1. returns True if source string ends with substring passed as argument, else False

12.8.3.1.1. example:

12.8.3.2. startswith() is the mirror opposite of endswith(), returning True if source string starts with substring passed as argument, esle False

12.8.4. find() method

12.8.4.1. similar to index(), it looks for a substring and returns the index of first occurrence of this substring, but it's safer (returns -1 if substring not found rather than raising exception like index()) and works with strings only

12.8.4.1.1. example:

12.8.4.1.2. use 2-parameter variant to start search at some position beyond index 0

12.8.4.1.3. use 3-parameter variant to limit upper position of search

12.8.4.2. rfind() method

12.8.4.2.1. has 1, 2 and 3 parameter variants that work almost identically to find() but starts search from end of string and works back

12.8.5. isalnum() method

12.8.5.1. returns True if source string consists exclusively of numeric digits and/or alphabetic characters (letters), else False

12.8.5.1.1. example:

12.8.5.1.2. will return False if string contains any spaces

12.8.5.1.3. will return True for alphabets other than Western Latin

12.8.6. isalpha() method

12.8.6.1. more specialised than isalnum(), returning True only when all characters are alphabetic letters

12.8.7. isdigit() method

12.8.7.1. more specialised than isalnum(), returning True only when all characters are numeric digits

12.8.8. islower() method

12.8.8.1. more specialised than isalpha(), returning True only when all characters are lowercase alpha

12.8.9. isupper() method

12.8.9.1. opposite of islower(), returns True only when all characters are uppercase alpha

12.8.10. isspace() method

12.8.10.1. returns True when all characters are whitespace

12.8.10.1.1. example:

12.8.11. join() method

12.8.11.1. takes a single list of strings as an argument and uses the source string as a separator to combine all the list strings into a single new string

12.8.11.1.1. example:

12.8.11.1.2. if list argument does not hold exclusively string elements, it will raise a TypeError exception

12.8.12. lower() method

12.8.12.1. converts all uppercase letters in source string to lowercase and returns copy of transformed string

12.8.12.1.1. example:

12.8.12.2. swapcase() method transforms all lowercase letters to upper and all uppercase letter to lower, returning transformed string

12.8.12.3. title() method transforms first letter of every word to uppercase, and all other letters to lowercase

12.8.12.3.1. example:

12.8.12.4. upper() method does the mirror opposite of lower()

12.8.13. lstrip() method

12.8.13.1. with no argument, this removes all leading whitespace characters from source string and returns transformed copy

12.8.13.1.1. example:

12.8.13.2. with single string argument, it substitutes that string for the leading characters to be removed

12.8.13.2.1. example:

12.8.13.3. rstrip() method is same as lstrip(), with 0 and 1 parameter variants, but works from opposite end of string

12.8.13.3.1. actually, the substring argument works by examining rightmost part of source string for ANY combination of substring characters and strips all of them

12.8.13.4. strip() method combines lstrip() and rstrip() into one

12.8.13.4.1. examples:

12.8.14. replace() method

12.8.14.1. requires two substring parameters, searches for first substring in source string and replaces with second substring, and returns result as new string

12.8.14.1.1. example:

12.8.14.2. there is a 3 parameter variant, where the 3rd parameter is an integer limiting the number of replacments

12.8.14.2.1. example:

12.8.15. split() method

12.8.15.1. places all substrings found in source string into elements of a list that is returned

12.8.15.1.1. example:

12.8.15.2. assumes that whitespaces are substring delimiters

12.8.15.3. join() method performs opposite action of split(), where the source string would typically be a space or some other delimiter and the argument of split() is a list of strings

12.8.15.3.1. example:

12.9. Python string comparison operators

12.9.1. All the usual comparison operators that can be used with numbers can also be used with strings (== , != , > , >= , < , <=)

12.9.1.1. the main thing to remember when comparing strings is that the comparison is always based on the ordinal ASCII/Unicode value of each character

12.9.1.2. remember that uppercase letters all occupy lower ordinal code values in ASCII and Unicode

12.9.1.2.1. so... print("hello" > "Hello") returns: True

12.9.1.3. bear in mind that a longer string compared to a shorter string is always greater than the shorter one when the longer string holds identical characters at its beginning to the shorter one

12.9.1.3.1. so... print("alpha" < "alphabet") returns: True

12.9.1.4. remember that comparing numbers that are strings is done on same basis as alphabetic letter comparisons

12.9.1.4.1. so... print('10' == '010') returns: False

12.9.1.4.2. and... print('10' < '8') returns: True

12.9.1.5. comparing a string with a number is possible using == and != but its generally a bad idea to make any comparisons between strings and numbers

12.9.1.5.1. so... print('10' == 10) returns: False

12.9.1.5.2. if you try and use one of the other comparison operators, you'll get a TypeError exception

12.10. Converting strings to numbers and vice versa

12.10.1. str() function

12.10.1.1. always safe, can convert any numeric type to a string

12.10.2. int() function

12.10.2.1. converts string representation of integer to an integer, but if string does not represent integer, it will raise a ValueError exception

12.10.2.1.1. note: int will not convert a string that represents a float to an int, but will convert an actual float to int by rounding

12.10.3. float() function

12.10.3.1. converts string representation of number (float or int) to a float, but if string does not represent a number, it will raise a ValueError exception

13. Python string list sorting

13.1. Python sorted() function

13.1.1. takes single list argument and returns a new list with all elements sorted

13.2. Python list sort() method

13.2.1. sorts and modifies source list (i.e. does not return copy, but actually changes the subject list)

14. Classes and Objects

14.1. Classes categorise by a grouping of characteristics

14.2. A class can have many sub classes and sub classes have super classes

14.2.1. Sub classes inherit from super classes

14.3. Objects are created from classes and automatically belong to a class hierarchy

14.3.1. Conceptually every object has a unique grouping of 3 attribute types: name (think "noun") properties (think "adjectives") actions (think "verbs")

14.4. Python classes

14.4.1. create class using class keyword followed by class name, colon and indented class definition (similar to how function is defined)

14.4.1.1. example:

14.4.1.1.1. class TheSimplestClass: pass

14.4.2. Python objects

14.4.2.1. once class is defined, you can create any number of objects from it by assigning variable to the class by referencing the class like a function

14.4.2.1.1. example:

14.4.2.1.2. object creation is called instantiation (i.e. it becomes an instance of the class)

14.5. Procedural approach suffers from issues when creating an object (e.g. a stack)

14.5.1. 1. variables referencing built-in types like lists can be accidentally altered by other code in ways not intended

14.5.1.1. solved by the class-object paradigm, which delivers encapsulation (objects cannot have their internal properties altered by external means)

14.5.2. 2. creating multiple versions of an object can often require copying code

14.5.2.1. solved by concept of instantiation (class defines all necessary properties and methods, defined just once, and having many copies at once as objects is easy)

14.5.3. 3. extending the functionality of an object can be fiddly and awkward to manage

14.5.3.1. solved by inheritance and ability to create sub classes

14.6. Python object constructor method

14.6.1. first method of a class should be: def __init__(self):

14.6.1.1. constructor methods require at least one parameter and first one must refer to the object being created - using "self" is a convention (i.e. not compulsory) but it is highly recommended to always follow that convention

14.6.2. constructor methods cannot return anything because they are designed to exclusively return a new object instance of the class

14.6.3. constructor methods cannot be explicitly invoked from the object or from its class (although invocation from one its super classes is allowed)

14.7. Python object encapsulation

14.7.1. to make object properties private, you must declare them in the class constructor method with a double underscore (__) prefix

14.7.1.1. examples:

14.7.1.1.1. class Stack: def __init__(self): self.stackList = [] stackObject = Stack() print(len(stackObject.stackList))

14.7.1.1.2. class Stack: def __init__(self): self.__stackList = [] stackObject = Stack() print(len(stackObject.__stackList))

14.8. Python object methods

14.8.1. When defining class methods, you must always define them with at least one parameter and the first parameter should be "self"

14.8.1.1. example:

14.8.1.1.1. class Stack: def __init__(self): self.__stackList = [] def push(self, val): self.__stackList.append(val) def pop(self): val = self.__stackList[-1] del self.__stackList[-1] return val

14.8.2. note: a method is a function defined inside a class but unlike functions, there is no such thing as a parameterless method (it is possible to invoke a function without passing an argument but you cannot define one without at least one parameter, the first of which will always be "self")

14.8.2.1. you should never attempt to explicitly pass an argument for self when invoking a function, as Python will do this automatically for you

14.8.3. self is used to get a reference to the (yet to be created) object and gain access to all the object/class variables/methods

14.8.4. object methods can be made hidden (private) just like variables by prefixing with __ and the same property name mangling occurs as for variables

14.9. Python object inheritance

14.9.1. Inheritance is achieved by defining a new class that takes the name of its super class as a parameter name. Furthermore, the constructor method should explicitly invoke the constructor method of its super class

14.9.1.1. example:

14.9.1.1.1. class AddingStack(Stack): def __init__(self): Stack.__init__(self) self.__sum = 0

14.9.2. overriding methods via inheritance involves defining the method in the sub class as a mix of invoking the super class method and adding new functionality

14.9.2.1. example:

14.9.2.1.1. class AddingStack(Stack): def __init__(self): Stack.__init__(self) self.__sum = 0 def push(self, val): self.__sum += val Stack.push(self, val)

14.10. Python object return hidden property value

14.10.1. the way to return a value for a hidden object property is to define a "getter" method for it with a return statement that uses dot notation with a self reference

14.10.1.1. example:

14.10.1.1.1. class AddingStack(Stack): def __init__(self): Stack.__init__(self) self.__sum = 0 def getSum(self): return self.__sum

14.11. Python instance variables

14.11.1. the idea of instance variables is that different objects of same class can have different properties that are entirely isolated from each other, and it is even possible to extend an object with new properties, post instantiation

14.11.1.1. example:

14.11.1.1.1. class ExampleClass: def __init__(self, val = 1): self.first = val def setSecond(self, val): self.second = val exampleObject1 = ExampleClass() exampleObject2 = ExampleClass(2) exampleObject2.setSecond(3) exampleObject3 = ExampleClass(4) exampleObject3.third = 5 print(exampleObject1.__dict__) print(exampleObject2.__dict__) print(exampleObject3.__dict__)

14.11.2. when using private variables (with the double underscore __ prefix), Python creates the instance variable names differently when those variables are created from inside class methods - it adds prefix of "_<class_name>" to the private variable name - but not when variable is created directly from outside

14.11.2.1. example:

14.11.2.1.1. class ExampleClass: def __init__(self, val = 1): self.__first = val def setSecond(self, val = 2): self.__second = val exampleObject1 = ExampleClass() exampleObject2 = ExampleClass(2) exampleObject2.setSecond(3) exampleObject3 = ExampleClass(4) exampleObject3.__third = 5 print(exampleObject1.__dict__) print(exampleObject2.__dict__) print(exampleObject3.__dict__)

14.11.2.2. this changing of a hidden property from __<property> to _<class>__<property> is known as property name mangling

14.12. Python object __dict__

14.12.1. Python creates a number of built in properties and methods for every new object and __dict__ is a dictionary property that holds names and values of all properties (variables) that the object is currently holding

14.13. Python class variables

14.13.1. Class variables are declared inside class definition, outside of methods, and they can be altered by methods, and key difference with instance variables is that they exist before any objects exist and keep a single value independently of all objects

14.13.1.1. example

14.13.1.1.1. class ExampleClass: counter = 0 def __init__(self, val = 1): self.__first = val ExampleClass.counter += 1 print(ExampleClass.counter) exampleObject1 = ExampleClass() exampleObject2 = ExampleClass(2) exampleObject3 = ExampleClass(4) print(exampleObject1.__dict__, exampleObject1.counter) print(exampleObject2.__dict__, exampleObject2.counter) print(exampleObject3.__dict__, exampleObject3.counter)

14.13.2. Class variables exhibit same behaviour as instance variables when defining them as "private" by using the __ prefix convention

14.13.3. Class variables are members of the class __dict__ property, which can be access via class_name.__dict__

14.14. Python hasattr() function

14.14.1. As Python takes a different attitude to many other languages to OOP, it allows objects of same class to have different properties, and to help make a safe check for property existence, the hasattr() function is provided

14.14.1.1. hasattr() requires two parameters: 1. name of class or object (unquoted) 2. name of property (quoted)

14.14.1.1.1. returns True or False

14.14.1.1.2. note: hasattr() will return True when 1st arg is object name and 2nd arg is class variable, but will return False when 1st arg is class name and 2nd arg is instance variable

14.15. Python class __name__ property

14.15.1. __name__ is a string property tied to class only, which returns the name of the class

14.15.1.1. example:

14.15.1.1.1. class Classy: pass print(Classy.__name__)

14.15.2. use type() function on object to return class and then return __name__ property from result

14.15.2.1. examples:

14.15.2.1.1. class Classy: pass obj = Classy() print(type(obj))

14.15.2.1.2. class Classy: pass obj = Classy() print(type(obj).__name__)

14.15.2.1.3. note: print(obj.__name__) will return error as __name__ does not exist in context of object

14.16. Python __module__ property

14.16.1. __module__ is a string property for classes and objects that returns the name of the module that defines the class

14.16.1.1. when the class definition is in the current file (as it would also be if running via interactive interpreter) the result of module is always "__main__"

14.16.1.2. when you use it on an object/class after first importing the module that defines the class, then you will get the proper external module name

14.16.1.3. example:

14.16.1.3.1. class Classy: pass print(Classy.__module__) obj = Classy() print(obj.__module__)

14.17. Python class __bases__ property

14.17.1. __bases__ is a tuple property built in for all classes, where the elements are superclasses

14.17.1.1. example:

14.17.1.1.1. class SuperOne: pass class SuperTwo: pass class Sub(SuperOne, SuperTwo): pass print('( ', end='') for x in Sub.__bases__: print(x.__name__, end=' ') print(')')

14.17.2. where a class has no superclass, it inherits from a built in Python class named object

14.17.2.1. example:

14.17.2.1.1. class SuperOne: pass print('( ', end='') for x in SuperOne.__bases__: print(x.__name__, end=' ') print(')')

14.18. Introspection

14.18.1. ability of a program to examine the type or properties of an object at runtime

14.18.1.1. Python essentially allows you to interrogate all meta data about objects and classes

14.18.2. Python issubclass() function

14.18.2.1. takes two arguments that must reference a class or object and returns True if the 2nd is a subclass of 1st

14.18.2.1.1. note: Python considers an object or class to be a subclass of itself, so that will result in True

14.18.2.1.2. example:

14.18.2.2. Python isinstance() function

14.18.2.2.1. similar to issubclass(), it takes first argument as an object reference and second as a class reference and returns True if object is an instance of the class or any of the class's subclasses

14.18.3. Python is operator

14.18.3.1. use the is operator to check if one variable holds a reference to the same object as another variable, returning True or False

14.18.3.2. note: this is different to ==, which can return True when comparing two objects of the same type with identical property values, whereas for the is operator to return True, the two variables must refer to a single, common object

14.18.4. Python super() function

14.18.4.1. The super() function can be used inside class definitions to gain access to the properties (variables and methods) of a superclass without having to explicitly name the superclass

14.18.4.1.1. example:

14.18.4.2. when using super() to access an inherited function, do not specify self as first parameter

14.19. Reflection

14.19.1. ability of a program to manipulate the values, properties and/or functions of an object at runtime

14.19.1.1. here is a simple program that demonstrates both introspection and reflection in Python - it creates an object, adds instance variables to it (reflection), queries attributes of the object using __dict__ and getattr() (introspection) and changes state using setattr() (reflection)

14.19.1.1.1. class MyClass: pass obj = MyClass() obj.a = 1 obj.b = 2 obj.i = 3 obj.ireal = 3.5 obj.integer = 4 obj.z = 5 def incIntsI(obj): for name in obj.__dict__.keys(): if name.startswith('i'): val = getattr(obj, name) if isinstance(val, int): setattr(obj, name, val + 1) print(obj.__dict__) incIntsI(obj) print(obj.__dict__)

14.20. Python __str__() method

14.20.1. __str__() is a built-in method inherited by everything in Python and its purpose is to describe an object in a string

14.20.1.1. it can be handy to override the __str__() function for any classes you create, enabling a more user friendly description of an object

14.20.1.1.1. example:

14.21. Python multiple inheritance

14.21.1. one type of multiple inheritance for a class (and all objects instantiated from it) is from all its superclasses

14.21.1.1. inheritance works bottom to top, which means if any properties or methods are overridden, the lowest in the super/sub class chain is inherited

14.21.1.1.1. example:

14.21.2. another type of multiple inheritance occurs when a class inherits from two or more unrelated superclasses

14.21.2.1. inheritance works left to right, based on the order of the parameters in the subclass definition

14.21.2.1.1. example:

14.21.3. beware using the super() function when dealing with multiple inheritance because the results will be ambiguous

14.21.4. although multiple inheritance is possible it should not be your first choice as it is riskier than single inheritance and violates a principle known as the single responsibility principle

14.21.4.1. consider using composition before going for multiple inheritance

14.21.4.2. note that Python will prevent any multiple inheritance that effectively forms a diamond

14.21.4.2.1. example:

14.22. Polymorphism

14.22.1. ability of subclassess to inherit from superclasses but change characteristics or behaviour

14.23. Composition

14.23.1. Composition is the process of composing an object using other different objects

14.23.1.1. example:

14.23.1.1.1. class Hello: firstWord = 'Hello' def __init__(self, nextWord): self.phrase = Hello.firstWord + ' ' + nextWord class World: def getWord(self): return 'world' myPhrase = Hello(World().getWord()).phrase print(myPhrase)

15. Python generator

15.1. a generator is a special type of function that produces multiple outputs and returns these encapsulated inside an iterable object

15.2. range() is an example of a generator

15.2.1. e.g. range(5) produces 5 values, 0 to 4 and returns an object that can be iterated by a for loop

15.3. a generator can also be a class that provides two methods: __iter__(), __next__()

15.3.1. __iter__() method returns the object and is invoked once

15.3.2. __next__() should return the next value (first, second, and so on) of the desired series - it will be invoked by the for/in statements in order to pass through the next iteration; if there are no more values to provide, the method should raise the StopIteration exception

15.3.3. example:

15.3.3.1. class Fib: def __init__(self, nn): print("__init__") self.__n = nn self.__i = 0 self.__p1 = self.__p2 = 1 def __iter__(self): print("__iter__") return self def __next__(self): print("__next__") self.__i += 1 if self.__i > self.__n: raise StopIteration if self.__i in [1, 2]: return 1 ret = self.__p1 + self.__p2 self.__p1, self.__p2 = self.__p2, ret return ret for i in Fib(10): print(i)

15.3.3.1.1. results show that constructor __init__() runs first, then __iter__(), then __next__() is called repeatedly, and the final time is the StopIteration exception that halts the iteration process but is gracefully handled

15.3.3.1.2. note that Fib uses recursion to call itself

15.3.3.1.3. the Fib object conforms to the iterator protocol - otherwise the for..in construct would raise an exception

15.4. Iterator protocol

15.4.1. way in which an object should behave to conform to the rules imposed by the context of the for and in statements

15.4.2. An object conforming to the iterator protocol is called an iterator

15.5. Python yield statement

15.5.1. The iterator protocol is rather inconvenient (as linked example shows, code is longer and harder to comprehend), which leads to the yield statement, which can be likened to a special form of the return statement

15.5.1.1. example:

15.5.1.1.1. def fun(n): for i in range(n): yield i for v in fun(5): print(v)

15.5.2. using yield instead of return converts a function to a generator, which yields a generator object that is iterable

15.5.3. as well as using a generator in a regular for loop, we can also use it in list comprehension

15.5.3.1. example

15.5.3.1.1. def powersOf2(n): pow = 1 for i in range(n): yield pow pow *= 2 t = [x for x in powersOf2(5)] print(t)

15.5.4. list() function can take a generator as its argument and convert it into a regular list

15.5.4.1. example

15.5.4.1.1. def powersOf2(n): pow = 1 for i in range(n): yield pow pow *= 2 t = list(powersOf2(3)) print(t)

15.5.5. we can use a generator with the in operator in place of a regular list

15.5.5.1. example

15.5.5.1.1. def powersOf2(n): pow = 1 for i in range(n): yield pow pow *= 2 for i in range(20): if i in powersOf2(4): print(i)

15.6. Python list comprehension to generator

15.6.1. in addition to a list comprehension using a generator (such as range() ), you can tweak any list comprehension expression so that it yields a generator rather than a list

15.6.1.1. the only change you have to make is to replace the square brackets [ ] of the list comprehension to regular paretheses ( )

15.6.1.1.1. example

16. Python lambda functions

16.1. A lambda function is a function without a name (you can also call it an anonymous function)

16.2. The lambda function is a concept borrowed from mathematics, more specifically, from a part called the Lambda calculus, but these two phenomena are not the same

16.3. Mathematicians use the Lambda calculus in many formal systems connected with logic, recursion, or theorem provability. Programmers use the lambda function to simplify the code, to make it clearer and easier to understand.

16.4. declaration of the lambda function doesn't resemble a normal function declaration in any way

16.4.1. lambda parameters : expression

16.4.1.1. note that as with regular functions, parameters are optional and when there are two or more, these must be separated by commas

16.4.2. very simple example that shows you can actually name lambda functions, although this means they are no longer anonymous and in normal use you will see lambdas used anonymously

16.4.2.1. two = lambda : 2 sqr = lambda x : x * x pwr = lambda x, y : x ** y for a in range(-2, 3): print(sqr(a), end=" ") print(pwr(a, two()))

16.5. a lambda function can be substituted anywhere that a regular named function can

16.5.1. following two examples return same result, 1st one shows function as a named one called poly and 2nd example shows that poly can easily be replaced by an anonymous lambda function

16.5.1.1. def printfunction(args, fun): for x in args: print('f(', x,')=', fun(x), sep='') def poly(x): return 2 * x**2 - 4 * x + 2 printfunction([x for x in range(-2, 3)], poly)

16.5.1.2. def printfunction(args, fun): for x in args: print('f(', x,')=', fun(x), sep='') printfunction([x for x in range(-2, 3)], lambda x: 2 * x**2 - 4 * x + 2)

16.6. Python map() function

16.6.1. map() function is one that is a common use case for lambdas (as programmer's view it as more elegant coding)

16.6.2. syntax is:

16.6.2.1. map(fun, iter)

16.6.2.2. fun is a function name, which can be a named function, but can also be a lambda function

16.6.2.3. iter is an iterable such as a list, tuple or generator

16.6.2.4. there can be more than one iter argument passed

16.6.3. feeds iter (iterator) into fun (function) as series of arguments and returns a map object that holds results of repeated function calls, where the map object is itself iterable

16.6.3.1. example

16.6.3.1.1. firstNames = ["Ian","Favorita","Cristina","Busty","Amanda"] lastNames = ["Bradshaw","Barbarello","Barbarello-Bradshaw"] greetings = map(lambda f : "hello " + f, firstNames) for m in greetings: print(m) greetings = map(lambda f, l : "hello " + f + " " + l, firstNames, lastNames) for m in greetings: print(m)

16.6.4. list() function can be used to convert new map object into a list

16.6.4.1. example

16.6.4.1.1. firstNames = ["Ian","Favorita","Cristina"] print(list(map(lambda f : "hello " + f, firstNames)))

16.7. Python filter() function

16.7.1. filter() function is another one, similar to map(), that is often combined with lambda functions for more elegant syntax

16.7.2. syntax is same as map() except that it accepts only 2 arguments (i.e. you cannot specify multiple iterator arguments, like you can with map() )

16.7.3. function (1st argument) is fed values from its 2nd argument and must subject that argument to a True/False test, capturing the element from the 2nd argument whenever the function yields True

16.7.3.1. example

16.7.3.1.1. data = [0, 1, 2, 3, 4] filtered = list(filter(lambda x: x > 0 and x % 2 == 0, data)) print(data) print(filtered)

17. Python closures

17.1. Closures provide an alternative for classes that would typically only be created with one method.

17.2. They avoid the use of global variables and provide a form of data hiding.

17.3. Following criteria must be met to create closure in Python:

17.3.1. 1. Must have a nested function

17.3.2. 2. Nested function must refer to value defined in enclosing function

17.3.3. 3. Enclosing function must return nested function

17.4. example

17.4.1. def makeclosure(par): loc = par def power(p): return p ** loc return power fsqr = makeclosure(2) fcub = makeclosure(3) for i in range(5): print(i, fsqr(i), fcub(i))

17.4.1.1. returns

17.4.1.1.1. 0 0 0 1 1 1 2 4 8 3 9 27 4 16 64

17.4.1.2. Note that power() function references variable loc, defined by makeclosure.

17.4.1.3. Note that makeclosure returns copy of power() function by using return statement and name of function WITHOUT parentheses.

17.4.1.4. Note that we can capture copies of the power() function with the value of loc "locked in" so to speak, assign these to variables fsqr and fcub respectively and then invoke fsqr and fcub as functions (separate copies of power() each with different fixed values for loc).

18. Python file processing

18.1. One of the most common issues in the developer's job is to process data stored in files

18.2. Different operating systems can treat the files in different ways. For example, Windows uses a different naming convention than the one adopted in Unix/Linux systems.

18.3. canonical file names

18.3.1. name which uniquely defines the location of the file regardless of its level in the directory tree

18.3.2. different between Windows and Linux

18.3.2.1. example

18.3.2.1.1. Windows

18.3.2.1.2. Linux/Unix

18.3.2.2. Windows uses drive letters, Linux does not

18.3.2.3. Root directory is \ in Windows but / in Linux, and sub-directories are denoted same way

18.3.2.4. Linux canonical file names are case sensitive, but not so for Windows

18.3.3. care is needed when specifying canonical file names on a Windows platform because the backslash \ acts as an escape character in Python string expressions

18.3.3.1. one option is to escape the backslash with \\

18.3.3.1.1. name = "C:\\dir\\file"

18.3.3.2. another option is to take advantage of an automated conversion provided by Python, which allows you to express Windows canonical file names using the forward slash

18.3.3.2.1. name = "C:/dir/file"

18.4. Python (like almost every other programming language) does not interact directly with files but does so via abstractions that are commonly referred to as handles or streams

18.4.1. Python provides a rich set of functions and methods that perform operations on streams, which affect the real files using mechanisms contained in the operating system kernel

18.4.2. A stream must be connected to a physical file in a process known as binding

18.4.2.1. when a stream is connected, this is called opening the file

18.4.2.2. when a stream is disconnected, this is called closing the file

18.4.2.3. in between opening and closing a file, the program is free to invoke functions/methods on the stream that will manipulate the file in some way

18.4.2.4. opening a file can fail for multiple reasons and it is important that your program is designed to handle such failures

18.5. File stream opening and closing

18.5.1. must declare open mode

18.5.1.1. read mode

18.5.1.1.1. a stream opened in this mode allows read operations only; trying to write to the stream will cause an exception

18.5.1.2. write mode

18.5.1.2.1. a stream opened in this mode allows write operations only; attempting to read the stream will cause an exception

18.5.1.3. update mode

18.5.1.3.1. a stream opened in this mode allows both writes and reads

18.5.2. attempting an operation not permitted for open mode will cause UnsupportedOperation exception, which inherits OSError and ValueError, and comes from the io module

18.5.3. think of a stream as behaving rather like a tape recorder

18.5.3.1. When you read something from a stream, a virtual head moves over the stream, reading data into memory

18.5.3.2. When you write something to the stream, the same head moves along the stream recording the data from the memory

18.5.3.3. current file position

18.5.3.3.1. a commonly used term, picture this as the current position of the tape recorder read/write head, except it is referring of course to the file stream

18.5.4. file streams are provided via the io module

18.5.4.1. most file streams will inherit from IOBase, which is superclass for the following 3 subclassses

18.5.4.1.1. TextIOBase

18.5.4.1.2. BufferedIOBase

18.5.4.1.3. RawIOBase

18.5.4.2. Python open() function

18.5.4.2.1. built in function that creates file stream object and attempts to connect stream to file (i.e. opening the file)

18.5.4.2.2. note: it is not possible to use constructors for IOBase or any of its subclasses to create file stream, you must use the built in open() function

18.5.4.2.3. has one mandatory parameter for the file name, and 7 other parameters that are all optional as they have default values

18.5.4.2.4. open mode is 2nd parameter

18.5.4.3. Python close() method

18.5.4.3.1. invoke this method on an object created via the open() function to destroy the file stream object, removing its connection to the file

18.5.4.3.2. when close() method invoked on stream object, the buffering (a.k.a. caching) mechanism that handles transfer of data from memory to physical device forces a flush of buffers

18.6. File stream types

18.6.1. text stream

18.6.1.1. text streams ones are structured in lines; that is, they contain typographical characters (letters, digits, punctuation, etc.) arranged in rows (lines), as seen with the naked eye when you look at the contents of the file in the editor

18.6.1.2. This file is written (or read) mostly character by character, or line by line

18.6.1.2.1. portability consideration

18.6.2. binary stream

18.6.2.1. binary streams don't contain text but a sequence of bytes of any value. This sequence can be, for example, an executable program, an image, an audio or a video clip, a database file, etc

18.6.2.2. Because these files don't contain lines, the reads and writes relate to portions of data of any size. Hence the data is read/written byte by byte, or block by block, where the size of the block usually ranges from one byte to an arbitrarily chosen value.

18.7. Python pre-opened streams

18.7.1. the general rule for streams is that they must be explicitly opened before they can be used, but there are 3 exceptions to this rule

18.7.1.1. the following 3 streams are pre-opened when every Python program starts, and they are defined within the sys module

18.7.1.1.1. sys.stdin

18.7.1.1.2. sys.stdout

18.7.1.1.3. sys.stderr

18.8. Python stream read() method

18.8.1. syntax is read(size)

18.8.1.1. size is integer representing number of bytes

18.8.1.2. size is optional, if omitted it defaults to -1 which tells Python to read in the entire file

18.8.1.3. remember that 1 byte is only assured of representing 1 character in a text file that holds characters that conform to the ASCII encoding set

18.8.1.4. if you specify size as a number that exceeds the total number of bytes in the file, it just behaves as if you specified -1 (i.e. reads in the whole file)

18.8.2. returns a string representing either the whole file or first x characters from file

18.8.3. example:

18.8.3.1. stream = open("./Test-Files/tzop.txt", "rt", encoding = "utf-8") print(stream.read())

18.8.3.1.1. opens file (tzop.txt), which creates a text stream in read mode for file with utf-8 encoding

18.8.3.1.2. reads full content of file and prints to the screen

18.8.4. warning, using read() method without any parameters can be dangerous - reading a terabyte-long file using this method may corrupt your OS

18.9. Python stream readline() method

18.9.1. syntax is readline(size)

18.9.1.1. as with read() method, size represents bytes, and if bytes would take the "virtual read head" beyond EOF (end of file) marker the method simply returns everything up to EOF

18.9.1.2. new line character is also returned

18.9.2. returns a string representing a single line from file

18.9.3. each invocation of readline() returns another line until EOF marker hit, after which it will return an empty string

18.9.4. example:

18.9.4.1. from os import strerror try: ccnt = lcnt = 0 s = open('./Test-Files/tzop.txt', 'rt') line = s.readline() while line != '': lcnt += 1 for ch in line: print(ch, end='') ccnt += 1 line = s.readline() s.close() print("\n\nCharacters in file:", ccnt) print("Lines in file: ", lcnt) except IOError as e: print("I/O error occurred:", strerr(e.errno))

18.9.4.1.1. open file "tzop.txt" in read mode using a text stream

18.9.4.1.2. read 1st line in, then enter while loop that repeatedly reads successive lines in until EOF (readline() returns ''), count number of lines, and for each line loop through every character, printing to screen and counting total number of characters

18.10. Python stream readlines() method

18.10.1. syntax is readlines(hintsize)

18.10.1.1. hintsize is optional, if omitted will return every line in file as element in list

18.10.1.2. if hintsize is used, it represents bytes and will be used as a guide as to when to stop reading lines - this will be a rounded up number of bytes and may represent an internal buffer size

18.10.2. returns a list where every element is a string representing a line from the file (including the \n end of line character)

18.10.3. example:

18.10.3.1. from os import strerror try: s = open("./Test-Files/test.txt","rt") lines = s.readlines() s.close() print(lines) except IOError as e: print("I/O error occurred:",strerror(e.errno))

18.10.3.1.1. note: test.txt is a 3 line text file and print(lines) returns a list that include 3 string elements, the first two of which conclude with the \n (new line) character

18.11. Python stream objects are generators

18.11.1. this provides an alternative means to process a file, line by line

18.11.2. the __next__() method of stream object yields the next line from file

18.11.3. close() method is automatically invoked when you iterate a stream object

18.11.4. example:

18.11.4.1. from os import strerror try: for line in open("./Test-Files/test.txt","rt"): print(line,end="") except IOError as e: print("I/O error occurred:", strerror(e.errno))

18.11.4.1.1. note how compact and elegant this code is when you need to process a text file line by line

18.11.4.1.2. we can take advantage of the stream object being a generator and hence an iterator, which avoids the need to invoke the readlines() method and the close() method

18.12. Python stream write() method

18.12.1. takes a single string argument that will be written to the underlying file via the stream

18.12.2. will raise exception if file stream has been opened in a mode not compatible with write operations

18.12.3. does not automatically add new line characters, so you must add \n at the end of lines if required

18.12.3.1. note the you use \n even in Python programs running on Windows hosts and these will be automatically converted to \r\n in the final file

18.12.3.1.1. do not specify \r\n in your Python code for line endings as this will turn into a combination of CR+LF+LF

18.12.4. example

18.12.4.1. from os import strerror try: ws = open('./Test-Files/test.txt','wt') ws.write("This is my first file that I created in Python\n") ws.write("What do you think?\n") ws.write("Pretty cool, huh?") ws.close() except IOError as e: print("I/O error occurred:",strerror(e.errno))

18.12.5. you can also invoke write method on stderr if you want to write directly to that - just remember that you do not have to explicitly open/close that stream

18.12.5.1. example

18.12.5.1.1. import sys sys.stderr.write("Error message")

18.13. Processing amorphous data

18.13.1. Amorphous data is data which have no specific shape or form - they are just a series of bytes

18.13.1.1. This doesn't mean that these bytes cannot have their own meaning, or cannot represent any useful object, e.g., bitmap graphics

18.13.2. Python requires specialized classes to handle amorphous data

18.13.2.1. bytearray is one such class for handling amorphous data

18.13.2.1.1. bytearray is builtin and available to invoke its constructor method - no need to import any modules to use it

18.13.2.1.2. bytearray() constructor takes 3 optional arguments: source, encoding and errors

18.13.2.1.3. bytearray is similar to a list in that it is mutable, and be iterated and have its elements read and/or updated

18.13.2.1.4. writing bytearray objects to file streams is done in normal way with stream .write method - you must open the stream in mode compatible for writing to a binary file

18.13.2.1.5. Python stream .readinto() method

18.13.2.1.6. you can also read an entire binary file into memory using the stream .read() method