# Python Basics

## Syntax and Data Types

### Variable assignment

In [1]:
a = 42
type(a) # you can use the type function to check the type of any object

int

In [2]:
b = a # associates the name b with the object stored in a
a = "a string" # a is now associated with a new object
print(b) # b is still associated with the original object

42


### Numeric Types

In [3]:
a = 5         # integers
b = 2.7e30    # floating-point number (double precision)
c = 5. + 1.7j # complex floating-point numbers

Most operators are the same as in the C-language family.

In [4]:
5 % 2 # modulo operation

1

In [5]:
a += 2 # equivalent to a = a + 2
a

7

In [6]:
a != 7 # inequality operator

False

In [7]:
2 | 4 # bitwise or

6

In [8]:
1 ^ 9 # bitwise xor

8

In [9]:
1 << 10 # bit-shift operation

1024

##### Notable exceptions
Python does not support the syntax `a++` for `a = a + 1`. Use `a += 1` instead.

In [10]:
1/2 # division of integers always returns a float (different from Python 2)

0.5

In [11]:
1//2 # explicit integer division is available in Python 2 and 3

0

In [12]:
2**12 # power operator as in Fortran

4096

In [13]:
# Integers have no limit on their value. (Python 2 had a special long integer type)
2**1000

10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376

### Boolean Type

In [14]:
1 == 1

True

In [15]:
import math
a = math.nan  # not a number
a != a

True

In [16]:
a = 17
a > 10 and a < 20

True

In [17]:
not True

False

### Strings

In [18]:
a = "first half "
b = "second half"
a + b # adding is concatenation

'first half second half'

In [19]:
a = ['a','list','of','words']
'-'.join(a)

'a-list-of-words'

In [20]:
a = "text"
print(a.upper()) # and a.lower(), of course
print(a.capitalize())

TEXT
Text


In [21]:
10 * "Yes" # integer multiplication repeats strings

'YesYesYesYesYesYesYesYesYesYes'

#### Escaping

Special characters (e.g. newline) can be expressed using C-style escape sequences.

In [22]:
print("line one\nline two")

line one
line two


This can lead to unintended effects.

In [23]:
print("$M_\text{star}$") # \t is the tab character

$M_	ext{star}$


In [24]:
print(r"$M_\text{star}$") # The r prefix makes it a raw string, switching off all escape sequences. Useful for LaTeX.
print("$M_\\text{star}$") # Alternative: Escape the backslash as \\

$M_\text{star}$
$M_\text{star}$


#### String Formatting

In [25]:
"%06d" % 50 # printf-style formatting is supported (see man 3 printf)

'000050'

In [26]:
m = 1989099999999999887809347321856.0 
print("%.3e" % m) # always use exponential notation with 3 decimal places
print("%.3g" % m) # automatic use of exponential with 3 significant figures
print("%.3f" % m) # never use exponential notation

1.989e+30
1.99e+30
1989099999999999887809347321856.000


In [27]:
"The %s is %d." % ("answer", 0o52)

'The answer is 42.'

#### Alternative Formatting

In [28]:
"{name} {value:.3g}".format(value=m, name="solar mass")

'solar mass 1.99e+30'

In [29]:
"The {} is {}.".format("answer", 0x2a)

'The answer is 42.'

#### Useful String Methods

In [30]:
'car' in 'incarnation'

True

In [31]:
'The sky is grey.'.replace('grey', 'blue') # more advanced replacement is possible with the regular expression module (re)

'The sky is blue.'

In [32]:
"Python".endswith("on") # also .startswith(), again regular expressions for more advanced tests on strings.

True

#### Strings and Bytes

Python strings are internally stored as Unicode. For input and output they have to be converted to bytes, specifying the desired encoding.

In [33]:
a = "Großbritannien"
a.encode('utf-8')

b'Gro\xc3\x9fbritannien'

In [34]:
a.encode('iso-8859-1') # also known as latin-1

b'Gro\xdfbritannien'

In [35]:
b = "パイソン"
b.encode('utf-8')

b'\xe3\x83\x91\xe3\x82\xa4\xe3\x82\xbd\xe3\x83\xb3'

In [36]:
b.encode('utf-16')

b'\xff\xfe\xd10\xa40\xbd0\xf30'

In [37]:
b.encode('iso-8859-1')

UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-3: ordinal not in range(256)

The situation is very different in Python 2, which has a Unicode string type (`u"a string"`). This is one of the most common problems in converting from Python 2 to 3.

### Lists

In [38]:
li = [12, 45, "text", ["another", "list"]]
# lists can contain objects of different types
li[3][0]

'another'

In [39]:
li.append("last") # modifies the list in place
li

[12, 45, 'text', ['another', 'list'], 'last']

In [40]:
["one", "list"] + li[3] # creates a new list

['one', 'list', 'another', 'list']

In [41]:
a = [3, 5, 6]
a.insert(1, 4)
a

[3, 4, 5, 6]

In [42]:
del a[0]
a

[4, 5, 6]

In [43]:
len(a) # returns the number of elements of any list-like object

3

#### List Slicing
These work on many list-like objects (strings)

In [44]:
a = list(range(10))
a

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [45]:
a[1:5] # slices do not include the last element

[1, 2, 3, 4]

In [46]:
a[:-1] # -1 is the last element. Leaving out a value means the slices extends to the end of the list

[0, 1, 2, 3, 4, 5, 6, 7, 8]

In [47]:
a[::2] # third argument is the step size

[0, 2, 4, 6, 8]

In [48]:
a[::-1] # easiest method of reversing a list

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

### Tuples

Tuples are similar to lists but they cannot be changed after creation. They are implicitly created when a function returns more than one value

In [49]:
t = ("this", "is", 1, "tuple")
t[1]

'is'

In [50]:
t[1] = 12

TypeError: 'tuple' object does not support item assignment

In [51]:
('only one element',) # Without the comma these would just be parentheses used for grouping.

('only one element',)

### Sets
Sets can only contain unique elements. They do not preserve the order. Internally it uses the hash of an object to check for uniqueness.

In [52]:
s = set(['abc',2,4,17,2*2])
s

{17, 2, 'abc', 4}

In [53]:
a = {1,2,3}
b = {3,4,5}
a.intersection(b) # typical set operations are supported

{3}

### Dictionaries
Dictionaries store key-value pairs. In other languages they are known as associative arrays, hash tables, or maps.

In [54]:
d = dict(building="Herschel", level=3)
d['room number'] = "3.19"
d

{'building': 'Herschel', 'level': 3, 'room number': '3.19'}

In [55]:
'room number'  in d

True

In [56]:
del d['room number'] # deleting single elements
'room number' in d

False

In [57]:
d[-17.5] = "negative seventeen point five" # any hashable type can be a key

## Control Flow
Blocks for loops, if clauses, etc. are generally defined using indentation. Any type of indentation (spaces or tabs) can be used as long as it is consistent in one block.

It is best practice to use 4 spaces per indentation level and no tabs. Configure your editor accordingly!

### Line breaks

Normally there is one statement per line. Lines can have arbitrary length but it is best practice not to go above 80 characters per line if possible.

In [58]:
a = 1; print(a) # You can have several statements separated by a semicolon.
# This should mostly be avoided for readability.

b = 2 \
    * \
    a
# You can continue a statement in the next line by placing a \ at the end of line.
print(a,
      b) # expressions inside parentheses are automatically continued.

1
1 2


### Loops

In [59]:
for i in range(5):
    print(i)

0
1
2
3
4


In [60]:
i = 5
while i > 0:
    print(i)
    i -= 1

5
4
3
2
1


### `if` Clauses

In [61]:
a = "Natus nostrum esse deleniti autem commodi temporibus atque"
nwords = len(a.split(' '))
if nwords < 3:
    print("few words")
elif nwords < 10: # any number of elif clauses is possible
    print("moderate number of words")
else:
    print("many words")

moderate number of words


Python does not have the equivalent of `switch` or `select` statements from C or Fortran.

### Advanced Loop Control

In [62]:
# This finds a root of x**2 - 2
x = 0.1 # start value
for i in range(30): # maximum number of iterations
    res = abs(x**2 - 2) # residual error
    if res < 1e-13: # convergence threshold
        break # Stop the loop. Equivalent to exit in Fortran.
    x -= (x**2 - 2) / (2 * x) # next iteration using the derivative
    print("step {}: x = {:.2f} (res = {:.3g})".format(i, x, res))
else: # this block is only executed if the finished the last iteration
    print("no convergence")
print("result:", x)

step 0: x = 10.05 (res = 1.99)
step 1: x = 5.12 (res = 99)
step 2: x = 2.76 (res = 24.3)
step 3: x = 1.74 (res = 5.6)
step 4: x = 1.44 (res = 1.03)
step 5: x = 1.41 (res = 0.0879)
step 6: x = 1.41 (res = 0.000924)
step 7: x = 1.41 (res = 1.07e-07)
result: 1.4142135623730956


There is also `continue` which immediately jumps to the next iteration. In Fortran this would be called `cycle`.

### More loops

In [63]:
li = ['one', 'two', 'three']
for l in li:
    # you can directly loop over any iterable object without using an index
    print(l)

one
two
three


In [64]:
for i, l in enumerate(li): # in case you still need the index
    print(i, ":", l)

0 : one
1 : two
2 : three


In [65]:
flowers = ['roses', 'violets']
colours = ['red', 'blue']
for f, c in zip(flowers, colours): # loop over two lists simultaneously
    print(f.capitalize(), "are", c)

Roses are red
Violets are blue


#### List Comprehensions
Compact way of constructing lists.

In [66]:
[2 * a for a in range(5,10)]

[10, 12, 14, 16, 18]

In [67]:
lorem = "Natus nostrum esse deleniti autem commodi temporibus atque"
long_words = [x for x in lorem.split(' ') if len(x) > 5]
long_words

['nostrum', 'deleniti', 'commodi', 'temporibus']

## Functions

In [68]:
def add(a, b):
    return a + b
add(1, 2)

3

In [69]:
add("a", " string") # No type checking is done automatically.
# This often makes functions more versatile. (duck typing)

'a string'

In [70]:
add(1, [2]) # if an error occurs inside a function 

TypeError: unsupported operand type(s) for +: 'int' and 'list'

In [71]:
def split_half(x):
    """This splits the argument into two pieces.
    
    Args:
        x: any list-like object
    Returns:
        first half
        second half
    """
    h = len(x) // 2
    return x[:h], x[h:]
split_half('this is a long text')

('this is a', ' long text')

In [72]:
help(split_half) # In IPython you can also use the command "split_half?".

Help on function split_half in module __main__:

split_half(x)
    This splits the argument into two pieces.
    
    Args:
        x: any list-like object
    Returns:
        first half
        second half



In [73]:
first, second = split_half('abcdef') # you can unpack the returned tuple into multiple variables
second

'def'

#### Keyword Arguments
Named arguments with a default value.

In [74]:
def power_add(a, b, c=0.0):
    return a ** b + c

power_add(5.0, 2.0) # Keyword arguments are optional.

25.0

In [75]:
power_add(5.0, 2.0, c=17.0) # You can override default values.

42.0

In [76]:
power_add(b=10.0, a=2.0)
# You can specify positional arguments out of order if you supply the name.

1024.0

In [77]:
power_add(1.0) # positional arguments are always required

TypeError: power_add() missing 1 required positional argument: 'b'

#### Flexible Argument Lists

In [78]:
def add_all(*args): # args is a list with all remaining arguments
    r = 0
    for x in args:
        r += x
    return r

add_all(5,10,20)

35

In [79]:
add_all() # args can be empty

0

In [80]:
li = [1, 12, 15]
add_all(*li) # The * operator unpacks the list items into individual arguments.

28

#### Flexible Keyword Arguments

In [81]:
def power(a, exponent=1.0):
    return a**exponent

def power_multiply(x, factor=1.0, **kwargs):  # all remaining keyword arguments are put into a dictionary called kwargs
    return power(x, **kwargs) * factor  # This passes on the keywords to the power function

power_multiply(5.0, factor=2.0)

10.0

In [82]:
power_multiply(5.0, exponent=3.0, factor=2.0)

250.0

In [83]:
power_multiply(5.0, another_keyword=27) # This raises an error as the power function doesn't handle this kwarg

TypeError: power() got an unexpected keyword argument 'another_keyword'

#### Lambda Functions
A way of specifying anonymous functions inline. Can always be replaced by using named functions defined with `def`.

In [84]:
addone = lambda x: x + 1
addone(10)

11

This can be useful in some contexts, e.g. maps.

In [85]:
list(map(addone, [5, 10, 20])) # map applies the first argument to all elements of the second argument

[6, 11, 21]

In [86]:
list(map(lambda x: x + 1, [5, 10, 20])) # use carefully to maintain readability

[6, 11, 21]

In [87]:
evenodd = lambda x: 'even' if x % 2 == 0 else 'odd'
evenodd(11)

'odd'

## Modules and Standard Library

### Importing Modules

- standard way of making functionality from other Python files available
- code is only executed at first import
- subsequent imports only make the namespace available
- `import` can be done at any place in code

In [3]:
import numpy
numpy.array([1,2,3])

array([1, 2, 3])

In [4]:
import numpy as np # specify an alias
np.array([1,2,3])

array([1, 2, 3])

In [5]:
from numpy import array # just import select items
array([1,2,3])

array([1, 2, 3])

In [6]:
from numpy import * # all items directly available, danger of overwriting local variables
array([1,2,3])

array([1, 2, 3])

#### `sys` Module
Functions for interacting with the Python interpreter.

In [7]:
import sys
print(sys.version)

3.5.3 (default, Jan 19 2017, 14:11:04) 
[GCC 6.3.0 20170118]


In [8]:
print(sys.platform)

linux


`sys.path` is the list of paths to search when importing a module.

`sys.argv` is the list of command line arguments the script was run with.

#### `os` Module
Interacting with the operating system, e.g file system, run commands.

In [10]:
import os
os.listdir()

['.ipynb_checkpoints',
 'syntax.ipynb',
 'syntax.slides.html',
 'custom.css',
 'reveal.js']

`os.system` runs an arbitrary command in the shell. Safer and more powerful version in `subprocess` module.

Functions like `os.remove`, `os.mkdir`, `os.chdir`, `os.chmod` are simlar to their command-line counterparts. Check the docstrings!

#### `math` Module
All standard math functions. acting on scalar values. Array version are available in `numpy`. More advaced functions available in `scipy`.

In [12]:
import math
math.sin(0.5 * math.pi)

1.0

#### Many others
There are many integrated modules for reading different file formats, communicating over the network (load webpage, send email, …) and my other tasks.

See https://docs.python.org/3/py-modindex.html

## Simple File Input/Output

### Writing a file

In [21]:
f = open('mynewfile.txt','w')
f.write('This is one line.\nThis is another line.\n')
f.close()

If an error occurs before the file is closed, not all contents might be written. A context manager can ensure this.

In [22]:
with open('mynewfile.txt','w') as f:
    f.write('This is one line.\nThis is another line.\n')

# f is automatically closed, once the block is left, even if an exception occurs.

Using `with` is the preferred method.

### Reading a file

In [27]:
with open('mynewfile.txt','r') as f: # 'r' is also the default when not specifying a mode
    a = f.read() # reads everything in one go (be careful with large files)

print(a)

This is one line.
This is another line.



In [29]:
with open('mynewfile.txt','r') as f: # 'r' is also the default when not specifying a mode
    a = f.readlines() # automatically splits lines into a list
a

['This is one line.\n', 'This is another line.\n']

In [30]:
with open('mynewfile.txt','r') as f: # 'r' is also the default when not specifying a mode
    a = f.read(6) # only read 6 characters at a time
    while a != '':
        print(a)
        a = f.read(6)

This i
s one 
line.

This i
s anot
her li
ne.



### File Modes

- `'r'` read (default)
- `'w'` write, truncate file to zero length
- `'x'` create new file, fail if exists
- `'a'` append at end of existing file, create otherwise

The following can be added to a mode:
- `'t'` text mode, reads/write strings with default system encoding, newline are converted (important on DOS/Windows)
- `'b'` binary mode, returns bytes instead of string, no conversion of newlines
- `'+'` open for reading and writing, e.g. `'r+'`