Basic data types: strings, ints, and floats

In [1]:
"hello, world"
Out[1]:
'hello, world'

This is cool

  • Remember to think about this

$\sqrt{3}$

In [2]:
5
Out[2]:
5
In [3]:
5.0
Out[3]:
5.0
In [4]:
3 - 2
Out[4]:
1
In [5]:
3 + 5
Out[5]:
8
In [6]:
3 * 2
Out[6]:
6
In [7]:
3 ** 2
Out[7]:
9

To get floating point division, make sure that at least one value in the division is a float

In [8]:
3/2
Out[8]:
1
In [9]:
3./2.
Out[9]:
1.5
In [10]:
float(4)/3
Out[10]:
1.3333333333333333
In [11]:
a = 3./2.
In [12]:
a
Out[12]:
1.5

Assignment

In [13]:
a = 0
In [14]:
a = a + 1
In [15]:
a
Out[15]:
1
In [16]:
a = a + 1
In [17]:
a
Out[17]:
2
In [18]:
a += 1
a
Out[18]:
3

Exercise one, evaluating functions

In [19]:
x = 1
y = 2
z = 3
In [20]:
V = x*float(y)/z
In [21]:
V
Out[21]:
0.6666666666666666

Functions inside of objects

Using string.find to find start codons

In [22]:
dna = "GGCCGTATGAGGTCATGCACACACACACCGAGAGTATGA"
dna
Out[22]:
'GGCCGTATGAGGTCATGCACACACACACCGAGAGTATGA'
In [23]:
dna.find("ATG")
Out[23]:
6
In [24]:
dna.find("ATG",6)
Out[24]:
6
In [25]:
dna.find("ATG",7)
Out[25]:
14
In [26]:
dna.find("M")
Out[26]:
-1

Conditional expressions

In [27]:
dna.find("M") == -1
Out[27]:
True
In [28]:
2 == 3
Out[28]:
False
In [29]:
2 != 3
Out[29]:
True
In [30]:
2 > 3
Out[30]:
False

Using a while loop to find all start codons

In [31]:
offset = dna.find("ATG")
print offset
while(offset != -1):
    offset = dna.find("ATG",offset+1)
    print offset
6
14
35
-1

Lists

In [32]:
[6,14,35,-1]
Out[32]:
[6, 14, 35, -1]
In [33]:
a = [5,2,1,"hi"]
In [34]:
a
Out[34]:
[5, 2, 1, 'hi']

Concatenating and appending to lists

In [35]:
a += ["3",13]
In [36]:
a
Out[36]:
[5, 2, 1, 'hi', '3', 13]
In [37]:
a.append(5)
a
Out[37]:
[5, 2, 1, 'hi', '3', 13, 5]
In [38]:
starts = []
offset = dna.find("ATG")
while(offset != -1):
    starts.append(offset)
    offset = dna.find("ATG",offset+1)
starts
Out[38]:
[6, 14, 35]

Defining functions

In [39]:
def find_starts(dna):
    starts = []
    offset = dna.find("ATG")
    while(offset != -1):
        starts.append(offset)
        offset = dna.find("ATG",offset+1)
    return starts
In [40]:
find_starts(dna)
Out[40]:
[6, 14, 35]
In [41]:
other_seq = "XXXXATGXXXX"
find_starts(other_seq)
Out[41]:
[4]
In [42]:
other_seq = "XXXXaTGXXXX"
find_starts(other_seq)
Out[42]:
[]
In [43]:
find_starts(dna)
Out[43]:
[6, 14, 35]

Using the moduls operator to find reading frame

In [44]:
6 % 3
Out[44]:
0
In [45]:
for i in find_starts(dna):
    print i
6
14
35

Two tricks that I didn't mention explicitly in class:

  • Combining multiple expressions with boolean logic (you can use and, or, and not)
  • Finding the length of a string or list with the len function
In [46]:
(5 == 3) or (5 == 5)
Out[46]:
True
In [47]:
len(dna)
Out[47]:
39

At this point I crashed my notebook demonstrating an infinite while loop.

As a reminder, you can usually break out of an infinite loop via Kernel->Interupt from inside of Jupyter.

If that doesn't work, find IPython's process ID via:

  • Linux:

    ps -ealf | grep -i ipython
  • OS X:

    ps -awx | grep -i ipython

And use kill to kill it.

Pseudo-random numbers

A simple linear congruent generator based on modular division

In [48]:
a = 6
m = 13
z = 2
In [49]:
numbers = []
while(len(numbers) < 20):
    z = (a*z) % m
    numbers.append(z)
numbers
Out[49]:
[12, 7, 3, 5, 4, 11, 1, 6, 10, 8, 9, 2, 12, 7, 3, 5, 4, 11, 1, 6]

Checking for unique elements with a set

In [50]:
set(numbers)
Out[50]:
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
In [51]:
a = 6
m = 12
z = 1
In [52]:
numbers = []
while(len(numbers) < 20):
    z = (a*z) % m
    numbers.append(z)
numbers
Out[52]:
[6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

A very good choice of a and m (determined empirically) from Comm. ACM 31:1192

In [53]:
a = 16807
m = 2**31-1
z = 1
In [54]:
numbers = []
while(len(numbers) < 20):
    z = (a*z) % m
    numbers.append(z)
numbers
Out[54]:
[16807,
 282475249,
 1622650073,
 984943658,
 1144108930,
 470211272,
 101027544,
 1457850878,
 1458777923,
 2007237709,
 823564440,
 1115438165,
 1784484492,
 74243042,
 114807987,
 1137522503,
 1441282327,
 16531729,
 823378840,
 143542612]
In [ ]: