Basic data types: strings, ints, and floats

In [2]:
"hello, world"
Out[2]:
'hello, world'
In [3]:
5
Out[3]:
5
In [4]:
5.
Out[4]:
5.0

This is really cool

$\pi$

Assignment

In [5]:
a = 5
In [6]:
a
Out[6]:
5
In [7]:
a = a + 3
In [8]:
a
Out[8]:
8
In [9]:
a += 3
a
Out[9]:
11
In [10]:
2*3
Out[10]:
6
In [11]:
2-3
Out[11]:
-1
In [12]:
6/4
Out[12]:
1.5

Integer division, useful when we don't want to count the remainder (e.g., if we need list indices for making a histogram)

In [13]:
6//4
Out[13]:
1

Modular division, which we used for the reading frame problem

In [14]:
6%4
Out[14]:
2

Exercise one, evaluating functions

In [15]:
A = .03
L = 70
C = (.02*A)/(330*L)
C
Out[15]:
2.5974025974025973e-08
In [16]:
.02*A/330/L
Out[16]:
2.5974025974025973e-08
In [17]:
L=25
NGC=13
MM=2
81.5+(41*NGC-100*MM-675)/L
Out[17]:
67.82

Functions

In [23]:
def Tm(L,NGC,MM):
    """Calculate the Tm of a QuickChange mutagenesis primer 
    with length L, NGC GC bases, and MM mismatches."""
    return 81.5+(41*NGC-100*MM-675)/L
In [21]:
Tm(25,13,2)
Out[21]:
67.82
In [22]:
Tm(25,13,1)
Out[22]:
71.82
In [24]:
L
Out[24]:
25
In [25]:
print(L)
25
In [26]:
print(L,MM,Tm(L,NGC,MM))
25 2 67.82
In [27]:
Tm
Out[27]:
<function __main__.Tm>
In [28]:
b = Tm
In [29]:
b(L,NGC,MM)
Out[29]:
67.82
In [31]:
sum([1,2,3,4,5])
Out[31]:
15

Strings part 2: indexing and slicing

In [32]:
dna = "ATGCATAGATATTAA"
In [33]:
dna
Out[33]:
'ATGCATAGATATTAA'

Python indexing counts from 0

In [34]:
dna[0]
Out[34]:
'A'
In [35]:
dna[1]
Out[35]:
'T'

or from -1 when indexing from the end of the string

In [36]:
dna[-1]
Out[36]:
'A'

Taking multiple elements is called slicing. The slice includes the first index and stops right before the second index

In [37]:
dna[2:4]
Out[37]:
'GC'
In [38]:
dna[:4]
Out[38]:
'ATGC'
In [39]:
dna[2:]
Out[39]:
'GCATAGATATTAA'
In [40]:
dna[:]
Out[40]:
'ATGCATAGATATTAA'

Lists

Like strings, but they can hold arbitrary data

In [41]:
numbers = [1,2,3,4,5]
In [42]:
numbers
Out[42]:
[1, 2, 3, 4, 5]
In [43]:
numbers[2:4]
Out[43]:
[3, 4]
In [44]:
numbers[3]
Out[44]:
4

Unlike strings, we can overwrite individual elements of a list

In [45]:
numbers[3] = 7
numbers
Out[45]:
[1, 2, 3, 7, 5]
In [46]:
numbers[3] = "seven"
numbers
Out[46]:
[1, 2, 3, 'seven', 5]
In [47]:
numbers[3] = 7
In [48]:
a = numbers
In [49]:
a
Out[49]:
[1, 2, 3, 7, 5]
In [50]:
numbers
Out[50]:
[1, 2, 3, 7, 5]
In [52]:
numbers[2] = "hi"
numbers
Out[52]:
[1, 2, 'hi', 7, 5]
In [53]:
a
Out[53]:
[1, 2, 'hi', 7, 5]
In [54]:
a = numbers[:]
a
Out[54]:
[1, 2, 'hi', 7, 5]
In [55]:
numbers[2] = 3
numbers
Out[55]:
[1, 2, 3, 7, 5]
In [56]:
a
Out[56]:
[1, 2, 'hi', 7, 5]
In [57]:
x = [1,2,3,4,5]

For loops

In [59]:
for i in x:
    print(i**2)
    print(i/2)
1
0.5
4
1.0
9
1.5
16
2.0
25
2.5

For loop example: a sum function

In [60]:
def f(x):
    s = 0
    for i in x:
        s += i
    return s
In [61]:
f(x)
Out[61]:
15
In [62]:
f([2,3,4])
Out[62]:
9
In [65]:
for i in [sum, f]:
    print(i(x))
15
15

Functions inside of objects: finding start codons

In [66]:
dna = "GCGATAATGCCGATAGACTGGATATGCA"
In [67]:
dna
Out[67]:
'GCGATAATGCCGATAGACTGGATATGCA'
In [68]:
dna.find("ATG")
Out[68]:
6

Conditional expressions

See slides for more expressions like these. The important detail is that they evaluate to True or False and can be used as the condition for a while loop

In [69]:
3 != 2
Out[69]:
True
In [70]:
3 == 2
Out[70]:
False
In [71]:
assert(3 != 2)
In [72]:
assert(3 == 2)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-72-14628a6b7041> in <module>()
----> 1 assert(3 == 2)

AssertionError: 
In [73]:
dna.find("ATG")
Out[73]:
6
In [74]:
dna.find("ATG",6)
Out[74]:
6
In [75]:
dna.find("ATG",7)
Out[75]:
23
In [76]:
dna.find("ATG",24)
Out[76]:
-1
In [77]:
offset = dna.find("ATG")
print(offset)
while(offset > -1):
    offset = dna.find("ATG", offset + 1)
    print(offset)
6
23
-1

Using modular division for reading frame

In [78]:
6%3
Out[78]:
0
In [79]:
23%3
Out[79]:
2

Sets are like lists but do not have a predictable order and contain only unique elements.

In [80]:
set([0,2,2,0])
Out[80]:
{0, 2}
In [81]:
s = set([2,4])
In [82]:
s
Out[82]:
{2, 4}

We can add elements to a set with add

In [83]:
s.add("hello")
s
Out[83]:
{2, 4, 'hello'}

Likewise, we can add elements to a list with append

In [84]:
L = [1,2,3]
L
Out[84]:
[1, 2, 3]
In [85]:
L.append(4)
L
Out[85]:
[1, 2, 3, 4]
In [96]:
L = []

Exercise: finding all reading frames in a DNA sequence

In [87]:
def reading_frames(dna):
    offset = dna.find("ATG")
    L = []
    while(offset > -1):
        L.append(offset)
        offset = dna.find("ATG", offset + 1)
    frames = []
    for offset in L:
        frames.append(offset%3)
    return set(frames)
In [88]:
reading_frames(dna)
Out[88]:
{0, 2}

A more succinct implementation

In [87]:
def reading_frames(dna):
    offset = dna.find("ATG")
    L = []
    while(offset > -1):
        L.append(offset)
        offset = dna.find("ATG", offset + 1)
    frames = set()
    for offset in L:
        frames.add(offset%3)
    return frames

Useful functions for the homework: range and len

In [90]:
for i in range(10):
    print(i)
0
1
2
3
4
5
6
7
8
9
In [91]:
dna
Out[91]:
'GCGATAATGCCGATAGACTGGATATGCA'
In [92]:
dna[0]
Out[92]:
'G'
In [95]:
for i in range(len(dna)):
    print(dna[i])
G
C
G
A
T
A
A
T
G
C
C
G
A
T
A
G
A
C
T
G
G
A
T
A
T
G
C
A