## Basic data types: strings, ints, and floats

In [2]:
"hello, world"

'hello, world'

In [3]:
5

5

In [4]:
5.

5.0

This is *really* cool

$\pi$

## Assignment

In [5]:
a = 5

In [6]:
a

5

In [7]:
a = a + 3

In [8]:
a

8

In [9]:
a += 3
a

11

In [10]:
2*3

6

In [11]:
2-3

-1

In [12]:
6/4

1.5

Integer division, useful when we don't want to count the remainder (*e.g.*, if we need list indices for making a histogram)

In [13]:
6//4

1

Modular division, which we used for the reading frame problem

In [14]:
6%4

2

### Exercise one, evaluating functions 

In [15]:
A = .03
L = 70
C = (.02*A)/(330*L)
C

2.5974025974025973e-08

In [16]:
.02*A/330/L

2.5974025974025973e-08

In [17]:
L=25
NGC=13
MM=2
81.5+(41*NGC-100*MM-675)/L

67.82

## Functions

In [23]:
def Tm(L,NGC,MM):
    """Calculate the Tm of a QuickChange mutagenesis primer 
    with length L, NGC GC bases, and MM mismatches."""
    return 81.5+(41*NGC-100*MM-675)/L

In [21]:
Tm(25,13,2)

67.82

In [22]:
Tm(25,13,1)

71.82

In [24]:
L

25

In [25]:
print(L)

25


In [26]:
print(L,MM,Tm(L,NGC,MM))

25 2 67.82


In [27]:
Tm

<function __main__.Tm>

In [28]:
b = Tm

In [29]:
b(L,NGC,MM)

67.82

In [31]:
sum([1,2,3,4,5])

15

## Strings part 2: indexing and slicing

In [32]:
dna = "ATGCATAGATATTAA"

In [33]:
dna

'ATGCATAGATATTAA'

Python indexing counts from 0

In [34]:
dna[0]

'A'

In [35]:
dna[1]

'T'

or from -1 when indexing from the end of the string

In [36]:
dna[-1]

'A'

Taking multiple elements is called *slicing*.  The slice *includes* the first index and stops right before the second index

In [37]:
dna[2:4]

'GC'

In [38]:
dna[:4]

'ATGC'

In [39]:
dna[2:]

'GCATAGATATTAA'

In [40]:
dna[:]

'ATGCATAGATATTAA'

## Lists
Like strings, but they can hold arbitrary data

In [41]:
numbers = [1,2,3,4,5]

In [42]:
numbers

[1, 2, 3, 4, 5]

In [43]:
numbers[2:4]

[3, 4]

In [44]:
numbers[3]

4

Unlike strings, we can overwrite individual elements of a list

In [45]:
numbers[3] = 7
numbers

[1, 2, 3, 7, 5]

In [46]:
numbers[3] = "seven"
numbers

[1, 2, 3, 'seven', 5]

In [47]:
numbers[3] = 7

In [48]:
a = numbers

In [49]:
a

[1, 2, 3, 7, 5]

In [50]:
numbers

[1, 2, 3, 7, 5]

In [52]:
numbers[2] = "hi"
numbers

[1, 2, 'hi', 7, 5]

In [53]:
a

[1, 2, 'hi', 7, 5]

In [54]:
a = numbers[:]
a

[1, 2, 'hi', 7, 5]

In [55]:
numbers[2] = 3
numbers

[1, 2, 3, 7, 5]

In [56]:
a

[1, 2, 'hi', 7, 5]

In [57]:
x = [1,2,3,4,5]

## For loops

In [59]:
for i in x:
    print(i**2)
    print(i/2)

1
0.5
4
1.0
9
1.5
16
2.0
25
2.5


### For loop example: a sum function

In [60]:
def f(x):
    s = 0
    for i in x:
        s += i
    return s

In [61]:
f(x)

15

In [62]:
f([2,3,4])

9

In [65]:
for i in [sum, f]:
    print(i(x))

15
15


## Functions inside of objects: finding start codons

In [66]:
dna = "GCGATAATGCCGATAGACTGGATATGCA"

In [67]:
dna

'GCGATAATGCCGATAGACTGGATATGCA'

In [68]:
dna.find("ATG")

6

## Conditional expressions

See slides for more expressions like these.  The important detail is that they evaluate to **True** or **False** and can be used as the condition for a **while** loop

In [69]:
3 != 2

True

In [70]:
3 == 2

False

In [71]:
assert(3 != 2)

In [72]:
assert(3 == 2)

AssertionError: 

In [73]:
dna.find("ATG")

6

In [74]:
dna.find("ATG",6)

6

In [75]:
dna.find("ATG",7)

23

In [76]:
dna.find("ATG",24)

-1

In [77]:
offset = dna.find("ATG")
print(offset)
while(offset > -1):
    offset = dna.find("ATG", offset + 1)
    print(offset)

6
23
-1


### Using modular division for reading frame

In [78]:
6%3

0

In [79]:
23%3

2

Sets are like lists but do not have a predictable order and contain only unique elements.

In [80]:
set([0,2,2,0])

{0, 2}

In [81]:
s = set([2,4])

In [82]:
s

{2, 4}

We can add elements to a set with **add**

In [83]:
s.add("hello")
s

{2, 4, 'hello'}

Likewise, we can add elements to a list with **append**

In [84]:
L = [1,2,3]
L

[1, 2, 3]

In [85]:
L.append(4)
L

[1, 2, 3, 4]

In [96]:
L = []

## Exercise: finding all reading frames in a DNA sequence

In [87]:
def reading_frames(dna):
    offset = dna.find("ATG")
    L = []
    while(offset > -1):
        L.append(offset)
        offset = dna.find("ATG", offset + 1)
    frames = []
    for offset in L:
        frames.append(offset%3)
    return set(frames)

In [88]:
reading_frames(dna)

{0, 2}

A more succinct implementation

In [87]:
def reading_frames(dna):
    offset = dna.find("ATG")
    L = []
    while(offset > -1):
        L.append(offset)
        offset = dna.find("ATG", offset + 1)
    frames = set()
    for offset in L:
        frames.add(offset%3)
    return frames

## Useful functions for the homework: range and len

In [90]:
for i in range(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


In [91]:
dna

'GCGATAATGCCGATAGACTGGATATGCA'

In [92]:
dna[0]

'G'

In [95]:
for i in range(len(dna)):
    print(dna[i])

G
C
G
A
T
A
A
T
G
C
C
G
A
T
A
G
A
C
T
G
G
A
T
A
T
G
C
A
