In [1]:

print "hello, world"

hello, world

Commands that start with "%" are IPython "magics", which are not part of the python language and do not get sent directly to the interpreter.

You can get a list of available magics with:

%magic

In [2]:

%pwd

Out[2]:

u'/home/mvoorhie/BMS270'

Commands that start with "!" are sent directly to the operating system shell rather than to the python interpreter.

On OS X and Linux, the shell is typically bash, which will interpret !ls as list the files in the current directory.

On Windows, !dir should give the equivalent result in the DOS shell (if you start ipython from cygwin, you should be able to use bash commands).

In [3]:

!ls

notebook1.ipynb  notebook1.log

Here is a platform independent way to have python create a list of the filenames in the current directory

In [4]:

import glob
glob.glob("*")

Out[4]:

['notebook1.log', 'notebook1.ipynb']

The %logstart magic retro-actively starts logging this IPython session (i.e., it captures even commands typed before the log was started).

The -o option turns on logging for outputs in addition to inputs.

In [5]:

%logstart -o notebook1.log

Activating auto-logging. Current session state plus future input saved.
Filename       : notebook1.log
Mode           : backup
Output logging : True
Raw input log  : False
Timestamping   : False
State          : active

Use matched single or double quotes for simple strings, triple quotes for multi-line strings.

In [6]:

"this is a string"

Out[6]:

'this is a string'

In [7]:

'i can single quote'

Out[7]:

'i can single quote'

In [8]:

"""this won't do what I want

multiple

                 lines
"""

Out[8]:

"this won't do what I want\n\nmultiple\n\n                 lines\n"

In [9]:

print """this won't do what I want

multiple

                 lines
"""

this won't do what I want

multiple

                 lines

Use a decimal point to distinguish integer (int) and double-precision floating point (float) values.

In [10]:

Out[10]:

In [11]:

5.

Out[11]:

5.0

In [12]:

5+3

Out[12]:

In [13]:

5-3

Out[13]:

In [14]:

5*2

Out[14]:

In [15]:

10/2

Out[15]:

Dividing two integers gives an integer result, rounded down.

In [16]:

10/3

Out[16]:

Make sure to include at least one float if you want floating point division

In [17]:

10./3.

Out[17]:

3.3333333333333335

In [18]:

10/3.

Out[18]:

3.3333333333333335

String substitution works via the mod (%) operator

In [19]:

"This is a %s mad lib" % "neat"

Out[19]:

'This is a neat mad lib'

Use parenthesis to group multiple arguments for string substitution

Technically, the parenthesis are defining a tuple, which is an immutable version of a list

In [20]:

"Now %s and %s" % ("one","two")

Out[20]:

'Now one and two'

Most python objects can be converted to strings, which is what the %s format string is asking for

In [21]:

"Now %s and %s" % (1,2)

Out[21]:

'Now 1 and 2'

In [22]:

"%s" % (10./3)

Out[22]:

'3.33333333333'

The %f format string lets us specify special formatting for floating point values; e.g., the number of decimal places:

In [23]:

"%.5f" % (10./3)

Out[23]:

'3.33333'

In [24]:

"%2.5f" % (10./3)

Out[24]:

'3.33333'

The %e format string gives scientific notation:

In [25]:

"%e" % (10./3)

Out[25]:

'3.333333e+00'

In [26]:

2**3

Out[26]:

Assignment

The assignment operator (=) lets a name "point at" (reference) a piece of data.

We can refer to the names as variables (because their value may vary with reassignment)

or as references (to emphasize their pointing nature).

In [27]:

a = 8
b = "hello"

We can then operate on the references as if they were the underlying data

In [28]:

Out[28]:

In [29]:

Out[29]:

'hello'

In [30]:

a + 3

Out[30]:

In [31]:

a + b

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-31-f96fb8f649b6> in <module>()
----> 1 a + b

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Addition on strings is concatenation:

In [32]:

b + "hi"

Out[32]:

'hellohi'

In [33]:

Out[33]:

In [35]:

Out[35]:

In [37]:

Out[37]:

In [38]:

(a+2)*3

Out[38]:

Exercise: using variables to evaluate a formula

In [43]:

ngc = 13
L = 25
nmm = 2

# Wrong: gives an unintended rounding error due to integer division
Tm1 = 81.5 + (41*ngc - 100*nmm - 675)/L
# Here are two ways to ensure floating point division:
Tm2 = 81.5 + (41*ngc - 100*nmm - 675.)/L
Tm3 = 81.5 + (41*ngc - 100*nmm - 675)/float(L)

In [44]:

Tm1, Tm2, Tm3

Out[44]:

(67.5, 67.82, 67.82)

IPython magic for listing just the names that we've explicitly assigned in this session

In [45]:

%who

L	Tm1	Tm2	Tm3	a	b	glob	ngc	nmm

How to remove a reference:

In [48]:

%who

L	Tm2	Tm3	a	b	glob	ngc	nmm

In [49]:

Out[49]:

Comparison operators

= $\rightarrow$ assignment

== $\rightarrow$ test for equality

In [50]:

a == 5

Out[50]:

False

In [52]:

a == 5

Out[52]:

True

In [54]:

a > 6

Out[54]:

False

In [55]:

a < 5

Out[55]:

False

In [56]:

a != 3

Out[56]:

True

In [57]:

a >= 3

Out[57]:

True

In [58]:

a <= 3

Out[58]:

False

Boolean values can be combined with and, or, and not.

Remember that this type of logical expression will "short circuit" as soon as only one outcome is possible.

In [59]:

(5 > 3) and (3 > 2)

Out[59]:

True

In [60]:

(5 < 3) or (3 > 2)

Out[60]:

True

We can act on boolean values using if/elif/else statements.

Things to remember for this type of block statement:

First part ends with a colon (:)
The block of code that the first part refers to is indicated by indentation -- everything at the same indent level is part of the same code block.

In [61]:

if(5 > 6):
    print "it is"
else:
    print "it isn't"

it isn't

In [62]:

if(5 > 6):
    print "it is"
else:
    print "it isn't"
    print "it really isn't"

it isn't
it really isn't

In [63]:

if(5 > 6):
    print "hi"
elif(5> 3):
    print "hello"
else:
    print "the end"

hello

In [64]:

if(5 > 3):
    if(4 > 7):
        print "first"
    else:
        print "second"
else:
    print "oops"

second

Lists are:

Indicated by square brackets ([])
Heterogenous (can contain more than one type of data)
Dynamic (can be grown with +,+=, and append, and can have elements reassigned)
Indexed from zero

In [65]:

mylist = [1,2,"apple","orange",2**5]

In [66]:

mylist

Out[66]:

[1, 2, 'apple', 'orange', 32]

In [67]:

mylist += ["first","second"]

In [68]:

mylist

Out[68]:

[1, 2, 'apple', 'orange', 32, 'first', 'second']

In [69]:

mylist.append(5)

In [70]:

mylist

Out[70]:

[1, 2, 'apple', 'orange', 32, 'first', 'second', 5]

In [71]:

list2 = []

In [72]:

list2.append(1)

In [73]:

list2

Out[73]:

[1]

In [74]:

mylist[3]

Out[74]:

'orange'

In [75]:

mylist[3:6]

Out[75]:

['orange', 32, 'first']

We can find the length of a sequence or collection with len

When defining your own classes, you can add this property with a __len__ method

In [76]:

len(mylist)

Out[76]:

In [77]:

len("string")

Out[77]:

In [78]:

len(6)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-78-d42e7c5a4468> in <module>()
----> 1 len(6)

TypeError: object of type 'int' has no len()

In [79]:

mylist[len(mylist)-1]

Out[79]:

In [80]:

mylist[-1]

Out[80]:

In [81]:

mylist[:3]

Out[81]:

[1, 2, 'apple']

In [82]:

mylist[3:]

Out[82]:

['orange', 32, 'first', 'second', 5]

In [83]:

mylist[:]

Out[83]:

[1, 2, 'apple', 'orange', 32, 'first', 'second', 5]

The "pointing at" nature of python references can give unexpected behavior for lists.

Use x[:] to make a "shallow" copy of a list for independent modification.

In [84]:

otherlist = mylist

In [85]:

print otherlist
print mylist

[1, 2, 'apple', 'orange', 32, 'first', 'second', 5]
[1, 2, 'apple', 'orange', 32, 'first', 'second', 5]

In [86]:

otherlist[3] = "thing"

In [87]:

otherlist

Out[87]:

[1, 2, 'apple', 'thing', 32, 'first', 'second', 5]

In [88]:

mylist

Out[88]:

[1, 2, 'apple', 'thing', 32, 'first', 'second', 5]

In [89]:

otherlist = mylist[:]

In [90]:

otherlist[3] = "new"

In [91]:

print mylist
print otherlist

[1, 2, 'apple', 'thing', 32, 'first', 'second', 5]
[1, 2, 'apple', 'new', 32, 'first', 'second', 5]

In [92]:

c = mylist[:3]

In [94]:

print c
print mylist

[1, 2, 5]
[1, 2, 'apple', 'thing', 32, 'first', 'second', 5]

We can create a (possibly ragged) multi-dimensional array with a list of lists:

In [95]:

L = [[5,6],[3,4]]

In [96]:

Out[96]:

[[5, 6], [3, 4]]

In [97]:

L[0][1]

Out[97]:

We can also create multidimensional arrays with the numpy array class.

Restrictions on arrays:

Homogenous (all data must be of the same type, which can be specified with dtype)
Not ragged (all rows must be the same length)
Not dynamic (dimensions are fixed at the time of creation; however, an array may be reshaped, and values within the array can be modified).

Benifits of arrays (due to underlying C implementation):

More memory-efficient than python lists
Faster math via numpy/scipy/etc

For starting out, you'll usually want python lists

In [98]:

A = array(L)

In [99]:

Out[99]:

array([[5, 6],
       [3, 4]])

In [100]:

A.dtype

Out[100]:

dtype('int64')

What is the largest signed or unsigned integer that can be represented by a given number of bits?

In [101]:

2**64-1

Out[101]:

18446744073709551615L

In [102]:

2**32-1

Out[102]:

4294967295

In [103]:

2**(32-1)-1

Out[103]:

2147483647

In [104]:

2**(16-1)-1

Out[104]:

Iteration with for loops

In [105]:

mylist

Out[105]:

[1, 2, 'apple', 'thing', 32, 'first', 'second', 5]

In [106]:

for i in mylist:
    print i

1
2
apple
thing
32
first
second
5

Practical Bioinformatics -- Day 1

Saying hello to the interpreter shell

Data types

Assignment

Comparison operators

Lists

Iteration with for loops