Goal: Explore file I/O
%ls example*
example1 example2 example.pdf
%pwd
u'/home/bms270/BMS270_2017'
Examples of using the command line download tools wget and curl. curl should be available on all macs.
!wget 'http://histo.ucsf.edu/BMS270/BMS270_2017/data/example1'
--2017-04-25 15:26:47-- http://histo.ucsf.edu/BMS270/BMS270_2017/data/example1 Resolving histo.ucsf.edu (histo.ucsf.edu)... 128.218.234.54 Connecting to histo.ucsf.edu (histo.ucsf.edu)|128.218.234.54|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 3252152 (3.1M) Saving to: ‘example1.1’ example1.1 100%[===================>] 3.10M 5.74MB/s in 0.5s 2017-04-25 15:26:47 (5.74 MB/s) - ‘example1.1’ saved [3252152/3252152]
!curl 'http://histo.ucsf.edu/BMS270/BMS270_2017/data/example1' > example1
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 3175k 100 3175k 0 0 5365k 0 --:--:-- --:--:-- --:--:-- 5373k
%ls
Day1.html Day1_warmup.ipynb example1 example.pdf Untitled.ipynb Day1.ipynb Day2b.ipynb example1.1 Untitled1.ipynb Day1_post.ipynb Day2.ipynb example2 Untitled2.ipynb
data = open("example1").read()
len(data)
3252152
data[:100]
'@SRR4244242.20068354 20068354 length=112\nCTGGGCTCCACCTCTAGGGTGATGGTCTTGCAGGTCAGGGTCTTCGCGAAGATCTGCAT'
ord(data[0])
64
chr(64)
'@'
print data[:100]
@SRR4244242.20068354 20068354 length=112 CTGGGCTCCACCTCTAGGGTGATGGTCTTGCAGGTCAGGGTCTTCGCGAAGATCTGCAT
fp = open("example1")
fp
<open file 'example1', mode 'r' at 0x7f16e10119c0>
fp.read(100)
'@SRR4244242.20068354 20068354 length=112\nCTGGGCTCCACCTCTAGGGTGATGGTCTTGCAGGTCAGGGTCTTCGCGAAGATCTGCAT'
fp.read(100)
'TATGACCTGATAACAAATGTGATGAAAGCACAAACCGCCCAGCGCGTCGAAAC\n+SRR4244242.20068354 20068354 length=112\nAAAA.'
fp.readline()
'FFFF))FFF)FFFFAF<FAF.FAFF))FAFFFFFFFFFF7)FF<FFA7FFFF.7F7FFAFFFF.AAFF<F.FF.<AFFFAAFF<F.)<FFAFF<.AF<FAFF.F..F\n'
fp.readline()
'@SRR4244242.6143545 6143545 length=116\n'
count = 0
for line in open("example1"):
print line
count += 1
if(count > 100):
break
@SRR4244242.20068354 20068354 length=112 CTGGGCTCCACCTCTAGGGTGATGGTCTTGCAGGTCAGGGTCTTCGCGAAGATCTGCATTATGACCTGATAACAAATGTGATGAAAGCACAAACCGCCCAGCGCGTCGAAAC +SRR4244242.20068354 20068354 length=112 AAAA.FFFF))FFF)FFFFAF<FAF.FAFF))FAFFFFFFFFFF7)FF<FFA7FFFF.7F7FFAFFFF.AAFF<F.FF.<AFFFAAFF<F.)<FFAFF<.AF<FAFF.F..F @SRR4244242.6143545 6143545 length=116 TTTCACCTCAGTGACGCAGCCCTTCTCTCTCCAGTCCACAGTGTCAGGCAATGTCCGATTAGAGTATGACCTGAAAGTGACAGTCTTCGGAGACTGTCGGGGAATTCTCAGAGCAC +SRR4244242.6143545 6143545 length=116 AAAAAFFFFFFFFFFFFFFFFFFFFFFFFFFF)FFF7.FFFFFFFFFFFFFFFFFFFF<F<FFFFFFFFFFFFFFFFFA<FFFFFFFFAF<FAFFFFF.FFFFFFFFAFFFAFFAF @SRR4244242.28027200 28027200 length=139 CTGTGATGGGGAAGACCAGAGTCTTATATCATGAATTGCATCGGTGCTGTGGGGCAGGCACATAGGATGCCAGGGCAAAGGGAGACGGAGCTCTGTGCTGACAAGGAATCACACTGAGCCCAGCTTCAGGGGGCCCAGG +SRR4244242.28027200 28027200 length=139 AAAAAFFFFFFFFFAFFFFFFFFFFAFFFFAFFAFFFFFFFFFFFFFFFFFFFFFAFFFFFFFAFFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFFAFAFFFFFFFFFFFFA7F<FFFFFFFFFFAFFFFFFAFA @SRR4244242.21033314 21033314 length=91 CTGGCATGTTGGAACAATGTAGGTAAGGGAAGTCGGCAAGCCGGATCCGTAACTTCGGGATAAGGATTGGCTCTAAGGGCTGGGTCGGTCG +SRR4244242.21033314 21033314 length=91 AAAAAFFF<FFFFFFF<FFFFFF7FAF7FFFFFFF7FFFFFFFFFFFAFFF<FFFFA7FAFAFFFFFFFF<FFFF<FFF7FA<FFF.FF<A @SRR4244242.19134434 19134434 length=149 TAAAAGACAAAAGTGAGAATGGTGCAGAAAAGGCGCAGGCACAACGGCTAGAAGAGGACCCAGCCAGCTAGGACCCTGCACGGATGTGTTGATGGCGGCCTCACAGGAACAGCGAATGGTAGAGAGTGGAGTGATCTCCCAACAACCCC +SRR4244242.19134434 19134434 length=149 AAA.AFF<FF.7A)7)7F.AFAFFFAA7FFA.A)F.F7.F<F)F.)A.FAFFFFF..7FFF77FF)FFF.FF..AFFF).))FF))F<)F)FF)F)<.FFFFFFF<F.FFAF.<.FF.7).F<A.77F..))A7A)AAAA).<.<F.7< @SRR4244242.2375668 2375668 length=76 TAACACAGAAGCAATGCTGTCACCTACCCCGGGGTGGACTCAGGGCATGGACGCGACCATCCTCCTCTTAGGAGTG +SRR4244242.2375668 2375668 length=76 A.AAAFFFAFFA.FFAFFFFF.FFF.FFFF)A.FF..7FFFAFF.FFFFA<F.7F.FFFFFFFFFFFAF7.FFAFF @SRR4244242.11970718 11970718 length=119 GAGTAATAAGAGCGAGGAGGGAGGGAAAGAACCATCTTCGAGTGCTCTCGAGGAGCCAAGCCCGCCTCAGCTGTCTTCAAAAGCAAACAAAGCCATCTTTGGAATTTGCAGACTAAGAT +SRR4244242.11970718 11970718 length=119 AAAA<AFFFF7FFFFFFFFFFFFFFFFFF<FFFFFFFFFF.FFFFFFAFFFFFFFFFF7FF7)FFFFFFFFAFFFFFFFFFFFFFAFFFFAFFF7FF)FFFFAFFA7FFAF.FFFFFFF @SRR4244242.10413608 10413608 length=49 GTAGACATGGGTTGCTCCTCCTTCCTCTGGCATAGACAAGTAGTATTTC +SRR4244242.10413608 10413608 length=49 AA<AAF7FFFFFFFFF<FFFFFFFFFFFFFFFFAFFFFFFFAFFFFFFF @SRR4244242.5105782 5105782 length=148 ATTTTTATGCTAAGTTCGAATGTATTTTTTTTGAGAATACAAAAAGTAACCCTTGAAAATCAGAATATATAACAGAAAAGAGCACAATAACTTAAGTATTAAACATCTGTATGAAATAACTTGCAAAGTTTGACAAATATGCACACAT +SRR4244242.5105782 5105782 length=148 AAAAAFFFFFFFFFFFFFFAFFFFFFFFFFFFFFFFAFFFFFF<FFFF.FAFFFFFAFFFFAFFFFFFFFFFFFFF.FFFFFFAFF<FAFFFFAFFFFFFAFFF7FFFFFFFFF7FFFFFFF<F7FAFF<FAFF7FFFFAF.A<7FFA @SRR4244242.11374318 11374318 length=151 CTTCTCCTCCCTCCATCAGAAGATGATCTGGAAATATTCAAGAAATACCAACACCTGTTTCTTCAGGGCAATGCAACATGGATGCCTTTCTTTCACTGCCCAAAATGGAATGTTCGGACAATTCTAAGAGGAGAGCATAACTTCTTCTCTG +SRR4244242.11374318 11374318 length=151 AAAAAFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAFFFFFFFFFF)F7FFAFFAFFFFFFFFFFFFFF<AFFFFFFFFFFFFFFFFFFFFAFFAFF<<AFFFFAAFFF<FF)FFAFAFF.7AAFF.FAAF<.A7FF7<<7.<A... @SRR4244242.3778568 3778568 length=151 TGGCAATCCAGGAGGAAGCAAAATTCGCACTGGTTACACATGACCAGGTCACCTGGTATCTGGCAGACACGGCAGATAGTGGCACTGTCATCCAGGATACCTGGCCCACTTACTGGGGCTGAGCTCACCTCAGGAGCCACCACCTCCAAGC +SRR4244242.3778568 3778568 length=151 AA<AAFFFAFFF.FF77FFFAF<FFFAFFFFFAFFAF7FF<)A)FF.<FFFAFFAF))FFFFFFFA<A.FF<FFFFF<AF.FAFFF.FFFF<<<F).7.F<F).FF..F<)FF.FAFA<F.7.)F..7<F<.AF.7)A7..7A.AA7.A)7 @SRR4244242.21767697 21767697 length=151 CCCCTCCTTAGGCAACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATTGATGCCGAACTTAGTGCGGACACCCGATCGGCATAGCGCACTACAGCCCAGAACTCCTGGACTCAAGCGATCCTCCTGTCTCAGCCTCCCGAGTAGCTGG +SRR4244242.21767697 21767697 length=151 AAAAAFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFAFFFFF<FFAFFFFF.FFFFFFAFFFFFFFAFFFF.FFFF.FFFFFFF7FFFFFFFFF<FFFFF7)<FFFFFFFAFFFFFFFFFFF<FFF7FFFF<.<AAFF<A<AFA<FFFAF @SRR4244242.19608439 19608439 length=106 CTTTGGTCTCCACGGTTGTAGTTTGTAGCTCGTGTGTTATAATTGCTCTCGTGCTGAGCTAAACACACCCAGTCGGCCAGGCTGACTCCATAGTAGCCTGCCATTC +SRR4244242.19608439 19608439 length=106 AAAAAFFFFFFFFFFFFFFFFFFFFFFFAFFFFFFFFF.FFFFAF<FFFFFFFFFFF.FFFFFF.FFF<F7FFFFFAF7FFAAF<77FFAFFAF.7AA)FFF<F7< @SRR4244242.9959947 9959947 length=137 CCGACCTGGGCCGGTTCACCCCTCCTTAGGCAACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATTGATGCCGAACTTAGTGCGGACACCCGATCGGCATAGCGCACTACAGCCCAGAACTCCTGGACTCAAGC +SRR4244242.9959947 9959947 length=137 AAA<AFFFFFFFFFFFF)F.7FFFFFF<FFFF<FFFFFFFFFFF.FFFFF<A)FFAFFFFF)<F<FFF<.FFFFFFF7FF<AAFFFFFFF7F.FF.FFFAFFFFFFF.<FFFF7AFFFFF7<FFFA7FAFFFF7F7F @SRR4244242.9702802 9702802 length=95 AGGGAGGCATCCGCTCCGGCGAGGGAGGCATCCGCCCCGACTCGGGGCTTCTCCTGCCCAGTCTGCCCCAGCGTAGAGCCCTGCTCTCTGGGAAC +SRR4244242.9702802 9702802 length=95 A.AA<FFFFFFFAFFFFFFFFA<FA<FAF<FFFF<)FFF.FFFFFFF.FA)7FFFF..F.FAFFFAAF)FFFF...7FF.FFF))FFFFFFF.FF @SRR4244242.10361791 10361791 length=111 TGAATACATGACCATTTCTCTTTTAGCACGCTCTTTATTCTCCTCTTCCAGAAGTTGGAGACGACTATTTAATTTGATTATCTGACGTCTTAATGAAGCTGCATCTACAAC +SRR4244242.10361791 10361791 length=111 AAAAAFFFFF<FFAFFFFFFFFFFAFF.FFFFFFFF<FFFAFFFFFFFF<<AFFFFF7FFFF..FF<FFF..FFFFFAA.AAFF<F.)FFF.<F77FF7F.F.77F.F.FF @SRR4244242.8071116 8071116 length=151 CTGGAGTCTTGGAAGCTTGACTACCCTACGTTCTCCTACAATGGACCTTGAGAGCTTGTTTGGAGGTTCTAGCAGGGGAGCGCAGCTACTCGTATACCCTTGACCGAAGACCGGTCCTCCTCTATTCGGGGAAGGTCGTCCTCTTCGACCG +SRR4244242.8071116 8071116 length=151 AAAAAFFFAFFF<AFFFFFAAFFF7FFFFFFFFFFFFFFFFFFFFFAFFFFFFFFFAFFFFFFAFAFFFFFFFFFFFFFFFAF)FFA7FFFFFFA.FFF<F<)FFF.<F<FFFFFAFFFFFFAAAFFFFFFFFFFFAFF<.A.FF<F<F<< @SRR4244242.7864080 7864080 length=150 GCCAGCTCTGCGGCAGGGTGTTCAGGCCTCAGTCCAGCACTGAAGGCAGGTGGTGTGGCCTCTACAGCTCATCCATGGCTTGGACAGGGGATTCTTCCTCATCTTCCTCCTTCTCATCTTCTTCGTCCTCATCTTCATCTCAATCAGATC +SRR4244242.7864080 7864080 length=150 AAAAAFFFFFFFFFAFFFFFFAFFFF<FFFF.FFFFF.)FFFFFFFAF.FFFFFFFF.7FFFF)FFFFAF<FF7.F)FFFFA<AF77)F.FFFF.FFFFF<F)FFF)<.FF<7F77F.F77F<<77.<F7<<.77)<F.AA.)AA<<..< @SRR4244242.9494196 9494196 length=120 TTTACATAGCAGTTCCAGATCACTCAGATACACAGTAAGACCCTGTCTAGGATCCTTTCTGAAAAACAGATTATTGCAGCTGGAACAACTATATAATGCCTACTACATGCCAAGCTCCAG +SRR4244242.9494196 9494196 length=120 A<AAAFFFFFAFFFFF7F<FFFFFFFFFFFFAFFFFA<F.FFFFFFFFAFF.FFFFFFAAFFFFFFFFFFFFFFAFF<F<FFF..FFFFFAFFFAA<FFFFFFF7FFFFFF..FFAF7FF @SRR4244242.16207425 16207425 length=113 CTTGATCTTGATTTTCAGTACGAATACAGACCGTGAAAGCGGGGCCTCACGATCCTTCTGACCTTTTGGGTTTTAAGCAGGAGGTGTCAGAAAAGTTACCACAGGGATAACTG +SRR4244242.16207425 16207425 length=113 AAAAAFFFFFFFFFAFFFFFFFAFFFFAFFFFFFFFFFFFFFFFFFFFFFFAFFFFAFFF.FFFFFFFFFF7FFFFFFAFFAFFFFFFFFFFAFFFFFFFFF<FFFAFAFFFF @SRR4244242.10008174 10008174 length=151 GTAGGACTGAGGCAGGTAGGTCCCGGCCTTAATGTTAATAAGGAGCTCCAGCAGGTTTTTGGGCGGCATAACGATGGAAGTGTTCAGAGGAATCACGTAGCACTTGTCCAGGTTAAGGTCCAAATAAGCAGTGAGTTTCTTGTTGAAGTCG +SRR4244242.10008174 10008174 length=151 AAAAAFFFFFFFFFFFFFFFFFFFFFFF)FFFFFFFFFFFFFFFFFFFFFFFFFFFAFFFFFFFFFF7FFFFFFFFFFFFFFFFFAFFFFFAFF.FFFFFFAFFFFFF<FFFF<.FFFFFF7FFFAFFFFF<FFAFAFFF<<<AF<FFFF. @SRR4244242.22742394 22742394 length=108 GCCGCTCGTCGGAGTACAGGATGCTAGCTGAAAGACTGTGATCCCGCTGACTGTTCCCTCGCCCACCTGGGATCTTCAGGGGTGGGCGAGGGACATCAGGAGCACCAC +SRR4244242.22742394 22742394 length=108 AAAAAFFFFFFFFFFFFFFFFFFFFAFFFFFFAF7FFFFF<FFAFFFFF<FFFAAFFF7FFFFFFFFFF7FFFF<FFF<FFFFFFFFFFFFFFFFFFF<FFFFFFFFF @SRR4244242.15611876 15611876 length=94 TGGACTGTTATCAAAACACCTAAGGAGGATATTAATCATGAGGAAGATATTCCTTGCATATTATATTCCTTGCATGAATATAAACTGGATGATT +SRR4244242.15611876 15611876 length=94 AAAAAFFFFAFFAFFFF<FFFAFFFFFFFFAFFFFFFFFFFFFFFFFFFFFFFFFFFFA.F7AFFAFFFFFFFFFFFFFFFFFFFF<FFFFFFF @SRR4244242.21281 21281 length=150 CGGTTCACCCCTCCTTAGGCAACCTGGTGGTCCCCCGCTACCGGGAGGTAACCATATTGATGCCGAACTTAGTGCGGACACCCGATCGGCAGATAGGAAAAGCACACGTCTAAACTCCAAACACAACACAAAAACATATAACATAATATA +SRR4244242.21281 21281 length=150 AAAAAF).F<F)FFFF<)FFAFF<A7<A7FFF)7FAFA7)FFAFFFFFF.FFF<F7FAF.<FFF<.FFFF.FFF).)FAAF77)F7F)F.7F7F.))FA.A.7A)FA.)F))7.).)7.))).7<)..).<7...))..7.)<)..)..7 @SRR4244242.3210697 3210697 length=151 CCCCTGGGGCGCGCAAGTCTGCGCTGGTTGTGGCCCCGCCACACTGCGGAGGTTGGTCAGATGGTTGCCCATCTTCATGATGAGTTTCACCTCCTTATCAAGAAAGAGGCTTTCCAGGAAGTCACAGAGATGAGGGTCCGCGCGGGCAGAC +SRR4244242.3210697 3210697 length=151 AAAA<FF<FAFF.7)<FFFAFF<FFFFFFF<AF)F)FFFF)F)<.FFFFFFF<.FFFFAF<).77)F<7F.AFF7F.FF.FFFF.FAF).F7FFA.F.F..<A))F)FF.7)F<.)FAA)F)7)FFFFFF<FAAFF7)7<FF<F.AF)..) @SRR4244242.27105289 27105289 length=117
data2 = open("example2").read()
data2[:100]
'RIFFF\x1f2\x00WAVEfmt \x10\x00\x00\x00\x01\x00\x02\x00D\xac\x00\x00\x10\xb1\x02\x00\x04\x00\x10\x00LIST\x1a\x00\x00\x00INFOISFT\x0e\x00\x00\x00Lavf56.40.101\x00data\x00\x1f2\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'