subtitlecat.com

All language subtitles for 77031x_PR_DNA_Markers_01_Assembly_v1-en

Afrikaans

Albanian

Amharic

Arabic

Armenian

Azerbaijani

Basque

Belarusian

Bengali

Bosnian

Bulgarian

Catalan

Cebuano

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English Download

Esperanto

Estonian

Filipino

Finnish

French Download

Frisian

Galician

Georgian

German

Greek

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Khmer

Korean

Kurdish (Kurmanji)

Kyrgyz

Lao

Latin

Latvian

Lithuanian

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mongolian

Myanmar (Burmese)

Nepali

Norwegian

Pashto

Persian

Polish

Portuguese

Punjabi

Romanian

Russian

Samoan

Scots Gaelic

Serbian

Sesotho

Shona

Sindhi

Sinhala

Slovak

Slovenian

Somali

Spanish

Sundanese

Swahili

Swedish

Tajik

Tamil

Telugu

Thai

Turkish

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Xhosa

Yiddish

Yoruba

Zulu

Odia (Oriya)

Kinyarwanda

Turkmen

Tatar

Uyghur

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 0 00:00:00,000 --> 00:00:02,340 PETER REDDIEN: I'd say it's just an amazing era we're 1 00:00:02,340 --> 00:00:06,690 operating in now with the power of DNA sequencing. 2 00:00:06,690 --> 00:00:09,270 You may think of DNA sequencing as something you do just 3 00:00:09,270 --> 00:00:12,250 to get the sequence of a genome or sequence of an individual, 4 00:00:12,250 --> 00:00:15,630 but there is a bewildering array of methods 5 00:00:15,630 --> 00:00:19,870 that you can do with DNA sequencing to explore biology, 6 00:00:19,870 --> 00:00:21,870 lots of different types of contexts in which you 7 00:00:21,870 --> 00:00:24,420 can use sequencing to get information 8 00:00:24,420 --> 00:00:27,745 about cells or an organism. 9 00:00:27,745 --> 00:00:29,370 So for example, you can get information 10 00:00:29,370 --> 00:00:32,640 about epigenetic states, which we'll talk about, 11 00:00:32,640 --> 00:00:36,690 gene expression levels, variance, regions 12 00:00:36,690 --> 00:00:38,440 that regulate genes, and so on. 13 00:00:38,440 --> 00:00:40,380 So lots of information can come using 14 00:00:40,380 --> 00:00:41,880 the method of DNA sequencing. 15 00:00:41,880 --> 00:00:43,922 You'll get some exposure to some of these methods 16 00:00:43,922 --> 00:00:45,870 throughout the course. 17 00:00:45,870 --> 00:00:47,190 OK. 18 00:00:47,190 --> 00:00:49,830 Right now we're really focused on just sequencing 19 00:00:49,830 --> 00:00:53,550 to get the sequence of a genome and today, a little bit 20 00:00:53,550 --> 00:00:56,740 of the uses of sequencing with respect to that. 21 00:00:56,740 --> 00:01:00,360 So the first part of the lecture here is about mapping 22 00:01:00,360 --> 00:01:04,680 reads to an assembly, like an assembled genome sequence, 23 00:01:04,680 --> 00:01:07,380 then I'll talk about genetic mapping with DNA sequencing. 24 00:01:07,380 --> 00:01:10,320 And finally, time permitting, a little bit of commentary 25 00:01:10,320 --> 00:01:13,183 on genome annotation. 26 00:01:13,183 --> 00:01:14,850 OK, so let's start with this first part, 27 00:01:14,850 --> 00:01:17,740 mapping reads to an assembly. 28 00:01:17,740 --> 00:01:18,240 All right. 29 00:01:18,240 --> 00:01:22,800 So after performing some kind of DNA sequencing, 30 00:01:22,800 --> 00:01:25,470 you have a set of individual reads, 31 00:01:25,470 --> 00:01:27,000 so we have these sequencing reads. 32 00:01:27,000 --> 00:01:34,650 33 00:01:34,650 --> 00:01:36,360 And we may have some huge number of them, 34 00:01:36,360 --> 00:01:39,900 millions of them depending upon the method used, 35 00:01:39,900 --> 00:01:42,360 where each of these reads have some sequence reading 36 00:01:42,360 --> 00:01:45,997 from 5 prime to 3 prime. 37 00:01:45,997 --> 00:01:48,080 And then we want to do something with these reads. 38 00:01:48,080 --> 00:01:49,320 OK? 39 00:01:49,320 --> 00:01:53,110 Taking these reads, which are all individual pieces of data-- 40 00:01:53,110 --> 00:01:53,985 so this is your data. 41 00:01:53,985 --> 00:01:58,590 42 00:01:58,590 --> 00:02:00,180 Taking these reads and then trying 43 00:02:00,180 --> 00:02:03,540 to see where they align to stitch them all together 44 00:02:03,540 --> 00:02:06,120 into some kind of assembly. 45 00:02:06,120 --> 00:02:07,560 So that was one application. 46 00:02:07,560 --> 00:02:17,478 47 00:02:17,478 --> 00:02:19,020 For example, to assemble the sequence 48 00:02:19,020 --> 00:02:22,470 of a genome, a genome assembly. 49 00:02:22,470 --> 00:02:26,200 50 00:02:26,200 --> 00:02:27,870 OK. 51 00:02:27,870 --> 00:02:32,250 OK, so what else could you do with these reads? 52 00:02:32,250 --> 00:02:36,630 Well, a second application could be 53 00:02:36,630 --> 00:02:39,180 not to take these and from scratch, 54 00:02:39,180 --> 00:02:41,430 try to assemble them into a genome, 55 00:02:41,430 --> 00:02:44,490 but simply to map them to an existing 56 00:02:44,490 --> 00:02:48,340 genome sequence or some existing reference assembly. 57 00:02:48,340 --> 00:03:00,750 So we could map them to a reference sequence. 58 00:03:00,750 --> 00:03:03,370 59 00:03:03,370 --> 00:03:03,870 OK? 60 00:03:03,870 --> 00:03:05,578 And this reference sequence, for example, 61 00:03:05,578 --> 00:03:10,460 could be an already existing assembled genome. 62 00:03:10,460 --> 00:03:12,080 OK, so let's take a look at that. 63 00:03:12,080 --> 00:03:14,020 So let's say we have some assembly here, 64 00:03:14,020 --> 00:03:15,440 like some genome reference. 65 00:03:15,440 --> 00:03:18,670 This will most likely exist in multiple different pieces 66 00:03:18,670 --> 00:03:19,720 of sequence. 67 00:03:19,720 --> 00:03:22,780 In a perfect assembly there'd be one piece of sequence 68 00:03:22,780 --> 00:03:24,340 for each chromosome. 69 00:03:24,340 --> 00:03:25,990 Most assemblies are very imperfect 70 00:03:25,990 --> 00:03:28,360 and are fragmented into lots of different pieces. 71 00:03:28,360 --> 00:03:31,630 You have this assembly and then you can take your reads 72 00:03:31,630 --> 00:03:34,480 and go on a search, computationally intensive, 73 00:03:34,480 --> 00:03:37,120 to look for where in the genome-- 74 00:03:37,120 --> 00:03:38,350 because the genome is large-- 75 00:03:38,350 --> 00:03:39,610 these things align. 76 00:03:39,610 --> 00:03:42,050 So you can get these reads and align them. 77 00:03:42,050 --> 00:03:44,290 So you might take this read, some read one, 78 00:03:44,290 --> 00:03:48,190 and you might notice that this aligns here to this reference. 79 00:03:48,190 --> 00:03:50,230 And by align I mean it has a sequence match. 80 00:03:50,230 --> 00:03:51,590 It's the same sequence. 81 00:03:51,590 --> 00:03:53,500 So it's a sequence matched to that location 82 00:03:53,500 --> 00:03:54,740 in the genome assembly. 83 00:03:54,740 --> 00:03:55,240 OK? 84 00:03:55,240 --> 00:03:57,365 And so then you can do this with all of your reads, 85 00:03:57,365 --> 00:03:59,110 line them up, and they'll all line up, 86 00:03:59,110 --> 00:04:02,150 stacking up to some locations in the genome. 87 00:04:02,150 --> 00:04:04,150 So you'll know those reads came from wherever 88 00:04:04,150 --> 00:04:07,032 your sample came from, from that part of the genome. 89 00:04:07,032 --> 00:04:09,490 Now, so one of the things you could do from this-- so these 90 00:04:09,490 --> 00:04:12,520 reads, for example, your data, could be from some individual. 91 00:04:12,520 --> 00:04:16,089 Let's say someone has presented with some kind of disease 92 00:04:16,089 --> 00:04:16,660 in a clinic. 93 00:04:16,660 --> 00:04:19,810 You could do some DNA sequencing from that individual, 94 00:04:19,810 --> 00:04:21,459 take the reads you get, align them 95 00:04:21,459 --> 00:04:24,490 to a reference sequence, the human reference sequence, 96 00:04:24,490 --> 00:04:28,960 and then look at your alignment for any relevant information. 97 00:04:28,960 --> 00:04:31,940 For instance, you might notice that at this position, 98 00:04:31,940 --> 00:04:35,020 you have, in the reference sequence, a nucleotide A. 99 00:04:35,020 --> 00:04:37,120 And then when you look at the sequence 100 00:04:37,120 --> 00:04:39,370 reads from the individual at that position, 101 00:04:39,370 --> 00:04:41,530 you might notice it's a G. And that way you 102 00:04:41,530 --> 00:04:44,290 would have identified a DNA sequence variant that 103 00:04:44,290 --> 00:04:47,140 existed in your individual with respect 104 00:04:47,140 --> 00:04:50,110 to some reference sequence, your genome assembly. 105 00:04:50,110 --> 00:04:51,910 So you'd have a sequence variant here, 106 00:04:51,910 --> 00:04:53,410 and you might have learned something 107 00:04:53,410 --> 00:04:55,880 that might be relevant for your purposes from this. 108 00:04:55,880 --> 00:04:58,270 OK, so lots of other applications 109 00:04:58,270 --> 00:05:01,900 of aligning sequencing reads to a genome assembly. 110 00:05:01,900 --> 00:05:05,100 This is just one, and we'll come back to some others. 8131