subtitlecat.com

All language subtitles for 1. PATTERN MATCHING BASICS

Afrikaans

Albanian

Amharic

Arabic

Armenian

Azerbaijani

Basque

Belarusian

Bengali

Bosnian

Bulgarian

Catalan

Cebuano

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Filipino

Finnish

French Download

Frisian

Galician

Georgian

German

Greek

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Khmer

Korean

Kurdish (Kurmanji)

Kyrgyz

Lao

Latin

Latvian

Lithuanian

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mongolian

Myanmar (Burmese)

Nepali

Norwegian

Pashto

Persian

Polish

Portuguese

Punjabi

Romanian

Russian

Samoan

Scots Gaelic

Serbian

Sesotho

Shona

Sindhi

Sinhala

Slovak

Slovenian

Somali

Spanish

Sundanese

Swahili

Swedish

Tajik

Tamil

Telugu

Thai

Turkish

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Xhosa

Yiddish

Yoruba

Zulu

Odia (Oriya)

Kinyarwanda

Turkmen

Tatar

Uyghur

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 0 1 00:00:06,912 --> 00:00:07,680 Hello everyone 1 2 00:00:07,936 --> 00:00:11,776 Welcome to our bonus lecture on pattern matching in postgres SQL 2 3 00:00:13,056 --> 00:00:18,176 In this lecture we will start by discussing different types of pattern matching, their syntax 3 4 00:00:18,432 --> 00:00:21,248 And then we will discuss few examples of each type 4 5 00:00:22,784 --> 00:00:26,880 At the end, we will also give you some tips on the usage of different methods 5 6 00:00:28,416 --> 00:00:29,440 So let's start 6 7 00:00:30,464 --> 00:00:32,000 We can define pattern matching 7 8 00:00:32,256 --> 00:00:36,352 As a method to identify the string complying to given format 8 9 00:00:37,376 --> 00:00:42,240 Suppose, we want to identify all the customer name that starts with letter A 9 10 00:00:42,496 --> 00:00:47,104 Or we want to identify the customer who has provided email id instead of names 10 11 00:00:47,872 --> 00:00:48,384 Or 11 12 00:00:48,640 --> 00:00:53,248 To identify the customer who has not provided the domain names in their email ID 12 13 00:00:53,760 --> 00:00:58,368 We can find all such customers using a pattern matching in postgres SQL 13 14 00:01:00,672 --> 00:01:05,536 This lecture will set your fundamentals of other advanced string functions as well 14 15 00:01:06,048 --> 00:01:07,840 Such as string split 15 16 00:01:08,352 --> 00:01:09,120 Etc. 16 17 00:01:10,400 --> 00:01:15,776 So understand the concept thoroughly and you will be able to use other functions as well 17 18 00:01:18,336 --> 00:01:20,128 In postgres SQL there are 18 19 00:01:20,384 --> 00:01:22,432 Three methods to perform pattern matching 19 20 00:01:23,200 --> 00:01:24,992 First is the like statement 20 21 00:01:25,760 --> 00:01:27,808 Second is the similar to statement 21 22 00:01:28,320 --> 00:01:32,416 And third is using Tilde operators with regular expressions 22 23 00:01:32,672 --> 00:01:34,720 Which are also known as regex expression 23 24 00:01:36,000 --> 00:01:39,584 We have already discussed like statements in our previous videos 24 25 00:01:40,352 --> 00:01:42,656 And we will be only covering few examples 25 26 00:01:42,912 --> 00:01:44,192 To refresh our memory 26 27 00:01:44,960 --> 00:01:47,008 Now moving on to similar to function 27 28 00:01:47,520 --> 00:01:49,568 These are the SQL standard functions 28 29 00:01:50,080 --> 00:01:56,224 And the only reason postgres SQL support it, is to stay compliant with SQL standards 29 30 00:01:57,504 --> 00:01:58,272 And Internally 30 31 00:01:58,528 --> 00:02:03,136 Every similar to expression is written in the form of regular expressions 31 32 00:02:03,648 --> 00:02:08,512 And therefore there always be a regular expression to do the same job faster 32 33 00:02:08,768 --> 00:02:10,304 Then the Similar to statements 33 34 00:02:11,072 --> 00:02:14,400 So there is no point in discussing similar to expressions 34 35 00:02:14,912 --> 00:02:19,264 And you should also avoid it in your queries and try regular expressions instead 35 36 00:02:20,544 --> 00:02:26,688 Regular expression with tilde operator provide us a very powerful and flexible tool to perform pattern 36 37 00:02:26,944 --> 00:02:27,456 Matching 37 38 00:02:28,992 --> 00:02:34,368 And one thing to note here is that the wild cards of like statements and wildcards 38 39 00:02:34,624 --> 00:02:36,928 Of regular expressions are different 39 40 00:02:37,696 --> 00:02:39,744 And you should try to learn them separately 40 41 00:02:40,512 --> 00:02:44,608 In this video, we will be mainly focusing on the regular expressions only 41 42 00:02:45,376 --> 00:02:49,216 Another major difference between like a statement and regular statement is 42 43 00:02:49,728 --> 00:02:53,312 Like statements perform pattern matching on the whole string 43 44 00:02:53,824 --> 00:02:58,688 Whereas regular expressions perform pattern matching also on the part of string 44 45 00:02:59,200 --> 00:03:03,040 So suppose if I want to find customer name with just 45 46 00:03:03,296 --> 00:03:04,320 Character A 46 47 00:03:04,832 --> 00:03:08,160 Like statement will find the customer name where 47 48 00:03:08,416 --> 00:03:11,232 There is only one character which is a 48 49 00:03:11,488 --> 00:03:14,560 Whereas regular expression will find all the customers 49 50 00:03:14,816 --> 00:03:17,120 Where the name contains a character A 50 51 00:03:17,888 --> 00:03:20,192 So let's start with the like operator 51 52 00:03:20,960 --> 00:03:23,264 There are two wildcards in like operator 52 53 00:03:24,544 --> 00:03:26,336 First is the percentage symbol 53 54 00:03:26,592 --> 00:03:28,640 And second is the underscore symbol 54 55 00:03:30,176 --> 00:03:34,016 Percentage symbols allow you to match any string of any length 55 56 00:03:34,528 --> 00:03:37,344 Whereas underscore symbol allow you to match 56 57 00:03:37,600 --> 00:03:38,880 Only a single character 57 58 00:03:39,904 --> 00:03:41,440 Let's look at some example 58 59 00:03:43,488 --> 00:03:46,048 We have already discussed this in our previous videos 59 60 00:03:46,304 --> 00:03:51,680 So in this video we will be only discussing it and not executing it in pg admin 60 61 00:03:52,448 --> 00:03:56,032 So suppose if we want to find all the customers 61 62 00:03:56,800 --> 00:03:58,848 Where the first name starts with 62 63 00:03:59,104 --> 00:04:00,640 Character J and o 63 64 00:04:01,152 --> 00:04:02,944 Will write select star 64 65 00:04:03,712 --> 00:04:04,736 From customer table 65 66 00:04:05,248 --> 00:04:06,272 Where first name 66 67 00:04:07,040 --> 00:04:07,552 Is like 67 68 00:04:08,576 --> 00:04:14,720 J o and then the percentage symbol, we are using percentage symbol because we don't know the length 68 69 00:04:14,976 --> 00:04:21,119 Of the name and we want to identify all the customer names where the starting characters are J and o 69 70 00:04:21,375 --> 00:04:24,191 Now suppose you want to find all the customers 70 71 00:04:24,703 --> 00:04:27,263 Which contains letter O and d 71 72 00:04:28,799 --> 00:04:33,407 So will write, select star from customer table where first name 72 73 00:04:33,663 --> 00:04:34,175 Like 73 74 00:04:34,431 --> 00:04:35,455 Percentage symbol 74 75 00:04:35,967 --> 00:04:37,247 Then OD 75 76 00:04:37,503 --> 00:04:41,087 Then percentage symbol this means that first name should contain 76 77 00:04:41,343 --> 00:04:43,391 O and D adjacent to each other 77 78 00:04:44,415 --> 00:04:47,487 Now in the next example suppose I want customer name 78 79 00:04:47,743 --> 00:04:50,303 Which will start with J A S 79 80 00:04:50,815 --> 00:04:53,631 And then there should be exactly One character 80 81 00:04:54,143 --> 00:04:55,679 That can be anything and 81 82 00:04:55,935 --> 00:05:00,543 Then there should be character N, in that case I will use underscore 82 83 00:05:01,055 --> 00:05:04,639 Underscore will ensure only a single character replacement 83 84 00:05:05,663 --> 00:05:11,807 We can also use not statements with the like a statement and the next example you can see that we have 84 85 00:05:12,063 --> 00:05:16,159 Used not like J percent to identify all the customer 85 86 00:05:16,415 --> 00:05:18,463 Whose names doesn't start with J 86 87 00:05:19,231 --> 00:05:20,255 Now suppose 87 88 00:05:20,511 --> 00:05:22,815 I want to identify all the customers 88 89 00:05:23,071 --> 00:05:24,607 Whose names start with 89 90 00:05:25,119 --> 00:05:26,399 A B 90 91 00:05:26,655 --> 00:05:27,935 C or E 91 92 00:05:28,959 --> 00:05:34,079 For this I have to write four different like statements with or statements in between 92 93 00:05:35,359 --> 00:05:38,431 Now consider a more complicated case 93 94 00:05:38,687 --> 00:05:39,199 Where 94 95 00:05:39,455 --> 00:05:41,759 I want my first name to start with 95 96 00:05:42,783 --> 00:05:43,807 ABC 96 97 00:05:44,063 --> 00:05:45,343 D or E 97 98 00:05:46,111 --> 00:05:48,415 And my second name should start with 98 99 00:05:48,927 --> 00:05:50,207 F or g 99 100 00:05:51,487 --> 00:05:52,255 In this case 100 101 00:05:52,767 --> 00:05:54,559 I have to write 5 into 2 101 102 00:05:55,071 --> 00:06:00,191 So total 10 like statements with the combination of or and and keywords 102 103 00:06:00,703 --> 00:06:06,079 Now let's suppose I also want to constrain on the length of my first name and last name 103 104 00:06:07,359 --> 00:06:08,383 In this case 104 105 00:06:08,639 --> 00:06:14,783 I first have to identify the separator between the first name and the last name that is a position of space 105 106 00:06:15,039 --> 00:06:15,551 Space 106 107 00:06:15,807 --> 00:06:16,575 In my name 107 108 00:06:17,599 --> 00:06:19,135 And then I have to write 108 109 00:06:19,391 --> 00:06:20,159 Two More 109 110 00:06:20,415 --> 00:06:23,231 Conditions on the length of first name and last name 110 111 00:06:25,791 --> 00:06:28,607 So you can see with the increase in number of condition 111 112 00:06:28,863 --> 00:06:30,143 And the complications 112 113 00:06:30,399 --> 00:06:32,959 The number of like statement in SQL 113 114 00:06:33,471 --> 00:06:35,007 Increases exponentially 114 115 00:06:36,799 --> 00:06:41,151 In general, like statements provide quick and easy way to solve 115 116 00:06:41,407 --> 00:06:43,199 Simple pattern matching problems 116 117 00:06:43,455 --> 00:06:47,295 But for Complex matching problem we have to use the regex functions 117 118 00:06:50,367 --> 00:06:55,743 You must be thinking, we will hardly encounter any such situation in our professional career 118 119 00:06:56,255 --> 00:06:57,791 So Let me tell you an example 119 120 00:06:58,047 --> 00:07:02,143 Suppose if you want to filter out invalid email ids from your data 120 121 00:07:03,167 --> 00:07:05,983 So a valid email id should contain 121 122 00:07:06,751 --> 00:07:09,055 A string of alphanumeric characters 122 123 00:07:09,567 --> 00:07:10,079 With 123 124 00:07:10,335 --> 00:07:11,103 Either dot 124 125 00:07:11,359 --> 00:07:11,871 Or 125 126 00:07:12,383 --> 00:07:13,407 Underscore symbol 126 127 00:07:13,663 --> 00:07:14,175 Then 127 128 00:07:14,431 --> 00:07:16,479 There should be a @ sign 128 129 00:07:17,503 --> 00:07:20,831 And then again there should be alphanumeric string 129 130 00:07:21,343 --> 00:07:23,647 For example Google or Yahoo 130 131 00:07:24,159 --> 00:07:26,207 Then there should be a dot 131 132 00:07:26,975 --> 00:07:31,327 And then after that dot, there should be 2 to 8 alphabet 132 133 00:07:31,583 --> 00:07:35,167 Such as .com or .in etc 133 134 00:07:36,447 --> 00:07:40,287 A valid email id should contain all of this parts 134 135 00:07:41,055 --> 00:07:42,847 And we will learn how to write this using regex expression 11505