Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:01,410 --> 00:00:03,360
In this lesson, we'll
examine the question,
2
00:00:03,360 --> 00:00:05,090
what is a database?
3
00:00:05,090 --> 00:00:06,630
Let's imagine for
a moment that we're
4
00:00:06,630 --> 00:00:09,780
in a room with a telephone
book for the 50 largest
5
00:00:09,780 --> 00:00:11,370
American cities.
6
00:00:11,370 --> 00:00:13,260
I tell you to find
the phone number
7
00:00:13,260 --> 00:00:15,690
of Carl Roth in Los Angeles.
8
00:00:15,690 --> 00:00:16,980
What would you do?
9
00:00:16,980 --> 00:00:19,920
Simply pick up the Los
Angeles telephone book,
10
00:00:19,920 --> 00:00:22,530
flip to the R's, and
find his number--
11
00:00:22,530 --> 00:00:24,040
not terribly difficult.
12
00:00:24,040 --> 00:00:27,900
But what if, first, I tore out
all the pages of each book,
13
00:00:27,900 --> 00:00:31,500
threw the pages in the air, and
scattered them on the floor?
14
00:00:31,500 --> 00:00:35,040
Then I asked you to find
Carl Roth in Los Angeles.
15
00:00:35,040 --> 00:00:37,800
That task is considerably
more difficult.
16
00:00:37,800 --> 00:00:41,530
This is a good example of why
the database is so important.
17
00:00:41,530 --> 00:00:43,320
Think about all the
information about you
18
00:00:43,320 --> 00:00:45,330
that exist in databases--
19
00:00:45,330 --> 00:00:48,240
hobbies, our preferences,
our interest,
20
00:00:48,240 --> 00:00:51,570
purchases that we've made,
friends that we have.
21
00:00:51,570 --> 00:00:54,330
All of this information
must be stored somewhere.
22
00:00:54,330 --> 00:00:57,510
And the vast majority of
it is stored in databases.
23
00:00:57,510 --> 00:01:01,050
Put simply, a database is an
organized collection of data.
24
00:01:01,050 --> 00:01:04,260
Computers use databases
to organize and store
25
00:01:04,260 --> 00:01:07,470
that data in a way that it
can be easily retrieved.
26
00:01:07,470 --> 00:01:09,510
But the kind of
databases we have today
27
00:01:09,510 --> 00:01:12,100
haven't always
been in existence.
28
00:01:12,100 --> 00:01:15,300
The first computers were
primarily used for computation
29
00:01:15,300 --> 00:01:18,820
in scientific research and
even some military applications
30
00:01:18,820 --> 00:01:21,150
to calculate numbers quickly.
31
00:01:21,150 --> 00:01:24,780
As time went on, businesses
began to use them, as well.
32
00:01:24,780 --> 00:01:26,760
When they came into
more common use,
33
00:01:26,760 --> 00:01:29,370
requirements increased
to be able to store
34
00:01:29,370 --> 00:01:31,500
much of the data that
was being calculated.
35
00:01:31,500 --> 00:01:33,630
As more data was
stored, new ways
36
00:01:33,630 --> 00:01:36,930
were required to preserve and
retrieve that data quickly.
37
00:01:36,930 --> 00:01:40,410
The first databases were
called flat-file databases.
38
00:01:40,410 --> 00:01:42,150
A flat-file database
is something
39
00:01:42,150 --> 00:01:44,190
that we're basically
familiar with.
40
00:01:44,190 --> 00:01:46,440
If you've ever seen a
comma-separated values
41
00:01:46,440 --> 00:01:49,410
file, or CSV,
you've seen the way
42
00:01:49,410 --> 00:01:51,840
a flat-file database is stored.
43
00:01:51,840 --> 00:01:56,100
Let's look at an example of
some simple flat-file values.
44
00:01:56,100 --> 00:01:58,500
Here we see first
name, last name,
45
00:01:58,500 --> 00:02:02,170
and other information about
our customers in the database.
46
00:02:02,170 --> 00:02:04,140
Notice that the first
value in each row
47
00:02:04,140 --> 00:02:07,350
is the first name and the
second value in each row
48
00:02:07,350 --> 00:02:08,760
is the last name.
49
00:02:08,760 --> 00:02:11,970
This is consistent
across all of the values.
50
00:02:11,970 --> 00:02:14,670
The data is read from,
essentially, left to right.
51
00:02:14,670 --> 00:02:17,550
The first value is read,
than a comma delimiter,
52
00:02:17,550 --> 00:02:20,520
the second value is read,
and so on and so forth.
53
00:02:20,520 --> 00:02:24,720
Each individual value within
that row, separated by commas,
54
00:02:24,720 --> 00:02:26,310
is called a field.
55
00:02:26,310 --> 00:02:29,340
This data was generally accessed
using programmatic methods
56
00:02:29,340 --> 00:02:32,070
that read individual records.
57
00:02:32,070 --> 00:02:33,810
This example of a
flat-file database
58
00:02:33,810 --> 00:02:37,950
is very limited in that there's
only a few records and fields.
59
00:02:37,950 --> 00:02:39,660
In a real flat-file
database, there
60
00:02:39,660 --> 00:02:42,720
could be millions and millions
of records and hundreds
61
00:02:42,720 --> 00:02:43,710
of fields.
62
00:02:43,710 --> 00:02:46,580
This is one of the problems
that developed over time.
63
00:02:46,580 --> 00:02:48,900
The flat files
grew unmanageable.
64
00:02:48,900 --> 00:02:50,430
Another approach
was needed to be
65
00:02:50,430 --> 00:02:52,290
able to manage
this type of data.
66
00:02:52,290 --> 00:02:56,160
That's where a man named Dr.
Ted Codd comes into the story.
67
00:02:56,160 --> 00:02:58,950
Dr. Codd, employed
at IBM at the time,
68
00:02:58,950 --> 00:03:01,890
was working on a solution
for some of these problems.
69
00:03:01,890 --> 00:03:05,040
In the 1970s, Dr.
Codd presented a paper
70
00:03:05,040 --> 00:03:07,410
that introduced a new
type of storage paradigm
71
00:03:07,410 --> 00:03:10,770
for databases, called
the relational paradigm.
72
00:03:10,770 --> 00:03:12,390
The relational
paradigm was a way
73
00:03:12,390 --> 00:03:14,160
to deal with many
of the problems
74
00:03:14,160 --> 00:03:16,620
in typical flat-file databases.
75
00:03:16,620 --> 00:03:20,160
The relational paradigm depends
on organizing information
76
00:03:20,160 --> 00:03:23,100
into what's called
entities and attributes.
77
00:03:23,100 --> 00:03:26,280
An entity is any
person, place, or thing.
78
00:03:26,280 --> 00:03:30,180
Attributes are things about
that person, place, or thing.
79
00:03:30,180 --> 00:03:32,250
So we might organize
our information the way
80
00:03:32,250 --> 00:03:33,720
you see here.
81
00:03:33,720 --> 00:03:36,360
We have an employee entity
that has attributes,
82
00:03:36,360 --> 00:03:38,580
such as first name
and last name.
83
00:03:38,580 --> 00:03:40,680
Thus, we can organize
all of the information
84
00:03:40,680 --> 00:03:43,140
that we've had in
flat-file databases
85
00:03:43,140 --> 00:03:46,320
and put it into an entity
attribute structure.
86
00:03:46,320 --> 00:03:49,380
The real strength of Dr.
Codd's relational model
87
00:03:49,380 --> 00:03:52,920
is that the information can be
related to other information.
88
00:03:52,920 --> 00:03:56,010
That is to say, we can
relate one entity to another.
89
00:03:56,010 --> 00:03:58,590
In a flat-file database
all the information
90
00:03:58,590 --> 00:04:00,630
about a given employee
would basically
91
00:04:00,630 --> 00:04:02,160
be in a single record.
92
00:04:02,160 --> 00:04:05,760
In Dr. Codd's model, we
separate that information out
93
00:04:05,760 --> 00:04:06,900
and relate it.
94
00:04:06,900 --> 00:04:08,820
Here we have
employee information
95
00:04:08,820 --> 00:04:10,270
and address information.
96
00:04:10,270 --> 00:04:13,260
Dr. Codd's theory was that
similar pieces of information
97
00:04:13,260 --> 00:04:15,300
could be structured
in such a way
98
00:04:15,300 --> 00:04:17,010
that they formed relationships.
99
00:04:17,010 --> 00:04:19,740
So rather than combining
numerous pieces of information
100
00:04:19,740 --> 00:04:22,110
together in the same
structure, we instead
101
00:04:22,110 --> 00:04:25,860
separate them out into
related pieces of information.
102
00:04:25,860 --> 00:04:28,230
Because both of these
entities share a common piece
103
00:04:28,230 --> 00:04:31,620
of information, in
this case employee ID,
104
00:04:31,620 --> 00:04:32,850
they can be related.
105
00:04:32,850 --> 00:04:36,330
This type of database is known
as an RDBMS or Relational
106
00:04:36,330 --> 00:04:38,320
Database Management System.
107
00:04:38,320 --> 00:04:39,870
So why is this model better?
108
00:04:39,870 --> 00:04:42,210
The primary reason
we use an RDBMS
109
00:04:42,210 --> 00:04:44,010
is the removal of
duplicate data.
110
00:04:44,010 --> 00:04:47,040
Consider an order entry
system with customer records.
111
00:04:47,040 --> 00:04:49,410
If an individual
places numerous orders,
112
00:04:49,410 --> 00:04:51,240
the customer's
basic information,
113
00:04:51,240 --> 00:04:53,340
such as name and
contact information,
114
00:04:53,340 --> 00:04:57,060
must be stored in every record
using the flat-file method.
115
00:04:57,060 --> 00:04:58,710
With the relational
method, we simply
116
00:04:58,710 --> 00:05:00,730
have an entity for
customer information
117
00:05:00,730 --> 00:05:03,010
and an other entity
for order information.
118
00:05:03,010 --> 00:05:04,690
We then relate them together.
119
00:05:04,690 --> 00:05:07,180
The process of transforming
a flat-file data
120
00:05:07,180 --> 00:05:10,960
model into a relational one
is known as normalization.
121
00:05:10,960 --> 00:05:15,370
Today, we refer to an entity as
a table and records and fields
122
00:05:15,370 --> 00:05:18,580
as rows and columns,
respectively.
123
00:05:18,580 --> 00:05:21,580
The RDBMS has been the data
storage model of choice
124
00:05:21,580 --> 00:05:23,200
for three decades, now.
125
00:05:23,200 --> 00:05:27,130
And although Oracle is the RDBMS
with the largest market share,
126
00:05:27,130 --> 00:05:29,530
there are other popular
systems, as well.
127
00:05:29,530 --> 00:05:33,240
IBM's flagship database
product is known as Db2.
128
00:05:33,240 --> 00:05:35,500
It evolved from their
first RDBMS product,
129
00:05:35,500 --> 00:05:38,560
called System R. It is very
popular with customers that
130
00:05:38,560 --> 00:05:40,960
use IBM hardware and
includes the ability
131
00:05:40,960 --> 00:05:42,730
to run on mainframe systems.
132
00:05:42,730 --> 00:05:45,670
Oracle actually has another
RDBMS that it provides,
133
00:05:45,670 --> 00:05:48,100
although it is free of
licensing restrictions.
134
00:05:48,100 --> 00:05:50,290
When Oracle acquired
Sun Microsystems,
135
00:05:50,290 --> 00:05:53,200
they also acquired
the MySQL RDBMS
136
00:05:53,200 --> 00:05:55,210
and continue to
support it today.
137
00:05:55,210 --> 00:05:58,720
SQL Server from Microsoft
is another popular RDBMS.
138
00:05:58,720 --> 00:06:01,090
Microsoft began
work on SQL Server
139
00:06:01,090 --> 00:06:04,360
when they purchased the code
base for Sybase SQL Server
140
00:06:04,360 --> 00:06:05,440
from Sybase.
141
00:06:05,440 --> 00:06:07,330
Although popular
on Windows systems,
142
00:06:07,330 --> 00:06:09,940
SQL Server lacks the
cross-platform abilities
143
00:06:09,940 --> 00:06:13,000
of other database systems since
it cannot run on any other
144
00:06:13,000 --> 00:06:15,990
operating system
besides Windows.
11733
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.