Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:05,330 --> 00:00:12,920
In the previous lesson, we saw how we can import a folder of files and also a text file into Excel.
2
00:00:13,670 --> 00:00:18,410
And it's this text file that we're going to clean up and prepare for analysis.
3
00:00:18,830 --> 00:00:24,500
Now if you find that when you are opening these files, you get this security warning running across
4
00:00:24,500 --> 00:00:31,400
the top in yellow, then you can simply click Enable content is just because we've created external
5
00:00:31,400 --> 00:00:33,560
links when we've imported this data.
6
00:00:34,130 --> 00:00:39,230
So I'm going to click Enable content and I'm going to say, do not ask me again for network files.
7
00:00:39,470 --> 00:00:40,880
So now we have this data set.
8
00:00:41,240 --> 00:00:45,200
And the first thing we're going to deal with here are these blank rows.
9
00:00:45,740 --> 00:00:51,140
I'm also going to show you how to handle blank cells and how to remove duplicate entries.
10
00:00:51,860 --> 00:00:54,410
So let's start with the blank rows, first of all.
11
00:00:54,680 --> 00:00:59,810
Now, blank rows in general can cause problems when you're analyzing your data.
12
00:01:00,440 --> 00:01:06,260
For example, if I want to put this data into a pivot table, then start creating things like charts.
13
00:01:06,590 --> 00:01:10,160
If I have blank rows in there, it's going to throw off my data.
14
00:01:10,790 --> 00:01:17,300
So you always want to make sure that you remove any blank rows that you have in your dataset.
15
00:01:17,480 --> 00:01:22,700
Now, if you have a very large dataset and you have quite a lot of blank rows in there, it's going
16
00:01:22,700 --> 00:01:26,180
to be pretty tedious to go through and try and delete all of these manually.
17
00:01:26,630 --> 00:01:32,300
So I might decide I want to select these two rows, hold down control, select these to scroll down,
18
00:01:32,300 --> 00:01:33,140
select them all.
19
00:01:33,350 --> 00:01:35,510
That's not particularly time efficient.
20
00:01:35,750 --> 00:01:42,020
Fortunately, there is a quick way to do this in Excel, so all we need to do here is click somewhere
21
00:01:42,020 --> 00:01:49,460
in our data, go up to the home tab and then all the way over in the editing group if we go to find
22
00:01:49,460 --> 00:01:53,150
and select and then go to special.
23
00:01:53,880 --> 00:01:57,380
And this is going to open this little go to special dialog box.
24
00:01:57,950 --> 00:02:03,290
Now, if you're interested in keyboard shortcuts, there isn't a keyboard shortcut to go directly.
25
00:02:03,290 --> 00:02:09,740
To go to special for what you could do is press control g and then click the special button.
26
00:02:10,850 --> 00:02:16,520
Now what this allows you to do, amongst other things, is select different items within your spreadsheet.
27
00:02:17,370 --> 00:02:21,170
One of the options that we have here is to select all blanks.
28
00:02:21,590 --> 00:02:27,770
So let's select that click on OK, and it highlights all of those blank rows.
29
00:02:28,010 --> 00:02:34,700
So now with them all highlighted, I can go up to that home tab again into the sales group and using
30
00:02:34,700 --> 00:02:41,000
the Delete dropdown, I can delete sheet rows and I've deleted all of them in one go.
31
00:02:41,690 --> 00:02:44,210
So that's going to save me a lot of time.
32
00:02:44,780 --> 00:02:48,920
Now, another thing you want to make sure you deal with in your spreadsheets are blank cells.
33
00:02:49,520 --> 00:02:54,170
Now, I don't have any blank cells, but if I did, let's just delete a few things out of some of these
34
00:02:54,170 --> 00:02:54,740
cells.
35
00:02:55,130 --> 00:03:01,770
And let's go like, that's like, so now again, blank cells can cause a bit of a problem.
36
00:03:01,790 --> 00:03:08,180
It's always better to have some kind of numeric value in these cells, even if that is just a zero.
37
00:03:08,480 --> 00:03:15,080
So once again, if I want to select all blank cells in this particular column, all I would need to
38
00:03:15,080 --> 00:03:24,590
do is select the Column Control G Special and I can select blanks and click on OK.
39
00:03:25,280 --> 00:03:29,870
And that will select all of the blank cells just in the range that I specified.
40
00:03:30,170 --> 00:03:34,280
So now I can enter a zero into all of them in one go.
41
00:03:34,940 --> 00:03:39,320
So all I need to do here is type zero control enter.
42
00:03:39,860 --> 00:03:42,920
And that's going to put a zero in all of those blank cells.
43
00:03:42,920 --> 00:03:46,070
I haven't had to scroll through and do them all individually.
44
00:03:46,580 --> 00:03:49,820
So again, that is a real time saver of a trick.
45
00:03:50,360 --> 00:03:55,580
And the final thing I want to show you here is removing duplicates from your dataset.
46
00:03:56,360 --> 00:04:01,010
So it might be when you import your data and some duplicates come across for whatever reason.
47
00:04:01,580 --> 00:04:07,220
Now, when I say duplicates, I mean an exact duplicate of every single column in this row.
48
00:04:07,610 --> 00:04:13,130
Now it's always worth removing duplicates, even if you're not sure if you have duplicates in your spreadsheet.
49
00:04:13,370 --> 00:04:18,290
And in Excel, we have a button that can check duplicates for us and remove them quickly.
50
00:04:18,530 --> 00:04:25,880
So let's make sure we collect somewhere in a spreadsheet up to the data tab and in the data tools group
51
00:04:25,880 --> 00:04:26,330
we have.
52
00:04:26,330 --> 00:04:27,890
I remove duplicates button.
53
00:04:28,580 --> 00:04:33,440
Now, when I click, this is going to ask me to determine what is actually a duplicate.
54
00:04:34,100 --> 00:04:39,590
And the first thing I need to make sure that I select is my data has headers and notice.
55
00:04:39,590 --> 00:04:44,240
It's picked up all of the different columns and it's put a tick in the box next to all of them.
56
00:04:44,780 --> 00:04:51,320
So what this basically means is that every column has to be the same for it to be considered a duplicate.
57
00:04:51,560 --> 00:04:58,770
Now I'm happy with that because I only want to remove duplicates where every value in the row is duplicated.
58
00:04:59,450 --> 00:05:00,860
So let's click on OK?
59
00:05:01,490 --> 00:05:04,760
You can see that it's actually found two duplicate row.
60
00:05:04,940 --> 00:05:07,010
Old values and it's removed them.
61
00:05:07,640 --> 00:05:08,510
Click on OK.
62
00:05:09,050 --> 00:05:10,220
And we are done.
63
00:05:10,430 --> 00:05:19,940
So that is how you can very quickly delete black rose input data into blank cells and also remove duplicates.
7040
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.