Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,200 --> 00:00:06,180
Hey everyone so for this demo I want to dive into another one of power buys a.i. Driven visuals called
2
00:00:06,180 --> 00:00:08,330
the key influencer visual.
3
00:00:08,400 --> 00:00:13,440
Now instead of applying this to our current adventure works project what I want to do is show you how
4
00:00:13,440 --> 00:00:18,900
we can use this visual to explore a new demo dataset that's a little bit more appropriate for this kind
5
00:00:18,900 --> 00:00:20,060
of analysis.
6
00:00:20,070 --> 00:00:26,250
So what I've done here is create a new power b I file it's called Power by A.I. visuals and you'll find
7
00:00:26,250 --> 00:00:28,410
a completed version of it as well.
8
00:00:28,560 --> 00:00:30,260
Both are available for download.
9
00:00:30,270 --> 00:00:32,360
You can follow along or just watch.
10
00:00:32,520 --> 00:00:33,850
Totally up to you.
11
00:00:34,350 --> 00:00:38,940
But the dataset that we're working with here there's one query in this file and it's called Kickstarter
12
00:00:38,940 --> 00:00:39,980
projects.
13
00:00:40,110 --> 00:00:45,600
Now for anyone not familiar with Kickstarter it's basically a platform where entrepreneurs can post
14
00:00:46,080 --> 00:00:50,760
projects or business ideas and try to get them funded from other users.
15
00:00:50,760 --> 00:00:52,990
So let's take a quick look at what we're dealing with here.
16
00:00:53,040 --> 00:00:59,970
Going to pop into the Data tab and we have one record to one row for each project got a unique project
17
00:01:00,000 --> 00:01:01,740
I.D. project name.
18
00:01:01,860 --> 00:01:05,140
It's categorized into subcategories and categories.
19
00:01:05,280 --> 00:01:09,090
The goal amount is basically the target level of funding.
20
00:01:09,090 --> 00:01:15,060
In order for the project to be successful we know when that project launched and ultimately we know
21
00:01:15,060 --> 00:01:16,350
what the outcome was.
22
00:01:16,350 --> 00:01:17,730
Was it successful.
23
00:01:17,730 --> 00:01:18,860
Did it fail.
24
00:01:18,870 --> 00:01:21,470
Is it currently live or in progress.
25
00:01:21,480 --> 00:01:23,390
That's what we have in this column here.
26
00:01:23,640 --> 00:01:28,880
And then we also know the number of backers the number of people who pledged or supported the project.
27
00:01:28,980 --> 00:01:30,690
We know what country it came from.
28
00:01:30,690 --> 00:01:33,470
We know the total amount pledged in U.S. dollars.
29
00:01:33,510 --> 00:01:35,770
And we've got a year field here as well.
30
00:01:35,880 --> 00:01:38,950
So I filtered this down to just 20 17 projects.
31
00:01:38,970 --> 00:01:42,430
Got about forty thousand records here in this table.
32
00:01:42,570 --> 00:01:47,610
But if you're interested jump in to power query you'll see the full unfiltered dataset which is about
33
00:01:47,760 --> 00:01:52,740
three hundred and fifty thousand records so you can play with the full one you can cut it up different
34
00:01:52,740 --> 00:01:54,600
ways totally your call.
35
00:01:55,380 --> 00:02:01,020
So let's head back to our report view and we're gonna go ahead and insert the key influencers visual
36
00:02:01,080 --> 00:02:08,090
looks like this little kind of like a lollipop chart looking thing with the A.I. light bulb in the corner.
37
00:02:08,160 --> 00:02:09,020
So let's drop it in.
38
00:02:09,020 --> 00:02:13,810
If you don't see that visual make sure you've updated your power by desktop version.
39
00:02:13,950 --> 00:02:16,060
I'm on December 20 19.
40
00:02:16,470 --> 00:02:18,060
So make sure your current.
41
00:02:18,330 --> 00:02:23,050
And let's just stretch this out to take up almost our entire canvas.
42
00:02:23,130 --> 00:02:27,850
Now here's the thing before we just start dragging and dropping all willy nilly here.
43
00:02:27,990 --> 00:02:33,300
Let's take a minute and understand what the purpose or goal of this visual really is.
44
00:02:33,420 --> 00:02:36,390
Because this is not your average visual.
45
00:02:36,480 --> 00:02:43,020
Most chart types and templates like bar charts and pie and areas and histogram whatever they're designed
46
00:02:43,080 --> 00:02:49,230
to visualize concrete data they just change the format they bring it to life in a visual form.
47
00:02:49,230 --> 00:02:52,970
What the key influencers visual does is much more sophisticated.
48
00:02:53,010 --> 00:02:59,640
It actually helps us understand and expose the individual factors that drive some sort of an outcome
49
00:03:00,090 --> 00:03:05,460
and that outcome could be categorical like a project being successful or unsuccessful like we have in
50
00:03:05,460 --> 00:03:12,030
this case or maybe it's a customer review being positive or negative or that outcome could be numerical
51
00:03:12,030 --> 00:03:18,270
or continuous like determining what factors impact the price of a house or make the price of a house
52
00:03:18,360 --> 00:03:19,920
increase or decrease.
53
00:03:20,010 --> 00:03:22,230
That's what this visual is all about.
54
00:03:22,230 --> 00:03:28,230
And for anyone who wants to kind of dig deeper under the hood into the actual statistics and data science
55
00:03:28,350 --> 00:03:31,110
behind this it's outside the scope of this course.
56
00:03:31,110 --> 00:03:37,380
But basically what we're working with here are regression models either a logistic regression for categorical
57
00:03:37,380 --> 00:03:41,310
variables or linear regression for numerical variables.
58
00:03:41,310 --> 00:03:48,900
Basically the idea is to understand the dependence and the interrelationships correlation between variables
59
00:03:49,170 --> 00:03:55,800
here in our table and specifically how changes to some independent variables impact the predicted value
60
00:03:56,340 --> 00:03:59,000
of our outcome or our dependent variable.
61
00:03:59,010 --> 00:04:00,680
So all right that was a lot of words.
62
00:04:00,720 --> 00:04:01,740
I apologize.
63
00:04:01,770 --> 00:04:03,040
Getting a little heavy there.
64
00:04:03,090 --> 00:04:05,370
Now let's start playing with this and see what this is all about.
65
00:04:05,370 --> 00:04:11,640
So we've selected our visual here we've got three different fields or wells that we can play with and
66
00:04:11,650 --> 00:04:19,140
the visualizations tab analyze explained by and expand by so analyze is the outcome that we care about.
67
00:04:19,140 --> 00:04:23,760
So let's go ahead and start with that project outcome field which is categorical.
68
00:04:23,820 --> 00:04:31,150
And as you can see it says All right what influences project outcome to be failed live or successful.
69
00:04:31,230 --> 00:04:36,600
I want to see what impacts successful projects and select successful there.
70
00:04:36,690 --> 00:04:43,230
And now my next step is to drag fields into this explained by well based on basically any field that
71
00:04:43,230 --> 00:04:46,800
I think might influence this outcome.
72
00:04:46,860 --> 00:04:47,090
OK.
73
00:04:47,100 --> 00:04:53,790
So if I think that country might be a factor here I could just drag country drop it in and the visual
74
00:04:53,790 --> 00:04:54,840
updates.
75
00:04:54,840 --> 00:04:58,340
So we see a lot of information here on the left side.
76
00:04:58,350 --> 00:05:05,600
We'll have a list of any key influencers any factors that power b has deemed to be statistically significant
77
00:05:06,080 --> 00:05:09,140
and influential in impacting that outcome.
78
00:05:09,200 --> 00:05:14,660
In this case we just have one and then on the right side we have visual representation of our entire
79
00:05:14,660 --> 00:05:17,710
data set for a given field.
80
00:05:17,960 --> 00:05:21,210
So we've got five countries in our dataset.
81
00:05:21,290 --> 00:05:30,370
UK US Canada France and Australia and power RBI has told us that there's one significant influencer
82
00:05:30,370 --> 00:05:36,950
here which is that when the country is the United Kingdom the likelihood of this project outcome being
83
00:05:36,950 --> 00:05:40,840
successful increases by 1 point 1 1 times.
84
00:05:40,880 --> 00:05:42,290
All else equal.
85
00:05:42,640 --> 00:05:49,280
And the way that that value or that factors derived is by comparing the success rate or the project
86
00:05:49,280 --> 00:05:53,900
success outcome in the UK compared to the average country.
87
00:05:53,900 --> 00:05:56,210
So all other countries averaged out.
88
00:05:56,270 --> 00:06:03,900
So in the UK for the fifty eight hundred and fifty six projects Forty two percent of those were successful.
89
00:06:04,040 --> 00:06:09,170
When we compare that against all other countries averaged out we see that only thirty six point two
90
00:06:09,170 --> 00:06:11,530
four percent are successful otherwise.
91
00:06:11,870 --> 00:06:18,980
So quite an increase there and that's why when country equals UK that's identified as a key influencer.
92
00:06:18,980 --> 00:06:24,230
Now note that we only have one influencer here in the list on the left but we're seeing five countries
93
00:06:24,230 --> 00:06:25,280
here on the right.
94
00:06:25,490 --> 00:06:32,900
And that's because whether or not a value or factor is noted as an influencer has to do with a number
95
00:06:32,900 --> 00:06:33,800
of things.
96
00:06:33,800 --> 00:06:38,120
For one it has to do with the difference of success rate compared to the average.
97
00:06:38,120 --> 00:06:44,170
So for instance U.S. and Canada are very very similar to the average and also has to do with volume.
98
00:06:44,240 --> 00:06:50,570
So you can see the count of Kickstarter projects from France only nine hundred seventy nine from Australia
99
00:06:50,600 --> 00:06:58,230
fifteen hundred compared to the US which is almost thirty three thousand UK which is almost six thousand.
100
00:06:58,220 --> 00:07:04,700
So those factors all go into this regression model behind the scenes in order to determine this list
101
00:07:04,700 --> 00:07:05,720
of key influencers.
102
00:07:06,230 --> 00:07:12,260
So at this point we're only looking at one factor country but we obviously know that there are other
103
00:07:12,260 --> 00:07:17,110
things that determine if a project ends up being successful or not.
104
00:07:17,120 --> 00:07:23,570
So all we need to do is think about okay what other factors might possibly be influential here and maybe
105
00:07:23,570 --> 00:07:25,780
category is a factor as well.
106
00:07:26,060 --> 00:07:32,300
So we can drag category in right next to our country and now all the sudden we see a totally new list
107
00:07:32,390 --> 00:07:33,950
of influencers here.
108
00:07:33,950 --> 00:07:40,970
In fact that country equals the US or UK influencer is now pushed way down to number seven in the list
109
00:07:41,390 --> 00:07:46,380
and we have these category influencers which are outweighing it significantly.
110
00:07:46,400 --> 00:07:52,430
So now we're actually seeing that when you factor in or consider country and category the number one
111
00:07:52,430 --> 00:07:58,940
influencer is when the category equals comics followed by dance projects theatre projects music projects
112
00:07:59,330 --> 00:08:00,630
and so on and so forth.
113
00:08:00,770 --> 00:08:03,530
And you can continue with this process.
114
00:08:03,530 --> 00:08:08,750
Right now we're looking at two categorical fields but we could pull continuous or numerical fields in
115
00:08:08,750 --> 00:08:15,480
here as well like the number of backers or the total amount pledged in US dollars.
116
00:08:15,590 --> 00:08:21,470
And now what you're seeing kind of as you'd expect are that these new fields we just pulled in are the
117
00:08:21,470 --> 00:08:24,800
new biggest or most influential factors.
118
00:08:24,800 --> 00:08:25,660
And it makes sense.
119
00:08:25,670 --> 00:08:31,490
The number one factor is that if you have a project that raised more than four thousand nine hundred
120
00:08:31,490 --> 00:08:36,800
sixty eight dollars you're almost four times more likely to have a successful project.
121
00:08:36,800 --> 00:08:39,500
All else equal and that makes sense.
122
00:08:39,500 --> 00:08:42,330
Same story here with backers more than 100.
123
00:08:42,380 --> 00:08:46,850
Here's the thing you raise more money you get more backers you're more likely for your project to hit
124
00:08:46,850 --> 00:08:49,660
its target so that all is intuitive.
125
00:08:49,670 --> 00:08:51,140
That makes sense.
126
00:08:51,140 --> 00:08:56,270
And when we click on one of those factors that's based on a continuous field what poverty does here
127
00:08:56,270 --> 00:09:02,150
is it kind of bends those values just like you would with a histogram into different chunks.
128
00:09:02,150 --> 00:09:08,540
So more than forty nine sixty eight between twenty six thirty five and forty nine sixty eight you can
129
00:09:08,540 --> 00:09:13,630
kind of see how each of those bins of values compares to the average.
130
00:09:13,730 --> 00:09:19,850
If we had a clear linear relationship here power behind might plot this as a scatter plot with a line
131
00:09:19,850 --> 00:09:25,550
of best fit but the one thing that's kind of missing to this point is that we're sorting by default
132
00:09:25,640 --> 00:09:29,450
based on the impact these influence factors here.
133
00:09:29,450 --> 00:09:32,160
But what we don't know at first glance is OK.
134
00:09:32,180 --> 00:09:38,240
This is an influential factor but how much of the data set does it actually represent.
135
00:09:38,240 --> 00:09:39,260
Is it a large portion.
136
00:09:39,260 --> 00:09:44,430
Is it a very small piece and what you can do is hover over these bubbles here and it will tell you so.
137
00:09:44,540 --> 00:09:52,460
This influencer contains twenty five point thirty eight percent of the data this one contains 19 percent.
138
00:09:52,460 --> 00:09:56,690
This one down here contains under 1 percent.
139
00:09:56,870 --> 00:10:02,690
And what you can do to make this a little bit more clear is actually go into the format pain here drill
140
00:10:02,690 --> 00:10:09,200
into the analysis options and you can enable counts and what that does it's kind of subtle but it adds
141
00:10:09,200 --> 00:10:14,450
that little ring around each of these bubbles which represents the percentage.
142
00:10:14,450 --> 00:10:20,650
Right now it's based on an absolute percentage so 25 percent actually looks like a quarter of the circle.
143
00:10:20,750 --> 00:10:27,050
You could change that to relative which makes the largest factor 100 percent and then index is all the
144
00:10:27,050 --> 00:10:28,460
other ones accordingly.
145
00:10:28,460 --> 00:10:31,820
Sometimes that can make it a little bit easier to see and interpret.
146
00:10:32,360 --> 00:10:37,280
And the last thing that that actually does is it gives us an option here to sort either by Impact which
147
00:10:37,280 --> 00:10:43,850
we're doing by default or we can sort by the count and now we see that actually this factor here which
148
00:10:43,850 --> 00:10:51,460
isn't a very big influencer is a big piece of our data it represents almost 26 percent of our dataset
149
00:10:51,670 --> 00:10:52,020
now.
150
00:10:52,130 --> 00:10:57,380
Last thing to cover really quickly here before we move on to the next demo which is going to cover continuous
151
00:10:57,380 --> 00:11:04,400
variables is this top segments tab and what the top segments Tab does is it actually runs a cluster
152
00:11:04,400 --> 00:11:10,760
analysis behind the scenes and what power behind doing here is it's combining factors into segments
153
00:11:10,760 --> 00:11:15,850
or populations that seem to have very high level of influence.
154
00:11:15,950 --> 00:11:20,040
So you can click on any of these to see more information about them.
155
00:11:20,060 --> 00:11:26,600
So segment 1 is defined as a segment where the number of backers is greater than one hundred and the
156
00:11:26,600 --> 00:11:29,190
category is not games.
157
00:11:29,210 --> 00:11:29,660
OK.
158
00:11:29,720 --> 00:11:35,770
So within that segment almost 90 percent of those projects were successful.
159
00:11:35,910 --> 00:11:38,370
And that's compared to an average of only 37 percent.
160
00:11:38,630 --> 00:11:44,810
So a very very influential very successful segment that we've defined here can also see that it contains
161
00:11:44,810 --> 00:11:50,090
about sixty four hundred records which is about fourteen point six percent of our data.
162
00:11:50,120 --> 00:11:55,580
You can learn more and explore down here but that's basically the idea you can click through see which
163
00:11:55,580 --> 00:11:58,240
types of segments power eyes defining.
164
00:11:58,230 --> 00:12:03,650
And this can help you understand things like you know who your target audience really is you know maybe
165
00:12:03,650 --> 00:12:10,820
you realize that middle aged women in the Northwest are your best customers and you're really not resonating
166
00:12:10,820 --> 00:12:13,100
with men in the south.
167
00:12:13,100 --> 00:12:18,590
Those are the types of insights that this sort of segment analysis can really enable.
168
00:12:18,590 --> 00:12:19,810
So there you go.
169
00:12:19,820 --> 00:12:25,490
That's the key influencer visual looking at a categorical outcome like we are here in the next demo
170
00:12:25,490 --> 00:12:30,800
we'll keep this example going and instead of looking at a categorical outcome we'll pull in a continuous
171
00:12:30,800 --> 00:12:33,430
variable there and check out some of the differences.
172
00:12:33,440 --> 00:12:33,950
Stay tuned.
18534
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.