plt.scatter Video Lecture Transcript This transcript was automatically generated by Zoom, so there may be discrepancies between the video and the text. 15:39:35 Hi! Everybody! Welcome back! As we continue to learn more about Matt. 15:39:39 Plot. Lib in this video, we're gonna learn about the scatter function of Matt Pointlib alright. 15:39:47 So go ahead and open up, scanner. I'm gonna open up the complete version, which will be complete by the end of this video. 15:39:52 So let's talk about, scatter plots in this notebook, and maybe before we even dive into the net, Potlib, I'll give a quick rundown of what a scatter plot is, so this is one of the most used plots and data visualization this plot marks values. 15:40:08 Of one variable of interest against another, so sort of think of like. 15:40:13 Why versus X. You go out and you collect a bunch of data about something. 15:40:17 Maybe you suspect that whatever your variable Y is is impacted by the value of variable X, so then you'll go and assume it, depending on what your X and Y actually are, you might plot a scatter plot where it's just the points of Y in X X comma 15:40:33 Y just to see like, how does y change with different values of x and so forth. 15:40:37 So some examples are, maybe you want to see an adults full grown height against their parents, full grown height, which is an example. 15:40:45 We'll use in this notebook. Maybe you want to see Mark. 15:40:49 The occurrence of some event at various locations, by plotting the latitude and the longitude pairs. 15:40:56 So this is a very famous example from John Snows. We're on public health. 15:41:00 Maybe you wanna see the volume of Chemical. Y that results from an experiment using various volumes of chemical X and examples could go on and on, scatter plots really are one of the most used plot types. 15:41:13 So it's really important, because they're so popular. 15:41:16 And so ubiquitous. It's important to know how to make one in that plot, lib. 15:41:22 And the proper way to make one is with the scatter. 15:41:23 Command. So while we could make one with the plot command, and just turn off our lines, the appropriate way to make one a Matt plot live is the scattered command, so why do we use the scatter command over the plot command, well, the scatter commands going to offer us some functionality that is 15:41:39 Not available in just the plot command alone, and we'll see what that functionality is in this notebook so to make a scatter plot, you're going to call either plot, scatter, or ax, dot, scatter, and then like plot. 15:41:52 You'll put in the X arguments, followed by the Y argument. 15:41:55 So these are the positions, the horizontal positions, and the vertical positions. 15:41:59 So we're gonna look at an example from Galton's famous study kind of arranged to the phrase regressing to the mean that takes the adult height of someone that's fully grown and against the height of their mother so in this example we're gonna look 15:42:17 at people who are daughters, so born as a female, and then look at their the height of their mothers, and see how those 2 relate. 15:42:27 So we're gonna do just to see this of 100 observations from the data set. 15:42:33 If you're interested in the paper you can click on this link here, and it will pop up so what we're gonna do is we're gonna look at this. 15:42:41 So X is gonna have the height of the mothers they're full grown height, and then why, we'll have the height of the daughters. 15:42:51 They're full grown height. And so here's just looking at the first 5 rows of this random sample from the data set. 15:42:57 Okay. 15:43:00 So let's go ahead, and we're just gonna use scatter. 15:43:05 So we're thinking that the daughter's height, maybe, is impacted by the mother's height. 15:43:10 So the X argument, or the horizontal variable will be the height of the mother. 15:43:15 We've already gone ahead and stored that in X, so X will go first, and then after that we're going to put the daughter's height, which I've already stored in. 15:43:24 Why, so we're going to go ahead and put that as the second argument. 15:43:27 And then, when I call this, we'll see our scatter plot. 15:43:31 So here are our points, and then, you know, we have the mother's height on the horizontal axis, the daughter's side on the vertical. 15:43:39 They're not labeled, which is a bad form. 15:43:41 But we're gonna worry about labels and formatting the plot in a later notebook, for now we're just focusing on dot scatter. 15:43:48 So how can we customize these points further? What are some of those arguments? 15:43:53 I told you about that plot alone does not have. Oh, so plot did have marker type. 15:43:59 But we still need to learn how to do that and scatter. 15:44:02 So what is marker types? Argument? It's again controlled by the marker argument. 15:44:07 And so these are the same exact markers that we saw in the plot video and notebook so you're going to put in your X, your Y, and then marker equals. 15:44:15 And so now I can make them all X's if I wanted. 15:44:18 Okay, so marker equals, lower Kx will make them all X's similarly, you could do marker equals capital X, and now they're like Bold X's. 15:44:28 I prefer the lowercase ones myself, because they're a little easier to to see that they're X's. 15:44:35 Again, there are a number of pre-made markers which you can find here. 15:44:39 I'll leave it to you to explore more on your own, so you can control the marker size with the S. 15:44:45 Argument S for size. So you can make all of them the same size if you just feed S. Just a non-negative float. 15:44:55 So the size of your markers given by this formula in inches. It's not very useful, but just so you might be wondering what the units are. 15:45:03 So if I set in my Plt. Scatter, if I set s equal to 10. 15:45:08 Now all of my markers are going to be of this size in ancient. 15:45:13 Okay. And then here's another example. Where now I'm gonna set them equal to 100. 15:45:18 And you can see how they're bigger, and maybe you can try and argue to yourself. 15:45:23 Oh, yeah, those look about 10 times bigger than the smaller ones. 15:45:27 So you can have differing sizes as well 15:45:32 So if I go ahead and I'm gonna make a random array of of sizes. 15:45:41 So this is going to be a uniform random draw of the same length of the data and then we can look at it. 15:45:46 So this variable sizes 15:45:50 Here's the array, all these different sizes. 15:45:53 And so then I can call. The S. Argument is equal to sizes, and when I do this, you're gonna see sizes of differing or markers, are differing sizes. 15:46:07 Okay. And so these sizes here are determined by this array. 15:46:12 So whatever the 0 entry is, and we can even look exit 0, maybe print Prince X at 0, y. 15:46:21 At 0, 15:46:23 So the point at 66 comma. 67 should have be one of our allowance points. 15:46:30 So 66 comma. 67 is about here. 15:46:34 And this is a pretty large point 15:46:39 You can change the color of your marker with the C. 15:46:42 Argument. So we can make everything the same color by setting C equal to a string of the color. 15:46:48 So here I make all of my points equal to green by doing C equals the string green. 15:46:53 Once again you could do the same colors from Plt Dot plot 15:47:01 So this big, long list of colors here that I got from the map plot, live documentation. Or you can go to that hex code color picker website and use that as well 15:47:13 You can also make it so that each point is its own unique color. 15:47:19 So here's one example where I'm going to randomly choose a color from red black, and dodger blue for every point. 15:47:26 So here's that array. Now, if I do 15:47:34 C. Equal to colors, one which is the variable that I stored all those choices in. 15:47:39 You can see how each point is its own unique color. Well, maybe not unique, since there are repeats, but each point is a distinct color based upon this array 15:47:53 So this was an array where I specified the color according to the name of the color. 15:48:01 You can also specify the color according to 15:48:05 So if I let my C. Argu be an array of numbers, it's gonna use a color map and a color map is a map of numbers, a range of numbers to a range of colors. 15:48:21 So this is a linear mapping where the lowest value of the range of numbers is assigned to one color, the highest value of that range of numbers is assigned to another, and then all of the numbers between the highest and the lowest are assigned linearly to a gradated color of 15:48:40 Like a transition from in one example. We'll see from whites all the way up to black. 15:48:45 Okay. So we'll see this in a little bit more detail. 15:48:48 In a later notebook, where we specifically talk about color maps but for now just know that a color map is a way to take one set of numbers and map it linearly by default, there are other ways. 15:49:02 But and by default linearly from a set of numbers to a gradation of colors from one color to another. 15:49:12 So here I'm making a random array. A second random array of colors called colors. 15:49:17 2, and these colors are values from 0 to one that are randomly selected. 15:49:22 And now, if I set my C equals to colors to, and I set a caller map, and this one is Verdus, which will talk again, we'll talk about that more in a later notebook. 15:49:33 Now you can see how each point and maybe I'll make the points a little bit bigger. 15:49:37 So we can see them better. Let's say 50 hmm! 15:49:42 Maybe a 100. Now you can see how each point is its own color. 15:49:48 Different colors, and those colors are being determined by this. 15:49:52 And again we're going to see a little bit more about color maps in a later notebook. 15:49:56 But for now know that this is an option for Plt. 15:50:00 Scatter just like with your lines, you can control. How? 15:50:04 See through your points are with an alpha argument. So remember, Alpha equals. 15:50:09 One means a completely solid color, and alpha equals. 15:50:12 0 means a completely. C through color. So you can do this uniformly by setting Alpha equal to a number. 15:50:19 So here's Alpha equals one totally solid. Here's Alpha equals 0 point 5 about half way to being solid. 15:50:27 You can also set it so that each point is its own unique opaqueness. 15:50:33 So its own unique see throughness. So here's another random array, and this random array can now be fed into the alpha argument. 15:50:40 So each point will have its opaqueness set according to the arguments here. 15:50:47 So that point, I think it was 66 comma. 67 will be pretty. 15:50:51 See through, cause it's got an alpha of point 1 9 7. Okay? 15:50:56 So 66, 67. If you can see it, we can zoom in. 15:51:00 That's pretty season. 15:51:04 Just like the inside color can be set with C. You can also set the edge color, the edge color is set with the edge color. 15:51:12 Argument. So I'm going to set the interior color of it white to help us better see the LED color, and then I'm gonna set edge color equals to black. 15:51:23 And so now, if I zoom in, you can see the inside is white, but the outside of these circles is black. 15:51:33 So each of these unique, each of these edges can also have their own unique color. 15:51:38 You can set these to an array of possible edge color values. 15:51:43 But I want to point out that there's no way to set the edge colors equal to a continuous variable. 15:51:52 So like we couldn't. You can't to do this color map part with the edges unless you want your edges to be the exact same color as the inside, so you couldn't have something like continuous variable set the color of the inside and it contained a separate continuous variable set the color of 15:52:11 the edge. You can't even let's say the inside was all one uniform color. 15:52:14 You can't even set a the edge color to it. 15:52:18 A variable like that. You can set it to be something like this, where, like some, have red edges, some have blue edges, some have black edges, but you cannot do the same thing with color. 15:52:28 Maps. 15:52:33 So you can set your edge width with a confusingly align width argument like I would think it would be edged with. 15:52:41 But it's line width, so this can take in a float or an array of floats. 15:52:44 So, for instance, here, like we can set our line widths 15:52:49 Equal to 2 to double the size of the edges, and we can compare. 15:52:54 They're about. They do look thicker, those edges are thicker. 15:52:58 You can also. Here I'm setting up a random array of width just like we've done with the other examples. 15:53:04 And now each circle will now have its own width determined by that random draw, so you can see some of these are a little bit thinner, and others are much thicker. 15:53:16 Okay. And I guess if we really wanted to step it up, we can change this 2 to a 10, and we can increase the size 15:53:24 Now you can really tell the difference right? 15:53:29 So similar to plot, scatter really also works well with date time. 15:53:35 Input, so if either, you're X, you're horizontal, variable, or your vertical variable. 15:53:39 Have a date time as its entries, daytime format entries. 15:53:45 It will accommodate those. Well, also. Okay. So I would say, we now, between these 2 notebooks, have a good understanding of the plot and scatter commands. 15:53:56 These are really only 2 commands, but they give you a lot of flexibility to make really interesting plots. 15:54:01 These are probably going to be the the commands that you use the most with you're making plots in that plot lab, and we know them really well. 15:54:08 If you want to learn more about them. We haven't covered everything about them. 15:54:13 Otherwise those would be pretty long notebooks. But if you want to learn more about them, I encourage you to check out the documentations that we've linked to in both notebooks and explore more on your own, try different things, and then before we go and learn about more chart types because there's a 15:54:27 lot of charts that you can make in that Potlib. 15:54:29 I want to take a second to go over what happens when I try and add more than one plot element onto a single axis object. 15:54:37 So what if I want to put a line plot over a scatter plot? 15:54:41 How to, does that work in that plot Lib. And this will help us because it's pretty much the same in every plotting software, every plotting package in Python okay, so that's going to do it for this notebook. 15:54:54 I really hope you enjoyed learning about scatter plots. 15:54:57 I enjoyed having you here learning about scatter plots, and I will see you hopefully in the next video.