box-and-whisker plot
(noun)
a convenient way of graphically depicting groups of numerical data through their quartiles
Examples of box-and-whisker plot in the following topics:
-
Making a Box Model
- A box plot (also called a box-and-whisker diagram) is a simple visual representation of key features of a univariate sample.
- A box plot (also called a box and whisker diagram) is a simple visual representation of key features of a univariate sample.
- Another common extension of the box model is the 'box-and-whisker' plot , which adds vertical lines extending from the top and bottom of the plot to, for example, the maximum and minimum values.
- Alternatively, the whiskers could extend to the 2.5 and 97.5 percentiles.
- Finally, it is common in the box-and-whisker plot to show outliers (however defined) with asterisks at the individual values beyond the ends of the whiskers.
-
Box Plots
- Box plots or box-whisker plots give a good graphical image of the concentration of the data.
- To construct a box plot, use a horizontal number line and a rectangular box.
- The "whiskers" extend from the ends of the box to the smallest and largest data values.
- The box plot gives a good quick picture of the data.
- NOTE : You may encounter box and whisker plots that have dots marking outlier values.
-
Box plots, quartiles, and the median
- A box plot summarizes a data set using five statistics while also plotting unusual observations.
- Figure 1.25 provides a vertical dot plot alongside a box plot of the num char variable from the email50 data set.
- The IQR is the length of the box in a box plot.
- In a sense, the box is like the body of the box plot and the whiskers are like its arms trying to reach the rest of the data.
- A vertical dot plot next to a labeled box plot for the number of characters in 50 emails.
-
Box Plots
- Box plots are useful for identifying outliers and for comparing distributions.
- Continuing with the box plots, we put "whiskers" above and below each box to give additional information about the spread of data.
- Whiskers are drawn from the upper and lower hinges to the upper and lower adjacent values (24 and 14 for the women's data).
- Although we don't draw whiskers all the way to outside or far out values, we still wish to represent them in our box plots.
- Box plots showing the individual scores and the means.
-
Graphs for Quantitative Data
- Plots play an important role in statistics and data analysis.
- The procedures here can broadly be split into two parts: quantitative and graphical.
- Below are brief descriptions of some of the most common plots:
- Box plot: In descriptive statistics, a boxplot, also known as a box-and-whisker diagram, is a convenient way of graphically depicting groups of numerical data through their five-number summaries (the smallest observation, lower quartile (Q1), median (Q2), upper quartile (Q3), and largest observation).
- This is an example of a scatter plot, depicting the waiting time between eruptions and the duration of the eruption for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA.
-
Comparing numerical data across groups
- Here two convenient methods are introduced: side-by-side box plots and hollow histograms.
- The side-by-side box plot is a traditional tool for comparing across groups.
- An example is shown in the left panel of Figure 1.43, where there are two box plots, one for each group, placed into one plotting window and drawn on the same scale.
- (Looking into the data set, we would find that 8 of these 15 counties are in Alaska and Texas. ) The box plots indicate there are many observations far above the median in each group, though we should anticipate that many observations will fall beyond the whiskers when using such a large data set.
- The side-by-side box plots are especially useful for comparing centers and spreads, while the hollow histograms are more useful for seeing distribution shape, skew, and groups of anomalies.
-
Interquartile Range
- The IQR is used to build box plots, which are simple graphical representations of a probability distribution.
- A box plot separates the quartiles of the data.
- The box starts at the lower quartile and ends at the upper quartile, so the difference, or length of the boxplot, is the IQR.
- If you wanted to leave out the outliers for a more accurate reading, you would subtract the values at the ends of both "whiskers:"
- The IQR is used to build box plots, which are simple graphical representations of a probability distribution.
-
Graphing Quantitative Variables
- The upcoming sections cover the following types of graphs: (1) stem and leaf displays, (2) histograms, (3) frequency polygons, (4) box plots, (5) bar charts, (6) line graphs, (7) scatter plots, and (8) dot plots.
- Some graph types such as stem and leaf displays are best-suited for small to moderate amounts of data, whereas others such as histograms are best-suited for large amounts of data.
- Graph types such as box plots are good at depicting differences between distributions.
- Scatter plots are used to show the relationship between two variables.
-
Examining numerical data exercises
- Describe the distribution in the histograms below and match them to the box plots.
- What characteristics of the distribution are apparent in the histogram and not in the box plot?
- (a) What features of the distribution are apparent in the histogram and not the box plot?
- The box plot makes it easy to iden- tify more precise values of observations outside of the whiskers. 1.41 (a) The median is better; the mean is sub- stantially affected by the two extreme observa- tions.
- The box plot shows the distribution of finishing times for male and female marathon winners.
-
Statistical Graphics
- Statistical graphics allow results to be displayed in some sort of pictorial form and include scatter plots, histograms, and box plots.
- They include plots such as scatter plots , histograms, probability plots, residual plots, box plots, block plots and bi-plots.
- Many familiar forms, including bivariate plots, statistical maps, bar charts, and coordinate paper were used in the 18th century.
- • Multivariate distribution and correlation in the late 19th and 20th century.
- A scatter plot helps identify the type of relationship (if any) between two variables.