Density plot

A density plot is a representation of the distribution of a numeric variable. It uses a kernel density estimate to show the probability density function of the variable

A density plot shows the distribution of a numeric variable. In ggplot2, the geom_density() function takes care of the kernel density estimation and plot the results.

Density plots are built in ggplot2 thanks to the geom_density geom. Only one numeric variable is need as input.

Arguments

base_family, base_size

base font family and size

plot_title_family, plot_title_face, plot_title_size, plot_title_margin

plot title family, face, size and margi

subtitle_family, subtitle_face, subtitle_size

plot subtitle family, face and size

subtitle_margin

plot subtitle margin bottom (single numeric value)

strip_text_family, strip_text_face, strip_text_size

facet label font family, face and size

caption_family, caption_face, caption_size, caption_margin

plot caption family, face, size and margin

axis_title_family, axis_title_face, axis_title_size

axis title font family, face and size

axis_title_just

axis title font justification, one of [blmcrt]

plot_margin plot margin (specify with ggplot2::margin)

grid panel

grid (TRUE, FALSE, or a combination of X, x, Y, y) axis add x or y axes? TRUE, FALSE, "xy" ticks ticks if TRUE add ticks

To create a density plot in R using ggplot2, we use the geom_density() function of the ggplot2 package.

Syntax: ggplot( aes(x)) + geom_density( fill, color, alpha)

Parameters:

· fill: background color below the plot

· color: the color of the plotline

· alpha: transparency of graph

# Libraries

library(ggplot2)

library(dplyr)

# Load dataset from github

data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%

filter( price<300 ) %>%

ggplot( aes(x=price)) +

geom_density(fill="#69b3a2", color="#e9ecef", alpha=0.8)

Density plots are built in ggplot2 thanks to the geom_density geom. Only one numeric variable is need as input.

# Libraries

library(ggplot2)

library(dplyr)

 # Load dataset from github

data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/1_OneNum.csv", header=TRUE)

 # Make the histogram

data %>%

  filter( price<300 ) %>%

  ggplot( aes(x=price)) +

    geom_density(fill="#69b3a2", color="#e9ecef", alpha=0.8)

Custom with theme_ipsum

The function is setup in such a way that you can customize your own one by just wrapping the call and changing the parameters.

The hrbrthemes package offer a set of pre-built themes for your charts. I am personnaly a big fan of the theme_ipsum: easy to use and makes your chart look more professional:

# Libraries

library(ggplot2)

library(dplyr)

library(hrbrthemes)

 # Load dataset from github

data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/1_OneNum.csv", header=TRUE)

 # Make the histogram

data %>%

  filter( price<300 ) %>%

  ggplot( aes(x=price)) +

    geom_density(fill="#69b3a2", color="#e9ecef", alpha=0.8) +

    ggtitle("Night price distribution of Airbnb appartements") +

    theme_ipsum()

Mirror density chart with ggplot2

A density chart is built thanks to the geom_density geom of ggplot2 (see a basic example). It is possible to plot this density upside down by specifying y = -..density... It is advised to use geom_label to indicate variable names.

# Libraries

library(ggplot2)

library(hrbrthemes)

 # Dummy data

data <- data.frame(

  var1 = rnorm(1000),

  var2 = rnorm(1000, mean=2)

 # Chart

p <- ggplot(data, aes(x=x) ) +

  # Top

  geom_density( aes(x = var1, y = ..density..), fill="#69b3a2" ) +

  geom_label( aes(x=4.5, y=0.25, label="variable1"), color="#69b3a2") +

  # Bottom

  geom_density( aes(x = var2, y = -..density..), fill= "#404080") +

  geom_label( aes(x=4.5, y=-0.25, label="variable2"), color="#404080") +

  theme_ipsum() +

  xlab("value of x")

Histogram with `geom_histogram`

Of course it is possible to apply exactly the same technique using geom_histogram instead of geom_density to get a mirror histogram:

# Chart

p <- ggplot(data, aes(x=x) ) +

  geom_histogram( aes(x = var1, y = ..density..), fill="#69b3a2" ) +

  geom_label( aes(x=4.5, y=0.25, label="variable1"), color="#69b3a2") +

  geom_histogram( aes(x = var2, y = -..density..), fill= "#404080") +

  geom_label( aes(x=4.5, y=-0.25, label="variable2"), color="#404080") +

  theme_ipsum() +

  xlab("value of x")

Density chart with several groups

A multi density chart is a density chart where several groups are represented. It allows to compare their distribution. The issue with this kind of chart is that it gets easily cluttered: groups overlap each other and the figure gets unreadable.

An easy workaround is to use transparency. However, it won’t solve the issue completely and is is often better to consider the examples suggested further in this document.

# Libraries

library(ggplot2)

library(hrbrthemes)

library(dplyr)

library(tidyr)

library(viridis)

 # The diamonds dataset is natively available with R.

 # Without transparency (left)

p1 <- ggplot(data=diamonds, aes(x=price, group=cut, fill=cut)) +

    geom_density(adjust=1.5) +

    theme_ipsum()

p1

# With transparency (right)

p2 <- ggplot(data=diamonds, aes(x=price, group=cut, fill=cut)) +

    geom_density(adjust=1.5, alpha=.4) +

    theme_ipsum()

p2

Here is an example with another dataset where it works much better. Groups have very distinct distribution, it is easy to spot them even if on the same chart.

# Load dataset from github

data <- read.table("https://raw.githubusercontent.com/zonination/perceptions/master/probly.csv", header=TRUE, sep=",")

data <- data %>%

  gather(key="text", value="value") %>%

  mutate(text = gsub("\\.", " ",text)) %>%

  mutate(value = round(as.numeric(value),0))

 # A dataframe for annotations

annot <- data.frame(

  text = c("Almost No Chance", "About Even", "Probable", "Almost Certainly"),

  x = c(5, 53, 65, 79),

  y = c(0.15, 0.4, 0.06, 0.1)

 # Plot

data %>%

  filter(text %in% c("Almost No Chance", "About Even", "Probable", "Almost Certainly")) %>%

  ggplot( aes(x=value, color=text, fill=text)) +

    geom_density(alpha=0.6) +

    scale_fill_viridis(discrete=TRUE) +

    scale_color_viridis(discrete=TRUE) +

    geom_text( data=annot, aes(x=x, y=y, label=text, color=text), hjust=0, size=4.5) +

    theme_ipsum() +

    theme(

      legend.position="none"

) +

    ylab("") +

    xlab("Assigned Probability (%)")

Small Multiple with `facet_wrap()`

Facet wrap

facet_wrap() makes a long ribbon of panels (generated by any number of variables) and wraps it into 2d. This is useful if you have a single variable with many levels and want to arrange the plots in a more space efficient manner.

You can control how the ribbon is wrapped into a grid with ncol, nrow, as.table and dir. ncol and nrow control how many columns and rows (you only need to set one). as.table controls whether the facets are laid out like a table (TRUE), with highest values at the bottom-right, or a plot (FALSE), with the highest values at the top-right. dir controls the direction of wrap: horizontal or vertical.

facet_grid() lays out plots in a 2d grid, as defined by a formula:

§ . ~ a spreads the values of a across the columns. This direction
facilitates comparisons of y position, because the vertical scales are aligned.

# Using Small multiple

ggplot(data=diamonds, aes(x=price, group=cut, fill=cut)) +

    geom_density(adjust=1.5) +

    theme_ipsum() +

    facet_wrap(~cut) +

    theme(

      legend.position="none",

      panel.spacing = unit(0.1, "lines"),

      axis.ticks.x=element_blank()

Stacked density chart

Another solution is to stack the groups. This allows to see what group is the most frequent for a given value, but it makes it hard to understand the distribution of a group that is not on the bottom of the chart.

# Stacked density plot:

p <- ggplot(data=diamonds, aes(x=price, group=cut, fill=cut)) +

    geom_density(adjust=1.5, position="fill") +

    theme_ipsum()

Search This Blog

S3PROGRAMMINGTECH

How to create density plot using ggplot2 in R Programming

Density plot

Custom with theme_ipsum

Mirror density chart with ggplot2

Histogram with `geom_histogram`

Density chart with several groups

Small Multiple with `facet_wrap()`

Stacked density chart

Comments

Post a Comment

Popular posts from this blog

How to create Animated 3d chart with R.

Linux/Unix Commands frequently used

R Programming Introduction

How to create density plot using ggplot2 in R Programming

Density plot

Custom with theme_ipsum

Mirror density chart with ggplot2

Histogram with geom_histogram

Density chart with several groups

Small Multiple with facet_wrap()

Stacked density chart

Comments

Post a Comment

Popular posts from this blog

How to create Animated 3d chart with R.

Linux/Unix Commands frequently used

R Programming Introduction

Histogram with `geom_histogram`

Small Multiple with `facet_wrap()`