如何在 R 中创建人口金字塔？

在本文中，我们将讨论如何在 R 编程语言中创建人口金字塔。

人口金字塔也称为年龄-性别金字塔。它帮助我们可视化按年龄组和性别划分的人口分布。它通常呈金字塔形。在人口金字塔中，男性通常位于左侧，女性位于右侧。这种数据可视化方法可用于根据男性和女性居民的数量或百分比来可视化特定人群的年龄。

为了在 R 语言中创建人口金字塔，我们使用 ggplot2 包的 geom_bar()函数。 ggplot2 是一个基于“图形语法”的声明式创建图形系统。 geom_bar()函数用于绘制条形图，它使条形的高度与每组中的案例数成正比。

基本在 R 中创建人口金字塔

为了创建人口金字塔，我们使用 coord_flip()函数和 geom_bar()函数来创建水平条形图，然后我们使用 mutate函数使男性人口的值为负，从而在左侧创建男性人口条右侧的侧面和女性人口栏为我们提供了所需的人口金字塔。

Syntax:

ggplot( df, aes(x = age, y = population)) + geom_bar(stat = “identity”) + coord_flip()

Parameters:

df: determines the data frame that contains population data.
gender, age and population: determines the columns of df data frame.

编程需要懂一点英语

例子：

这里，是一个基本的人口金字塔。示例中使用的 CSV 文件可在此处下载。

R

# load sample data
sample_data <- read.csv("Population.CSV")
  
# load library ggplot2 and dplyr
library(ggplot2)
library(dplyr)
  
# change male population to negative
sample_data %>%mutate(
    population = ifelse(gender=="M", population*(-1),
                        population*1))%>%
    ggplot(aes(x = age,y = population)) + 
    geom_bar(stat = "identity") +
    coord_flip()

R

# load sample data
sample_data <- read.csv("Population.CSV")
  
# load library ggplot2 and dplyr
library(ggplot2)
library(dplyr)
  
# change male population to negative
sample_data %>%mutate(
    population = ifelse(gender=="M", population*(-1),
                        population*1))%>%
    ggplot(aes(x = age,y = population, fill=gender)) + 
    geom_bar(stat = "identity") +
    coord_flip()+
    scale_fill_brewer(type = "seq",palette = 7)

R

# load sample data
sample_data <- read.csv("Population.CSV")
  
# load library ggplot2 and dplyr
library(ggplot2)
library(dplyr)
  
# change male population to negative
sample_data %>%mutate(
    population = ifelse(gender=="M", population*(-1),
                        population*1))%>%
    ggplot(aes(x = age,y = population, fill=gender)) + 
    geom_bar(stat = "identity") +
    coord_flip()+
   labs(title = "Title of plot", x = "Age", 
        y = "Population(in millions)")

R

# load sample data
sample_data <- read.csv("Population.CSV")
  
# load library ggplot2 and dplyr
library(ggplot2)
library(dplyr)
  
# change male population to negative
sample_data %>%mutate(
    population = ifelse(gender=="M", population*(-1),
                        population*1))%>%
    ggplot(aes(x = age,y = population, fill=gender)) + 
    geom_bar(stat = "identity") +
    coord_flip()+
    scale_y_continuous(limits = c(-4,4), 
                       breaks = seq(-4, 4, by = 2))+
   labs(title = "Title of plot", x = "Age",
        y = "Population(in millions)")

输出：

R中的颜色自定义人口金字塔

为了自定义男性和女性条的颜色，我们使用 ggplot()函数的填充美学属性。 W 可以传递我们想要颜色的变量，也可以传递需要放置的确切颜色。我们甚至可以使用 scale_fill_brewer()函数将颜色设置为预定义的调色板。

Syntax:

ggplot( df, aes(x, y, fill) )

Parameters:

fill: determines the variable according to which bars are to be colored.

编程需要懂一点英语

例子：

在这里，在这个例子中，在顺序调色板编号 7 中为绘图着色。

R

# load sample data
sample_data <- read.csv("Population.CSV")
  
# load library ggplot2 and dplyr
library(ggplot2)
library(dplyr)
  
# change male population to negative
sample_data %>%mutate(
    population = ifelse(gender=="M", population*(-1),
                        population*1))%>%
    ggplot(aes(x = age,y = population, fill=gender)) + 
    geom_bar(stat = "identity") +
    coord_flip()+
    scale_fill_brewer(type = "seq",palette = 7)

输出：

R中的标签自定义人口金字塔

要自定义绘图的标签，我们使用 labs()函数的标题、x 和 y 参数。这里，title、x 和 y 分别决定了绘图的标题、x 轴的标签和 y 轴的标签。

Syntax:

ggplot( df, aes(x, y) )+ labs( title, x, y)

Parameters:

title: determines the title of the plot.
x and y: determines the label of x and y axis respectively.

编程需要懂一点英语

例子：

这是一个带有自定义颜色和标签的人口金字塔。示例中使用的 CSV 文件可在此处下载。

R

# load sample data
sample_data <- read.csv("Population.CSV")
  
# load library ggplot2 and dplyr
library(ggplot2)
library(dplyr)
  
# change male population to negative
sample_data %>%mutate(
    population = ifelse(gender=="M", population*(-1),
                        population*1))%>%
    ggplot(aes(x = age,y = population, fill=gender)) + 
    geom_bar(stat = "identity") +
    coord_flip()+
   labs(title = "Title of plot", x = "Age", 
        y = "Population(in millions)")

输出：

轴自定义R中的人口金字塔

因为在上面的例子中，人口金字塔不在中心，因为女性人口更多。为了解决这些情况，我们可以使用 scale_x/y_continuous()函数来固定轴的比例。我们也可以使用这个函数来设置轴的中断。要设置轴中断，我们使用 scale_x/y_continuous()函数的中断参数。

Syntax:

scale_x/y_continuous( limits, breaks)

Parameters:

limits: determines the limits of the x or y-axis.
breaks: determines the axis breaks of the x or y-axis.

编程需要懂一点英语

例子：

在此示例中，我们设置了 y 轴范围以使绘图更加均匀。

R

# load sample data
sample_data <- read.csv("Population.CSV")
  
# load library ggplot2 and dplyr
library(ggplot2)
library(dplyr)
  
# change male population to negative
sample_data %>%mutate(
    population = ifelse(gender=="M", population*(-1),
                        population*1))%>%
    ggplot(aes(x = age,y = population, fill=gender)) + 
    geom_bar(stat = "identity") +
    coord_flip()+
    scale_y_continuous(limits = c(-4,4), 
                       breaks = seq(-4, 4, by = 2))+
   labs(title = "Title of plot", x = "Age",
        y = "Population(in millions)")

输出：