ggraph
是Thomas Lin Pedersen开发的网络可视化的包.官方文档在这:https://ggraph.data-imaginist.com/index.html.
它和igraph
不同,igraph虽然也有网络可视化,但是更多的还是用于网络分析,可视化并不是太友好.
大致记录一下如何使用它来进行网络图的构建.
1 安装
需要安装两个包.
install.packages("ggraph")
install.packages("tidygraph")
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.6.1
## -- Attaching packages ------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.2.1 v purrr 0.3.2
## v tibble 2.1.3 v dplyr 0.8.3
## v tidyr 1.0.0 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.4.0
## Warning: package 'ggplot2' was built under R version 3.6.1
## Warning: package 'tibble' was built under R version 3.6.1
## Warning: package 'tidyr' was built under R version 3.6.1
## Warning: package 'dplyr' was built under R version 3.6.1
## -- Conflicts ---------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(tidygraph)
## Warning: package 'tidygraph' was built under R version 3.6.1
##
## Attaching package: 'tidygraph'
## The following object is masked from 'package:stats':
##
## filter
library(ggraph)
## Warning: package 'ggraph' was built under R version 3.6.1
2 构建所需要的数据
最为简单的办法是使用数据库格式的数据,而且前两行分别为from
和to
.我们构建一个示例数据:
2.1 构建edges数据
set.seed(0)
edges <- data.frame(from = sample(1:15, 80, replace = TRUE),
to = sample(1:15, 80, replace = TRUE),
stringsAsFactors = FALSE) %>%
distinct()
这是最简单的一个edge信息,每一行就是一个edge信息,当然,我们也可以给每一条边都加上属性信息.
edges <- data.frame(edges,
edge.width = rnorm(n = nrow(edges), mean = 1, sd = 0.5),
edge.colour = rnorm(n = nrow(edges), mean = 0, sd = 0.5),
stringsAsFactors = FALSE)
edges %>% head
## from to edge.width edge.colour
## 1 14 6 0.2382166 0.225093551
## 2 9 8 1.2969731 -0.009279916
## 3 4 7 1.1664752 -0.159034187
## 4 7 11 1.5315499 -0.464681074
## 5 1 1 0.8479080 -0.743730155
## 6 2 4 1.1850094 -0.537596148
2.2 然后构建node的数据
node <- unique(c(edges$from, edges$to)) %>% sort()
node
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
nodes <- data.frame(node, node.size = rnorm(n = length(node), mean = 1, sd = 0.5),
node.colour = sample(c("Class A", "Class B"), length(node), replace = TRUE),
stringsAsFactors = FALSE)
nodes %>% head
## node node.size node.colour
## 1 1 0.3733551 Class A
## 2 2 1.3211207 Class B
## 3 3 0.9776454 Class A
## 4 4 0.1333908 Class B
## 5 5 1.0010659 Class B
## 6 6 0.6848498 Class A
2.3 构建ggraph
所需的数据
得到edges和nodes之后,需要将其转为ggraph
所需要的格式.
graph_data <- tidygraph::tbl_graph(nodes = nodes, edges = edges,
directed = FALSE)
graph_data
## # A tbl_graph: 15 nodes and 67 edges
## #
## # An undirected multigraph with 1 component
## #
## # Node Data: 15 x 3 (active)
## node node.size node.colour
## <int> <dbl> <chr>
## 1 1 0.373 Class A
## 2 2 1.32 Class B
## 3 3 0.978 Class A
## 4 4 0.133 Class B
## 5 5 1.00 Class B
## 6 6 0.685 Class A
## # ... with 9 more rows
## #
## # Edge Data: 67 x 4
## from to edge.width edge.colour
## <int> <int> <dbl> <dbl>
## 1 6 14 0.238 0.225
## 2 8 9 1.30 -0.00928
## 3 4 7 1.17 -0.159
## # ... with 64 more rows
3 画图
3.1 基础绘图
拿到所需数据之后,可以开始画图了,跟ggplot2
一样,也是图层的画法,一层层进行叠加.我们先看一个简单的例子.
ggraph(graph = graph_data) +
geom_edge_fan() +
geom_node_point()
## Using `stress` as default layout
首先可以看到,需要使用ggraph
启动一个图像,然后必须的两个geom分别是geom_edge_xx
和geom_node_xxx
分别用来定义边和node.他们的使用办法跟ggplot2
非常类似,参数也都很类似,只是加上了edge
和node
标签.
我们下面接着对图片进行美化.
plot <-
ggraph(graph = graph_data, layout = "linear", circular = TRUE) +
geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
theme_void()
plot
从上面例子可以看到,对于edge的属性的设置,需要使用scale_edge_xxx
系列函数,而对于node,则直接使用原来的ggplot2
的scale_xxx
系列函数就可以了.
3.2 添加文字
添加文字可以使用geom_node_text()
和geom_node_label()
函数.
对于layout为圆形网状来说,我们需要将node的角度进行一定程度的调整.
node_name = paste("Node", node, sep = "_")
node_name
## [1] "Node_1" "Node_2" "Node_3" "Node_4" "Node_5" "Node_6" "Node_7"
## [8] "Node_8" "Node_9" "Node_10" "Node_11" "Node_12" "Node_13" "Node_14"
## [15] "Node_15"
angle <- 360 * (c(1:length(node_name)) - 0.5)/length(node_name)
hjust <- ifelse(angle > 180, 1, 0)
angle <- ifelse(angle > 180, 90 - angle + 180, 90 - angle)
然后添加node文字.
plot +
geom_node_text(aes(x = x * 1.05,
y = y * 1.15,
label = node_name),
angle = angle,
hjust = hjust,
colour = "black",
size = 3.5)
可以看到有些文字跑到绘图区域外面了,这时候需要将坐标轴进行扩展就行了.
plot +
geom_node_text(aes(x = x * 1.05,
y = y * 1.15,
label = node_name),
angle = angle,
hjust = hjust,
colour = "black",
size = 3.5) +
expand_limits(x = c(-1.5, 1.5), y = c(-1.5, 1.5))
我们可以看看原来的坐标轴是什么样子的.
ggraph(graph = graph_data, layout = "linear", circular = TRUE) +
geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
theme_bw()
我们将theme设置为theme_bw()
就可以清楚的看到原来的坐标体系了.
可以看到legend的顺序有点乱.可以在guides()
函数中设置.
ggraph(graph = graph_data, layout = "linear", circular = TRUE) +
geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
guides(colour = guide_legend(order = 1),
size = guide_legend(order = 2),
colour = guide_edge_colourbar(order = 3)) +
theme_void()
## Warning: Duplicated aesthetics after name standardisation: colour
添加文字的时候,文字之间,以及文字node之间,会出现覆盖的问题.如下图所示:
ggraph(graph = graph_data, layout = "auto", circular = TRUE) +
geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
theme_void() +
geom_node_text(aes(x = x,
y = y,
label = node_name,
colour = node.colour),
size = 3.5)
## Using `stress` as default layout
在ggplot2
中,我们使用ggrepel
包可以解决这个问题.在这里,我们可以设置repel
为TRUE.
ggraph(graph = graph_data, layout = "auto", circular = TRUE) +
geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
theme_void() +
geom_node_text(aes(label = node_name,
colour = node.colour),
size = 3.5, repel = TRUE)
## Using `stress` as default layout
当然,也可以使用geom_node_label()
来标注.
ggraph(graph = graph_data, layout = "auto", circular = TRUE) +
geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
theme_void() +
geom_node_label(aes(label = node_name,
colour = node.colour),
size = 3.5, repel = TRUE)
## Using `stress` as default layout
3.3 使用不同的layout
对网络来说,可以使用不同的layout,layout既可以通过再ggraph
中通过设置layout
参数实现,也可以通过将graph
直接赋予layout属性实现.
ggraph(graph = graph_data, layout = "auto", circular = TRUE) +
geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
theme_void()
## Using `stress` as default layout
ggraph(graph = graph_data, layout = "linear", circular = FALSE) +
geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
geom_node_text(aes(colour = node.colour),
hjust = 1,
angle = 65,
nudge_y = -0.3,
label = node_name,
size = 3.5) +
expand_limits(x = c(-1.5, 1.5), y = c(-1.5, 1.5)) +
theme_void()
ggraph(graph = graph_data, layout = "eigen", circular = FALSE) +
geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
theme_void() +
geom_node_text(aes(x = x,
y = y,
label = node_name,
colour = node.colour),
size = 3.5)
3.4 不同的连接线
上面的例子node之间的连接都是曲线(geom_edge_arc
),当然我们也可以使用不同的连接方式,比如直线,这时候需要使用不同的geom_edge_xxx()
函数.
比如直线可以使用geom_edge_link()
,有三个不同的函数,暂时没有仔细看差别,详细差别可以使用:?get_edges
查看.
ggraph(graph = graph_data, layout = "auto", circular = TRUE) +
geom_edge_link(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
theme_void()
## Using `stress` as default layout
ggraph(graph = graph_data, layout = "auto", circular = TRUE) +
geom_edge_link2(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
theme_void()
## Using `stress` as default layout
ggraph(graph = graph_data, layout = "auto", circular = TRUE) +
geom_edge_link0(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
theme_void()
## Using `stress` as default layout
画有一定弯度的edge可以使用geom_edge_fan()
函数.
ggraph(graph = graph_data, layout = "auto", circular = TRUE) +
geom_edge_fan(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
theme_void()
## Using `stress` as default layout
3.5 使用不同的主题
可以使用暗黑主题,像图片显示的更炫酷一些.
ggraph(graph = graph_data, layout = "auto", circular = TRUE) +
geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white", high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2)) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
ggdark::dark_theme_void() +
geom_node_text(aes(label = node_name,
colour = node.colour),
size = 3.5, repel = TRUE)
## Using `stress` as default layout
## Inverted geom defaults of fill and color/colour.
## To change them back, use invert_geom_defaults().
因为leged的某些显示颜色为黑色,而主题没有将其修改过来,可以手动进行修改.
ggraph(graph = graph_data, layout = "auto", circular = TRUE) +
geom_edge_arc(aes(edge_colour = edge.colour, edge_width = edge.width)) +
scale_edge_colour_gradient2(low = "#155F83FF", mid = "white",
high = "#800000FF") +
scale_edge_width_continuous(range = c(0.2,2),
guide = guide_legend(override.aes = list(colour = "white", alpha =1))) +
guides(colour = guide_legend(override.aes = list(size = 5))) +
geom_node_point(aes(colour = node.colour, size = node.size)) +
scale_size_continuous(range = c(5,10)) +
scale_colour_manual(values = c("Class A" = "#8A9045FF", "Class B" = "#155F83FF")) +
ggdark::dark_theme_void() +
geom_node_text(aes(label = node_name,
colour = node.colour),
size = 3.5, repel = TRUE)
## Using `stress` as default layout