Network Visualizations in R

using ggraph and graphlayouts

David Schoch

The tutorial has been updatd for graphlayouts 0.5.0 and ggraph >= 1.0.2.9999
Running the code with older versions may produce errors.

Introduction

Most network analytic tasks are fairly easy to do in R. But when it comes to visualizing networks, R may lack behind some standalone software tools. Not because it is not possible to produce nice figures, but rather because it requires some time to obtain pleasing results. Just take a look at the default output when plotting a network with the plot() function.

gotS1 <- read_graph("data/GoT/gotS1.graphml",format="graphml")
plot(gotS1,margin = -0.5)

It is definitely possible to produce nice figures with the igraph package1 Check out this wonderful tutorial, yet it may take some time to familiarize yourself with the syntax. Additionally, most of the layout algorithms of igraph are non-deterministic. This means that running the same plot call twice may produce different results.

In this tutorial, you will learn the basics of ggraph, the “ggplot2 of networks”, together with the graphlayouts package, which introduces additional useful layout algorithms to R. Arguably, using ggraph is not really easier than igraph. But once the underlying principle of the grammar of graphics is understood, you’ll see that it is actually quite intuitive to use.

Required libraries

To run all the code in this tutorial, you need to install and load several packages

install.packages(c("igraph","graphlayouts","ggraph","ggplot2"))

igraph is mostly used for its data structures and graphlayouts and ggraph for visualizations. The package ggplot2 is only needed as a dependency of ggraph.

library(igraph)
library(ggraph)
library(graphlayouts)

Quick plots

It is always a good idea to take a quick look at your network before starting any analysis. This can be done with the function qgraph() from the ggraph package.

qgraph(gotS1)

qgraph() allows you to specify node/edge colours too but it really is only meant to give you a quick overview without writing a massive amount of code.

The basics of ggraph

Once you move beyond quick plots, you need to understand the basics of, or at least develop a feeling for, the grammar of graphics to work with ggraph.

This is a white paper by Hadley Wickham on the topic if you are interested in the technical background Instead of explaining the grammar, let us directly jump into some code and work through it one line at a time.

# define a custom color palette
got_palette <- c("#1A5878", "#C44237", "#AD8941", "#E99093", "#50594B")

ggraph(gotS1,layout = "stress")+
  geom_edge_link0(aes(edge_width = weight),edge_colour = "grey66")+
  geom_node_point(aes(fill = clu,size = size),shape=21)+
  geom_node_text(aes(filter = size >= 26, label = name),family="serif")+
  scale_fill_manual(values = got_palette)+
  scale_edge_width(range = c(0.2,3))+
  scale_size(range = c(1,6))+
  theme_graph()+
  theme(legend.position = "none")

ggraph works with layers. Each layer adds a new feature to the plot and thus builds the figure step-by-step. The following sections work through each of them separately.

Layout

ggraph(gotS1,layout = "stress")

The first step is to calculate a layout. The layout parameter specifies the algorithm to use. The “stress” layout is part of the graphlayouts package and is always a safe choice since it is deterministic and produces nice layouts for almost any graph. I would recommend to use it as your default choice. Other algorithms for, e.g., concentric layouts and clustered networks are described further down in this tutorial. For the sake of completeness, here is a list of layout algorithms of igraph.

c("layout_with_dh", "layout_with_drl", "layout_with_fr", 
  "layout_with_gem", "layout_with_graphopt", "layout_with_kk", 
  "layout_with_lgl", "layout_with_mds", "layout_with_sugiyama",
  "layout_as_bipartite", "layout_as_star", "layout_as_tree")

To use them, you just need the last part of the name.

ggraph(gotS1,layout = "dh")+
  ...

Edges

geom_edge_link0(aes(width = weight),edge_colour = "grey66")

The second layer specifies how to draw the edges. Edges can be drawn in many different ways as the list below shows.

c("geom_edge_arc", "geom_edge_arc0", "geom_edge_arc2", "geom_edge_density", 
  "geom_edge_diagonal", "geom_edge_diagonal0", "geom_edge_diagonal2", 
  "geom_edge_elbow", "geom_edge_elbow0", "geom_edge_elbow2", "geom_edge_fan", 
  "geom_edge_fan0", "geom_edge_fan2", "geom_edge_hive", "geom_edge_hive0", 
  "geom_edge_hive2", "geom_edge_link", "geom_edge_link0", "geom_edge_link2", 
  "geom_edge_loop", "geom_edge_loop0")

You can do a lot of fancy things with these geoms but for a standard network plot, you should always stick with geom_edge_link0 since it simply draws a straight line between the endpoints.2 some tools draw curved edges by default. While this may add some artistic value, it reduces readability. Always go with straight lines! If your network has multiple edges between two nodes, then you can switch to geom_edge_parallel().

In case you are wondering what the “0” stands for: The standard geom_edge_link() draws 100 dots on each edge compared to only two dots (the endpoints) in geom_edge_link0(). This is done to allow, e.g., gradients along the edge.

You can reproduce this figure by substituting

`geom_edge_link(aes(edge_alpha = ..index..),edge_colour = "black")`

in the code above.

The drawback of using geom_edge_link() is that the time to render the plot increases and so does the size of the file if you export the plot.3 example Typically, you do not need gradients along an edge. Hence, geom_edge_link0() should be your default to draw edges.

Within geom_edge_link0, you can specify the appearance of the edge, either by mapping edge attributes to aesthetics or setting them globally for the graph. Mapping attributes to aesthetics is done within aes(). In the example, we map the edge width to the edge attribute “weight”. ggraph then automatically scales the edge width according to the attribute.4 you will learn later how to control this. The colour of all edges is globally set to “grey66”.

The following aesthetics can be used within geom_edge_link0 either within aes() or globally:

ggraph does not automatically plot arrows if your graph is directed. You need to do this manually using the arrow parameter.

geom_edge_link0(aes(...),..., 
                arrow = arrow(angle = 30, length = unit(0.15, "inches"),
                              ends = "last", type = "closed"))

The default arrowhead type is “open”, yet “closed” usually has a nicer appearance.

Nodes

geom_node_point(aes(fill = clu,size = size),shape = 21)+
geom_node_text(aes(filter = size >= 26, label = name),family = "serif")

On top of the edge layer, we draw the node layer. Always draw the node layer above the edge layer. Otherwise, edges will be visible on top of nodes. There are slightly less geoms available for nodes.

c("geom_node_arc_bar", "geom_node_circle", "geom_node_label", 
"geom_node_point", "geom_node_text", "geom_node_tile", "geom_node_treemap")

The most important ones here are geom_node_point() to draw nodes as simple geometric objects (circles, squares,…) and geom_node_text() to add node labels.5 You can also use geom_node_label(), but this draws labels within a box.

The mapping of node attributes to aesthetics is similar to edge attributes. In the example code, we map the fill attribute of the node shape to the “clu” attribute, which holds the result of a clustering, and the size of the nodes to the attribute “size”. The shape of the node is globally set to 21.

The figure below shows all possible shapes that can be used for the nodes.

Personally, I prefer “21” since it draws a border around the nodes. If you prefer another shape, say “19”, you have to be aware of several things. To change the color of shapes 1-20, you need to use the colour parameter. For shapes 21-25 you need to use fill. The colour parameter only controls the the border for these cases.

The following aesthetics can be used within geom_node_point() either within aes() or globally:

For geom_node_text(), there are a lot more options available, but the most important once are:

Note that we also used a filter within aes() of geom_node_text(). The filter parameter allows you to specify a rule for when to apply the aesthetic mappings. The most frequent use case is for node labels (but can also be used for edges or nodes). In the example, we only display the node label if the size attribute is larger than 26.

Scales

scale_fill_manual(values = got_palette)+
scale_edge_width_continuous(range = c(0.2,3))+
scale_size_continuous(range = c(1,6))

The scale_* functions are used to control aesthetics that are mapped within aes(). You do not necessarily need to set them, since ggraph can take care of it automatically.

ggraph(gotS1,layout = "stress")+
  geom_edge_link0(aes(edge_width = weight),edge_colour = "grey66")+
  geom_node_point(aes(fill = clu,size = size),shape = 21)+
  geom_node_text(aes(filter = size >= 26, label = name),family="serif")+
  theme_graph()+
  theme(legend.position = "none")

While the node fill and size seem reasonable, the edges are a little too thick. In general, it is always a good idea to add a scale_* for each aesthetic within aes().

What kind of scale_* function you need depends on the aesthetic and on the type of attribute you are mapping. Generally, scale functions are structured like this:
scale_<aes>_<variable type>().

The “aes” part is easy. Just us the type you specified within aes(). For edges, however, you have to prepend edge_. The “variable type” part depends on which scale the attribute is on. Before we continue, it may be a good idea to briefly discuss what aesthetics make sense for which variable type.

aesthetic variable type notes
node size continuous
edge width continuous
node colour/fill categorical/continuous use a gradient for continuous variables
edge colour continuous categorical only if there are different types of edges
node shape categorical only if there are a few categories (1-5). Colour should be the preferred choice
edge linetype categorical only if there are a few categories (1-5). Colour should be the preferred choice
node/edge alpha continuous

The easiest to use scales are those for continuous variables mapped to edge width and node size (also the alpha value, which is not used here). While there are several parameters within scale_edge_width_continuous() and scale_size_continuous(), the most important one is “range” which fixes the minimum and maximum width/size. It usually suffices to adjust this parameter.

For continuous variables that are mapped to node/edge colour, you can use scale_colour_gradient() scale_colour_gradient2() or scale_colour_gradientn() (add edge_ before colour for edge colours). The difference between these functions is in how the gradient is constructed. gradient creates a two colour gradient (low-high). Simply specify the the two colours to be used (e.g. low = “blue”, high = “red”). gradient2 creates a diverging colour gradient (low-mid-high) (e.g. low = “blue”, mid = “white”, high = “red”) and gradientn a gradient consisting of more than three colours (specified with the colours parameter).

use the colourpicker addin to easily create your own palette or choose gradient colors. Choosing the right colour for your purposes is important and non-trivial. See this series of blog posts. For categorical variables that are mapped to node colours (or fill in our example), you can use scale_fill_manual(). This forces you to choose a color for each category yourself. Simply create a vector of colors (see the got_palette) and pass it to the function with the parameter values.

ggraph then assigns the colors in the order of the unique values of the categorical variable. This are either the factor levels (if the variable is a factor) or the result of sorting the unique values (if the variable is a character).

sort(unique(V(gotS1)$clu))
## [1] "1" "2" "3" "4" "5"

If you want more control over which value is mapped to which colour, you can pass the vector of colours as a named vector.

got_palette2 <- c("5" = "#1A5878","3" = "#C44237","2" = "#AD8941",
                  "1" = "#E99093", "4" = "#50594B")

Using your own colour palette gives your network a unique touch. If you can’t be bothered with choosing colours, you may want to consider scale_fill_brewer() and scale_colour_brewer(). The function offers all palettes available at colorbrewer2.org.

ggraph(gotS1,layout = "stress")+
  geom_edge_link0(aes(edge_width = weight),edge_colour = "grey66")+
  geom_node_point(aes(fill = clu,size = size),shape = 21)+
  geom_node_text(aes(filter = size >= 26, label = name),family = "serif")+
  scale_fill_brewer(palette = "Dark2")+
  scale_edge_width_continuous(range = c(0.2,3))+
  scale_size_continuous(range = c(1,6))+
  theme_graph()+
  theme(legend.position = "none")

themes

theme_graph()+
theme(legend.position = "none")

themes control the overall look of the plot. There are a lot of options within the theme() function of ggplot2. Luckily, we really don’t need any of those. theme_graph() is used to erase all of the default ggplot theme (e.g. axis, background, grids, etc.) since they are irrelevant for networks. The only option worthwhile in theme() is legend.position, which we set to “none”, i.e. don’t show the legend.

The code below gives an example for a plot with a legend.

ggraph(gotS1,layout = "stress")+
  geom_edge_link0(aes(edge_width = weight),edge_colour = "grey66")+
  geom_node_point(aes(fill = clu,size = size),shape=21)+
  geom_node_text(aes(filter = size>=26, label = name),family="serif")+
  scale_fill_manual(values = got_palette)+
  scale_edge_width_continuous(range = c(0.2,3))+
  scale_size_continuous(range = c(1,6))+
  theme_graph()+
  theme(legend.position = "bottom")

another full example

Let us work through one more visualization using a very special data set. The “Grey’s Anatomy” hook-up network

edges <- read.csv("data/greys-edges.csv", header = FALSE,stringsAsFactors = FALSE)
nodes <- read.csv("data/greys-nodes.csv", header = TRUE,stringsAsFactors = FALSE)

ga <- graph_from_data_frame(d = edges,directed = FALSE,vertices = nodes)
ga
## IGRAPH b8f7f86 UN-- 54 57 -- 
## + attr: name (v/c), sex (v/c), race (v/c), birthyear (v/n),
## | position (v/c), season (v/n), sign (v/c)
## + edges from b8f7f86 (vertex names):
##  [1] Arizona Robbins--Leah Murphy     Alex Karev     --Leah Murphy    
##  [3] Arizona Robbins--Lauren Boswell  Arizona Robbins--Callie Torres  
##  [5] Erica Hahn     --Callie Torres   Alex Karev     --Callie Torres  
##  [7] Mark Sloan     --Callie Torres   George O'Malley--Callie Torres  
##  [9] Izzie Stevens  --George O'Malley Meredith Grey  --George O'Malley
## [11] Denny Duqutte  --Izzie Stevens   Izzie Stevens  --Alex Karev     
## [13] Derek Sheperd  --Meredith Grey   Preston Burke  --Cristina Yang  
## + ... omitted several edges

Start with the qgraph call.

qgraph(ga)

The network consists of several components. Note that the igraph standard is to pack all components in a circle. The standard in graphlayouts is to arrange them in a rectangle. The underlying “bin packing” algorithm has been updated, giving the user a bit more control. You can specify the bbox parameter to arrange the components differently. The plot above arranges all components on one level, but two levels may be desirable. You may need to experiment a bit with the parameter, but for this network, bbox=15 seems to work best (see below).

We will use this network to quickly illustrate what can be done with geom_edge_link2(). The function allows to interpolate node attributes between the start and end node along the edges. In the code below, we use the “position” attribute. The line which adds the node labels illustrates two further features of ggraph. First, aesthetics don’t need to be node attributes. Here, for instance, we calculate the degree and then map it to the font size. The second one is the repel = TRUE argument. This option places the node labels in a way that labels do not overlap.

ggraph(ga,"stress",bbox = 15)+
  geom_edge_link2(aes(edge_colour = node.position),edge_width = 0.5)+
  geom_node_point(aes(fill = sex),shape = 21,size = 3)+
  geom_node_text(aes(label = name,size = degree(ga)),
                 family = "serif",repel = TRUE)+
  scale_edge_colour_brewer(palette = "Set1")+
  scale_fill_manual(values=c("F" = "#EEB422","M" = "#424242","grey66"))+
  scale_size(range=c(2,5),guide = FALSE)+
  theme_graph()+
  theme(legend.position = "bottom")

While the coloured edges look kind of artistic, we should go back to the “0” version.

ggraph(ga,"stress",bbox = 15)+
  geom_edge_link0(edge_colour = "grey66",edge_width = 0.5)+
  geom_node_point(aes(fill = sex),shape = 21,size = 3)+
  geom_node_text(aes(label = name,size = degree(ga)),
                 family = "serif",repel = TRUE)+
  scale_fill_manual(values=c("F" = "#EEB422","M" = "#424242","grey66"))+
  scale_size(range=c(2,5),guide = FALSE)+
  theme_graph()+
  theme(legend.position = "bottom")

Miscellaneous

Everything we covered above should be enough to produce nice network visualizations for scientific publications. However, ggraph has a lot more advanced functions/parameter settings to further enhance your visualization. If you are looking for something specific, it is always a good idea to read the documentation of the geoms.

Some things that I frequently use are the following:

The stress layout also works well with medium to large graphs. The network shows the global football competition network between 2016-2018. It consists of ~5000 nodes (clubs) and ~15000 edges (games). Node colour corresponds to the confederation of the club.

snahelper

Even with a lot of experience, it may still be a painful process to produce nice looking figures by writing ggraph code. Enter the snahelper.

install.packages("snahelper")

The snahelper is an RStudio addin which provides you with a GUI to plot networks. Instead of writing code, you simply use drop-down menus to assign attributes to aesthetics or change appearances globally. One great feature of the addin is that you can adjust the position of nodes individually if you are not satisfied with their location. Once you are done, you can either directly export the figure to png or automatically insert the code to produce the figure into your script. That way, you can review the code and hopefully learn something from it. Below if a demo that shows its functionality.

To use the addin, simply highlight the variable name of your network within an R script and choose the SNAhelper from the Addins drop-down menu within RStudio.

Non-standard layouts

While “stress” is the key layout algorithm in graphlayouts, there are other, more specialized layouts that can be used for different purposes. In this part, we work through some examples with concentric layouts and learn how to disentangle extreme “hairball” networks.

Concentric layouts

Circular layouts are generally not advisable. Concentric circles, on the other hand, help to emphasize the position of certain nodes in the network. The graphlayouts package has two function for concentric layouts, layout_with_focus() and layout_with_centrality().

The first one allows to focus the network on a specific node and arrange all other nodes in concentric circles (depending on the geodesic distance) around it. Below we focus on the character Ned Stark.

ggraph(gotS1,layout = "focus",focus = 1)+
  geom_edge_link0(aes(edge_width = weight),edge_colour = "grey66")+
  geom_node_point(aes(fill = clu,size=size),shape = 21)+
  geom_node_text(aes(filter = (name == "Ned"),size = size,label = name),
                 family = "serif")+
  scale_edge_width_continuous(range = c(0.2,1.2))+
  scale_size_continuous(range = c(1,5))+
  scale_fill_manual(values = got_palette)+
  coord_fixed()+
  theme_graph()+
  theme(legend.position = "none")

The parameter focus in the first line is used to choose the node id of the focal node. The function coord_fixed() is used to always keep the aspect ratio at one (i.e. the circles are always displayed as a circle and not an ellipse).

The function draw_circle() can be used to add the circles explicitly.

ggraph(gotS1,layout="focus",focus = 1)+
  draw_circle(col = "#00BFFF", use = "focus",max.circle = 3)+
  geom_edge_link0(aes(width = weight),edge_colour = "grey66")+
  geom_node_point(aes(fill = clu,size = size),shape = 21)+
  geom_node_text(aes(filter = (name == "Ned"),size = size,label = name),
                 family = "serif")+
  scale_edge_width_continuous(range = c(0.2,1.2))+
  scale_size_continuous(range = c(1,5))+
  scale_fill_manual(values = got_palette)+
  coord_fixed()+
  theme_graph()+
  theme(legend.position = "none")

layout_with_centrality() works in a similar way. You can specify any centrality index (or numeric vector for that matter), and create a concentric layout where the most central nodes are put in the center and the most peripheral nodes in the biggest circle. The numeric attribute used for the layout is specified with the cent parameter. Here, we use the weighted degree of the characters.

ggraph(gotS1,layout = "centrality",cent = graph.strength(gotS1))+
  geom_edge_link0(aes(edge_width = weight),edge_colour = "grey66")+
  geom_node_point(aes(fill=clu,size=size),shape = 21)+
  geom_node_text(aes(size = size,label = name),family = "serif")+
  scale_edge_width_continuous(range = c(0.2,0.9))+
  scale_size_continuous(range = c(1,8))+
  scale_fill_manual(values = got_palette)+
  coord_fixed()+
  theme_graph()+
  theme(legend.position = "none")

Concentric layouts are not only helpful to focus on specific nodes, but also make for a good tool to visualize collections of ego networks. The data repository contains a set of 32 ego networks, which we can read into R as follows.

ego_files <- list.files("data/egonet/",full.names = TRUE)
egonets <- lapply(ego_files,function(x) read.graph(x,format = "graphml"))
egonets[[1]]
## IGRAPH a963a62 UNW- 21 64 -- 
## + attr: name (v/c), gender (v/c), age (v/c), rank (v/n), id (v/c),
## | weight (e/n)
## + edges from a963a62 (vertex names):
##  [1] 1 --2   1 --8   1 --20  1 --ego 2 --5   2 --9   2 --12  2 --13 
##  [9] 2 --14  2 --19  2 --ego 3 --4   3 --6   3 --11  3 --12  3 --15 
## [17] 3 --18  3 --ego 4 --14  4 --ego 5 --9   5 --12  5 --13  5 --16 
## [25] 5 --ego 6 --17  6 --19  6 --ego 7 --12  7 --13  7 --16  7 --ego
## [33] 8 --10  8 --11  8 --16  8 --18  8 --19  8 --ego 9 --17  9 --18 
## [41] 9 --19  9 --20  9 --ego 10--12  10--ego 11--12  11--15  11--18 
## [49] 11--ego 12--13  12--ego 13--19  13--20  13--ego 14--20  14--ego
## + ... omitted several edges

Each network has three node variables (age, rank and gender) and one edge variable (weight). Our goal is to plot each network with ego in the center and arrange the alters in concentric circles around them, according to their rank (corresponds to how close ego and alter are). The data is coded such that a rank of one means “very close” and 4 means “not close at all”. Since layout_with_centrality() assumes that large values mean “more central”, we need to invert the rank attribute.

ggraph(egonets[[1]],layout = "centrality", cent = 5-V(egonets[[1]])$rank)+
  draw_circle(use = "cent")+
  geom_edge_link0(aes(edge_width = weight),edge_colour = "grey66")+
  geom_node_point(aes(fill = gender),shape = 21,size = 5)+
  scale_edge_width_continuous(range = c(0.2,1.2),guide = FALSE)+
  scale_fill_manual(values = c("w" = "#EEB422", "m" = "#3D3D3D"))+
  coord_fixed()+
  theme_graph()+
  theme(legend.position = "bottom")

We can easily reuse the code from above for our entire ego network collection that is stored in egonets. We start by turning the ggraph code into a function, which only takes a network as input.

plot_ego <- function(net){
  ggraph(net,layout = "centrality", cent = 5-V(net)$rank)+
  draw_circle(use = "cent")+
  geom_edge_link0(aes(edge_width = weight),edge_colour = "grey66")+
  geom_node_point(aes(fill = gender),shape = 21,size = 5)+
  scale_edge_width_continuous(range = c(0.2,1.2),guide = FALSE)+
  scale_fill_manual(values = c("w" = "#EEB422", "m" = "#3D3D3D"))+
  coord_fixed()+
  theme_graph()+
  theme(legend.position = "none",plot.margin = margin(0,0,0,0,"cm")) #no margins
}

Using lapply, we apply it to our whole collection.

ego_plots <- lapply(egonets,plot_ego)

To put all networks into one final plot, we can use the patchwork package.

# install.packages(remotes)
# remotes::install_github("thomasp85/patchwork)
library(patchwork)

With the package loaded, it is very easy to put two networks next to each other

ego_plots[[1]] + ego_plots[[2]]

Looping over the whole list, we can combine all plots into one.

p <- ego_plots[[1]]
for(i in 2:32){
 p <- p + ego_plots[[i]] 
}

The patchwork package has some additional functions which allow us to, e.g., control how many plots are put into each row (i.e. how many columns there should be).

p + plot_layout(ncol = 8)

Using patchwork, you can avoid saving each plot separately and then gluing them together with an image processing program.

Backbone layout

layout_as_backbone() is a layout algorithm that can help emphasize hidden group structures. To illustrate the performance of the algorithm, we create an artificial network with a subtle group structure using sample_islands() from igraph.

g <- sample_islands(9,40,0.4,15)
g <- simplify(g)
V(g)$grp <- as.character(rep(1:9,each = 40))

The network consists of 9 groups with 40 vertices each. The density within each group is 0.4 and there are 15 edges running between each pair of groups. Let us try to visualize the network with what we have learned so far.

ggraph(g,layout = "stress")+
  geom_edge_link0(edge_colour = "black",edge_width = 0.1, edge_alpha = 0.5)+
  geom_node_point(aes(fill = grp), shape = 21)+
  scale_fill_brewer(palette = "Set1")+
  theme_graph()+
  theme(legend.position = "none")

As you see, the graph seems to be a proper “hairball” without any special structural features standing out. In this case, though, we know that there should be 9 groups of vertices that are internally more densely connected than externally. To uncover this group structure, we turn to the “backbone layout”.

bb <- layout_as_backbone(g,keep = 0.4)
E(g)$col <- FALSE
E(g)$col[bb$backbone] <- TRUE

Technical details can be found in the paper.

The idea of the algorithm is as follows. For each edge, an embededness score is calculated which serves as an edge weight attribute. These weights are then ordered and only the edges with the highest score are kept. The number of edges to keep is controlled with the keep parameter. In our example, we keep the top 40%.6 The parameter usually requires some experimenting to find out what works best. Since this may result in an unconnected network, we add all edges of the union of all maximum spanning trees. The resulting network is the “backbone” of the original network and the “stress” layout algorithm is applied to this network. Once the layout is calculated, all edges are added back to the network.

The output of the function are the x and y coordinates for nodes and a vector that gives the ids of the edges in the backbone network. In the code above, we use this vector to create a binary edge attribute that indicates if an edge is part of the backbone or not.

To use the coordinates, we set the layout parameter to “manual” and provide the x and y coordinates as parameters.

ggraph(g,layout = "manual",x = bb$xy[,1],y = bb$xy[,2])+
  geom_edge_link0(aes(edge_colour = col),edge_width = 0.1)+
  geom_node_point(aes(fill = grp),shape = 21)+
  scale_fill_brewer(palette = "Set1")+
  scale_edge_color_manual(values=c(rgb(0,0,0,0.3),rgb(0,0,0,1)))+
  theme_graph()+
  theme(legend.position = "none")

The groups are now clearly visible! Of course the network used in the example is specifically tailored to illustrate the power of the algorithm. Using the backbone layout in real world networks may not always result in such a clear division of groups. It should thus not be seen as a universal remedy for drawing hairball networks. Keep in mind: It can only emphasize a hidden group structure if it exists.

A facebook friendship network of a university in the US. Node colour corresponds to dormitory of students. (left) stress layout and (right) backbone layout.

Further reading

The tutorial “Network Analysis and Visualization with R and igraph” by Katherine Ognyanova (link) comes with in-depth explanations of the built-in plotting function of igraph.

For further help on ggraph see the blog posts on layouts (link), nodes (link) and edges (link) by @thomasp85. Thomas is also the creator of tidygraph and there is also an introductory post on his blog (link).

More details and algorithms of the graphlayouts package can be found on my blog (link1, link2) and on github.