The tab_spanner() function

Let’s create a gt table using a small portion of the gtcars dataset. Over several columns (hp, hp_rpm, trq, trq_rpm, mpg_c, mpg_h) we’ll use tab_spanner() to add a spanner with the label "performance". This effectively groups together several columns related to car performance under a unifying label.

gtcars |>
  dplyr::select(
    -mfr, -trim, bdy_style,
    -drivetrain, -trsmn, -ctry_origin
  ) |>
  dplyr::slice(1:8) |>
  gt(rowname_col = "model") |>
  tab_spanner(
    label = "performance",
    columns = c(
      hp, hp_rpm, trq, trq_rpm, mpg_c, mpg_h
    )
  )
year bdy_style
performance
msrp
hp hp_rpm trq trq_rpm mpg_c mpg_h
GT 2017 coupe 647 6250 550 5900 11 18 447000
458 Speciale 2015 coupe 597 9000 398 6000 13 17 291744
458 Spider 2015 convertible 562 9000 398 6000 13 17 263553
458 Italia 2014 coupe 562 9000 398 6000 13 17 233509
488 GTB 2016 coupe 661 8000 561 3000 15 22 245400
California 2015 convertible 553 7500 557 4750 16 23 198973
GTC4Lusso 2017 coupe 680 8250 514 5750 12 17 298000
FF 2015 coupe 652 8000 504 6000 11 16 295000

With the default gather = TRUE option, columns selected for a particular spanner will be moved so that there is no separation between them. This can be seen with the example below that uses a subset of the towny dataset. The starting column order is name, latitude, longitude, population_2016, density_2016, population_2021, and density_2021. The first two uses of tab_spanner() deal with making separate spanners for the two population and two density columns. After their use, the columns are moved to this new ordering: name, latitude, longitude, population_2016, population_2021, density_2016, and density_2021. The third and final call of tab_spanner() doesn’t further affect the ordering of columns.

towny |>
  dplyr::slice_max(population_2021, n = 5) |>
  dplyr::select(
    name, latitude, longitude,
    ends_with("2016"), ends_with("2021")
  ) |>
  gt() |>
  tab_spanner(
    label = "Population",
    columns = starts_with("pop")
  ) |>
  tab_spanner(
    label = "Density",
    columns = starts_with("den")
  ) |>
  tab_spanner(
    label = md("*Location*"),
    columns = ends_with("itude"),
    id = "loc"
  )
name
Location
Population
Density
latitude longitude population_2016 population_2021 density_2016 density_2021
Toronto 43.74167 -79.37333 2731571 2794356 4328.27 4427.75
Ottawa 45.42472 -75.69500 934243 1017449 335.07 364.91
Mississauga 43.60000 -79.65000 721599 717961 2464.98 2452.56
Brampton 43.68833 -79.76083 593638 656480 2232.65 2468.99
Hamilton 43.25667 -79.86917 536917 569353 480.11 509.12

While columns are moved, it is only the minimal amount of moving required (pulling in columns from the right) to ensure that columns are gathered under the appropriate spanners. With the last call, there are two more things to note: (1) label values can use the md() (or html()) helper functions to help create styled text, and (2) an id value may be supplied for reference later (e.g., for styling with tab_style() or applying footnotes with tab_footnote()).

It’s possible to stack multiple spanners atop each other with consecutive calls of tab_spanner(). It’s a bit like playing Tetris: putting a spanner down anywhere there is another spanner (i.e., there are one or more shared columns) means that second spanner will reside a level above the prior. Let’s look at a few examples to see how this works, and we’ll also explore a few lesser-known placement tricks. We’ll use a cut down version of exibble for this, set up a few level-1 spanners, and then place a level-2 spanner over two other spanners.

exibble_narrow <- exibble |> dplyr::slice_head(n = 3)

exibble_narrow |>
  gt() |>
  tab_spanner(
    label = "Row Information",
    columns = c(row, group)
  ) |>
  tab_spanner(
    label = "Numeric Values",
    columns = where(is.numeric),
    id = "num_spanner"
  ) |>
  tab_spanner(
    label = "Text Values",
    columns = c(char, fctr),
    id = "text_spanner"
  ) |>
  tab_spanner(
    label = "Numbers and Text",
    spanners = c("num_spanner", "text_spanner")
  )
Numbers and Text
Numeric Values
Text Values
date time datetime
Row Information
num currency char fctr row group
0.1111 49.95 apricot one 2015-01-15 13:35 2018-01-01 02:22 row_1 grp_a
2.2220 17.95 banana two 2015-02-15 14:40 2018-02-02 14:33 row_2 grp_a
33.3300 1.39 coconut three 2015-03-15 15:45 2018-03-03 03:44 row_3 grp_a

In the above example, we used the spanners argument to define where the "Numbers and Text"-labeled spanner should reside. For that, we supplied the "num_spanner" and "text_spanner" ID values for the two spanners associated with the num, currency, char, and fctr columns. Alternatively, we could have given those column names to the columns argument and achieved the same result. You could actually use a combination of spanners and columns to define where the spanner should be placed. Here is an example of just that:

exibble_narrow_gt <-
  exibble_narrow |>
  gt() |>
  tab_spanner(
    label = "Numeric Values",
    columns = where(is.numeric),
    id = "num_spanner"
  ) |>
  tab_spanner(
    label = "Text Values",
    columns = c(char, fctr),
    id = "text_spanner"
  ) |>
  tab_spanner(
    label = "Text, Dates, Times, Datetimes",
    columns = contains(c("date", "time")),
    spanners = "text_spanner"
  )

exibble_narrow_gt
Text, Dates, Times, Datetimes
Numeric Values
Text Values
date time datetime row group
num currency char fctr
0.1111 49.95 apricot one 2015-01-15 13:35 2018-01-01 02:22 row_1 grp_a
2.2220 17.95 banana two 2015-02-15 14:40 2018-02-02 14:33 row_2 grp_a
33.3300 1.39 coconut three 2015-03-15 15:45 2018-03-03 03:44 row_3 grp_a

And, again, we could have solely supplied all of the column names to columns instead of using this hybrid approach, but it is interesting to express the definition of spanners with this flexible combination.

What if you wanted to extend the above example and place a spanner above the date, time, and datetime columns? If you tried that in the manner as exemplified above, the spanner will be placed in the third level of spanners:

exibble_narrow_gt |>
  tab_spanner(
    label = "Date and Time Columns",
    columns = contains(c("date", "time")),
    id = "date_time_spanner"
  )
Date and Time Columns
Text, Dates, Times, Datetimes
Numeric Values
Text Values
date time datetime row group
num currency char fctr
0.1111 49.95 apricot one 2015-01-15 13:35 2018-01-01 02:22 row_1 grp_a
2.2220 17.95 banana two 2015-02-15 14:40 2018-02-02 14:33 row_2 grp_a
33.3300 1.39 coconut three 2015-03-15 15:45 2018-03-03 03:44 row_3 grp_a

Remember that the approach taken by tab_spanner() is to keep stacking atop existing spanners. But, there is space next to the "Text Values" spanner on the first level. You can either revise the order of tab_spanner() calls, or, use the level argument to force the spanner into that level (so long as there is space).

exibble_narrow_gt |>
  tab_spanner(
    label = "Date and Time Columns",
    columns = contains(c("date", "time")),
    level = 1,
    id = "date_time_spanner"
  )
Text, Dates, Times, Datetimes
Numeric Values
Text Values
Date and Time Columns
row group
num currency char fctr date datetime time
0.1111 49.95 apricot one 2015-01-15 2018-01-01 02:22 13:35 row_1 grp_a
2.2220 17.95 banana two 2015-02-15 2018-02-02 14:33 14:40 row_2 grp_a
33.3300 1.39 coconut three 2015-03-15 2018-03-03 03:44 15:45 row_3 grp_a

That puts the spanner in the intended level. If there aren’t free locations available in the level specified you’ll get an error stating which columns cannot be used for the new spanner (this can be circumvented, if necessary, with the replace = TRUE option). If you choose a level higher than the maximum occupied, then the spanner will be dropped down. Again, these behaviors are indicative of Tetris-like rules which tend to work well for the application of spanners.

Using a subset of the towny dataset, we can create an interesting gt table. First, only certain columns are selected from the dataset, some filtering of rows is done, rows are sorted, and then only the first 10 rows are kept. After the data is introduced to gt(), we then apply some spanner labels using two calls of tab_spanner(). In the second of those, we incorporate unit notation text (within "{{"/"}“}) in the label to get a display of nicely-formatted units.

towny |>
  dplyr::select(
    name, ends_with(c("2001", "2006")), matches("2001_2006")
  ) |>
  dplyr::filter(population_2001 > 100000) |>
  dplyr::slice_max(pop_change_2001_2006_pct, n = 10) |>
  gt() |>
  fmt_integer() |>
  fmt_percent(columns = matches("change"), decimals = 1) |>
  tab_spanner(
    label = "Population",
    columns = starts_with("population")
  ) |>
  tab_spanner(
    label = "Density, {{*persons* km^-2}}",
    columns = starts_with("density")
  ) |>
  cols_label(
    ends_with("01") ~ "2001",
    ends_with("06") ~ "2006",
    matches("change") ~ md("Population Change,<br>2001 to 2006")
  ) |>
  cols_width(everything() ~ px(120))
name
Population
Density, persons km−2
Population Change,
2001 to 2006
2001 2006 2001 2006
Brampton 325,428 433,806 1,224 1,632 33.3%
Vaughan 182,022 238,866 668 877 31.2%
Markham 208,615 261,573 989 1,240 25.4%
Barrie 103,710 128,430 1,047 1,297 23.8%
Richmond Hill 132,030 162,704 1,310 1,614 23.2%
Oakville 144,738 165,613 1,042 1,192 14.4%
Mississauga 612,925 668,599 2,094 2,284 9.1%
Cambridge 110,372 120,371 977 1,065 9.1%
Burlington 150,836 164,415 810 883 9.0%
Guelph 106,170 114,943 1,214 1,315 8.3%