pivot_wider {tidyr}R Documentation

Pivot data from long to wide

Description

Maturing lifecycle

pivot_wider() "widens" data, increasing the number of columns and decreasing the number of rows. The inverse transformation is pivot_longer().

Learn more in vignette("pivot").

Usage

pivot_wider(data, id_cols = NULL, names_from = name,
  names_prefix = "", names_sep = "_", names_repair = "check_unique",
  values_from = value, values_fill = NULL, values_fn = NULL)

Arguments

data

A data frame to pivot.

id_cols

A set of columns that uniquely identifies each observation. Defaults to all columns in data except for the columns specified in names_from and values_from. Typically used when you have additional variables that is directly related.

names_from, values_from

A pair of arguments describing which column (or columns) to get the name of the output column (name_from), and which column (or columns) to get the cell values from (values_from).

If values_from contains multiple values, the value will be added to the front of the output column.

names_prefix

String added to the start of every variable name. This is particularly useful if names_from is a numeric vector and you want to create syntactic variable names.

names_sep

If names_from or values_from contains multiple variables, this will be used to join their values together into a single string to use as a column name.

names_repair

What happen if the output has invalid column names? The default, "check_unique" is to error if the columns are duplicated. Use "minimal" to allow duplicates in the output, or "unique" to de-duplicated by adding numeric suffixes. See vctrs::vec_as_names() for more options.

values_fill

Optionally, a named list specifying what each value should be filled in with when missing.

values_fn

Optionally, a named list providing a function that will be applied to the value in each cell in the output. You will typically use this when the combination of id_cols and value column does not uniquely identify an observation.

Details

pivot_wider() is an updated approach to spread(), designed to be both simpler to use and to handle more use cases. We recomend you use pivot_wider() for new code; spread() isn't going away but is no longer under active development.

See Also

pivot_wider_spec() to pivot "by hand" with a data frame that defines a pivotting specification.

Examples

# See vignette("pivot") for examples and explanation

fish_encounters
fish_encounters %>%
  pivot_wider(names_from = station, values_from = seen)
# Fill in missing values
fish_encounters %>%
  pivot_wider(
    names_from = station,
    values_from = seen,
    values_fill = list(seen = 0)
  )

# Generate column names from multiple variables
us_rent_income %>%
  pivot_wider(names_from = variable, values_from = c(estimate, moe))

# Can perform aggregation with values_fn
warpbreaks <- as_tibble(warpbreaks[c("wool", "tension", "breaks")])
warpbreaks
warpbreaks %>%
  pivot_wider(
    names_from = wool,
    values_from = breaks,
    values_fn = list(breaks = mean)
  )

[Package tidyr version 1.0.0 Index]