Skip to contents

Mutating joins for an ir object

Usage

inner_join.ir(
  x,
  y,
  by = NULL,
  copy = FALSE,
  suffix = c(".x", ".y"),
  ...,
  keep = FALSE,
  na_matches = c("na", "never")
)

left_join.ir(
  x,
  y,
  by = NULL,
  copy = FALSE,
  suffix = c(".x", ".y"),
  ...,
  keep = FALSE,
  na_matches = c("na", "never")
)

right_join.ir(
  x,
  y,
  by = NULL,
  copy = FALSE,
  suffix = c(".x", ".y"),
  ...,
  keep = FALSE,
  na_matches = c("na", "never")
)

full_join.ir(
  x,
  y,
  by = NULL,
  copy = FALSE,
  suffix = c(".x", ".y"),
  ...,
  keep = FALSE,
  na_matches = c("na", "never")
)

Source

mutate-joins

Arguments

x

An object of class ir.

y

A data frame.

by

A character vector of variables to join by.

If NULL, the default, *_join() will perform a natural join, using all variables in common across x and y. A message lists the variables so that you can check they're correct; suppress the message by supplying by explicitly.

To join by different variables on x and y, use a named vector. For example, by = c("a" = "b") will match x$a to y$b.

To join by multiple variables, use a vector with length > 1. For example, by = c("a", "b") will match x$a to y$a and x$b to y$b. Use a named vector to match different variables in x and y. For example, by = c("a" = "b", "c" = "d") will match x$a to y$b and x$c to y$d.

To perform a cross-join, generating all combinations of x and y, use by = character().

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

suffix

If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

...

Other parameters passed onto methods.

keep

Should the join keys from both x and y be preserved in the output?

na_matches

Should NA and NaN values match one another?

The default, "na", treats two NA or NaN values as equal, like %in%, match(), merge().

Use "never" to always treat two NA or NaN values as different, like joins for database sources, similarly to merge(incomparables = FALSE).

Value

x and y joined. If the spectra column is renamed, the ir

class is dropped. See mutate-joins.

Examples

## inner_join
set.seed(234)
dplyr::inner_join(
  ir_sample_data,
  tibble::tibble(
    id_measurement = c(1:5, 101:105),
    nitrogen_content = rbeta(n = 10, 0.2, 0.1)
  ),
  by = "id_measurement"
)
#> # A tibble: 5 × 8
#>   id_measurement id_sample sample_type sample_comment              klason_lignin
#> *          <int> <chr>     <chr>       <chr>                                 [1]
#> 1              1 GN 11-389 needles     Abies Firma Momi fir                0.360
#> 2              2 GN 11-400 needles     Cupressocyparis leylandii …         0.339
#> 3              3 GN 11-407 needles     Juniperus chinensis Chines…         0.268
#> 4              4 GN 11-411 needles     Metasequoia glyptostroboid…         0.350
#> 5              5 GN 11-416 needles     Pinus strobus Torulosa              0.331
#> # … with 3 more variables: holocellulose [1], spectra <named list>,
#> #   nitrogen_content <dbl>


## left_join
set.seed(234)
dplyr::left_join(
  ir_sample_data,
  tibble::tibble(
    id_measurement = c(1:5, 101:105),
    nitrogen_content = rbeta(n = 10, 0.2, 0.1)
  ),
  by = "id_measurement"
)
#> # A tibble: 58 × 8
#>    id_measurement id_sample sample_type sample_comment             klason_lignin
#>  *          <int> <chr>     <chr>       <chr>                                [1]
#>  1              1 GN 11-389 needles     Abies Firma Momi fir               0.360
#>  2              2 GN 11-400 needles     Cupressocyparis leylandii…         0.339
#>  3              3 GN 11-407 needles     Juniperus chinensis Chine…         0.268
#>  4              4 GN 11-411 needles     Metasequoia glyptostroboi…         0.350
#>  5              5 GN 11-416 needles     Pinus strobus Torulosa             0.331
#>  6              6 GN 11-419 needles     Pseudolarix amabili Golde…         0.279
#>  7              7 GN 11-422 needles     Sequoia sempervirens Cali…         0.330
#>  8              8 GN 11-423 needles     Taxodium distichum Cascad…         0.357
#>  9              9 GN 11-428 needles     Thuja occidentalis Easter…         0.369
#> 10             10 GN 11-434 needles     Tsuga caroliniana Carolin…         0.289
#> # … with 48 more rows, and 3 more variables: holocellulose [1],
#> #   spectra <named list>, nitrogen_content <dbl>


## right_join
set.seed(234)
dplyr::right_join(
  ir_sample_data,
  tibble::tibble(
    id_measurement = c(1:5, 101:105),
    nitrogen_content = rbeta(n = 10, 0.2, 0.1)
  ),
  by = "id_measurement"
)
#> # A tibble: 10 × 8
#>    id_measurement id_sample sample_type sample_comment             klason_lignin
#>  *          <int> <chr>     <chr>       <chr>                                [1]
#>  1              1 GN 11-389 needles     Abies Firma Momi fir               0.360
#>  2              2 GN 11-400 needles     Cupressocyparis leylandii…         0.339
#>  3              3 GN 11-407 needles     Juniperus chinensis Chine…         0.268
#>  4              4 GN 11-411 needles     Metasequoia glyptostroboi…         0.350
#>  5              5 GN 11-416 needles     Pinus strobus Torulosa             0.331
#>  6            101 NA        NA          NA                                NA    
#>  7            102 NA        NA          NA                                NA    
#>  8            103 NA        NA          NA                                NA    
#>  9            104 NA        NA          NA                                NA    
#> 10            105 NA        NA          NA                                NA    
#> # … with 3 more variables: holocellulose [1], spectra <named list>,
#> #   nitrogen_content <dbl>


## full_join
set.seed(234)
dplyr::full_join(
  ir_sample_data,
  tibble::tibble(
    id_measurement = c(1:5, 101:105),
    nitrogen_content = rbeta(n = 10, 0.2, 0.1)
  ),
  by = "id_measurement"
)
#> # A tibble: 63 × 8
#>    id_measurement id_sample sample_type sample_comment             klason_lignin
#>  *          <int> <chr>     <chr>       <chr>                                [1]
#>  1              1 GN 11-389 needles     Abies Firma Momi fir               0.360
#>  2              2 GN 11-400 needles     Cupressocyparis leylandii…         0.339
#>  3              3 GN 11-407 needles     Juniperus chinensis Chine…         0.268
#>  4              4 GN 11-411 needles     Metasequoia glyptostroboi…         0.350
#>  5              5 GN 11-416 needles     Pinus strobus Torulosa             0.331
#>  6              6 GN 11-419 needles     Pseudolarix amabili Golde…         0.279
#>  7              7 GN 11-422 needles     Sequoia sempervirens Cali…         0.330
#>  8              8 GN 11-423 needles     Taxodium distichum Cascad…         0.357
#>  9              9 GN 11-428 needles     Thuja occidentalis Easter…         0.369
#> 10             10 GN 11-434 needles     Tsuga caroliniana Carolin…         0.289
#> # … with 53 more rows, and 3 more variables: holocellulose [1],
#> #   spectra <named list>, nitrogen_content <dbl>