Skip to contents

Filtering joins for an ir object

Usage

semi_join.ir(x, y, by = NULL, copy = FALSE, ..., na_matches = c("na", "never"))

anti_join.ir(x, y, by = NULL, copy = FALSE, ..., na_matches = c("na", "never"))

Source

filter-joins

Arguments

x

An object of class ir.

y

A data frame.

by

A character vector of variables to join by.

If NULL, the default, *_join() will perform a natural join, using all variables in common across x and y. A message lists the variables so that you can check they're correct; suppress the message by supplying by explicitly.

To join by different variables on x and y, use a named vector. For example, by = c("a" = "b") will match x$a to y$b.

To join by multiple variables, use a vector with length > 1. For example, by = c("a", "b") will match x$a to y$a and x$b to y$b. Use a named vector to match different variables in x and y. For example, by = c("a" = "b", "c" = "d") will match x$a to y$b and x$c to y$d.

To perform a cross-join, generating all combinations of x and y, use by = character().

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

...

Other parameters passed onto methods.

na_matches

Should NA and NaN values match one another?

The default, "na", treats two NA or NaN values as equal, like %in%, match(), merge().

Use "never" to always treat two NA or NaN values as different, like joins for database sources, similarly to merge(incomparables = FALSE).

Value

x and y joined. If the spectra column is renamed, the ir

class is dropped. See filter-joins.

Examples

## semi_join
set.seed(234)
dplyr::semi_join(
  ir_sample_data,
  tibble::tibble(
    id_measurement = c(1:5, 101:105),
    nitrogen_content = rbeta(n = 10, 0.2, 0.1)
  ),
  by = "id_measurement"
)
#> # A tibble: 5 × 7
#>   id_measurement id_sample sample_type sample_comment              klason_lignin
#> *          <int> <chr>     <chr>       <chr>                       <units>      
#> 1              1 GN 11-389 needles     Abies Firma Momi fir        0.359944     
#> 2              2 GN 11-400 needles     Cupressocyparis leylandii … 0.339405     
#> 3              3 GN 11-407 needles     Juniperus chinensis Chines… 0.267552     
#> 4              4 GN 11-411 needles     Metasequoia glyptostroboid… 0.350016     
#> 5              5 GN 11-416 needles     Pinus strobus Torulosa      0.331100     
#> # … with 2 more variables: holocellulose <units>, spectra <named list>


## anti_join
set.seed(234)
dplyr::anti_join(
  ir_sample_data,
  tibble::tibble(
    id_measurement = c(1:5, 101:105),
    nitrogen_content = rbeta(n = 10, 0.2, 0.1)
  ),
  by = "id_measurement"
)
#> # A tibble: 53 × 7
#>    id_measurement id_sample sample_type sample_comment             klason_lignin
#>  *          <int> <chr>     <chr>       <chr>                      <units>      
#>  1              6 GN 11-419 needles     Pseudolarix amabili Golde… 0.279360     
#>  2              7 GN 11-422 needles     Sequoia sempervirens Cali… 0.329672     
#>  3              8 GN 11-423 needles     Taxodium distichum Cascad… 0.356950     
#>  4              9 GN 11-428 needles     Thuja occidentalis Easter… 0.369360     
#>  5             10 GN 11-434 needles     Tsuga caroliniana Carolin… 0.289050     
#>  6             11 GN 11-435 needles     Picea abies Norway Spruce  0.288000     
#>  7             12 GN 11-460 needles     Pinus taeda Loblolly pine  0.322300     
#>  8             13 HW 07-151 hardwood    Quercus sp. Red oak (from… 0.238095     
#>  9             14 HW 11-137 hardwood    Acer saccharum Sugar maple 0.242592     
#> 10             15 HW 11-144 hardwood    Fraxinus americana White … 0.259224     
#> # … with 43 more rows, and 2 more variables: holocellulose <units>,
#> #   spectra <named list>