Skip to contents

Extract the TaxonClassification table for all taxa present in the current dataset. The result is a collected tibble (not a vault_pipe) and is intended as a "pipeline split": the user can inspect, edit, and pass the result back into get_taxa() or get_traits() via their classification_data argument.

Usage

get_classification_table(con = NULL, return_raw_data = FALSE)

Arguments

con

A vault_pipe object created by open_vault(). Must already contain taxon_id in the data, i.e. get_taxa() with classify_to = "original" must have been called earlier in the pipe.

return_raw_data

A logical. If FALSE (default), returns a tibble with both IDs and resolved names: taxon_id, taxon_name, taxon_species, species_name, taxon_genus, genus_name, taxon_family, and family_name. This format is human-readable and can also be fed back as classification_data in get_taxa() or get_traits(). If TRUE, returns a raw tibble with only ID columns taxon_id, taxon_species, taxon_genus, and taxon_family matching the TaxonClassification schema.

Value

When return_raw_data = FALSE (default): a tibble with eight columns (taxon_id, taxon_name, taxon_species, species_name, taxon_genus, genus_name, taxon_family, family_name) restricted to taxa present in the data. Both IDs and resolved names are included, so the tibble can be inspected directly and also passed to the classification_data argument of get_taxa() or get_traits(). When return_raw_data = TRUE: a tibble with columns taxon_id, taxon_species, taxon_genus, and taxon_family restricted to the species-level taxon IDs present in the data.

Details

This function must be called after get_taxa() with classify_to = "original" so that the data contains species-level taxon_id values that match the TaxonClassification table. Calling it after a higher-level classification (e.g. "genus") will yield an empty result because the genus IDs are not primary keys in TaxonClassification.

Typical workflow:

con_taxa <-
  open_vault(path) |>
  get_datasets() |>
  get_samples() |>
  get_taxa(classify_to = "original")

data_class <-
  get_classification_table(con_taxa, return_raw_data = TRUE)

# inspect / edit data_class, then feed back:
con_taxa |>
  get_taxa(
    classify_to = "genus",
    classification_data = data_class
  )