Merging taxa in assemblage data

One possible reason for the impossible percent values I’ve found in assemblages data is that taxa have been merged in Excel after percent were calculated. Doing anything in Excel is to invite disaster, if nothing else, it is very difficult to check what has been done.

Merging and renaming taxa is an almost inevitable step in the workflow for processing community and assemblages data. We need a reproducible method: here I show how can it be done with R.

I’m going to assume that the assemblage data are in wide format (one column per taxa) and that there are meta data (depths, ages etc) in one or more columns. If the meta data are in the rownames (which is very convenient for the ‘rioja’ and ‘vegan’ packages, less so for ‘dplyr’ as tibbles don’t have rownames), they can be moved into a column with rownames_to_column.

Here is a small artificial assemblage dataset.

library("tidyverse")
set.seed(1)
spp <- data_frame(
depth_cm = 1:3,
sp_A = rpois(3, 5),
sp_b = rpois(3, 5),
sp.C = rpois(3, 5),
sp_D = rpois(3, 5))
spp_save <- spp # keep copy for later
spp

## # A tibble: 3 x 5
## depth_cm sp_A sp_b sp.C sp_D
##
## 1 1 4 8 9 2
## 2 2 4 3 6 3
## 3 3 5 8 6 3

If we just want to rename a couple of taxa, the simplest solution is to use rename, where we set new_name = old_name. rename can take pairs of new and old names, separated by commas.

spp %>% rename(sp_B = sp_b, sp_C = sp.C)

## # A tibble: 3 x 5
## depth_cm sp_A sp_B sp_C sp_D
##
## 1 1 4 8 9 2
## 2 2 4 3 6 3
## 3 3 5 8 6 3

If there are many names that need altering, or we need to make the same changes to multiple data.frames, we need a different solution as rename gets tedious.

I like to make a data.frame of the old and new names and then use plyr::mapvalues to change the old into the new names. (plyr is a useful package but has several conflicts with dplyr so it is safer to use the :: notation than loading it).

changes <- read.csv(stringsAsFactors = FALSE, text =
"old, new
sp_b, sp_B
sp.C, sp_C", strip.white = TRUE)#this can go in an separate file

names(spp) <- plyr::mapvalues(names(spp), from = changes$old, to = changes$new)
spp

## # A tibble: 3 x 5
## depth_cm sp_A sp_B sp_C sp_D
##
## 1 1 4 8 9 2
## 2 2 4 3 6 3
## 3 3 5 8 6 3

Merging taxa is possible in the wide format, but much easier in a thin format. We can convert from a wide format to a thin format with gather, and back with spread.

spp <- spp_save#original version

spp_thin <- spp %>% gather(key = taxa, value = count, -depth_cm)#don't gather depth_cm
spp_thin

## # A tibble: 12 x 3
## depth_cm taxa count
##
## 1 1 sp_A 4
## 2 2 sp_A 4
## 3 3 sp_A 5
## 4 1 sp_b 8
## 5 2 sp_b 3
## 6 3 sp_b 8
## 7 1 sp.C 9
## 8 2 sp.C 6
## 9 3 sp.C 6
## 10 1 sp_D 2
## 11 2 sp_D 3
## 12 3 sp_D 3

If there are just a few taxa that need merging, we can use recode within mutate followed by summarise. Note that in contrast with rename, recode expects “old_name” = “new_name”

spp_thin %>%
mutate(taxa = recode(taxa, "sp.C" = "sp_D")) %>%
group_by(depth_cm, taxa) %>%
summarise(count = sum(count)) %>%
spread(key = taxa, value = count)

## # A tibble: 3 x 4
## # Groups: depth_cm [3]
## depth_cm sp_A sp_b sp_D
## *
## 1 1 4 8 11
## 2 2 4 3 9
## 3 3 5 8 9

If there are many taxa that need merging (or some that need merging and some renaming) we can use mapvalues again.

changes <- read.csv(stringsAsFactors = FALSE, text =
"old, new
sp_b, sp_B
sp.C, sp_D", strip.white = TRUE)#this can go in an separate file

spp_thin %>%
mutate(taxa = plyr::mapvalues(taxa, from = changes$old, to = changes$new)) %>%
group_by(depth_cm, taxa) %>%
summarise(count = sum(count)) %>%
spread(key = taxa, value = count)

## # A tibble: 3 x 4
## # Groups: depth_cm [3]
## depth_cm sp_A sp_B sp_D
## *
## 1 1 4 8 11
## 2 2 4 3 9
## 3 3 5 8 9

This can also be done with a left_join.

spp2 <- spp_thin %>%
left_join(changes, by = c("taxa" = "old")) %>%
mutate(taxa = coalesce(new, taxa)) %>% #takes original name if no new one.
select(-new) %>%
group_by(depth_cm, taxa) %>%
summarise(count = sum(count)) %>%
spread(key = taxa, value = count)
spp2

## # A tibble: 3 x 4
## # Groups: depth_cm [3]
## depth_cm sp_A sp_B sp_D
## *
## 1 1 4 8 11
## 2 2 4 3 9
## 3 3 5 8 9

Now the data are ready for further analysis – remember some functions will want you to remove the meta_data first. For example

cca(select(spp2, -depth_cm))

Merging taxa in assemblage data

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112