Pourquoi Phyloseq?

Why Phyloseq?

1. Commençons! / Let’s start!

Charger les librairies / Load packages

library(phyloseq)
library(vegan)
library(ggplot2)

R version 3.4.3
Phyloseq version 1.22.3
Linux machine image 4.4.0-64-generic

Importer les données / Data import

## Sample data
soiltab <- read.table('data/soil.csv', sep = ';', row.names = 1, header = TRUE)

# Taxonomy table
load("data/taxotab.01.05.rdata")

# ASV table
load("data/seqtab.nochim.rdata")


Construire l’objet Phyloseq / Construct the Phyloseq object

ps <- phyloseq(otu_table(seqtab.nochim, taxa_are_rows=FALSE), 
               sample_data(soiltab), 
               tax_table(as.matrix(taxo$tax)))
ps
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 546 taxa and 74 samples ]
## sample_data() Sample Data:       [ 74 samples by 8 sample variables ]
## tax_table()   Taxonomy Table:    [ 546 taxa by 7 taxonomic ranks ]


Réordonner les facteurs / Reorder factors

sample_data(ps)$horizon = factor(sample_data(ps)$horizon, levels = c('L', 'F', 'H', 'Ae', 'B'))
sample_data(ps)$forest = factor(sample_data(ps)$forest, levels = c('AS', 'Mixed', 'FG'))


Pour commencer directement avec l’objet Phyloseq / To start directly from Phyloseq object

#load(file = "data/ps.rdata")
#ps


2. Filtrage et transformation / Filtering and transformation

Subset of the dataset with only fungi

ps.fungi = subset_taxa(ps, Kingdom=="k__Fungi")
ntaxa(ps)
## [1] 546
ntaxa(ps.fungi)
## [1] 524

Filtered 22 non-fungal taxa

Enlever les taxons avec seulement 1 ou 2 sequences au total / Remove singletons and doubletons

ps.fungi.nosd = filter_taxa(ps.fungi, function(x) sum(x) > 2, TRUE)
ntaxa(ps.fungi.nosd)
## [1] 523


Transformation log + 1 pour normaliser les données / Shifted log transformation to normalize

ps.fungi.nosd.log <- transform_sample_counts(ps.fungi.nosd, function(x) log(x+1))


Agglomerer les taxons au niveau specifique / Agglomerate taxa at the species level

ps.fungi.nosd.log.sp = tax_glom(ps.fungi.nosd.log, "Species", NArm = FALSE) # NArm = FALSE in order to keep ESV not taxonomically assigned at the species level


Exemple de transmformation / Example of other transformation

ps.fungi.nosd.hel = transform_sample_counts(ps.fungi.nosd, function(x) sqrt(x/sum(x))) # Hellinger transformation


3. Exploration des données / Data exploration

Details de l’objet Phyloseq / Details of the object Phyloseq

ntaxa(ps.fungi.nosd.log)
## [1] 523
nsamples(ps.fungi.nosd.log)
## [1] 74
rank_names(ps.fungi.nosd.log)
## [1] "Kingdom" "Phylum"  "Class"   "Order"   "Family"  "Genus"   "Species"
sample_variables(ps.fungi.nosd.log)
## [1] "sampleID" "plot"     "block"    "forest"   "horizon"  "Carbon"  
## [7] "Nitrogen" "pH"
otu_table(ps.fungi.nosd.log)[1:2, 1:2]
## OTU Table:          [2 taxa and 2 samples]
##                      taxa are columns
##     ATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCTTTGGTATTCCGAAGGGCATGCCTGTTTGAGTGTCATTAAATTATCAACCTTGCTCGCTTTTATTAGCTTGAGTTAGGCTTGGATGTGAGGGTTTTGCTGGCTTCCTTCAGTGGATGGTCTGCTCCCTTTGAATGCATTAGCGGGATCTCTTGTGGACCGTCACTTGGTGTGATAATTATCTATGCCTTGAGACTTTGAAACAAACTTATGAGAATCTGCTTATAACCGTCCTCACGGACAACTTTTGACAATTTGACCTCAAATCAGGTAGGACTACCCGCTGAACTTAA
## S10                                                                                                                                                                                                                                                                                                                                                              0.000000
## S11                                                                                                                                                                                                                                                                                                                                                              1.098612
##     ATGCGATACGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACATTGCACTCCTTGGTATTCCGAGGAGTATGCCTGTTTCAGTATCATGAGCACTCTCACACCTAACCTTTGGGTTTATGGCGTGGAATTGGAATGCGCCGACTGTCATGGTTGGCCCTTCTAAAATGTAGTTCTTGGCTGTCACCTAATACAGCAGTTTGGCCTAATAGTTTTGGCATTCATTGTCAAATCTTTGGCTAACATTTGCTCCAGGAGTCAGTCTTGATAATACAGAAAACTCATTCAAATTTTGATCTGAAATCAGGTAGGGCTACCCGCTGAACTTAA
## S10                                                                                                                                                                                                                                                                                                                                                0.000000
## S11                                                                                                                                                                                                                                                                                                                                                2.639057
tax_table(ps.fungi.nosd.log)[1:2, 1:2]
## Taxonomy Table:     [2 taxa by 2 taxonomic ranks]:
##                                                                                                                                                                                                                                                                                                                                                                       Kingdom   
## ATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCTTTGGTATTCCGAAGGGCATGCCTGTTTGAGTGTCATTAAATTATCAACCTTGCTCGCTTTTATTAGCTTGAGTTAGGCTTGGATGTGAGGGTTTTGCTGGCTTCCTTCAGTGGATGGTCTGCTCCCTTTGAATGCATTAGCGGGATCTCTTGTGGACCGTCACTTGGTGTGATAATTATCTATGCCTTGAGACTTTGAAACAAACTTATGAGAATCTGCTTATAACCGTCCTCACGGACAACTTTTGACAATTTGACCTCAAATCAGGTAGGACTACCCGCTGAACTTAA "k__Fungi"
## ATGCGATACGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACATTGCACTCCTTGGTATTCCGAGGAGTATGCCTGTTTCAGTATCATGAGCACTCTCACACCTAACCTTTGGGTTTATGGCGTGGAATTGGAATGCGCCGACTGTCATGGTTGGCCCTTCTAAAATGTAGTTCTTGGCTGTCACCTAATACAGCAGTTTGGCCTAATAGTTTTGGCATTCATTGTCAAATCTTTGGCTAACATTTGCTCCAGGAGTCAGTCTTGATAATACAGAAAACTCATTCAAATTTTGATCTGAAATCAGGTAGGGCTACCCGCTGAACTTAA               "k__Fungi"
##                                                                                                                                                                                                                                                                                                                                                                       Phylum            
## ATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCCTTTGGTATTCCGAAGGGCATGCCTGTTTGAGTGTCATTAAATTATCAACCTTGCTCGCTTTTATTAGCTTGAGTTAGGCTTGGATGTGAGGGTTTTGCTGGCTTCCTTCAGTGGATGGTCTGCTCCCTTTGAATGCATTAGCGGGATCTCTTGTGGACCGTCACTTGGTGTGATAATTATCTATGCCTTGAGACTTTGAAACAAACTTATGAGAATCTGCTTATAACCGTCCTCACGGACAACTTTTGACAATTTGACCTCAAATCAGGTAGGACTACCCGCTGAACTTAA "p__Basidiomycota"
## ATGCGATACGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACATTGCACTCCTTGGTATTCCGAGGAGTATGCCTGTTTCAGTATCATGAGCACTCTCACACCTAACCTTTGGGTTTATGGCGTGGAATTGGAATGCGCCGACTGTCATGGTTGGCCCTTCTAAAATGTAGTTCTTGGCTGTCACCTAATACAGCAGTTTGGCCTAATAGTTTTGGCATTCATTGTCAAATCTTTGGCTAACATTTGCTCCAGGAGTCAGTCTTGATAATACAGAAAACTCATTCAAATTTTGATCTGAAATCAGGTAGGGCTACCCGCTGAACTTAA               "p__Mucoromycota"


CHALLENGE

Pouvez-vous trouvez le nombre de phylum de champignons?
Can you find the number of fungal phylum?

Profondeur de sequencage après filtrage de la canalisation Dada2 / Sequencing depth per samples after filtering of Dada2 pipeline

plot_bar(ps, x = "Sample", y = "Abundance", fill = "Kingdom") +
  geom_bar(stat="identity", color = "black") +
  labs(x = "Sample ID", y = "Reads abundance")

plot_bar(ps.fungi,  x = "Sample", y = "Abundance", fill = "Kingdom") +
  geom_bar(stat="identity", color = "black") +
  labs(x = "Sample ID", y = "Reads abundance of fungi")