Skip to content

Simulating a BEF relationship with/without phylogenetic signals, followed by evaluating different analytical methods based on type-I/type-II errors, estimated phylogenetic signals, and R2.

Usage

BEF_simulate(comm,
             CV_sp=NULL,
             nspp=15,
             nsite=50,
             min_richness=1,
             max_richness=4,
             spaMM_formula,
             b0=0,
             b1=0,
             signals_X="phy_cor",
             noise_mean=0,
             noise_sd=1,
             non_phy_cor_mean=0,
             non_phy_cor_sd=1,
             lambda_true=1,
             scale_all=FALSE,
             conv_fail_drop=TRUE,
             ...)

Arguments

comm

A species-site matrix.

VCV_sp

A phylogenetic covariance matrix at species levels. If ignored, a stochastic birth-death tree will be generated.

nspp

Species pool size. The default is 15.

nsite

Number of sites. The default is 50.

min_richness

The minimum species richness among all sites. The default is 1.

max_richness

The maximum species richness among all sites. The default is 1.

spaMM_formula

A formula in spaMM syntax. See Details for specifying phylogenetic random effects

b0

The value of b0, which is the global intercept. The default is 0

b1

The value of b1, which is the effect of X on Y.

signals_X

Type of X simulated or used. If signals_X = "sr", species richness of simulated communities will be used. If signals_X = "phy_cor", a continuous variable with phylogenetic signals will be generated. If signals_X = "no_phy_cor", a continuous variable without phylogenetic signals will be generated. The mean and S.D. of the normal distribution should be specified in argument non_phy_cor_mean and non_phy_cor_sd. The default is "phy_cor".

noise_mean

The mean of the random noise generated as a normal distribution. The default is 0

noise_sd

The S.D. of the random noise generated as a normal distribution. The default is 1

non_phy_cor_mean

The mean of the normal distribution used to generate X if it is specified as "no_phy_cor". The default is 0.

non_phy_cor_sd

The S.D. of the normal distribution used to generate X if it is specified as "no_phy_cor". The default is 1.

lambda_true

Phylogenetic signals in how species contribution to ecosyste functions. This can lead to phylogeny influencing compositional autocorrelation between communities. The default is 1.

scale_all

Standardizing all variables after centering, The default is FALSE.

conv_fail_drop

Drop results when spaMM failed to converge. The default is TRUE.

...

Othert arguments that will be passed to EcoCoMix.

Details

The simulated relationship is $$Y = X_{0}+b_{1}X_{1}+X_{2}+X_{3}$$

\(X_{0}\) is the population-level intercept. \(X_{2}\) is a variable with compoisitional autocorrelation, with the extent of phylogenetic signal depending on lambda_true. \(X_{1}\) can be species richness, a variable with or without compositional autocorrelation, depending on signals_X. \(b_{1}\) is the poluation-level effect of \(X_{1}\) and can be modified by users. \(X_{3}\) is the random noise, defaulted as \(N(0,1)\).

\(X_{2}\) (and \(X_{1}\) if signals_X = "phy_cor") follows $$MVN(0,{C_{comm})}$$

\(C_{comm}\) represents the correlation matrix between communities. The estent of phylogenetic influence depends on lambda_true.

Value

A list of data frames with the length equals the number of simulation.

The following indicates whether X1 is statistically significant based on different model specification. 1=Y; 0=N.

m_optim_sig

Model based on the optimized phylogenetic structure.

m_true_sig

Model based on the phylogenetic structure.

m_original_sig

Model based on the phylogenetic structure.

m_best_sig

The most parsimonious model.

m_without_comp_sig

Model without considering the compositional autocorrelation.

The following indicates the estimated slope of X1:

m_optim_slope

Model based on the optimized phylogenetic structure.

m_true_slope

Model based on the true phylogenetic structure.

m_original_slope

Model based on the provided phylogenetic structure.

m_best_slope

The most parsimonious model.

m_without_comp_slope

Model without considering the compositional autocorrelation.

Optimized lambda:

optim_lambda

Optimized lambda for the model m_optim

optim_lambda_int

Optimized lambda for the intercept-only model.

The AIC of models:

AIC_without_comp

Model without considering compositional autocorrelation.

AIC_original_VCV

Model based on the provided phylogenetic structure

AIC_optim

Model based on the optimized phylogenetic structure.

AIC_star

Model based on the star phylogeny.

AIC_true

Model based on the true phylogenetic structure

AIC_optim_int

Intercept-only model based on the optimized phylogenetic structure

Simulation properties:

min_richness

The minimum richness of communities for the simulation.

max_richness

The maximum richness of communities for the simulation.

nspp

Species pool size.

true_lambda

True lambda of the phylogenetic structure

r_x1x2

Correlation between \(X_{1}\) and \(X_{2}\)

b1

True value of \(b_{1}\).

Count

Number of iterations. >1 if any model failed to converge and if conv_fail_drop = T.

signals_X

Input for the signals_X argument.

phylo_structure

"Simulated" (if VCV_sp is null) or "Provided" (if VCV_sp is not null).

Other results:

optim_r2m

Marginal R-squared of the optimized model.

optim_r2c

Conditional R-squared of the optimized model.

NumDF

Numerator degree-of-freedom based on the Satterthwaite approximation for the optimized model.

DenDF

Conditional degree-of-freedom based on the Satterthwaite approximation for the optimized model.

Author

Toby P.N. Tsang

See also

Examples


library(phytools)
library(tidyverse)
library(EcoCoMix)
library(ape)

nspp <- 14
sim <- 50
b1 <- 0 #assume no effects of species richness
signals_X <- "sr"
lambda_true <- runif(sim) #random phylogenetic signals

count <- 1

result <- list()

spaMM_formula <-  y~x1+corrMatrix(1|comp_id)

for (i in 1:sim) {
  message("sim_",i)

  result[[i]] <- tryCatch(BEF_simulate(comm = NULL,
                                           VCV_sp = NULL,
                                           nspp=nspp,
                                           nsite=88,
                                           min_richness=1,
                                           max_richness= 4,
                                           spaMM_formula=spaMM_formula,
                                           b1=b1,
                                           signals_X=signals_X,
                                           noise_mean = 0,
                                           noise_sd = 0.01, #very low noise
                                           lambda_true= lambda_true[[i]],
                                           conv_fail_drop = TRUE,
                                           scale_all=FALSE,
                                           control.optim=list(factr=1e12),
                                           optim.lambda=TRUE,
                                           init=list(), #specification for spaMM - calculations will be faster
                                           int_model=FALSE),
                              error = function(e) e) #catch error return
  }


all_result <- as.data.frame(do.call(rbind,result))