
Simulating a BEF relationship and evaluting different analytical methods
BEF_simulate.Rd
Simulating a BEF relationship with/without phylogenetic signals, followed by evaluating different analytical methods based on type-I/type-II errors, estimated phylogenetic signals, and R2.
Usage
BEF_simulate(comm,
CV_sp=NULL,
nspp=15,
nsite=50,
min_richness=1,
max_richness=4,
spaMM_formula,
b0=0,
b1=0,
signals_X="phy_cor",
noise_mean=0,
noise_sd=1,
non_phy_cor_mean=0,
non_phy_cor_sd=1,
lambda_true=1,
scale_all=FALSE,
conv_fail_drop=TRUE,
...)
Arguments
- comm
A species-site matrix.
- VCV_sp
A phylogenetic covariance matrix at species levels. If ignored, a stochastic birth-death tree will be generated.
- nspp
Species pool size. The default is 15.
- nsite
Number of sites. The default is 50.
- min_richness
The minimum species richness among all sites. The default is 1.
- max_richness
The maximum species richness among all sites. The default is 1.
- spaMM_formula
A formula in spaMM syntax. See Details for specifying phylogenetic random effects
- b0
The value of b0, which is the global intercept. The default is 0
- b1
The value of b1, which is the effect of X on Y.
- signals_X
Type of X simulated or used. If signals_X = "sr", species richness of simulated communities will be used. If signals_X = "phy_cor", a continuous variable with phylogenetic signals will be generated. If signals_X = "no_phy_cor", a continuous variable without phylogenetic signals will be generated. The mean and S.D. of the normal distribution should be specified in argument non_phy_cor_mean and non_phy_cor_sd. The default is "phy_cor".
- noise_mean
The mean of the random noise generated as a normal distribution. The default is 0
- noise_sd
The S.D. of the random noise generated as a normal distribution. The default is 1
- non_phy_cor_mean
The mean of the normal distribution used to generate X if it is specified as "no_phy_cor". The default is 0.
- non_phy_cor_sd
The S.D. of the normal distribution used to generate X if it is specified as "no_phy_cor". The default is 1.
- lambda_true
Phylogenetic signals in how species contribution to ecosyste functions. This can lead to phylogeny influencing compositional autocorrelation between communities. The default is 1.
- scale_all
Standardizing all variables after centering, The default is FALSE.
- conv_fail_drop
Drop results when spaMM failed to converge. The default is TRUE.
- ...
Othert arguments that will be passed to EcoCoMix.
Details
The simulated relationship is $$Y = X_{0}+b_{1}X_{1}+X_{2}+X_{3}$$
\(X_{0}\) is the population-level intercept. \(X_{2}\) is a variable with compoisitional autocorrelation, with the extent of phylogenetic signal depending on lambda_true. \(X_{1}\) can be species richness, a variable with or without compositional autocorrelation, depending on signals_X. \(b_{1}\) is the poluation-level effect of \(X_{1}\) and can be modified by users. \(X_{3}\) is the random noise, defaulted as \(N(0,1)\).
\(X_{2}\) (and \(X_{1}\) if signals_X = "phy_cor") follows $$MVN(0,{C_{comm})}$$
\(C_{comm}\) represents the correlation matrix between communities. The estent of phylogenetic influence depends on lambda_true.
Value
A list of data frames with the length equals the number of simulation.
The following indicates whether X1 is statistically significant based on different model specification. 1=Y; 0=N.
- m_optim_sig
Model based on the optimized phylogenetic structure.
- m_true_sig
Model based on the phylogenetic structure.
- m_original_sig
Model based on the phylogenetic structure.
- m_best_sig
The most parsimonious model.
- m_without_comp_sig
Model without considering the compositional autocorrelation.
The following indicates the estimated slope of X1:
- m_optim_slope
Model based on the optimized phylogenetic structure.
- m_true_slope
Model based on the true phylogenetic structure.
- m_original_slope
Model based on the provided phylogenetic structure.
- m_best_slope
The most parsimonious model.
- m_without_comp_slope
Model without considering the compositional autocorrelation.
Optimized lambda:
- optim_lambda
Optimized lambda for the model m_optim
- optim_lambda_int
Optimized lambda for the intercept-only model.
The AIC of models:
- AIC_without_comp
Model without considering compositional autocorrelation.
- AIC_original_VCV
Model based on the provided phylogenetic structure
- AIC_optim
Model based on the optimized phylogenetic structure.
- AIC_star
Model based on the star phylogeny.
- AIC_true
Model based on the true phylogenetic structure
- AIC_optim_int
Intercept-only model based on the optimized phylogenetic structure
Simulation properties:
- min_richness
The minimum richness of communities for the simulation.
- max_richness
The maximum richness of communities for the simulation.
- nspp
Species pool size.
- true_lambda
True lambda of the phylogenetic structure
- r_x1x2
Correlation between \(X_{1}\) and \(X_{2}\)
- b1
True value of \(b_{1}\).
- Count
Number of iterations. >1 if any model failed to converge and if conv_fail_drop = T.
- signals_X
Input for the signals_X argument.
- phylo_structure
"Simulated" (if VCV_sp is null) or "Provided" (if VCV_sp is not null).
Other results:
- optim_r2m
Marginal R-squared of the optimized model.
- optim_r2c
Conditional R-squared of the optimized model.
- NumDF
Numerator degree-of-freedom based on the Satterthwaite approximation for the optimized model.
- DenDF
Conditional degree-of-freedom based on the Satterthwaite approximation for the optimized model.
Examples
library(phytools)
library(tidyverse)
library(EcoCoMix)
library(ape)
nspp <- 14
sim <- 50
b1 <- 0 #assume no effects of species richness
signals_X <- "sr"
lambda_true <- runif(sim) #random phylogenetic signals
count <- 1
result <- list()
spaMM_formula <- y~x1+corrMatrix(1|comp_id)
for (i in 1:sim) {
message("sim_",i)
result[[i]] <- tryCatch(BEF_simulate(comm = NULL,
VCV_sp = NULL,
nspp=nspp,
nsite=88,
min_richness=1,
max_richness= 4,
spaMM_formula=spaMM_formula,
b1=b1,
signals_X=signals_X,
noise_mean = 0,
noise_sd = 0.01, #very low noise
lambda_true= lambda_true[[i]],
conv_fail_drop = TRUE,
scale_all=FALSE,
control.optim=list(factr=1e12),
optim.lambda=TRUE,
init=list(), #specification for spaMM - calculations will be faster
int_model=FALSE),
error = function(e) e) #catch error return
}
all_result <- as.data.frame(do.call(rbind,result))