Releases: samuel-watson/glmmrBase
v0.10.3
- Fixed multiple bugs including in calculation of log-binomial likelihood and squared exponential function, and in the marginal effects calculation.
- Added quantile regression as an experimental feature using the asymmetric Laplace distribution for the likelihood.
- Added further data checks for y in the Model object to prevent errors
- Allowed optional specification of the outcome variable when instantiating a new Model object
- Added lme4 type wrapper functions
lmer_mcml
andglmer_mcml
to mimiclmer
andglmer
functionality. - Reduced data copying between mean and covariance objects
What's Changed
- Heckman by @samuel-watson in #43
Full Changelog: v0.8.1...v0.10.3
v0.8.1
- Fixed error with updating u with MCNR/MCEM
- Improved method of tracking log-likelihood values between iterations
- Several bug fixes for Model R class including: accessing gradient and MCMC samples
- Added a range of S3 methods for mcml, mcml.summary, and Model classes, including summary, coef, print, log.lik, and so forth.
- Fixed error with printing not ending in a new line
- Added Hessian correction for non-linear models
- Improved formula parsing
- Added "twoway" functions as a specifiable function
- Reduced memory overhead of model class by removing calc objects from model, which were seldom used.
- NOTE: OpenMP has been disabled for this release: there was an error in one of the parallelised loops causing nonsense results. The error has not been identified and so all parallelisation within this package is disabled until it is resolved.
- Some optimisations for the calculator class to improve calculation speed
- Fixed some formulae errors for some gradient functions, and improved functionality to better support newton-raphson methods.
- Set default Eigen index to int to remove warnings about converting types.
- Fixed issue with HSGP where log-likelihoods did not update.
v0.7.1
This version:
- Adds rstan functionality to the package. Previous versions did not include rstan because it did not produce reasonable results. The issue appeared to be in the implementation of
reduce_sum
so within-chain parallelisation is removed for rstan sampling but available with the cmdstanr sampling. The user can select the sampler with the argumentmcmc.pkg
. - Adds stochastic approximation expectation maximisation algorithm to the MCML sampler. This algorithm uses a Robbins-Munro approach to estimating the log-likelihood and so requires far fewer MCMC samples per iteration, as all MCMC samples are retained an used on each iteration. This algorithm can be used with or without Ruppert-Polyak averaging.
- Adaptive sample sizes are included for MCMC-ML.
- New convergence criteria are included based on the marginal improvement in the log-likelihood. At convergence the log-likelihood will fail to improve. To account for the stochastic nature of the algorithm, an upper bound is used based on the estimated variance of the log-likelihood differences.
- Some small bugs and errors are fixed.
What's Changed
- v0.6.1 fixed LA in optim by @samuel-watson in #31
- Add DIRECT and incorporate BOBYQA natively by @samuel-watson in #32
- Update README.md by @samuel-watson in #33
- Newuoa & L-BFGS by @samuel-watson in #35
Full Changelog: v0.6.1...v0.7.1
v0.6.1
Major update
Encompasses unreleased v0.5.3 and v0.5.4. v0.5.3 Was a revert to C++17 from C++20 as C++20 cannot be used on CRAN with Eigen because RcppEigen is <3.4.0 where checks for deprecated functions in C++20 were included. The main changes this entailed was removing using enum class
statements and a couple other small changes (although incorporation of ranges was planned). As a further note, a build using rstan is available on branch stan
however it is significantly slower to run, and longer to compile so it hasn't been incorporated into this version. Other changes:
- Improved storage and copying of sparse matrices to work with SparseChol 0.3.1 and to reduce data copying. The
Covariance
class now only contains an object of classSparseChol
and notsparse
and all data are copied directly into that object. - Includes (optional) permute of covariance matrix prior to factorisation using the approximate minimum degree algorithm. Does not affect block diagonal matrices, but for matrices using a compactly supported covariance function may improve fitting time, although further testing is required.
- Changed the compactly supported covariance functions. Removed existing options as they were not parameterised in a useful way, they were complex, and they also did not correctly create a sparse matrix. These functions have been replaced with truncated power and Cauchy functions with 1 or 2 parameters. The specification of the sparse matrix has also been fixed.
- Removed dependency on package
rminqa
and natively incorporated the BOBYQA algorithm - a much improved function binding scheme is used, which cuts the overhead when optimising and improves speed of model fitting. - Added the DIRECT algorithm as an option for model fitting. However, it is not used by default and not yet exposed to the user directly as the settings do not necessarily reduce function evaluations, but may be useful in some scenarios. May be exposed after further testing.
- Fixed an error in the non-linear optimisation module for Laplace approximation where the bounds were not completely initialised properly, which caused an error in model fitting.
- Added the Kenward-Roger improved approximation for covariance non-linear in parameters.
- Added the Satterthwaite degrees of freedom correction
- Renamed Model class function
kenward_roger
tosmall_sample_correction
to include both types of Kenward-Roger and Satterthwaite in R. - Renamed
ModelMatrix<modeltype>
class functionkenward_roger
tosmall_sample_correction
in C++. - Added the Box correction for Gaussian mixed models: accessible as
Box matrix.box()
andmodel$box()
for C++ and R, respectively. - Added member function for R
Model
classupdate_y
so that the outcome data can be updated independently of model fitting. - Added additional error catching and checking in R builds (use
#define ENABLE_DEBUG
) - Fixed error that when specifying a non-linear function of fixed effect parameters that included a reciprocal of a parameter, that parameter would be initialised to zero and cause a crash. For non-linear models, parameters are default initialised to 1 now.
- Fixed calculator interpreter error for
vcalc
when adding random effects that would lead the iterator to go out of bounds and cause a crash. - Fixed error in calculation of p-values for SE DoF corrections
- Modified the robust sandwich error to use model residuals (but it still produces unsatisfactory results for some models).
- Changed the small sample corrections to all use the
CorrectionData
class return type. - Removed
subset_cols()
functions from Model and Covariance classes as this causes errors with the formulae and requires regenerating the models. - Removed the dependency on
digest
in R and the hash functions as models are now updated using the update functions and do not require checking against hashes. - Removed the
check
functions from the Model, Covariance, and LinearPredictor classes as updates are now handled through update functions. - Fixed error in formula parse where
factor(x)
functions would register as non-linear. - Allowed for a parameter to be used more than once in a formula.
- Extended the calculator number array to 20 items.
Full Changelog: v0.5.4...v0.6.1
v0.5.2
(Quick new version, 0.5.1 was not released on CRAN)
- Added the
marginal
member function to the RModel
class. This function calculates marginal effects (either derivative, difference, or ratio) conditional on or averaging over fixed and random effects. See Model documentation for details. - Refactored some code to improve readability, including moving family link and distributions and covariance functions to enum classes.
v0.5.1
- Added Gaussian process approximations: Nearest neighbour Gaussian Process and Hilbert Space Gaussian Process. These can be specified with the prefix
nngp_
andhsgp_
, for example(1|nngp_sqexp(x,y))
and(1|hsgp_fexp(x,y))
. Currently the HSGP only supports exponential and squared exponential covariance functions. An additionalgriddata
class has been added to store location information and generate nearest neighbours. The approximation parameters (number of nearest neighbours or number of basis functions and boundary) are set inmodel$covariance$nngp_data()
andmodel$covariance$hsgp_data()
, respectively. There are two new C++ covariance classes derived fromCovariance
:nngpCovariance
andhsgpCovariance
. A new method for calculating the information matrix has been implemented for HSGP as the approximate covariance matrix is not positive definite and so cannot be inverted. - As a result of the new covariance types there are three types of model class in the package (i.e.
Model<ModelBits<Covariance,LinearPredictor>>
and so forth. The Rcpp modules have been updated; to maintain readability all exported functions require a type argument and the pointer is generated using theglmmrType
struct which includes anstd::variant
for the different classes. This also simplifies future expansion with new classes (including in therts2
package). A variant is also used as a generic capture for the return values. The generic structure of the Rcpp module functions is now:
// [[Rcpp::export]]
SEXP Model__P(SEXP xp, int type = 0){
glmmrType model(xp,static_cast<Type>(type));
auto functor = overloaded {
[](int) { return returnType(0);},
[](auto ptr){return returnType(ptr->model.linear_predictor.P());}
};
auto S = std::visit(functor,model.ptr);
return wrap(std::get<int>(S));
}
- Fixed and improved formula parsing. Non-linear formulas had several small bugs which led to either incorrect calculations or crashes in some cases and also only accepted integer numbers. The improved parsing fixes these bugs and allows for any doubles to be used. In formulae any data name is automatically multiplied by a parameter unless it is wrapped in brackets, e.g.
2^x
will parse as pow(2,b_x*x) and2^(x)
will be pow(2,x). An example complex spatial model with non-linear functional form used in a linked study is:
model <- Model$new(
~ b_eff*((1-(0.5*((b_del/d)^100+1)^(-0.01*b_k)))^b_v) + (1|nngp_fexp(x,y)),
data=df,
covariance = c(0.5,0.5),
mean = c(0.5,0.5,0.5,1,3),
family = gaussian()
)
- Instructions for calculations have moved to using an
enum class
to massively improve the readability of the code. For example, instructions now read as{PushData, PushParameter, Multiply, Add}
rather than{0,2,5,3}
. The RModel
class now includes a functioncalculator_instructions()
to print the list of instructions to the console to help understand specification and parsing of non-linear functions. - Two optional flags have been added to the
general.h
file. Compiling with#define ENABLE_DEBUG
will include a large number of debugging steps, currently this is mostly only useful when building for R as the debug prints to the R console - this will be updated for future releases. Using#define R_BUILD
brings in Rcpp and adds R specific printing and output. - The initialiser for the R6
Model
class now accepts just a vector of parameters for thecovariance
andmean
arguments as this was the most common use case. For example, one can specify a model as
mod <- Model$new(
~ (1|fexp(x,y)),
data = expand.grid(x = seq(-1,1,length.out = 25),
y = seq(-1,1,length.out = 25)),
covariance = c(0.5,1),
mean = c(0),
family = gaussian()
)
- Upper and lower bounds can optionally be set on the fixed effect parameters for the MCML and Laplace approximation model fitting.
- There is now a function
Model__set_bobyqa_control()
that will set the BOBYQA control parametersnpt
rho
andrhobeg
. This is not exposed to the user in this package (although it is used inrts2
) - Sparse matrix functions and operators have been moved out of this package and into the
sparseChol
package (v0.2.2) - Compilation now requires C++20
- The R
Model
class member functionfitted
now includes an optionsample=TRUE
to generate fitted values by resampling from the fixed effect parameter sampling distribution. - The R
Model
class now overwrites the pointer in its memberCovariance
class with itself. Previously the member covariance object contained a pointer to a separate covariance class object and so doubled the memory requirement. The covariance class still initially generates it own class, which will be improved in future versions. - An error in the
predict
function has been fixed that caused a crash for some models. - A non-user exposed function is added to calculate the Hessian matrix numerically
Model__hessian_numerical()
. It is slow and not required here but may be useful for derived packages and functions. - A covariance only model with no intercept can now be specified as
-1+(1|f(x))
previously this caused an error.
v0.4.6
- Minor bug fixes and code improvements
- Added templated C++ classes to enable use in broader applications (e.g. rts2)
v0.4.5
Updates:
- Fixed errors in Kenward-Roger implementation
- Added Kenward-Roger degrees of freedom correction
- Added between-within degrees of freedom correction
- Improved printing of mcml class output (added DoF for some SE methods and prints the SE method)
v0.4.4
A rapid new version as previous version did not fix an issue causing an Error on CRAN's checks. This version:
- Fixes an error when compiled on machines without OpenMP support. Some omp commands had not been defined previously when Omp wasn't available.
- Tidied up some compiler warnings.
- Added option to specify the number of threads when using setParallel and setting default to small number (2) for CRAN policy
v0.4.3
- Disables Eigen's "stupid warnings" which CRAN do not allow you to suppress at the package level, which caused an error on MacOS.
- Refactored the C++ code into more bitesize classes to remove the bloated model class.