Dependence Modeling with Copulas
This is the web site for the book:
Joe, H. (2014). Dependence Modeling with Copulas.
Chapman & Hall/CRC. Published June/July 2014.
Details about at the book at the
publisher's
web page.
- Software and code mentioned below provide one level of reproducibility.
An interested reader can check how (almost) all numerical computations in
the book were done.
The book has algorithms only and no code.
- Mathematical proofs and derivations, if short, are included in the
book, especially if they illustrate important techniques.
Otherwise, references to the literature are given.
- There are non-trivial proofs of some properties of bivariate parametric
copula families, for which only hints are given for the derivations.
One day in the future, supplementary documents might be provided
at this web site, and then everything in the book is checkable.
Errata
- Corrected Table 7-18 as pdf file (2015-01-25),
after small error found in code
- Corrected equations (5.16) and last displayed equation on
page 249 (2015-09-15);
thanks to Prof. Marek Omelka for discovering the error in the
integration.
- Errata list (July 2016).
CopulaModel software
The companion software for the book is available from the link below.
In addition to data analysis, the software can be used to
learn about copulas by getting into the code and adapting it for
particular applications.
An additional reference for some of the functions in the software is
Krupskii's PhD thesis.
Patches and additions will infrequently be made.
The first patch is 2015.01.25 from the software web page.
The interface is R
but much of the code is written in fortran90 or C
for better speed and memory management,
and also so that the algorithms are readable.
This software is distributed under the GPL-3 license.
Go to here for downloads and see below for
installation instructions.
To minimize effects of scanners and robots, a login and password
are needed. For login, use the initials of the title of the book
(all upper case or all lower case).
For password, use vinecop.
An earlier version of the software package was beta-tested and
installed in Linux, MacOS and Windows. It has been used with versions
R 2.9.0 and above. There is more documentation
in the source files compared with the R help pages, so the source
code is important to learn about copula modeling.
Binary versions of the R package are not being provided. Below are brief instructions
on compiling the source for different platforms.
Because of lack of resources, the software does not come with support.
But in the future, this might exist.
- Linux: need development packages gcc, gfortran, make and maybe
a few others.
- Windows: minimally Rtools from
http://cran.r-project.org/bin/windows/Rtools/ ;
an alternative is cygwin or
Unix in Windows, with the mingw compilers for gcc and gfortran.
An important step in installing Rtools is to check the "edit the
system PATH" checkbox.
It should also be checked
that the R binary directory (e.g. C:\Program Files\R\R-3.0.2\bin) is in the
PATH variable. Directions of appending it can be found at e.g.
http://www.java.com/en/download/help/path.xml
See updated information below.
- MacOS: need Apple's development tools (compilers, linkers, make, etc).
To find the correct version of gcc and gfortran (gnu C and fortran
compilers), go to
http://hpc.sourceforge.net.
If you have more than one version of gfortran, check that R will link
to the correct version of gfortran when compiling.
See updated information below.
Updated information for Windows and MacOS (June 1, 2016)
Below is further information from a user who installed CopulaModel on
both Windows and MacOS.
Windows: building an R-package
- Download the Rtools and InnoSetup.
-
In addition, download a LaTeX-Compiler (e.g., MikTex)
and HTML Help Workshop.
-
The third step is to set the paths of all these programs in the environment variable PATH.
MacOS: building an R-package
- Install a full version of the Xcode.
-
In addition, install a LaTeX-Compiler (e.g., MacTex).
The following might work as an alternative to the command line instructions
given below.
In R, type
install.packages(pkgs="$path/$tarfile", type="source", repos=NULL)
where $path is replaced with the location of the source file,
and $tarfile is replaced with the gzipped tar source file.
Installation from the command line (briefest instructions)
The briefest instructions are the following but if these fail, re-read the
sections above and below this one).
Installation from the command line (generic instructions)
Suppose the R package is called mypkg.
After the source file is unpacked, go to the directory
which has mypkg underneath it. Then
R CMD INSTALL mypkg
should compile the package and install it (providing you have the
Rtools above and the binary search path is set up correctly).
For Unix systems (including MacOS), you might need superuser privileges.
In Unix, to install in your own diskspace, suppose
you have a directory ~/Rlib
R CMD INSTALL --library=$HOME/Rlib mypkg
will work where $HOME is replaced by your home directory.
Then to use, you need something like:
library(mypkg,lib.loc="$HOME/Rlib")
Special features of the software/code in CopulaModel are the following.
- For the 1-, 2-, 3-parameter bivariate copula
families in the book, most have each of the following:
pcop, dcop ,rcop, pcondcop,
qcondcop for the copula cdf, copula density, copula random
generation, conditional distribution C2|1 and
conditional quantile C-12|1.
Also for many bivariate copula families, there are
conversions among copula parameter,
Kendall's tau, Spearman's rho, Blomqvist's beta, correlation of normal scores,
and tail dependence parameters.
- Templates for copula log-likelihood and full log-likelihood with
univariate margins for discrete and continuous when copula cdf has
simple form. For the discrete case, included are functions
for the multivariate Gaussian/normal copula with univariate
regression models.
- R-vine (regular vine) for continuous data with specified vine array and
pair-copulas.
For continuous R-vines, not all of the capabilities of VineCopula
(R package available at CRAN) are
included. The interface is quite different, as it allows the user to include
parametric copula families, not available in VineCopula, for the edges of the vine.
- R-vine for discrete response data with possibility of covariates.
- Sequential minimum spanning trees for choosing "suboptimal" truncated vine
based on partial correlations, and exhaustive search for best truncated
partial correlation vines for dimension d<=8.
- Factor copula models for continuous response (factor models here
are truncated vines with latent variables).
- Factor copula models for ordinal (item) response.
- Simulation for R-vines and various factor structures.
- Kullback-Leibler optimization between (bivariate) copula families
and KL sample size for log-likelihood ratio procedure.
- Diagnostic methods to help in detecting asymmetries and choosing
copula families.
- Diagnostic methods for assessing adequacy of fit and comparing models.
- Utilities: monotone interpolation, modified Newton-Raphson
minimization to handle maximum likelihood with several hundred
parameters, Gaussian quadrature, operations on partial correlations.
Style of code
- Algorithms are coded with minimal use of features of any
specific programming language.
- Code is largely portable between R, C, Fortran90, Matlab ...
- Code currently has many templates, from which small modifications
can be made for specific applications.
Numerical methods
- Gauss-Legendre quadrature for 1-dimensional and 2-dimensional
integration of copula density and its first and second partial derivatives
with respect to copula parameters.
- The gradient and sometimes the Hessian of the negative
log-likelihoods are
efficiently coded in Fortran90 (which has built-in vector/matrix functions).
- Modified Newton-Raphson minimization of negative log-likelihood,
with positive definite approximation of Hessian in early iterations.
Code for this function is written in R.
Convergence is quadratically fast when near the optimum.
Quasi-Newton methods with numerical derivatives get near the optimum
quickly but then take too long to estimate the Hessian.
-
For some non-trivial computations,
the use of the piecewise cubic interpolation (table look-up),
linked to fortran77 code,
leads to a tremendous reduction in total computational time with almost
no loss of accuracy. An example is:
If bivariate tnu copulas are used in tree 1 of the factor/vine model,
many evaluations of the Tnu+1 univariate
cdf pt() are needed for the log-likelihood.
Adding your own functions to the package (for personal use)
- If you have functions written completely in R, they can be put in *.R files
and read into your session with source(). To add to the package, put the *.R files in
the directory CopulaModel/R and add the function names to the file CopulaModel/NAMESPACE.
Then recompile the package.
- If you have functions written in R linked to C or Fortran, the .c, .f or .f90 files
can be put in the directory CopulaModel/src (after they have been debugged) and the
.R files for the interface can be put in the directory CopulaModel/R.
Add the R function names to the file CopulaModel/NAMESPACE.
Then recompile the package.
To check the R interface to C and Fortran code, use something like one of the following to
compile into a shared object (extension .so):
R CMD SHLIB *.c *.f *.f90
# in the above the shared object takes the name from the first file in the list
gcc -fpic -c *.c
gfortran -fpic -c *.f *.f90
ld -shared -o myadd.so *.o
In the R interface file, add dyn.load() for the .so file to test the
interface before adding all of the source files to the R package.