clr.transform {GCDkit}R Documentation

Centered-log-ratio transformation

Description

Implementation of centred-log-ratio (clr) transformation for compositional data.

Usage

clr.trans(comp.data=NULL,GUI=FALSE)
    
pr.comp.clr(comp.data=NULL,use.cov=FALSE,scale=TRUE,GUI=FALSE)

lda.clr(comp.data=NULL,grouping=groups,GUI=FALSE)

Arguments

comp.data

a numerical matrix; the data to be normalized. Or just names of variables in the data matrix 'WR'.

use.cov

logical; should be the covariance matrix used instead of correlation matrix?

scale

logical; the scalings applied to each variable.

GUI

logical; is the function called from a menu (GUI)?

grouping

character or factor; grouping information for each of the samples.

Details

Compositional data - i.e., multivariate data in which all the components sum up to some constant (e.g. 1 or 100, for percentages) - are widespread in the geosciences. A typical example represent major-element analyses from whole-rock samples.

Numerous workers have argued that much of correlation in such closed datasets is spurious, due to the so-called constant sum or closure effect (e.g., Chayes 1960; Rock 1988; Rollinson 1992, 1993).

This effect arises from the fact that such components in the compositional datasets cannot vary independently. If one oxide, for instance SiO2 that dominates the whole-rock analyses of many igneous rocks, increases in abundance, all other oxides must decrease. Therefore, everything must be anti-correlated with silica.

For their correct statistical treatment, compositional data have to be transformed, or 'opened'. A classic remedy to the closure effect are log-ratio transformations (Aitchison 1986; Buccianti et al. eds 2006).

The functions 'clr.trans', 'pr.comp.clr' and 'lda.clr' implement the so-called centred-log-ratio (clr) transformation. Data opening in this case is done by dividing each value of a variable by the geometric mean of all the variables for that sample and then taking logarithms. It is critical of course that all the variables are expressed in the same measurement unit.

For instance, for MgO, the centred-log-ratio transformed version is given as:

MgO_clr = ln(MgO)/geom.mean

where 'ln' is natural logarithm, 'C' concentration in wt. % of the selected variable (oxide) and the denominator a geometric mean of all variables being transformed (e.g., Pawlowsky-Glahn & Egozcue 2006)).

The function 'pr.comp.clr' performs principal components analysis and plots a biplot (Gabriel, 1971; Buccianti & Peccerillo, 1999). The function 'lda.clr' serves for linear discriminant analysis.

Value

For clr.trans, a numeric matrix 'results'. The names of components are preserved, and supplemented by a suffix '_clr'.

Plugin

disclosure.r

Author(s)

Vojtěch Janoušek, vojtech.janousek@geology.cz

Vladimír Kusbach, kusbach@gmail.com

References

Aitchison J (1986) The Statistical Analysis of Compositional Data. Methuen, New York, pp 1-416

Buccianti A, Mateu-Figueras G, Pawlowsky-Glahn V (eds) (2006) Compositional Data Analysis in the Geosciences. Geological Society London Special Publications 264: pp 1-212

Chayes F (1960) On correlation between variables of constant sum. J Geophys Res 65: 4185-4193 doi: 10.1029/JZ065i012p04185

Gabriel KR (1971) The biplot graphical display of matrices with application to principal component analysis. Biometrika 58: 453-467 doi: 10.1093/biomet/58.3.453

Greenacre, M. J. (2010). Biplots in Practice. Bilbăo: Fundación BBVA.

Pawlowsky-Glahn V, Egozcue JJ (2006) Compositional data and their analysis: an introduction. In: Buccianti A, Mateu-Figueras G, Pawlowsky-Glahn V (eds) Compositional Data Analysis in the Geosciences. Geological Society London Special Publications 264: pp 1-10 doi: 10.1144/GSL.SP.2006.264.01.01

Reimann C, Filzmoser P, Garrett R, Dutter R (2008) Statistical Data Analysis Explained: Applied Environmental Statistics with R. John Wiley & Sons, Chichester, pp 1-362

Rock NMS (1988) Numerical geology. A Source Guide, Glossary and Selective Bibliography to Geological Uses of Computers and Statistics. Lecture Notes in Earth Sciences 18, Springer, Berlin, pp 1-427 doi: 10.1007/BFb0045143

Rollinson HR (1992) Another look at the constant sum problem in geochemistry. Mineral Mag 56: 469-475 doi: 10.1180/minmag.1992.056.385.03

Rollinson HR (1993) Using Geochemical Data: Evaluation, Presentation, Interpretation. Longman, London, pp 1-352 doi: 10.4324/9781315845548

van den Boogaart KG, Tolosana-Delgado R (2008) "compositions": a unified R package to analyze compositional data. Comput Geosci 34: 320-338 doi: 10.1016/j.cageo.2006.11.017

van den Boogaart KG, Tolosana-Delgado R (2013) Analyzing Compositional Data with R. Springer, Berlin, pp 1-258

Venables WN, Ripley BD (1999) Modern Applied Statistics with S-Plus. Springer, Berlin. doi: 10.1007/978-1-4757-3121-7

See Also

prComp princomp lda

See Reimann et al. (2008) with van den Boogaart and Tolosana-Delgado (2013) for further details and van den Boogaart and Tolosana-Delgado (2008) for implementation of a comprehensive R library dealing with compositional data.

Examples

    sampleDataset("sazava")
    
    # Centered-log-ratio transformation
    ox<-c("SiO2","Al2O3","FeOt","MgO","CaO")
    clr.trans(ox)
    addResults() # Needed to append the clr-transformed data to the matrix 'WR'
    
    multiple(x="SiO2_clr", y="Al2O3_clr,FeOt_clr,MgO_clr,CaO_clr")
    plateCex(2)
    plateCexLab(1.3)
    
    # Principal components on basis of clr-transformed data
    pr.comp.clr()
    
    pr.comp.clr("SiO2,TiO2,Al2O3,MgO,CaO")

[Package GCDkit version 6.2.0 Index]