Averages and Correlations

Note: This page is highly mathematical.

This document defines notation that I use for indicating “average” values, and also defines a particular type of “correlation” factor that I use.

Overview

I this section, I’ll try to summarize the minimum you need to know about averages in order to to be able to follow my analyses.

  • I denote the average of a quantity A by the notation \ex{A}.
  • In the context of my climate analyses, \ex{A} generally indicates a global average (i.e., an average over the surface of the globe) within some time window.
  • I sometimes express things in terms of the weighted average of A relative to a weight B, which I denote as \exw{A}{B}. The value of this weighted average is always within the range of variation of A, i.e., between the minimum and maximum values of A.
  • I sometimes express things in terms of the multiplicative correlation of A and B, which I denote as \mc{A}{B}. This is a dimensionless quantity. When A and B are uncorrelated, \mc{A}{B} = 1. However, in general, it could be larger or smaller than 1. The notation can be extended to multiple variables, as in \mct{A}{B}{C}, which denotes the correlation of A, B, and C.

These values are related by several identities, which I often use to transform the way a quantity is expressed:

  1. \ex{A \, B} = \exw{A}{B} \cdot \ex{B}
  2. \ex{A \, B} = \mc{A}{B} \cdot \ex{A} \ex{B}
  3. \exw{A}{B} = \mc{A}{B} \cdot \ex{A}

(I use the notation A \cdot B to indicate multiplication when I want to make the factors more visually distinct.)

If you are reading one of my analyses, taking in the above information may be enough to allow you to follow the analysis and understand the results. If not, then more detailed information is provided below.

Table of Contents

Averages

Average

If A is a discrete quantity and there are a finite number of instances of A, then the average value of A, denoted \ex{A} is given by:

(1)   \begin{equation*} \ex{A} = \frac{\sum_{i=1}^N A_i}{N} \end{equation*}

If A is a continuous quantity which varies with respect to a variable x, then the value of A averaged over x is given by:

(2)   \begin{equation*} \ex{A} = \frac{\int A(x)\,dx}{\int\,dx} \end{equation*}

Averages are “linear,” meaning they have behavior in a simple way with respect to addition and multiplication by a constant. In particular, if k is a constant, then:

(3)   \begin{equation*} \ex{A+B} = \ex{A} + \ex{B} \end{equation*}

(4)   \begin{equation*} \ex{k\,A} = k\,\ex{A} \end{equation*}

Average over more than one variable

One can average over different variables. When I want to distinguish averages with respect to different variables, I denote the average of A with relative to the variable x by \ex{A}_x. When averaging over multiple variables, it doesn’t matter in which order averaging is done:

(5)   \begin{equation*} \ex{\ex{A}_x}_y = \langle\ex{A}_y\rangle_x \end{equation*}

Weighted average

The weighted average of A with respect to B is defined by:

(6)   \begin{equation*} \exw{A}{B} =  \ex{A\cdot\frac{B}{\ex{B})} }= \frac{\ex{A \, B}}{\ex{B}} \end{equation*}

When working with averages of quantities that are the result of multiplying several factors together, using weighted averages can make expressions work out more neatly.

For example, when talking about Earth’s albedo, the global albedo value quoted typically reflects the average of local albedo values, as weighted by the intensity of received sunlight. This makes the math work out simply: To determine the total fraction of sunlight reflected, it’s only necessary to multiply the total sunlight incident on the Earth by the global albedo value. (If one used an unweighted average of local albedo values in that calculation, that would yield an incorrect answer.)

If A has minimum value A_\mathrm{min} and maximum value A_\mathrm{max}, then, provided B \ge 0 everywhere, \exw{A}{B}will always be between these two limits:

(7)   \begin{equation*} A_\mathrm{min} \leq  \; \exw{A}{B} \; \leq A_\mathrm{max} \end{equation*}

Multiplicative Correlations

When people talk about the “correlation” of two variables, they are typically talking about what could be referred to as an “additive” correlation. Such a “correlation” of A and B involves calculating \ex{(A-\ex{A})\,(B-\ex{B})}. The relationship between each variable’s additive deviation from its mean (average) vale.

Rather than focusing on additive correlations, I’d like to consider each variable’s multiplicative deviation from its mean value. I denote the multiplicative correlation of A and B (or the multiplicative correlation factor for A and B) as \mc{A}{B} where this is defined by:

(8)   \begin{equation*} \mc{A}{B} =  \ex{\frac{A}{\ex{A}} \cdot \frac{B}{\ex{B}}} = \frac{\ex{A \, B}}{\ex{A}\ex{B}} \end{equation*}

This can be generalized to more variables:

(9)   \begin{equation*} \mcf{A}{B}{\dots}{Z} =  \ex{\frac{A}{\ex{A}} \cdot\frac{B}{\ex{B}} \cdots \frac{Z}{\ex{Z}} } = \frac{\ex{A \, B \cdots Z}}{\ex{A}\ex{B}\cdots\ex{Z}} \end{equation*}

Variables A and B are said to be “uncorrelated” if \mc{A}{B}=1.

Writing formulas using multiplicative correlation factors allows one to clearly distinguish what depends on the average values of individual variables, and what depends on relationships between variables (as characterized by their correlation).

If the variables A and B satisfies 0 \leq A_\mathrm{min} \leq A \leq A_\mathrm{max} and 0 \leq B_\mathrm{min} \leq B \leq B_\mathrm{max}, then their multiplicative correlation satisfies:

(10)   \begin{equation*} \frac{A_\mathrm{min}}{\ex{A}} \cdot \frac{B_\mathrm{min}}{\ex{B}} \;\leq \;\mc{A}{B}  \;\leq\; \frac{A_\mathrm{max}}{\ex{A}} \cdot \frac{B_\mathrm{max}}{\ex{B}} \end{equation*}

Note: The terms I use, “additive correlation” and “multiplicative correlation,” might not match terminology used elsewhere.

Identities

The definitions above lead to these identities, which will sometimes be used in my analysis:

(11)   \begin{equation*} \ex{A \, B} = \exw{A}{B} \, \ex{B}  \end{equation*}

(12)   \begin{equation*} \ex{A \, B} = \mc{A}{B}  \cdot \ex{A} \cdot \ex{B}  \end{equation*}

(13)   \begin{equation*} \exw{A}{B}  = \mc{A}{B} \cdot \ex{A} \end{equation*}

Averages for global climate analysis

Typical averaging

In my analyses of climate, unless I indicate otherwise, averages should be interpreted as being averages over the surface of the globe within some time window:

(14)   \begin{equation*} \ex{A} = \ex{\ex{A}_t}_\loc = \ex{\ex{A}_\loc}_t  \end{equation*}

where \ex{A}_t is the average over a time window, and \ex{A}_\loc is the average over latitude and longitude.

Time average

When dealing with climate data, data is usually averaged over some time window, typically at least over a day, if not over a month or a year.

A time average can be defined relative to a series of discrete time windows W_n(t) or relative to a sliding time window, W(\tau).

Time average over a discrete time window

The average of a variable A(t) relative to a discrete time window W_n(t) is defined as:

(15)   \begin{equation*} \ex{X(t)}_{t,n} = \int_{-\infty}^{\infty} A(t)\, W_n(t) \; dt \end{equation*}

where the time window is normalized so that:

(16)   \begin{equation*} \int_{-\infty}^{\infty} W_n(t) \, dt = 1 \end{equation*}

Time average over a sliding time window

The average of a variable A(t) relative to a sliding time window W(\tau) is defined as:

(17)   \begin{equation*} \ex{A(t)}_t = \int_{-\infty}^{t} A(t^\prime)\, W(t-t^\prime) \; dt^\prime \end{equation*}

where the time widows is normalized so that:

(18)   \begin{equation*} \int_0^{\infty} W(\tau) \, d\tau = 1 \end{equation*}

Average over latitude and longitude

The average of A over latitude and longitude is made up of the averages over those two variables:

(19)   \begin{equation*} \ex{A}_\loc = \ex{\ex{A}_\phi}_\theta \end{equation*}

Average over longitude: zonal average

If \theta and \phi are latitude and longitude (expressed in radians), then the average of A over longitude is:

(20)   \begin{equation*} \ex{A(\phi)}_\phi =   \frac{1}{2 \pi} \int_{-\pi}^{\pi}  A(\phi) \; d\phi \end{equation*}

In the context of climate, an average like this that eliminates longitude and leaves only latitude is referred to as a “zonal” average.

Average over latitude

For simple applications, the average of A over latitude may be computed as:

(21)   \begin{equation*} \ex{A(\theta)}_\theta = \frac{1}{2} \int_{-\frac{\pi}{2}}^{\frac{\pi}{2}} A(\theta) \; \cos(\theta) \, d\theta \end{equation*}

However, that formula assumes the planet is a sphere. More accurate work must take into account the actual shape of a planet. Earth is more accurately treated as an oblate-spheroid. The NASA CERES project provides information on how to apply geodetic weighting to compute accurate global averages.