Sums and Correlations

Note: This page is highly mathematical.

This document defines notation that I use for indicating “summed” values, and also defines a particular type of “correlation” factor that I use. This is an alternative to Averages and Correlations. The two frameworks are equivalent ways of representing the same underlying mathematics.

Overview

I this section, I’ll try to summarize the minimum you need to know about sums in order to to be able to follow my analyses.

  • I denote the sum of a quantity A by the notation \sumb{A}.
  • In the context of my climate analyses, I use \osum{A} to indicate a global average (i.e., an average over the surface of the globe) within some time window.
  • I sometimes express things in terms of the weighted average of A relative to a weight B, which I denote as \osumw{A}{B}.
  • I sometimes express things in terms of the multiplicative correlation of A and B, which I denote as \mc{A}{B}. This is a dimensionless quantity. When A and B are uncorrelated, \mc{A}{B} = 1. However, in general, it could be larger or smaller than 1. The notation can be extended to multiple variables, as in \mct{A}{B}{C}, which denotes the correlation of A, B, and C.

These values are related by several identities, which I often use to transform the way a quantity is expressed:

  1. \osum{A \, B} = \osumw{A}{B} \cdot \osum{B}
  2. \osum{A \, B} = \mc{A}{B} \cdot \osum{A} \osum{B}
  3. \osumw{A}{B} = \mc{A}{B} \cdot \osum{A}

(I use the notation A \cdot B to indicate multiplication when I want to make the factors more visually distinct.)

If you are reading one of my analyses, taking in the above information may be enough to allow you to follow the analysis and understand the results. If not, then more detailed information is provided below.

Table of Contents

Sums

Sum

If A is a discrete quantity and there are a finite number of instances of A, then the summed value of A, denoted \int\inbraces{A}, is given by:

(1)   \begin{equation*} \int\inbraces{A} = \sum_{i=1}^N {A_i} \end{equation*}

If A is a continuous quantity which varies with respect to a variable x, then the value of A “summed” over x is given by:

(2)   \begin{equation*} \int\inbraces{A} = \int A(x)\,dx \end{equation*}

Sums are “linear,” meaning they have behavior in a simple way with respect to addition and multiplication by a constant. In particular, if k is a constant, then:

(3)   \begin{equation*} \int\inbraces{A+B} = \int\inbraces{A} + \int\inbraces{B} \end{equation*}

(4)   \begin{equation*} \sumb{k\,A} = k\,\sumb{A} \end{equation*}

Sum over more than one variable

One can sum over different variables. When I want to distinguish sum with respect to different variables, I denote the sum of A with relative to the variable x by \int_x\inbraces{A}. When averaging over multiple variables, it doesn’t matter in which order summing is done:

(5)   \begin{equation*} \int_x\int_y\inbraces{A} = \int_y\int_x\inbraces{A}  \end{equation*}

Weighted sum

The weighted sum of A with respect to B is defined by:

(6)   \begin{equation*} \sumbw{A}{B} =  \sumb{A\cdot\frac{B}{\sumb{B})} }= \frac{\sumb{A \, B}}{\sumb{B}} \end{equation*}

When working with averages of quantities that are the result of multiplying several factors together, using weighted sums can make expressions work out more neatly.

If A has minimum value A_\mathrm{min} and maximum value A_\mathrm{max}, then, provided B \ge 0 everywhere, \exw{A}{B}will always be between these two limits:

(7)   \begin{equation*} \sumb{A_\mathrm{min}} \leq  \; \sumbw{A}{B} \; \leq \sumb{A_\mathrm{max}} \end{equation*}

Multiplicative Correlations

When people talk about the “correlation” of two variables, they are typically talking about what could be referred to as an “additive” correlation. Such a “correlation” of A and B involves calculating \ex{(A-\ex{A})\,(B-\ex{B})} where \ex{X} denotes the average of X. The relationship between each variable’s additive deviation from its mean (average) vale.

Rather than focusing on additive correlations, I’d like to consider each variable’s multiplicative deviation from its mean value. I denote the multiplicative correlation of A and B (or the multiplicative correlation factor for A and B) as \mc{A}{B} where this is defined by:

(8)   \begin{equation*} \mc{A}{B} =  \sumb{\frac{A}{\sumb{A}} \cdot \frac{B}{\sumb{B}}} = \frac{\sumb{A \, B}}{\sumb{A}\sumb{B}} \end{equation*}

This can be generalized to more variables:

(9)   \begin{equation*} \mcf{A}{B}{\dots}{Z} =  \sumb{\frac{A}{\sumb{A}} \cdot\frac{B}{\sumb{B}} \cdots \frac{Z}{\sumb{Z}} } = \frac{\sumb{A \, B \cdots Z}}{\sumb{A}\sumb{B}\cdots\sumb{Z}} \end{equation*}

Variables A and B are said to be “uncorrelated” if \mc{A}{B}=1.

Writing formulas using multiplicative correlation factors allows one to clearly distinguish what depends on the average values of individual variables, and what depends on relationships between variables (as characterized by their correlation).

If the variables A and B satisfies 0 \leq A_\mathrm{min} \leq A \leq A_\mathrm{max} and 0 \leq B_\mathrm{min} \leq B \leq B_\mathrm{max}, then their multiplicative correlation satisfies:

(10)   \begin{equation*} \frac{A_\mathrm{min}}{\ex{A}} \cdot \frac{B_\mathrm{min}}{\ex{B}} \;\leq \;\mc{A}{B}  \;\leq\; \frac{A_\mathrm{max}}{\ex{A}} \cdot \frac{B_\mathrm{max}}{\ex{B}} \end{equation*}

Note: The terms I use, “additive correlation” and “multiplicative correlation,” might not match terminology used elsewhere. This definition of “multiplicative correlation” is exactly equivalent to the definition on the page Averages and Correlations.

Identities

The definitions above lead to these identities, which will sometimes be used in my analysis:

(11)   \begin{equation*} \sumb{A \, B} = \sumbw{A}{B} \, \sumb{B}  \end{equation*}

(12)   \begin{equation*} \sumb{A \, B} = \mc{A}{B}  \cdot \sumb{A} \cdot \sumb{B}  \end{equation*}

(13)   \begin{equation*} \sumbw{A}{B}  = \mc{A}{B} \cdot \sumb{A} \end{equation*}

Sums for global climate analysis

Typical summing

In my analyses of climate, I’ll use the notation \osum{A} to denote a sum over the surface of the globe over some time window:

(14)   \begin{equation*} \osum{A} = \int_\loc\int_t{A} = \int_t\int_\loc{A}   \end{equation*}

where \int_t{A} is the average over a time window, and \int_\loc{A} is the average over latitude and longitude.

Time sum

When dealing with climate data, data is usually summed or averaged over some time window, typically at least over a day, if not over a month or a year.

A sum over time can be defined relative to a series of discrete time windows W^\prime_n(t) or relative to a sliding time window, W^\prime(\tau).

Time sum over a discrete time window

The average of a variable A(t) relative to a discrete time window W^\prime_n(t) is defined as:

(15)   \begin{equation*} \int_{t,n}\inbraces{X(t)} = \int_{-\infty}^{\infty} A(t)\, W^\prime_n(t) \; dt \end{equation*}

where the time window is 1 within the time window, and 0 outside it.

Time sum over a sliding time window

The sum of a variable A(t) relative to a sliding time window W^\prime(\tau) is defined as:

(16)   \begin{equation*} \int_t\inbraces{X(t)} = \int_{-\infty}^{t} X(t^\prime)\, W^\prime(t-t^\prime) \; dt^\prime \end{equation*}

where again the time window is 1 within the time window, and 0 outside it.

Sum over latitude and longitude

The sum of A over latitude and longitude is made up of the integrals over those two variables:

(17)   \begin{equation*} \int_\loc\inbraces{A} =  \int_{-\pi}^{\pi} \left[ \int_{-\pi}^{\pi}  A(\phi) \; \, \dd\phi \right] R^2(\theta) \, (\cos\theta)\, \dd\theta \end{equation*}

Relationship between sums and averages

Sums and averages, as I’ve defined them, are related as follows:

(18)   \begin{equation*} \osum{A} = \ex{A} \osum{1} \end{equation*}

(19)   \begin{equation*} \osumw{A}{B} = \exw{A}{B} \osum{1} \end{equation*}

The quantity \osum{1} is simply the surface area of the planet times the duration of the time window.

Thus, global sums and averages are identical except for a multiplicative constant.