More accurate, calibrated bootstrap confidence intervals for estimating the correlation between two time series

Ólafsdóttir, K. B.; Mudelsee, M.

doi:10.1007/s11004-014-9523-4

More accurate, calibrated bootstrap confidence intervals for estimating the correlation between two time series

Published: 18 February 2014

Volume 46, pages 411–427, (2014)
Cite this article

Mathematical Geosciences Aims and scope Submit manuscript

K. B. Ólafsdóttir^1,2 &
M. Mudelsee¹

865 Accesses
36 Citations
Explore all metrics

Abstract

Estimation of Pearson’s correlation coefficient between two time series, in the evaluation of the influences of one time-dependent variable on another, is an often used statistical method in climate sciences. Data properties common to climate time series, namely non-normal distributional shape, serial correlation, and small data sizes, call for advanced, robust methods to estimate accurate confidence intervals to support the correlation point estimate. Bootstrap confidence intervals are estimated in the Fortran 90 program PearsonT (Mudelsee, Math Geol 35(6):651–665, 2003), where the main intention is to obtain accurate confidence intervals for correlation coefficients between two time series by taking the serial dependence of the data-generating process into account. However, Monte Carlo experiments show that the coverage accuracy of the confidence intervals for smaller data sizes can be substantially improved. In the present paper, the existing program is adapted into a new version, called PearsonT3, by calibrating the confidence interval to increase the coverage accuracy. Calibration is a bootstrap resampling technique that performs a second bootstrap loop (it resamples from the bootstrap resamples). It offers, like the non-calibrated bootstrap confidence intervals, robustness against the data distribution. Pairwise moving block bootstrap resampling is used to preserve the serial dependence of both time series. The calibration is applied to standard error-based bootstrap Student’s $t$ confidence intervals. The performance of the calibrated confidence interval is examined with Monte Carlo simulations and compared with the performance of confidence intervals without calibration. The coverage accuracy is evidently better for the calibrated confidence intervals where the coverage error is acceptably small already (i.e., within a few percentage points) for data sizes as small as 20.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Construction of Bootstrap Confidence Intervals for Estimating the Correlation Between Two Time Series Not Sampled on Identical Time Points

Article Open access 27 May 2021

Improved Seasonal Mann–Kendall Tests for Trend Analysis in Water Resources Time Series

Applications of threshold models and the weighted bootstrap for Hungarian precipitation data

Article 14 April 2015

References

Beran R (1987) Prepivoting to reduce level error of confidence sets. Biometrika 74(3):457–468
Article Google Scholar
Carlstein E (1986) The use of subseries values for estimating the variance of a general statistic from a stationary sequence. Ann Stat 14(3):1171–1179
Article Google Scholar
Cronin TM (2010) Paleoclimates: understanding climate change past and present. Columbia University Press, New York
DiCiccio TJ, Efron B (1996) Bootstrap confidence intervals. Stat Sci 11(3):189–212
Article Google Scholar
Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman and Hall, New York
Gordon AL (1986) Interocean exchange of thermocline water. J Geophys Res 91(C4):5037–5046
Article Google Scholar
Granger CW, Maasoumi E, Racine J (2004) A dependence metric for possibly nonlinear processes. J Time Ser Anal 25(5):649–669
Google Scholar
Hall P (1986) On the bootstrap and confidence intervals. Ann Stat 14(4):1431–1452
Article Google Scholar
Hall P, Martin MA (1988) On bootstrap resampling and iteration. Biometrika 75(4):661–671
Article Google Scholar
Kendall MG (1954) Note on bias in the estimation of autocorrelation. Biometrika 41(3–4):403–404
Article Google Scholar
Künsch HR (1989) The jackknife and the bootstrap for general stationary observations. Ann Stat 17(3):1217–1241
Article Google Scholar
Liu RY, Singh K (1992) Moving blocks jackknife and bootstrap capture weak dependence. In: LePage R, Billard L (eds) Exploring the limits of bootstrap. Wiley, New York, pp 225–248
Loh W-Y (1987) Calibrating confidence coefficients. J Am Stat Assoc 82(397):155–162
Article Google Scholar
Mudelsee M (2002) TAUEST: a computer program for estimating persistence in unevenly spaced weather/climate time series. Comput Geosci 28(1):69–72
Google Scholar
Mudelsee M (2003) Estimating Pearson’s correlation coefficient with bootstrap confidence interval from serially dependent time series. Math Geol 35(6):651–665
Article Google Scholar
Mudelsee M (2010) Climate time series analysis: classical statistical and bootstrap methods. Springer, Dordrecht
Pearson K (1896) Mathematical contributions to the theory of evolution—III. Regression, heredity, and panmixia. Philos Trans R Soc Lond Ser A 187:253–318
Article Google Scholar
Peeters FJC, Acheson R, Brummer G-JA, de Ruijter WPM, Schneider RR, Ganssen GM, Ufkes E, Kroon D (2004) Vigorous exchange between the Indian and Atlantic oceans at the end of the past five glacial periods. Nature 430(7000):661–665
Article Google Scholar
Pisias NG, Mix AC, Zahn R (1990) Nonlinear response in the global climate system: evidence from benthic oxygen isotopic record in core RC13-110. Paleoceanography 5(2):147–160
Article Google Scholar
Rehfeld K, Marwan N, Heitzig J, Kurths J (2011) Comparison of correlation analysis techniques for irregularly sampled time series. Nonlinear Process Geophys 18(3):389–404
Google Scholar
Sherman M, Speed FM Jr, Speed FM (1998) Analysis of tidal data via the blockwise bootstrap. J Appl Stat 25(3):333–340
Article Google Scholar
von Storch H, Zwiers FW (1999) Statistical analysis in climate research. Cambridge University Press, Cambridge

Download references

Acknowledgments

We thank Alexander Gluhovsky and three anonymous persons for constructive review comments. We thank Michael Schulz, Arne Biastoch, Jonathan Durgadoo, Frank Peeters, Conor Purcell, and Gema Martínez Méndez for discussions and helpful comments. The work described in this paper and the research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007–2013), Marie-Curie ITN, under Grant Agreement No. 238512, GATEWAYS project.

Author information

Authors and Affiliations

Climate Risk Analysis, Schneiderberg 26, 30167 , Hannover, Germany
K. B. Ólafsdóttir & M. Mudelsee
MARUM, Center for Marine Environmental Sciences, University of Bremen, 28334 , Bremen, Germany
K. B. Ólafsdóttir

Authors

K. B. Ólafsdóttir
View author publications
You can also search for this author in PubMed Google Scholar
M. Mudelsee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Mudelsee.

Appendices

Appendix A: Software

The calibration method explained in the paper was adapted into the Fortran 90 software PearsonT3. The software is freely available at http://www.climate-risk-analysis.com. The installation requires copying the PearsonT3 executable file into an appropriate directory and installing the free graphic program Gnuplot (http://www.gnuplot.info/). The gnuplot executable file, gnuplot.exe, needs to be in the same directory as PearsonT3.exe. The software is command line driven and can be run from the Windows command prompt or simply by double clicking the executable file. After starting PearsonT3 the program asks for a name and path of input data file. The data file should be a simple text file and in the format

$$\begin{aligned} \begin{array}{l@{\quad }l@{\quad }l} t_{1} &{} x_{1} &{} y_{1} \\ t_{2} &{} x_{2} &{} y_{2} \\ \vdots &{} \vdots &{} \vdots \\ t_{n} &{} x_{n} &{} y_{n}, \end{array} \end{aligned}$$

where $t$ is sampling times and $x$ and $y$ are two equally long time series with data size $\ge 10$. The time series are automatically mean detrended (the mean of the data is subtracted from the data). A linear detrend option was included in the old version of the software but is not included in PearsonT3. If the time series samples contain linear or more complex trend, we recommend some detrending prior to the analysis in PearsonT3 to fulfill the weakly stationary assumptions.

The persistence times are estimated with the least-squares algorithm TAUEST Mudelsee (2002) with automatic bias correction. If the bias-corrected equivalent autocorrelation coefficient becomes $>1$ (Eq. 6), then the bias correction is not performed, which can occur if $n$ is small and the autocorrelation coefficient is large. After the estimation the time series are plotted up on the screen along with an $x-y$ scatterplot to test for the linear relationship. The results are printed on the screen, which informs about the data file name, the time interval $[t(1);\,t(n)]$, the number of data points ($n$), the persistence times ($\tau _{X}$ and $\tau _{Y}$), and the estimated correlation coefficient ($r_{XY}$) with 95 % calibrated confidence interval. The results are also written into a result file, along with the data, means, and mean detrended data. The final result file, named PearsonT3.dat, is a plain ASCII file, which is saved in the same directory as the executable file PearsonT3.exe.

Appendix B: Bivariate AR(1) process

The bivariate AR(1) process for uneven spacing is given by ((Mudelsee 2010, Ch. 7.6))

$$\begin{aligned} X(1)&= \fancyscript{E}_{\mathrm{N}(0,\, 1)}^X(1),\nonumber \\ Y(1)&= \fancyscript{E} _{\mathrm{N}(0,\, 1)}^Y(1),\nonumber \\ X(i)&= \exp \left\{ -\left[ T(i) - T(i-1) \right] / \tau _X \right\} \cdot X(i-1)\nonumber \\&+\, \fancyscript{E}_{\mathrm{N}(0,\, 1-\exp \left\{ -2\left[ T(i) - T(i-1) \right] / \tau _X \right\} )}^X(i), \qquad i = 2,\ldots ,n,\nonumber \\ Y(i)&= \exp \left\{ -\left[ T(i) - T(i-1) \right] / \tau _Y \right\} \cdot Y(i-1)\nonumber \\&+\, \fancyscript{E}_{\mathrm{N}(0,\, 1-\exp \left\{ -2\left[ T(i) - T(i-1) \right] / \tau _Y \right\} )}^Y(i), \qquad i = 2,\ldots ,n, \end{aligned}$$

(12)

where the white-noise terms are correlated as

$$\begin{aligned} \textit{CORR}\left[ \fancyscript{E}_{\mathrm{N}(0, 1)}^X(1), \fancyscript{E}_{\mathrm{N}(0, 1)}^Y(1)\right] =\rho _\fancyscript{E}, \end{aligned}$$

(13)

$$\begin{aligned}&\textit{CORR}\Bigl [ \fancyscript{E}_{\mathrm{N}(0,\, 1-\exp \left\{ -2\left[ T(i) - T(i-1) \right] / \tau _X \right\} )}^X(i), \fancyscript{E}_{\mathrm{N}(0,\, 1-\exp \left\{ -2\left[ T(j) - T(j-1) \right] / \tau _Y \right\} )}^Y(j) \Bigr ] = 0,\\&\qquad i, j = 2,\ldots ,n,\qquad i \ne j, \end{aligned}$$

$$\begin{aligned}&\textit{CORR}\Bigl [ \fancyscript{E}_{\mathrm{N}(0,\, 1-\exp \left\{ -2\left[ T(i) - T(i-1) \right] / \tau _X \right\} )}^X(i), \fancyscript{E}_{\mathrm{N}(0,\, 1)}^Y(1)\Bigr ] = 0,\\&\qquad \qquad \, i = 2,\ldots ,n, \end{aligned}$$

$$\begin{aligned}&\textit{CORR}\Bigl [\fancyscript{E}_{\mathrm{N}(0,\, 1)}^X(1), \fancyscript{E}_{\mathrm{N}(0,\, 1-\exp \left\{ -2\left[ T(i) - T(i-1) \right] / \tau _Y \right\} )}^Y(i), \Bigr ] = 0,\\&\qquad \qquad \, i = 2,\ldots ,n. \end{aligned}$$

The process is strictly stationary with the properties

$$\begin{aligned}&\hbox {E}[X(i)]= \hbox {E}[Y(i)]= 0,\end{aligned}$$

(14)

$$\begin{aligned}&\hbox {VAR}[X(i)]=\hbox {VAR}[Y(i)]=1,\end{aligned}$$

(15)

$$\begin{aligned}&\hbox {CORR}[X(i),Y(i)]=\rho _{XY}= \rho _\fancyscript{E}. \end{aligned}$$

(16)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ólafsdóttir, K.B., Mudelsee, M. More accurate, calibrated bootstrap confidence intervals for estimating the correlation between two time series. Math Geosci 46, 411–427 (2014). https://doi.org/10.1007/s11004-014-9523-4

Download citation

Received: 12 September 2013
Accepted: 19 January 2014
Published: 18 February 2014
Issue Date: May 2014
DOI: https://doi.org/10.1007/s11004-014-9523-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

More accurate, calibrated bootstrap confidence intervals for estimating the correlation between two time series

Abstract

Access this article

Similar content being viewed by others

On the Construction of Bootstrap Confidence Intervals for Estimating the Correlation Between Two Time Series Not Sampled on Identical Time Points

Improved Seasonal Mann–Kendall Tests for Trend Analysis in Water Resources Time Series

Applications of threshold models and the weighted bootstrap for Hungarian precipitation data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Software

Appendix B: Bivariate AR(1) process

Rights and permissions

About this article

Cite this article

Keywords

Navigation

More accurate, calibrated bootstrap confidence intervals for estimating the correlation between two time series

Abstract

Access this article

Similar content being viewed by others

On the Construction of Bootstrap Confidence Intervals for Estimating the Correlation Between Two Time Series Not Sampled on Identical Time Points

Improved Seasonal Mann–Kendall Tests for Trend Analysis in Water Resources Time Series

Applications of threshold models and the weighted bootstrap for Hungarian precipitation data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Software

Appendix B: Bivariate AR(1) process

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation