SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Politics : Foreign Affairs Discussion Group -- Ignore unavailable to you. Want to Upgrade?


To: DavesM who wrote (187751)5/31/2006 5:24:07 AM
From: Wharf Rat  Read Replies (1) | Respond to of 281500
 
)You mean this one?
GEOPHYSICAL RESEARCH LETTERS, VOL. 32, L03710, doi:10.1029/2004GL021750, 2005

Hockey sticks, principal components, and spurious significance

Stephen McIntyre

Northwest Exploration Co., Ltd., Toronto, Ontario, Canada

Ross McKitrick

Department of Economics, University of Guelph, Guelph, Ontario, Canada
===========================================

What is the point of contention in MM05?

MM05 contend that the particular PC convention used in MBH98 in dealing with the N. American tree rings selects for the 'hockey stick' shape and that the final reconstruction result is simply an artifact of this convention.

3) What convention was used in MBH98?

MBH98 were particularly interested in whether the tree ring data showed significant differences from the 20th century calibration period, and therefore normalized the data so that the mean over this period was zero. As discussed above, this will emphasize records that have the biggest differences from that period (either positive of negative). Since the underlying data have a 'hockey stick'-like shape, it is therefore not surprising that the most important PC found using this convention resembles the 'hockey stick'. There are actual two significant PCs found using this convention, and both were incorporated into the full reconstruction.

4) Does using a different convention change the answer?

As discussed above, a different convention (MM05 suggest one that has zero mean over the whole record) will change the ordering, significance and number of important PCs. In this case, the number of significant PCs increases to 5 (maybe 6) from 2 originally. This is the difference between the blue points (MBH98 convention) and the red crosses (MM05 convention) in the first figure. Also PC1 in the MBH98 convention moves down to PC4 in the MM05 convention. This is illustrated in the figure on the right, the red curve is the original PC1 and the blue curve is MM05 PC4 (adjusted to have same variance and mean). But as we stated above, the underlying data has a hockey stick structure, and so in either case the 'hockey stick'-like PC explains a significant part of the variance. Therefore, using the MM05 convention, more PCs need to be included to capture the significant information contained in the tree ring network.

This figure shows the difference in the final result whether you use the original convention and 2 PCs (blue) and the MM05 convention with 5 PCs (red). The MM05-based reconstruction is slightly less skillful when judged over the 19th century validation period but is otherwise very similar. In fact any calibration convention will lead to approximately the same answer as long as the PC decomposition is done properly and one determines how many PCs are needed to retain the primary information in the original data.

5) What happens if you just use all the data and skip the whole PCA step?

This is a key point. If the PCs being used were inadequate in characterizing the underlying data, then the answer you get using all of the data will be significantly different. If, on the other hand, enough PCs were used, the answer should be essentially unchanged. This is shown in the figure below. The reconstruction using all the data is in yellow (the green line is the same thing but with the 'St-Anne River' tree ring chronology taken out). The blue line is the original reconstruction, and as you can see the correspondence between them is high. The validation is slightly worse, illustrating the trade-off mentioned above i.e. when using all of the data, over-fitting during the calibration period (due to the increase number of degrees of freedom) leads to a slight loss of predictability in the validation step.

6) So how do MM05 conclude that this small detail changes the answer?

MM05 claim that the reconstruction using only the first 2 PCs with their convention is significantly different to MBH98. Since PC 3,4 and 5 (at least) are also significant they are leaving out good data. It is mathematically wrong to retain the same number of PCs if the convention of standardization is changed. In this case, it causes a loss of information that is very easily demonstrated. Firstly, by showing that any such results do not resemble the results from using all data, and by checking the validation of the reconstruction for the 19th century. The MM version of the reconstruction can be matched by simply removing the N. American tree ring data along with the 'St Anne River' Northern treeline series from the reconstruction (shown in yellow below). Compare this curve with the ones shown above.

As you might expect, throwing out data also worsens the validation statistics, as can be seen by eye when comparing the reconstructions over the 19th century validation interval. Compare the green line in the figure below to the instrumental data in red. To their credit, MM05 acknowledge that their alternate 15th century reconstruction has no skill.

7) Basically then the MM05 criticism is simply about whether selected N. American tree rings should have been included, not that there was a mathematical flaw?

Yes. Their argument since the beginning has essentially not been about methodological issues at all, but about 'source data' issues. Particular concerns with the "bristlecone pine" data were addressed in the followup paper MBH99 but the fact remains that including these data improves the statistical validation over the 19th Century period and they therefore should be included.

8) So does this all matter?

No. If you use the MM05 convention and include all the significant PCs, you get the same answer. If you don't use any PCA at all, you get the same answer. If you use a completely different methodology (i.e. Rutherford et al, 2005), you get basically the same answer. Only if you remove significant portions of the data do you get a different (and worse) answer.
realclimate.org