Critical Code Studies

Climategate, Chapter 4

A file from the leaked source code in the so-called Climategate scandal.

Climategate Code Background on Climategate Climategate report
The HARRY_READ_ME.txt file.

Programming Language: Interactive Data Language
Developed: 1998
Principal Authors: Tim Mitchell, Mark New, Ian “Harry” Harris
Platform: Window, MacOs, Unix, VMS, and others.
Libraries Used:
Source file: (leaked file)
Interoperating files: age-banded and hugershoff standardized datasets (corr_age2hug.out)

II. Notes:

1. ; precedes comments in IDL. John Graham-Cumming offers a useful gloss of this code on his blog (2009).

2. MXD is “Maximum Latewood Density,” a correlate of regional temperature. Hugershoff refers to the Hugershoff function (Warren 1980).

7. The term “A VERY ARTIFICIAL CORRECTION” became the “smoking gun” of the climate data manipulation. But as one commentator adds “Certainly if I wanted to actually fudge something without anyone knowing, a 15-star comment wouldn’t be my first thought” (Clark 2009).

9. Creates the list (aka vector) [1400, 1904, 1909, 1914, 1919, 1924, 1929, 1934, 1939, 1944, 1949, 1954, 1959, 1964, 1969, 1974, 1979, 1984, 1989, 1994] Findgen creates a floating point array of the dimensions specified. Note that this array is declared without static typing, arguably a weakness in IDL.

10. Here are the adjustments: down for 1929-1943, 1949-53, then increasingly upward after 1953. Each number is multiplied by .75.

11. The “fudge factor” labels the offset used to correct the data.

12. If the size of the arrays do not equal, display an error message. However, no doubt the use of “Ooops” was also read as signaling a larger error and perhaps, too, a lax attitude toward the code.

14. Load color table 39 (“rainbow and white”) for the color display palette.

16. plot draws a line of the vector arguments (0, 1).

17. nrow is the number of rows in the graph.

21. endif ends the first if condition, just as endelse ends and else.

24. See 21.

28. Restores the variables from the reglists file.

29. This array contains the regions: northwest Canada, west North America, central Canada, northwest Europe, southwest Europe, north Siberia, central Siberia, TIB? east siberia, and all sites. The name harryfn most likely refers to Ian “Harry” Harris, one of the programmers (with “fn” possibly referring to “file name.”

32. rawdat=fltarr(4,2000) creates “rawdat” a floating point array of 4 x 2000.

33-38. Opens the harrfn files and reads n the data.

40. Reform repairs data after a subsection (in this case 2:3 is removed).

45. The boundaries of the data are 1400 and 1992.

52-53. Notably this normalization runs from 1881-1960, before the tree data becomes less reliable. Another version of this code contains a commented out section that applies this normalization to 1881-1940. (

57. Uses linear interpolation to fill in the data in between the adjusted years.

58. This adjustment is made to data containing “all bands.”

77. Perhaps a typo, the code that follows begins its work after 1400, not 1600.

79. Similar to above, the code narrows the range to after 1400.

81. Narrows range of adjustment to just years after 1400.

85. See comment on line 57.

86. This second adjustment is made just to the data containing “2-6 bands.”

94-95. oplot plots points over a previously drawn plot without redrawing the axis.

124. printf, 1 prints to an open file (1).