Bamber et al. (2019)

Title:

Ice sheet contributions to future sea-level rise from structured expert judgment

Key Points:

Potential contributions of ice sheets to future sea-level rise (SLR) remain the largest source of uncertainty in SLR projections
For a +2 °C temperature scenario, consistent with the Paris Agreement, the study estimates a median SLR contribution of 26 cm by 2100, with a 95th percentile value of 81 cm. For a +5 °C temperature scenario, more consistent with unchecked emissions growth, the corresponding values are 51 cm and 178 cm, respectively.
Beyond 2100, uncertainty and projected SLR increase rapidly. For the +5 °C scenario, the 95th percentile ice sheet contribution by 2200 is 7.5 m due to instabilities in both West and East Antarctica.
Scenarios of 21st century global total SLR exceeding 2 m should be used for planning purposes.

Keywords:

sea-level rise, climate predictions, ice sheets, Greenland, Antarctica

Corresponding author:

Jonathan L. Bamber

Citation:

Bamber, J. L., Oppenheimer, M., Kopp, R. E., Aspinall, W. P., & Cooke, R. M. (2019). Ice sheet contributions to future sea-level rise from structured expert judgment. Proceedings of the National Academy of Sciences, 116(23), 11195–11200. doi:10.1073/pnas.1817205116

URL:

https://www.pnas.org/doi/full/10.1073/pnas.1817205116

Significance

Future sea level rise (SLR) poses serious threats to the viability of coastal communities, but continues to be challenging to project using deterministic modeling approaches. Nonetheless, adaptation strategies urgently require quantification of future SLR uncertainties, particularly upper-end estimates. Structured expert judgement (SEJ) has proved a valuable approach for similar problems. Our findings, using SEJ, produce probability distributions with long upper tails that are influenced by interdependencies between processes and ice sheets. We find that a global total SLR exceeding 2 m by 2100 lies within the 90% uncertainty bounds for a high emission scenario. This is more than twice the upper value put forward by the Intergovernmental Panel on Climate Change in the Fifth Assessment Report.

Abstract

Despite considerable advances in process understanding, numerical modeling, and the observational record of ice sheet contributions to global mean sea-level rise (SLR) since the Fifth Assessment Report (AR5) of the Intergovernmental Panel on Climate Change, severe limitations remain in the predictive capability of ice sheet models. As a consequence, the potential contributions of ice sheets remain the largest source of uncertainty in projecting future SLR. Here, we report the findings of a structured expert judgement study, using unique techniques for modeling correlations between inter- and intra-ice sheet processes and their tail dependences. We find that since the AR5, expert uncertainty has grown, in particular because of uncertain ice dynamic effects. For a +2 °C temperature scenario consistent with the Paris Agreement, we obtain a median estimate of a 26 cm SLR contribution by 2100, with a 95th percentile value of 81 cm. For a +5 °C temperature scenario more consistent with unchecked emissions growth, the corresponding values are 51 and 178 cm, respectively. Inclusion of thermal expansion and glacier contributions results in a global total SLR estimate that exceeds 2 m at the 95th percentile. Our findings support the use of scenarios of 21st century global total SLR exceeding 2 m for planning purposes. Beyond 2100, uncertainty and projected SLR increase rapidly. The 95th percentile ice sheet contribution by 2200, for the +5 °C scenario, is 7.5 m as a result of instabilities coming into play in both West and East Antarctica. Introducing process correlations and tail dependences increases estimates by roughly 15%.

Introduction

Global mean sea-level rise (SLR), which during the last quarter century has occurred at an accelerating rate (1), averaging about +3 mm⋅y−1, threatens coastal communities and ecosystems worldwide. Adaptation measures accounting for the changing hazard, including building or raising permanent or movable structures such as surge barriers and sea walls, enhancing nature-based defenses such as wetlands, and selective retreat of populations and facilities from areas threatened by episodic flooding or permanent inundation, are being planned or implemented in several countries. Risk assessment for such adaptation efforts requires projections of future SLR, including careful characterization and evaluation of uncertainties (2) and regional projections that account for ocean dynamics, gravitational and rotational effects, and vertical land motion (3). During the nearly 40 y since the first modern, scientific assessments of SLR, understanding of the various causes of this rise has advanced substantially. Improvements during the past decade include closing the historic sea-level budget, attributing global mean SLR to human activities, confirming acceleration of SLR since the nineteenth century and during the satellite altimetry era, and developing analytical frameworks for estimating regional and local mean sea level and extreme water level changes. Nonetheless, long-term SLR projections remain acutely uncertain, in large part because of inadequate understanding of the potential future behaviors of the Greenland and Antarctic ice sheets and their responses to future global climate change. This limitation is especially troubling, given that the ice sheet influence on SLR has been increasing since the 1990s (4) and has overtaken mountain glaciers to become the largest barystatic (mass) contribution to SLR (5). In addition, for any given future climate scenario, the ice sheets constitute the component with the largest uncertainties by a substantial margin, especially beyond 2050 (6).

Advances since the Fifth Assessment Report (AR5) of the Intergovernmental Panel on Climate Change (7) include improved process understanding and representation in deterministic ice sheet models (8, 9), probabilistic projections calibrated against these models and the observational record (10), and new semiempirical models, based on the historical relationship between temperature and sea-level changes. Each of these approaches has limitations that stem from factors including poorly understood processes, poorly constrained boundary conditions, and a short (∼25 y) satellite observation record of ice sheets that does not capture the time scales of internal variability in the ice sheet climate system. As a consequence, it is unclear to what extent recent observed ice sheet changes (11) are a result of internal variability (ice sheet weather) or external forcing (ice sheet climate).

Where other methods are intractable for scientific or practical reasons, structured expert judgement (SEJ), using calibrated expert responses, provides a formal approach for estimating uncertain quantities according to current scientific understanding. It has been used in a wide range of applications, including natural and anthropogenic hazards such as earthquakes, volcanic eruptions, vector-borne disease spread, and nuclear waste security (12). That said, it should not be regarded as a substitute for fundamental research into driving processes, but instead as a source of complementary insights into the current state of knowledge and, in particular, the extent of the uncertainties (12). An SEJ study conducted in advance of the AR5 (13) (hereafter BA13) provided valuable insights into the uncertainties around ice sheet projections, as assessed at that time.

Since then, regional- and continental-scale, process-based modeling of ice sheets has advanced substantially (8, 9, 14–16), with the inclusion of new positive feedbacks that could potentially accelerate mass loss, and negative feedbacks that could potentially slow it. These include solid Earth and gravitational processes (17, 18), Antarctic marine ice cliff instability (19), and the influences of organic and inorganic impurities on the albedo of the Greenland Ice Sheet (20). The importance of these feedbacks is an area of continuing research. Therefore, alternative approaches must be exploited to assess future SLR and, critically, its associated uncertainties (21), to serve the more immediate needs of the science and policy communities.

Here, we report the findings of an SEJ exercise undertaken in 2018 via separate, 2-d workshops held in the United States and United Kingdom, involving 22 experts (hereafter SEJ2018). Details of how experts were selected are provided in SI Appendix, Note 1. The questions and format of the workshops were identical, so that the findings could be combined using an impartial weighting approach (Methods). SEJ (as opposed to other types of expert elicitation) weights each expert using objective estimates of their statistical accuracy and informativeness (22), determined using experts’ uncertainty evaluations over a set of seed questions from their field with ascertainable values (Methods). The approach is analogous to weighting climate models based on their skill in capturing a relevant property, such as the regional 20th century surface air temperature record (23). In SEJ, the synthetic expert (i.e., the performance weighted [PW] combination of all of the experts’ judgments) in general outperforms an equal weights (EW) combination in terms of statistical accuracy and informativeness, as illustrated in SI Appendix, Fig. S3. The approach is particularly effective at identifying those experts who are able to quantify their uncertainties with high statistical accuracy for specified problems rather than, for example, experts with restricted domains of knowledge or even high scientific reputation (12).

The participating experts quantified their uncertainties for three physical processes relevant to ice sheet mass balance: accumulation, discharge, and surface runoff. They did this for each of the Greenland, West Antarctic, and East Antarctic ice sheets (GrIS, WAIS, and EAIS, respectively), and for two schematic temperature change scenarios. The first temperature trajectory (denoted L) stabilized in 2100 at +2 °C above preindustrial global mean surface air temperature (defined as the average for 1850–1900), and the second (denoted H) stabilized at +5 °C (SI Appendix, Fig. S1). The experts generated values for four dates: 2050, 2100, 2200, and 2300. Experts also quantified the dependence between accumulation, runoff, and discharge within each of the three ice sheets, and between each ice sheet for discharge only, for the H scenario in 2100. We used temperature trajectories rather than emissions scenarios to isolate the experts’ judgements about the relationship between global mean surface air temperature change and ice sheet changes from judgements about climate sensitivity.

An important and unique element of SEJ2018 was the elicitation of intra- and inter-ice sheet dependencies (SI Appendix, Note 1.5). Two features of dependence were elicited: a central dependence and an upper tail dependence. The former captures the probability that one variable exceeds its median given that the other variable exceeds its median, whereas the latter captures the probability that one variable exceeds its 95th percentile given that the other exceeds its 95th percentile. It is well known that these two types of dependence are, in general, markedly different, a property that is not captured by the usual Gaussian dependence model. The latter always imposes tail independence, regardless of the degree of central dependence, and can produce large errors when applied inappropriately (24). For example, if GrIS discharge exceeds its 95th percentile, what is the probability that runoff will also exceed its 95th percentile? This probability may be substantially higher than the independent probability of 5%, and ignoring tail dependence may lead to underestimating the probability of high SLR contributions. On the basis of each expert’s responses, a joint distribution was constructed to capture the dependencies among runoff, accumulation, and discharge for GrIS, WAIS, and EAIS, with dependence structures chosen, per expert, to capture central and tail dependences (Methods and SI Appendix, Note 1.5). In BA13, heuristic dependency values were applied on the basis of simple assumptions about the response of processes to a common forcing.

To help interpret the findings, experts were also asked to provide qualitative and rank-order information on what they regard to be the leading processes that could influence ice dynamics and surface mass balance (snowfall minus ablation); henceforth, this is termed the descriptive rationale. Further details can be found in the SI Appendix. The combined sea-level contribution from all processes and ice sheets was determined assuming either independence or dependence. Here, we focus on the findings with dependence; we examine the effect of the elicited dependencies and the approach taken in SI Appendix, Note 1.5.

The ice sheet contributions were expressed as anomalies from the 2000–2010 mean states, which were predefined (SI Appendix, Table S7). The baseline sea-level contribution for this period was prescribed as 0.76 mm⋅y−1 (0.56, 0.20, and 0.00 mm⋅y−1 for GrIS, WAIS, and EAIS, respectively) and has been added to the elicited values discussed here. This is close to an observationally derived value of 0.79 mm⋅y−1 for the same period, which was published subsequently to the SEJ workshops (4).

Results and Discussion

Fig. 1 shows the probability density functions (PDFs) for both temperature trajectory scenarios for the combined ice sheet contributions, assuming some dependencies exist between ice sheet processes, as elicited from the expert group (SI Appendix, Note 1.5). The associated numerical values are detailed in Table 1, and plots for all four epochs are provided in SI Appendix, Fig. S2. They display similar characteristics to Fig. 1. The PDFs were generated using Monte Carlo sampling from the intrinsic range obtained from the expert responses (22). All PDFs are non-Gaussian and exhibit an extended upper tail, especially for the H temperature scenario. We believe this reflects the experts’ joint view that large amplitude, nonlinear instabilities could be triggered at this higher temperature, even by 2050. For example, for 2050, the median [and likely range, defined as the 17–83% probability range, as in the AR5 (25)] of the ice sheet contributions are 10 cm (5–18 cm) for the L scenario and 12 cm (6–24 cm) for the H scenario. The tail behavior is discussed further in SI Appendix, Note 1.1. By 2100, the differences between the scenarios grow larger, with projected contributions of 26 cm (12–53 cm) and 51 cm (22–113 cm; Fig. 2 and Table 1).

Fig. 1: PDFs for the L (blue) and H (red) temperature scenarios for the combined ice sheet SLR contributions at (A) 2100 and (B) 2300. All four time intervals are shown in SI Appendix, Fig. S2. The horizontal bars show the fifth, 17th, 50th (median), 83rd, and 95th percentile values. The baseline rate of 0.76 mm⋅a−1 is included. Note that there is more than a factor five change in the x axis scales.

Table 1: Projected sea-level rise contributions from each ice sheet and combined. Individual ice sheet and total sea-level contributions for both temperature scenarios and for the four periods considered: 2050, 2100, 2200, and 2300. All values assume the dependencies elicited for the 2100 H case. Because the PDFs are not Gaussian, the mean and median values differ; the latter is a better measure of central tendency. All values are cumulative from 2000 and include the baseline imbalance for 2000–2010 of 0.76 mm y−1. The AR5-defined likely range (17–83%) is provided alongside the 90% credible interval. PW01 denotes the performance weighted combination of experts based on their calibration score.

Year and ice sheet Low High Mean ± SD 50% 5–95% 17–83% Mean ± SD 50% 5–95% 17–83% 2050 PW01 SLR 11 ± 8 10 1–27 5–18 15 ± 12 12 1–38 6–24 GrIS 7 ± 5 5 2–18 3–11 9 ± 7 6 2–27 4–14 WAIS 7 ± 8 5 0–23 1–7 5 ± 6 4 0–18 1–10 EAIS 0 ± 2 0 −4–4 −2–1 0 ± 4 0 −6–7 −3–2 2100 PW01 SLR 32 ± 25 26 3–81 12–53 67 ± 56 51 7–178 22–112 GrIS 19 ± 16 13 2–57 7–31 33 ± 30 23 2–99 10–60 WAIS 13 ± 16 8 −3–44 2–23 27 ± 33 18 −5–93 3–46 EAIS 3 ± 6 0 −8–12 −3–4 6 ± 17 2 −11–46 −4–11 2200 PW01 SLR 89 ± 72 72 5–231 30–149 204 ± 260 130 5–750 40–251 GrIS 49 ± 47 34 5–149 19–79 77 ± 69 55 3–216 23–122 WAIS 37 ± 45 26 −24–128 1–76 80 ± 113 51 −25–324 −3–138 EAIS 4 ± 15 2 −15–34 −6–10 48 ± 158 6 −29–398 −10–19 2300 PW01 SLR 155 ± 137 120 0–426 47–259 310 ± 322 225 14–988 87–466 GrIS 78 ± 75 55 7–237 30–145 130 ± 117 98 7–349 39–225 WAIS 67 ± 88 44 −47–248 6–131 117 ± 136 83 −36–384 7–228 EAIS 10 ± 41 3 −29–96 −8–24 63 ± 195 10 −53–498 −14–51

Fig. 2: Median and likely range (17th–83rd percentile as used in the AR5) estimates of the ice sheet SLR contributions for different temperature scenarios and different studies. AR5 RCP ice sheet contributions are shown for RCP 2.6 and RCP 8.5 by combining contributions from the different sources (gray bars). BA13 is shown for the elicited temperature increase of 3.5 °C by 2100 (orange bar). This study (SEJ2018, in blue) is shown for the L and H temperature scenarios using solid lines. Dashed lines are interpolated from the L and H findings, using stochastic resampling of the distributions assuming a linear relationship between pairs of L and H samples.

The relative contribution of each ice sheet to total SLR (used here to refer to the sum of the three ice sheet contributions) depends on the temperature scenario. To demonstrate this, we compare the mean projections for the three ice sheets for the overall 2100 H distribution, for the same distribution conditional on the total contribution being above the median total projection (>51 cm), and the same distribution conditional on the total being above the 90th percentile (>141 cm). In the unconditional distribution, GrIS dominates the mean projection, contributing 33 cm (49%) of the 67-cm total, compared with 27 cm for WAIS and 6 cm for EAIS: proportions that approximately mirror the present-day contributions (4). The GrIS share declines for larger total contributions. For the mean of the upper half of total SLR projections, GrIS contributes 49 cm (46%) of 106 cm total compared with 44 cm for WAIS and 13 cm for EAIS; for the mean of the top decile, GrIS contributes 60 cm (30%) of the 194-cm total compared with 95 cm for WAIS and 39 cm for EAIS.

Statistically, the declining GrIS share and declining GrIS/AIS ratio reflect a higher mean estimate but slightly less skewed distribution for GrIS than for WAIS, and a long tail for EAIS (Fig. 3), as well as the assessed dependence structure between different terms. Physically, this is likely a result of the role of highly nonlinear dynamic processes coming into play for marine sectors of the AIS that are needed to achieve the higher values of total SLR, whereas at lower total SLR values, more linear processes dominate. It is also noteworthy that the fifth percentiles for both temperature scenarios and for all epochs are less than their current values, suggesting a scenario in which increased snowfall, primarily over the AIS (Table 1), plausibly compensates for any changes in ice dynamics and enhanced melting over the GrIS.

Fig. 3: Individual ice sheet contributions to SLR for 2100 L (A) and H (B) temperature scenarios, assuming dependences between the ice sheets in terms of the processes of accumulation, runoff, and discharge. PDFs were generated from 50,000 realizations of the relevant SEJ distributions. Horizontal bars indicate the fifth, 50th, and 95th percentile values (i.e., the 90% credible range). Also shown are the likely range (17th–83rd percentile) as defined in the AR5 and the total AIS contribution (WAIS plus EAIS assuming the inter ice sheet dependencies elicited). Note that this is not simply the sum of WAIS and EAIS contributions because of inter-ice sheet dependencies. The AIS values are compared with a recent emulator approach (30) in SI Appendix, Fig. S11.

Direct comparison with the AR5 is complicated by the use of different external forcings. Our L scenario is slightly warmer than the median projection for Representative Concentration Pathway (RCP) 2.6, and cooler than the median projection for RCP 4.5 at 2100 (2081–2100 global mean warming of +1.9 °C compared with medians of +1.6 °C and +2.4 °C, for RCP 2.6 and RCP 4.5, respectively), whereas our H scenario is roughly comparable to the median projection for RCP8.5 (2081–2100 global mean warming of +4.5 °C compared with a median of +4.3 °C for RCP 8.5), although with different trajectories (SI Appendix, Fig. S1). Our two temperature scenarios were chosen to assess the potential consequences, in terms of SLR, of the goal of the COP21 Paris agreement to keep global temperatures below +2 °C above preindustrial and of a scenario closer to business as usual, as opposed to matching a specific RCP. For comparison, the AR5 likely range ice sheet SLR contribution for RCP8.5 at 2100 is 6–35 cm, with a median of 19 cm (7) (Fig. 2). As mentioned, comparing our findings with those from the AR5 requires transforming temperatures and percentiles to match those used in the AR5. Nonetheless, given these caveats, it is clear the SEJ median and upper value of the likely range (83rd percentile) are statistically significantly larger than the corresponding AR5 values (Fig. 2). Our likely range upper bound is almost three times the AR5 value for RCP 8.5 (94 vs. 35 cm, estimated by summing the individual components considered in the AR5 and, hence, assuming perfect dependence). This is driven, primarily, by larger uncertainty ranges for the WAIS and GrIS contributions (Fig. 3), possibly resulting from experts’ consideration of the aforementioned nonlinear processes. We note also that the uncertainties have grown substantially in comparison with BA13, where the elicited temperature increase above preindustrial was +3.5 °C (indicated by the orange line in Fig. 2). In comparison, our current findings result in a larger uncertainty range at a lower temperature increase (Fig. 2). There has been recent consideration of the benefits of limiting warming to +1.5 °C (26) and what difference this would make compared with the Paris Agreement +2 °C. The reduction in the sea-level contribution from the ice sheets at this lower temperature for our study is broadly in line with the findings of the Intergovernmental Panel on Climate Change Special Report on 1.5 °C, which obtained a value of 10 cm reduction in global mean sea level from all sources (26).

Another important point is the positive skews of the distributions, which result in long upper tails that are less apparent in the AR5 values (limited to the likely range). For example, the median values obtained here and in the AR5 for RCP2.6 differ by 8 cm (Fig. 2), but the 83rd percentile from the SEJ is about 100% larger (51 vs. 25 cm). This becomes even more important if considering probabilities beyond the likely range defined in the AR5, such as the very likely range (the 90th percentile confidence interval). This is apparent from the values in Table 1. Kurtosis provides a quantitative measure of tail behavior and is discussed in SI Appendix, Note 1.1.

Fig. 3 illustrates the PDFs for 2100 L and H temperature trajectories for each ice sheet. The 90% credible intervals for the GrIS and WAIS (approximately equivalent to the very likely range in Intergovernmental Panel on Climate Change terminology) are broadly similar to one another in both scenarios (c.f. the 90% credible interval bars in Fig. 3 A and B). For the 2100 L and H scenarios, the EAIS uncertainty ranges are about a factor of three and two smaller, respectively. Median values for the GrIS and WAIS are broadly comparable (13/8 cm for L and 23/18 cm for H), whereas the EAIS median values are 0 and 2 cm for L and H, respectively. Both the WAIS and GrIS show a strong skew with a long positive tail, which is absent for the EAIS for 2100 L but begins to emerge for 2100 H. There is, consequently, a substantial difference between the high-end, 95th percentile values considered here versus the 83rd percentile value presented in the AR5, which is far more pronounced than differences between the fifth and 17th percentiles (Fig. 3). For WAIS under 2100 H, the difference between the 83rd and 95th percentile is a factor two (Fig. 3 and Table 1), and a factor four for the EAIS. This is also seen when considering the total SLR from the ice sheets. For 2200 H, the 83rd and 95th percentiles are 251 and 750 cm, respectively (Table 1). By limiting consideration only to the likely range, the AR5 results miss this tail behavior, which is a critical component of risk management.

The present SEJ demonstrates a shift in expert opinion since BA13 (i.e., in 2012), when it was found that the GrIS had the narrowest 90% credible range but the largest median SLR rate (13). Here, the GrIS still has the largest median value (for both L and H), but the upper tail of the distribution is now comparable to that of the WAIS (Fig. 3A). It is difficult to determine the basis for this, but we note that the experts overwhelmingly believe that the recent (last 2 decades) acceleration in mass loss from the GrIS is predominantly a result of external forcing, rather than internal variability. Of the 22 experts, 18 judge the acceleration is largely or entirely a result of external forcing (SI Appendix, Fig. S9A and Table S6). This is an important and statistically significant shift from the findings in BA13. In contrast, for the WAIS, opinion remains divided, with seven experts indicating their view that it is largely a result of internal variability, seven placing more weight on external forcing, and eight giving equal weights to each. This reflects the earlier conclusions of BA13.

The findings of SEJ2018 cannot be directly compared with BA13 because the target questions differ, as do the temperature scenarios. The closest comparison that can be made between SEJ2018 and BA13 is for the latter’s cumulative 5/50/95% SLR values of 10/29/84 cm for 2010–2100, which comprised two-thirds from GrIS, one-third from WAIS, and a negligible amount from EAIS, for a temperature increase based on experts’ judgement of +3.5 °C (13). For SEJ2018, we obtain −5/18/73 cm for +2 °C rise and −1/43/170 cm for +5 °C rise (integrated over 2000–2100). Fig. 2 compares the likely range in BA13 and the various temperature markers used here and in the AR5. It is evident that opinion has shifted toward a stronger ice sheet response and a larger credible range, for a given temperature change, than was considered plausible by the experts 6 y ago.

The rather high median and 95% values for 2100 SLR (Fig. 2 and Table 1), found here, likely reflect recent studies that have explored, in particular, AIS sensitivity to CO2 forcing during previous warm periods (27, 28) and new positive feedback processes such as the Marine Ice Cliff Instability (19), alongside the increasing evidence for a secular trend in Arctic climate (29) and subsequent increasing GrIS mass loss (4). A recent study (30) has used an emulator approach to reexamine the potential role of the Marine Ice Cliff Instability in explaining past sea level and how this affects projections, and we can compare our AIS results with the projections reported in ref. 30. Our results lie between the emulation with Marine Ice Cliff Instability and without, lying closer to the latter for the median values (SI Appendix, Fig. S11). Uncertainties for the H temperature scenario grow rapidly beyond 2100, with 90th percentile credible ranges of −10 to 740 cm and −9 to 970 cm for 2200 and 2300, respectively. Limiting projections to the likely range largely obscures the real, and potentially critical, extent of the deep uncertainties evident in this study.

Global Total SLR Projections

To place these results in the context of total SLR projections, including contributions from ocean thermal expansion, glaciers, and land-water storage, we use a probabilistic SLR projection framework (3). Specifically, we substitute Monte Carlo samples from the PW01 joint probability distribution in SEJ2018 for the ice sheet values used in Kopp et al. (3), while keeping the remaining projections for other components of SLR. For thermal expansion and glaciers, these projections are driven by CMIP5 model projections, using an approach similar to that of AR5. For land-water storage, the projections are based on semiempirical relationships among population, dam construction, and groundwater withdrawal (3). We combine the L scenario ice sheet projections with the other components from the +2.0 °C scenario developed by Rasmussen et al. (31), and for the H scenario with those for RCP 8.5 from Kopp et al. (3).

Compared with other SLR projections for 2050 developed over the last 6 y (32), the 2050 L projections are broadly comparable (very likely range of 16–49 cm), whereas the 2050 H projections are somewhat fatter tailed, with the very likely range extending up to 61 cm (Table 2). This compares with the 20 studies compiled by Horton et al. (32), which spanned from 12 cm at the low end of fifth percentile projections to 48 cm at the high end of 95th percentile projections. There are relatively few +2 °C studies to compare with our 2100 L projections, but those that are compiled in Horton et al. (32) range from 0.2 m at the low end of fifth percentile projections to 1.1 m at the high end of the 95th percentile projections. The SEJ2018 distributions fall on the high side of this range, with a median projection of 0.7 m and a 90th percentile range of 0.4–1.3 m.

Table 2: Total global-mean sea-level rise projections. Produced by combining PW01 ice sheet projections with thermal expansion, glacier, and land water storage distributions from Kopp et al (3). Centimeters above 2000 CE 50% 17–83% 5–95% 1–99% 2050 L 30 22–40 16–49 10–61 2050 H 34 26–47 21–61 16–77 2100 L 69 49–98 36–126 21–163 2100 H 111 79–174 62–238 43–329

The 2100 H projections fall within the existing range of RCP 8.5 projections, which have extended upward in recent years, substantially beyond the AR5 range. The 2100 H median projection of 1.1 m falls midway between the AR5 projection of 0.7 m and the 1.5 m that Kopp et al. (6) projected using the Antarctic ice sheet projections of DeConto and Pollard (19), which provided an initial attempt at explicit, continental-scale physical modeling of ice shelf hydrofracturing and marine ice cliff instability. The very likely range of 0.6–2.4 m falls within the 0.4–2.4-m low-fifth percentile to high-95th percentile range in the compilation of Horton et al. (32). This comparison emphasizes the skewness of the expert distribution: although the median projection falls in the middle of recently published projections, the 95th percentile tracks the high end of published projections. Although none of these studies is entirely independent of the others, taken together, they provide strong support for recent coastal planning scenarios that anticipate SLR well above the AR5 range (33–35).

Conclusions

This study suggests that experts’ judgments of uncertainties in projections of the ice sheet contribution to SLR have grown during the last 6 y and since publication of the AR5. This is likely a consequence of a focused effort by the glaciological community to refine process understanding and improve process representation in numerical ice sheet models. It may also be related to the observational record, which indicates continued increase in mass loss from both the AIS and GrIS during this time. This negative learning (36, 37) may appear a counter intuitive conclusion, but is not an uncommon outcome: as understanding of the complexity of a problem improves, so can uncertainty bounds grow. We note that for risk management applications, consideration of the upper tail behavior of our SLR estimates is crucial for robust decision making. Limiting attention to the likely range, as was the case in the Intergovernmental Panel on Climate Change AR5, may be misleading and will likely lead to a poor evaluation of the true risks. We find it plausible that SLR could exceed 2 m by 2100 for our high-temperature scenario, roughly equivalent to business as usual. This could result in land loss of 1.79 M km2, including critical regions of food production, and displacement of up to 187 million people (38). A SLR of this magnitude would clearly have profound consequences for humanity.

Materials and Methods

Experts were convened in two separate 2-d workshops, one in Washington, DC, drawing on experts working in North America, followed by one near London, drawing on European experts. The experts were notified in advance of the objectives of the exercise and received examples of questions to be asked, along with a description of the method to be applied for analyzing their responses (SI Appendix, Note 4). To minimize misunderstandings and ambiguities and to clarify issues and aspects of the problem, group discussion of the target questions was allowed before experts individually (and privately) completed each of the three categories of questions. These comprised seed questions used for calibration of the experts, target questions for eliciting judgments on topics for which our goal was to quantify uncertainties, and a set of descriptive rationale questions, through which experts could articulate or summarize their reasoning about the target items (SI Appendix, Note 3). The period for answering questions was unlimited, but in practice was about 6–8 h overall. At the conclusion of the first day, responses were collated and preliminary probability distributions were developed from EW and from performance weights combination solutions, using the Classical Model Decision Maker approach (22). These preliminary outcomes were presented to the experts on the second day, and they were given an opportunity to discuss and, if they wished, to revise their initial judgments. Although a broad discussion revealed what motivated many of the responses and provided a basis for our interpretation here of the key contributory factors, few experts changed any of their responses after this provisional presentation.

After the elicitation, the target item uncertainty distributions were recalculated with the Classical Model to conform to the goal of achieving optimal statistical accuracy with minimal credible bounds (e.g., high informativeness). This is accomplished by forming a weighted combination of those experts for which the hypothesis that their probabilistic assessments were statistically accurate would be not rejected at the 0.01 level (denoted PW01). The threshold 0.01 was chosen to achieve robust representation of experts from both workshops while enforcing standard scientific constraints on statistical hypotheses. On this basis, the judgments of six US and two European experts were preferred, and the outcomes of pooling their judgments are shown in SI Appendix, Table S1, for each of the temperature scenarios. Instead of choosing a statistical rejection threshold based on standard hypothesis testing, the Classical Model also allows choosing an optimal threshold that maximizes the statistical accuracy and informativeness of the resulting combination. The effect of this optimization is a moderate reduction in the 90th percentile credible range relative to the PW01 combination.

The Classical Model Decision Maker combined score is an asymptotic strictly proper scoring rule if experts get zero weight when their P value drops below some threshold (22). This means that, with such a cutoff, an expert receives their maximal expected weight in the long run by, and only by, stating percentiles that reflect their true beliefs. The weight of an expert is determined by his/her statistical accuracy and informativeness. For comparison, an equally weighted combination of the eight preferred experts (denoted EW01) is formed. EW01’s credible intervals are wider than those of PW01 (SI Appendix, Note 1.1). We use PW01 here to provide robust representation from both panels, as explained here. All combinations concern the experts’ joint distributions, based on the elicited dependence information. Expert scoring is shown in SI Appendix, Table S3, where further details can be found. Rutgers, Princeton University, and Resources for the Future (RFF) considered this study to be exempt from requiring informed consent.

Supplementary Information

Determining expert quantiles

Thirteen experts participated in the expert elicitation on contribution to sea level rise from ice sheets, held at RFF, Washington DC, USA on Jan 25-26. Nine experts participated in a similar elicitation held near London, UK on Feb 20-21, 2018. The two elicitations used the same elicitation protocol. The assessments concerned Accumulation, Runoff and Discharge for GrIS, WAIS and EAIS for the time  temperature scenarios shown in Figure S1. Experts were chosen based on whether they were research active in the topic, assessed on their publications over the last ~5 years and involvement in related initiatives such as NASA SeaRise, Delta Commission (Netherlands Govt), EU Ice2Sea project, Ice Sheet MIPS etc. A working minimum group size, from previous experience, is about six experts and more than 20 provides diminishing returns in terms of the performance of the synthetic pooled expert. We also wished to obtain a balance in age, gender and specialism within the broad field of ice sheets and SLR and to avoid accessing multiple experts from the same group. In addition to the 22 that participated, nine experts were invited who could not attend (4) or did not wish to (5).

The participating experts are listed below US elicitation: Robert Bindschadler, Rob DeConto, Natalya Gomez, Ian Howat, Ian Joughin, Shawn Marshall, Sophie Nowicki, Stephen Price, Eric Rignot, Ted Scambos, Christian Schoof, Helene Seroussi, Ryan Walker EU elicitation: Gaël Durand, Johannes Fuerst, Hilmar Gudmundsson, Anders Levermann, Frank Pattyn, Catherine Ritz, Ingo Sasgen, Aimee Slangen, Bert Wouters

The assessments were combined using equal weighting and performance-based weighting. In the EU expert panel, one expert provided judgments based on a conceptual interpretation of the three processes, Accumulation, Runoff and Discharge, that differed significantly from the definitional framework outlined in the questionnaire; the expert acknowledged this to be the case upon enquiry, and their judgments were not included in subsequent processing. In the US panel one expert misinterpreted the baseline values, as a result, their uncertainty judgments contained systematic discrepancies in relation to others in the panel. Unfortunately, there was not an opportunity to re-visit and correct this expert’s evaluations in a timely manner, and so the relevant inputs were removed from the analysis reported here.

The combined assessments were convolved to obtain the overall ice sheet contribution to global sea level rise using dependence information provided by the experts

Overall Results

In this exercise experts quantified their 5th, 50th and 95th percentiles for accumulation, for discharge and for runoff for each of GrIS, WAIS and EAIS as anomalies from the 2000-2010 baseline trend (see Supplementary note 5).

They also quantified their dependence between these quantities at 2100 with 5˚ C warming with respect to pre-industrial. This same dependence structure was applied for all other scenarios. As an extension, more articulated dependence structures could be elicited for the different scenarios and applied to the present assessments. In the terminology of SEJ, a Decision Maker (DM) is a “synthetic pooled expert” that is some weighted combination of experts. Equal Weights (EW) is sometime referred to as “one person one vote”. Performance Weighting (PW) is where experts are weighted based on measures of their informative and accuracy quantified using a set of calibration questions or items (described in greater detail in SI Note 1.2).

The results with Performance Weighting (PW) are shown in Table S1 in yellow. For the final results, it was decided to use the performance weighted combination of all experts whose statistical accuracy (P- value) was greater than 0.01 (PW01). EW denotes Equal Weighted combinations.

Total ice-sheet SLR is the sum of SLR from all three ice sheets: however, this is a sum of stochastic variables. For 2300H the total mean of 287 cm is the sum of 63 cm, 113 cm and 111 cm, but the quantiles do not sum in this way. For 2300H, the total 95th percentile, 966cm, is smaller than 498 cm + 332 cm + 378 cm = 1208 cm. Adding stochastic variables requires knowledge of their joint distribution. The quantiles will add only if the variables are completely rank dependent (sometimes called co- monotonic). In this case one variable is at or above its 95th percentile if and only if the others are as well. The chance of that happening is then 5%, which means that the sum of the 95th percentiles is exceeded with probability 5%. If the variables are independent, then the chance that all three are at or above their respective 95th percentiles is 0.053 = 0.000125. In this case the 95th percentile will be much lower than the sum of the separate 95th percentiles. In fact, if the three ice sheets are independent the 95th percentile of PW01 (Figure S9b) is 823 cm. The difference 966 cm − 823 cm reflects the effect of the dependence.

The choice of a cutoff for statistical accuracy (P-value) beneath which experts are unweighted is imposed by the theory of strictly proper scoring rules (see Supplementary Information section 1.2). The scoring rule theory does not say what this cutoff should be, only that there should be some positive lower bound to the admissible statistical accuracy scores. Optimal performance weighting (PWOpt) chooses a cutoff which optimizes the scores of the resulting combination. PW01 reflects the choice to include all those experts who have acceptable statistical accuracy so as to ensure wider representation. The distributions of PW01 are somewhat wider than those of PWOpt. With the optimal cutoff of 0.399, only experts 3 and 14 are weighted. Cutoff = 0.01 forms a weighted combination of eight experts whose statistical accuracy is above 0.01; these are experts 3,6,8,9,12,14,24 and 27. EW01 forms an equal weighted combination of these same eight experts. All combinations concern the experts’ joint distributions based on the elicited dependence information.

Expert Scoring

The expert judgment methodology applied here is termed the “Classical Model” because of its analogy to classical hypothesis testing (1). The key idea is that experts are treated as statistical hypotheses. Experts were given a PowerPoint presentation to explain the basic features of the method (see SI Section 8), on which this section is based. Expert scoring is shown in Table S2. For detailed explanations please refer to (2), especially the online supplementary material (Appendix A).

An expert’s statistical accuracy is the P-value (column 2 in Table S2) at which we would falsely reject the hypothesis that an expert’s probability assessments are statistically accurate. Roughly, an expert is statistically accurate if, in a statistical sense, 5% of the realizations fall beneath his/her 5th percentile, 45% of the realizations fall between the 5th and 50th percentile, etc. High values (near 1) are good, low values (near 0) reflect low statistical accuracy. An expert’s informativeness is measured as the Shannon relative information in the expert’s distribution relative to a uniform background measure over an interval containing all experts’ percentile assessments and the realizations, variable-wise. Columns 3 and 4 give the average information scores for each expert for all variables (column 3) and all calibration variables (column 4). The number of calibration variables is shown in column 5 for each expert (in this case all experts assessed all 16 calibration variables). The product of columns 2 and 4 is the combined score for each expert. Note that statistical accuracy scores vary over seven orders of magnitude whereas information scores vary within a factor three. Therefore, by design, the ratios of the products of combined scores are dominated by the statistical accuracy. If an expert’s P-value is above a cut-off value (in this case P=0.01) then the expert is weighted with weight proportional to the combined score. Normalized weights for weighted experts are shown in column 6.

A combination of the experts’ distributions is termed a “decision maker” (DM). Column 7 gives each expert’s Shannon relative information with respect to the equal weight (EW) DM (1). These dimensionless numbers indicate the divergence among the experts themselves and are compared with perturbations caused by dropping a single expert or a single calibration variable (Supplementary Tables 3 and 4). Note that the scores in column 7 are somewhat smaller than the scores in column 3. This suggests that EW is somewhat more informative than the background measure, relative to which the experts’ informativeness is measured in column 3.

Other DMs in Table 2, besides EW, are PW01, the performance weighted combination of the eight weighted experts, and PWOpt, the performance weighted combination with the cutoff chosen to optimize the combined score of the DM. Indeed, the combined score of PWOpt (0.4914) is (only) slightly greater than that of PW01 (0.4795). As is typical in such studies, the information of EW is about half that of PWOpt. Very roughly, this translates to EW’s average 90% confidence bands being twice as large as those of PWOpt. Similarly, EW’s statistical accuracy (P-value) is inferior to that of PWOpt. This is an “in-sample” comparison since DM’s are compared on the same set over which PWOpt is optimized. For “out-of-sample” comparisons see below.

Six of the 13 US experts had a statistical accuracy score above 0.01. This is a high number for SEJ studies, especially considering the fact that 16 calibration variables were used, constituting a more powerful statistical test than the traditional number of ten calibration items. Two of the eight EU experts had a statistical accuracy score above 0.01, which is in line with most SEJ studies. There is very little difference between the scores of PWOpt and PW01, though there are modest differences in SLR predictions (see Table S1). A scoring system is asymptotically strictly proper if and only if an expert obtains his/her highest expected score in the long run by, and only by, stating percentiles corresponding to his/her true beliefs. The combined score is an asymptotic strictly proper scoring rule if experts get zero weight when their P-value drops below some threshold (1). If (s)he tries to game the system to maximize his/her expected weight, (s)he will eventually figure out that (s)he must say exactly what (s)he thinks. Honesty is the only optimal strategy. The theory does not say what the cut-off value should be, so that is often chosen by optimization.

In the Classical Model, the optimization works as follows: starting with a cutoff beneath the lowest P- value includes all experts with weight proportional to their combined scores. The combined score of the resulting DM is stored. Taking the expert with the lowest P-value, we next exclude that expert, normalize the remaining combined scores, compute the resulting DM, apply this DM to the calibration variables and store the resulting DM’s combined score. Then we remove the next lowest P-value expert and repeat. With N expert P-values this results in N-1 different DM’s. We choose the DM whose combined score is the highest. In this case, setting the cut-off at 0.399 and retaining experts 3 and 14 produced the highest scoring DM. With this scoring system it is impossible that a weighted expert has a lower P-value than an unweighted expert, even though doing so might produce a higher DM score. This system can thus be regarded as optimal weighting under a strictly proper scoring rule constraint. The theory was developed in the 1980s and is detailed in (1) and (2).

The Classical Model has been applied in hundreds of expert panels and has been validated both in- and out-of-sample (2-5). In the absence of observations of the variables of interest, out-of-sample validation comes down to cross-validation whereby the calibration variables are repeatedly separated into subsets of training- and test variables. The PW model is initialized on the training variables and scored on the test variables. The superiority of PW over EW in terms of statistical accuracy and informativeness has been demonstrated using this approach.

Robustness on Experts

Robustness on experts examines the effect on the PW01 “decision maker” (i.e. the synthetic pooled expert) of losing individual experts. Experts are removed one at a time and PW01 is recomputed. Table S3 shows the resulting information and P-values of the “perturbed” PW01. The rightmost column of Table S4 shows the divergence among the experts themselves. Comparison with the rightmost column of table S3 shows that the scoring results are very robust against loss of a single expert.

A more complete sense of robustness would examine the effect of the method of recruitment of experts and of the elicitation team. Before the Classical Model was adopted for the European uncertainty analysis of accident consequence codes for nuclear power plants (6), the authorities in Brussels required that parallel elicitations be carried out using the same elicitation protocol, but with different elicitation teams independently recruiting different experts. The findings in this case indicate a strong convergence of elicitation results from the two groups (6). Such an approach is generally far beyond the budgets of most applications. However, the results here (Table S1) show good general agreement on SLR between the US and European panels who were elicited separately. A different type of robustness is gleaned from the 14 year running expert judgment assessments of risks from the Montserrat volcano (7). Those assessments concerned a consistent elicitation method, applied to the same variables under changing conditions, with some exchanging of participating experts over elicitations. The approach showed good consistency of performance for volcanic hazard assessment purposes, over more than seventy repeat elicitations.

Robustness on Items

Seed variables are removed one at a time and PW01 is recomputed. These scores are extremely robust against loss of a seed variable. Comparing the rightmost columns of Supplementary Tables 3 and 5 shows that the perturbation caused by loss of a single calibration variable is very small relative to the divergence among the experts themselves.

Dependence Elicitation

Dependence and especially tail dependence are unfamiliar concepts for many scientists. A PowerPoint presentation was given to the experts, before the elicitation, to introduce these notions, where the reader can find precise definitions (see Supplementary Section 8 for links). Figure S6 from the presentation shows how aggregation affects uncertainty. 3 sigma or one in 1000 upper tail events are depicted for the sum of 10 zero mean normal variables. If the variables have a pairwise correlation of 0.5, the distribution dilates such that the 3 sigma event coincides with the sigma event of the sum of independent Normals. If this pairwise correlation is realized with an upper tail dependent copula, the 3 sigma event coincides with the 7 sigma event for independent Normals. Thus an event whose probability is 1/1000 (3 sigma will appear to be an event with probability 1.28*10^{-12} (7 sigma)) when tail dependence is present but ignored.

The dependence elicitation for pairs of variables was accomplished by eliciting conditional exceedance probabilities: for central correlation, experts answered: “what is the probability that variable X exceeds its median given that variable Y has exceeded its median?”. Numerical and verbal answers were accepted (Table S8). For upper tail dependence, “median” was replaced by “95th percentile” in the above question and verbal responses were elicited as indicated in Table S8.

Three random variables (Runoff, Discharge and Accumulation) for each of the three ice sheets yield 36 pairs of variables. Potential dependences between ice sheets were also identified. Based on judgments of size and relevance, the analysis team pared this down to 10 pairs corresponding to the colored nodes in Figure S8, in addition to 3 inter-ice sheet relations. This structure is a “dependence vine” for determining a high dimensional joint distribution based on bivariate and conditional bivariate distributions. Unspecified (conditional) bivariate distributions are conditionally independent, making it easy to extend a partially specified structure to the minimally informative realization of the specified structure.

The basic “dependence vine” for expert 14, as an example, is shown in Figure S8. The ellipses represent the variables (GA = Greenland Accumulation; WD = West Antarctica Discharge; etc). The dependences, represented by arcs, are quantified by assessing exceedance probabilities. The colored nodes are those between which dependence is assessed. Conditional independence is assumed elsewhere. Calculations and sampling were performed with the freeware UNINET. This exposition of vine theory is necessarily incomplete; a Wiki page provides more background and references. A full exposition is in (8, 9).

For each of the eight experts with P-value > 0.01, a comparable regular vine was constructed using the dependence information elicited from each individual expert. These eight joint distributions were combined with the various weighting schemes shown in Table S2.

From expert quantiles to SLR

The procedure of going from expert quantiles to distributions for SLR is as follows (for detailed information see (2), especially the online supplementary material Appendix A):

For each variable we determine an “intrinsic range” (IR), the smallest interval that contains all expert assessments plus the realization (in case of seed variables) + a 10% overshoot below and above (10% is a parameter that can be adjusted in the code)
We put a background measure on each IR. In the code the user can choose between the uniform and log-uniform background measure. Log-uniform is indicated when experts reason in orders of magnitude. In this case all background measures are uniform. Other choices could be made but would require re-coding.
For each expert and each variable, we fit a density that is minimally informative with respect to the background measure and complies with the expert’s quantile assessments. For the uniform background, this is a piecewise uniform density. This density “adds as little as possible” to the expert’s assessment. Note that fitting a two-parameter family such as the Gaussian distribution will often be unable to match 3 quantiles.

4) ASSUMING INDEPENDENCE a. With N experts we form the EW combination by simple averaging of the experts’ densities. DO NOT average the quantiles; that can give a very overconfident result. b. With PW, we take a weighted average of densities. c. Simple Monte Carlo sampling is used to build a distribution for SLR. For each ice sheet we sample D, R and A and store D+R-A. d. Monte Carlo sampling is used to build a distribution of total SLR as SLRGr + SLREAIS + SLRWAIS. Again, do not sum the quantiles.

WITH DEPENDENCE, we build a joint density for each expert based on the elicited exceedance probabilities. This cannot be done with generic software. XL add-ons like @Risk and Crystal Ball impose the assumption of the Gaussian copula. Based on a pilot elicitation with the 2012 experts, we anticipated that tail dependence could be significant, rendering the Gaussian copula inappropriate. For each expert we obtain a distribution for total SLR, and we take a weighted average of these densities to find the combined distribution for SLR. Each expert’s total SLR distributions incorporates his/her dependence.

Steps (1) – (3) can be done with freeware EXCALIBUR (EXpert CALIBRation). Step (4) can be done with Freeware UNICORN (UNCertainty analysis wIth CORelatioNs)– which has limited dependence modeling capability). Step (5) uses freeware UNINET which is much more powerful. All these programs can be downloaded from http://www.lighttwist.net/wp/.

Experts’ rationales

This section summarizes and collates the expert responses to the rationale questionnaire that is reproduced in supplementary note 6. For description of the process considered see note 6.

It is apparent that SEJ2018 value spreads for Antarctic Ice Sheet contribution to sea level rise in 2100CE lie above Edwards et al (2019)(10) no-MICI but are substantially lower than the 50th and 95th percentile MICI values obtained using the emulator in (10).

Briefing note sent to experts prior to elicitation

The following briefing note was sent to all experts prior to the elicitation:

Eliciting Ice Sheets’ Contribution to Sea Level Rise Sept 28, 2017

Introduction: Probabilistic prediction for Ice Sheet contributions to Sea Level Rise

With global warming, ice sheets in Greenland and Antarctica are likely to become the primary agents of Sea Level Rise (SLR) in the coming decades and centuries. In their normally slow, march to the sea, glaciers draining the ice sheets exhibit dynamics which are highly variable from place to place, with neighboring glaciers or ice streams responding in markedly different ways to the same external forcing. Dynamic models must account for things like bedrock properties (including slipperiness and topography), ice shelf buttressing, precipitation, melt water effects on ice stiffness, grounding line migration, ocean currents, and ice cliff instability. Some of these features are directly observable, many are not.

Glaciologists focusing on individual glaciers must contend with many uncertainties when predicting future ice mechanics and dynamics out to, say 2100CE, or even 2300CE. Point predictions, whatever their pedigree, are of limited value when the uncertainties are very large. Scientists must therefore make probabilistic predictions; they must say, in effect “My best estimate is a +0.2mm contribution to SLR by 2100CE from this particular glacier, and I am 90% confident the contribution will be between - 3mm and +6mm”. A narrative might explain, say, “the contribution could actually be negative (the ice sheet would actually grow) if warming and changing atmospheric and ocean circulation increased winter precipitation inland while leaving the buttressing ice shelves largely intact; a very high contribution might result if increased storminess and shifting ocean currents break up ice shelves or summer coastal precipitation causes increased calving and instability”. Capturing the narrative behind the uncertainty assessments is essential for understanding and communicating our current state of knowledge.

That is the easy part. Judging the cumulative future effects of the main ice sheets on sea level rise raises a host of new questions and methodological challenges, lying further outside most physical scientists’ comfort zone. What might be the joint impacts of ice sheet responses on SLR if extreme conditions were encountered under global climate change?

A proof of concept

We describe a proof-of-concept demonstration for using expert judgments to constrain quantitative estimates of dependences in potentially correlated processes that affect the ice sheet (2), and indicate some preliminary trial results. We also explore the influence on these results of different ways of combining expert judgments (3).

A talk on this subject was given at the Banff research center in 2013 by Roger Cooke and can be streamed from http://www.birs.ca/events/2013/5-day-workshops/13w5146/videos/watch/201305221037-Cooke.html
A talk on performance weighting was given at the Centers for Disease Control and Prevention in Atlanta GA on May 23, 2017 by Willy Aspinall, and may be streamed at https://www.youtube.com/watch?v=FPC-h-br8i8&feature=youtu.be

A recent study extending (Bamber and Aspinall, 2013, henceforth B&A) made a first pass at assessing dependences between macro process variables relating to the Greenland, West Antarctica and East Antarctica ice sheets. Estimates of contributions to SLR were based on the B&A protocol. A typical question was

In the case of Greenland, for a global mean annual Surface Average Temperature rise of 3°C by 2100 with respect to pre-industrial, what will be the integrated contribution, in mm to SLR relative to 2000- 2010 of the following:

i) accumulation 5% value: ___________ 95% value: ___________ 50%value: ____________

ii) runoff 5% value: ___________ 95% value: ___________ 50%value: ____________

iii) discharge 5% value: ___________ 95% value: ___________ 50%value:: ____________

Similar questions were directed to West and East Antarctica, and to different temperatures, out to 2200.

The dependence elicitation was based on exceedance probabilities, as pioneered in the 1990’s by uncertainty analyses for nuclear power plants in the US and in Europe. Whereas the earlier nuclear work used only the 50% exceedance probabilities, our ice sheet follow-on study asked also for 95% exceedance probabilities.

Within ice sheet process dependencies to 2100CE

Greenland Ice Sheet, 2100 3°C warming

Given discharge >= your 50% value, what is probability that runoff also >= your 50% =____ Given discharge >= your 95% value, what is probability that runoff also >= your 95% =____ Given accumulation >= your 50% value, what is probability that discharge also >= your 50% =____ Given accumulation >= your 50% value, what is probability that runoff also >= your 50% =____

In answering questions concerning the 95% exceedances, the experts had to consider whether factors likely to produce extreme values in one variable would also produce extreme values in the other.

An extensive procedures guide for structured expert judgment emerging from these nuclear studies has informed many subsequent applications, including B&A. Following the Classical Model for structured expert judgment (Cooke 1991; Cooke and Goossens 2008), calibration variables from the experts’ field were used by B&A to score the experts’ statistical accuracy and informativeness. True values of calibration variables are known post hoc, they preferably concern near term future measurements, but can also involve unfamiliar intersections of past data or literature. An illustrative calibration variable from B&A was

There are nine main glacier/ice caps on Iceland. What was their 2009/2010 average climatic balance Bclim, in Kg/m2? (please indicate gain by +value, loss by -value) 5% value: ___________ 95% value: ___________ 50%value: ____________

Based on extensive experience with the Classical Model, an equally weighted combination of experts tends to give statistically accurate assessments exhibiting wide confidence bounds (low information). The goal of the Classical Model is to demonstrate high statistical accuracy with narrow confidence bounds. This is accomplished by differentially weighting the experts so as to favor those with high statistical accuracy and high information. Recent background on the Classical Model for climate uncertainty quantification may be found here. Other recent applications are summarized here, a Wikipedia page gives some background, and an extensive study of out-of-sample validation with complete mathematical exposition in supplementary material is here.

Dependence and aggregation

The SLR contribution of, say, the Jakobshaven glacier in western Greenland up to 2100CE is a random variable; it can be described mathematically by giving a range of possible values and a probability that each value would be realized. Quantifying the uncertainty in the contribution to SLR from the Greenland and Antarctic ice sheets involves adding together hundreds of random variables. Adding random variables is not like adding ordinary numbers. In adding two random variables, say the Jakobshaven and the Petermann glaciers’ contributions to SLR by 2100CE, we must consider all possible combinations of values for Jakobshaven and for Petermann, and consider the probability that these values arise together. Suppose the contribution from Jakobshaven were very large. According to the above narrative, that suggests certain influencing factors are in play; how would these influences affect the Petermann glacier, 1274 km to the north? If they would also tend to induce high contribution values for the Petermann, then this could indicate a positive dependence between the SLR contributions of the two glaciers. If, on the contrary, the drivers of elevated ice mass loss in the west of Greenland were conducive to more stable conditions in the north, then the interglacier dependence might be negative.

The more random variables we aggregate, the more important the effects of long range, global correlations can become, a feature which our intuitions easily under-appreciate. A neglected weak global correlation of  = 0.2 when summing 500 normal variables underestimates the confidence interval of the sum by an order of magnitude. Global correlations also amplify the correlation of aggregations. In the above example, the correlation between the sum of the first 250 variables and the sum of the second 250 variables is 0.992. In contemplating the uncertainty in the effects of hundreds of glaciers, we must consider the overall effects of these dependencies.

Tail Dependence

The correlation coefficient represents a sort of average association between two random variables. This often yields an adequate measure of their association, but not always. The linkage between two variables may be primarily due to factors driving the extreme values, not the more mundane, central values.

For example, under normal conditions it may be that the mass loss at Jakobshaven and at Petermann vary according to local weather conditions, which are largely uncorrelated. However, very large mass loss at Jakobshaven would implicate large scale warming factors, which in turn could imply large mass loss at Petermann as well. In such cases one speaks of positive tail dependence between the two variables, here glacier processes. Tail dependence can be positive or negative, can affect the upper or lower tails of distributions, or both, and bears no direct relation to the ordinary correlation. Thus, two Gaussian variables are always tail independent. Given that one of them exceeds its rth percentile, the probability that the second also exceeds its own rth percentile tends toward the independent value of (100-r)% as r tends to 100% regardless of the correlation, provided it is strictly between 1 and −1.

In other words, a very high value of one variable tends not to entrain a high value of the other with two Gaussian variables, but this will not be true for variables characterized by other distributions.

Results

The calculations were performed by Aspinall and Cooke defining a regular vine, using the experts’ responses. Regular vines capture dependence in terms of nested bivariate and conditional bivariate distributions on the ranks of random variables, called copulae. The copulae are chosen to mimic the elicited exceedance probabilities. Our highest weighted experts evinced tail dependence between Greenland Discharge and Greenland Runoff, between West Antarctica Discharge and West Antarctica Runoff, and between West Antarctica Discharge and East Antarctica Discharge. Although the actual values of tail dependence varied between experts, they were comparable in magnitude. Other variables exhibited median dependence without tail dependence.

Table 1 presents the overall results for the case of +3˚C global warming by 2100CE, and enables us to gauge the effects of dependence and of performance weighting. “EW” denotes the combination based on equal weighting of all nine experts, “PW” denotes the optimal performance weighting in which two experts were weighted, based on statistical accuracy and informativeness4. “Indep” signifies that experts’ dependence information was not used. The contribution to SLR was computed, per ice sheet, as Runoff + Discharge − Accumulation as if these were independent random variables. “tail indep” signifies that tail dependence was ignored and dependence was based only on the 50% exceedance probabilities. “tail dep” includes the information on tail dependence.

Table 1: “EW” denotes the combination based on equal weighting of all experts, “PW” denotes the optimal performance weighting in which the experts were weighted. “Indep” signifies that no dependence information was used. “tail indep” signifies that dependence was based only on the 50% exceedance probabilities. “tail dep” includes the information on tail dependence.

Ice sheet contribution to SLR by 2100CE with +3˚C warming [mm] mean stdev 5% 50% 95% Expert combination method EW indep 615 270 238 581 1120 PW Indep 335 200 71 307 719 PW tail indep 337 216 64 305 749 PW tail dep 338 229 71 292 785 The largest effect is wrought by using performance-based weighting instead of expert equal weighting. The mean SLR of “EW indep” is nearly twice that of the PW combinations, and the 5- 50- and95- percentiles are substantially shifted upwards, relative to any of the alternative combinations. Focusing on the PW combinations, the effect of including dependence information is most visible in the 95th percentiles; the corresponding means are notably consistent. Including ice sheet inter-dependence, without tail dependence, raises the 95th percentile by +37 mm relative to the independent case; including tail dependence raises this percentile by +66mm relative to the independent case.

These are “linear pooling methods”; other methods have also been proposed for ice sheet uncertainty quantification, for a discussion see Bamber et al (2016).

Conclusions

Predicting the cumulative effect of ice sheets on Sea Level Rise by 2100CE involves large uncertainties. Developing science-based quantifications of these uncertainties obliges scientists to venture outside their comfort zone of deterministic model-based predictions and deal with expert subjective uncertainty assessments. Adding information on dependence and tail dependence increased the values of the upper tail 95th percentiles of the performance weight combination. However, that increase effect was dominated by the reduction in SLR predictions produced by restricting the elicitation solution to our statistically accurate experts.

Reference values for ice-sheet processes

In the elicitation workshops, there was extensive discussion of how to define ice sheet contributions to sea level over future periods of time in relation to the temperature rise trajectories shown in Fig. S1. It was agreed that the ice sheet contributions would be expressed as anomalies from the 2000-2010 mean mass change states, as pre-defined in Table S7. On this basis, the net baseline sea-level contributions for this period were prescribed as 0.76 mm yr-1 for overall SLR, and 0.56, 0.20, and 0.00 mm yr-1 for GrIS, WAIS, and EAIS, respectively. (The resulting joint contribution of the three ice sheets is close to an observationally-derived value of 0.79 mm yr-1 for the same period, which was published subsequently to the SEJ workshops (4)).

For the SLR results presented in the main text, baseline contributions – integrated over the relevant time periods (i.e. from 2000CE to 2050CE; 2100CE; 2200CE and 2300CE) – have been added to the elicited SLR values reported in Supplementary note 1.

Table S8: Reference probabilities assumed for estimating central and tail dependencies.

Prompts for discussion of rationales

Some of the questions below give you options for answering that are not independent (e.g., on the second question, buttressing is not independent of hydrofracturing). In such cases, indicate the option that best captures your overall judgment. In cases where you feel more than one answer is absolutely necessary to best characterize your judgment, feel free to fit in more than one response. Where changes are referred to and a future period is specified, these are for the difference between the future period and the base period, 2000-2010.

Mass change observations and assumptions

Are the recent ~decadal trends in mass balance largely due to internal variability of the atmosphere/ice/ocean/climate system or anthropogenic forcing, for each ice sheet overall?

Sheet: GrIS WAIS EAIS IV EF IV EF (no trend) (no trend)

Dynamical processes

How will changes in near-field gravitational and vertical land motion due to past and future ice sheet unloading affect marine ice sheet instability: Decrease instability (D), Increase instability (I), or No significant change (N)? Sheet GrIS WAIS EAIS D I N

Among buttressing by ice shelves (B), basal traction (BT), transverse stresses (TS), hydrofracturing (HF), ice cliff instability (IC), and dissipation after iceberg formation at exit gates (DI), which one will be the most important for controlling the overall 21st, 22nd, and 23rd century discharge rate and grounding line migration for key ice streams and outlet glaciers (recognizing time variations in the role of each)? Key ice streams are those that you expect to control overall discharge for that ice sheet.

2°C scenario: Sheet GrIS WAIS EAIS 21st 22nd 23rd 21st 22nd 23rd 21st 22nd 23rd B BT TS HF IC DI 5°C scenario: Sheet GrIS WAIS EAIS 21st 22nd 23rd 21st 22nd 23rd 21st 22nd 23rd B BT TS HF IC DI

Surface mass balance

Between atmospheric circulation/moisture transport changes (AM) and albedo changes (AC), which do you consider more important for determining surface mass balance of grounded ice during the 21st, 22nd, and 23rd century. 2°C scenario Sheet GrIS WAIS EAIS 21st 22nd 23rd 21st 22nd 23rd 21st 22nd 23rd AM AC 5°C scenario Sheet GrIS WAIS EAIS 21st 22nd 23rd 21st 22nd 23rd 21st 22nd 23rd AM AC
Among changes in summer sea ice extent (SI), atmospheric circulation/moisture transport changes (AM), and albedo changes (AC), which do you consider most important for determining surface mass balance and rate of thinning of ice shelves in 21st, 22nd, and 23rd century? 2°C scenario Sheet GrIS WAIS EAIS 21st 22nd 23rd 21st 22nd 23rd 21st 22nd 23rd SI AM AC 5°C scenario Sheet GrIS WAIS EAIS 21st 22nd 23rd 21st 22nd 23rd 21st 22nd 23rd SI AM AC

Ocean processes

Among Antarctic circumpolar current changes (ACC), changes in intrusion of circumpolar deep water onto continental shelf (CDW), and changes in AMOC (MOC), which do you consider will have the largest effect on sub-shelf basal melt rates during the 21st, 22nd, and 23rd century? 2°C scenario Sheet GrIS WAIS EAIS 21st 22nd 23rd 21st 22nd 23rd 21st 22nd 23rd ACC CDW MOC 5°C scenario Sheet GrIS WAIS EAIS 21st 22nd 23rd 21st 22nd 23rd 21st 22nd 23rd ACC CDW MOC

Polar Amplification

Please provide the polar amplification factor (e.g., 1.5x, 2x) or range of factors that you used in your estimates. 2050 2100 2200 2300 1.5°C 2.0°C 2°C 5°C 2°C 5°C 2°C 5°C North South

Low-probability, high-consequence scenarios

Are there high-outcome scenarios above the 95% values you provided that deserve attention? If so, what are they?

Ambiguity relating to discharge versus sea level contribution

The questionnaire provided to experts asked for their estimate of changes in discharge (defined as the ice flux across the grounding line) that would contribute to SLR. For ice grounded below sea level, such as in large sectors of the WAIS and parts of the EAIS, the change in volume of discharge and the sea level contribution are not the same quantity. This is because it is only the volume above flotation (VAF) that contributes to SLR, while the change in discharge includes ice below flotation that will be displaced by sea water.

This issue was identified during the second SEJ workshop held in Europe and to address it we asked all experts what value they were using for discharge: total discharge or VAF (the same as sea level equivalent). Of the 22 experts, four stated they had calculated total discharge and the rest VAF. Of these four, one had a high calibration score and a strong weighting in the PW01 solutions and a correction to these discharge values for the WAIS and EAIS were considered necessary. To do this, we utilized the output of a thermomechanical ice sheet model coupled to a solid earth deformation model in a climate forced deglaciation experiment (11, 12) and calculated the ratio of total discharge to VAF. This is shown in Figure S12 alongside the gradients for the ratio for the WAIS, EAIS and AIS. For the WAIS, the ratio changes after a volume loss of about 1 m sea level equivalent (SLE), while for the EAIS it is relatively constant. For the EAIS discharge we used a constant ratio, while for the WAIS it varied as a function of the discharge anomaly. The change in gradient is only significant for the 2300 L and H and 2200 H scenarios, where WAIS discharge anomaly exceeds 1 m for this expert.

Figure S12: The ratio of volume above flotation to total ice discharge for a present-day deglaciation experiment for the Antarctic ice sheet. Fig a) AIS, b) WAIS, c) EAIS. Blue dots represent the first 400 years, red dots are for the remaining 20 Kyr.

Elicitation questions

The Elicitation questions are available as a pdf and Excel file at https://doi.org/10.5523/bris.23k1jbtan6sjv2huakf63cqgav