Constraining the models' response of tropical clouds to SST forcings using CALIPSO observations

Here we present preliminary results from the analysis of the low cloud cover (LCC) and cloud radiative effect (CRE) interannual changes in response to sea surface temperature (SST) forcings in two GISS climate models, and 12 other climate models. We further classify them as a function of their ability to reproduce the vertical structure of the cloud response to SST change against 10 years of CALIPSO observations: “the constrained models, which match the observation constraint, and the unconstrained models”. The constrained models replicate the observed interannual LCC change particularly well (ΔLCCcon=-3.49 ±1.01 %/K vs. ΔLCCobs=-3.59 ±0.28 %/K) as opposed to the unconstrained models, which largely underestimate it (ΔLCCunc = -1.32 ± 1.28 %/K). As a result, the amount of short-wave warming simulated by the constrained models (ΔCREcon=2.60 ±1.13 W/m2 /K) is in better agreement with the observations (ΔCREobs=3.05 ± 0.28 W/m2 /K) than the unconstrained models (ΔCREcon=0.87 ±2.63 W/m2 /K). Depending on the type of low cloud, the observed relationship between cloud/radiation and surface temperature varies. Over the stratocumulus regions, increasing SSTs generate higher cloud top height along with a large decrease of the cloud fraction below as opposed to a slight decrease of the cloud fraction at each level over the trade cumulus regions. Our results suggest that the models must generate sustainable stratocumulus decks and moist processes in the planetary boundary layer to reproduce these observed features. Future work will focus on defining a method to objectively discriminate these cloud types that can be applied consistently in both the observations and the models.


INTRODUCTION
Low-level clouds are ubiquitous in the tropics. Their presence is tied to the sea surface temperature (SST), which affects temperature and moisture differences between the surface and the free troposphere 1,2 . While the underlying processes are not fully understood, recent observationally-based studies confirm that low-cloud cover (LCC) and SST are negatively correlated 3,4,5 .
Therefore, in a warming world, marine boundary layer clouds are expected to dissipate, which will result in more incoming solar radiation, reinforcing the surface warming through a positive feedback. However, there is no consensus in global climate models (GCMs) on whether the low-level cloud amount will increase or decrease in future climate projections 6 . Moreover, not all models are able to reproduce the observed loss of low-level cloud in response to increased surface temperatures in present day climate and the majority continue to underestimate the low-level cloud amount 7,8 . Added together, these problems limit our confidence in future climate projections.
As a result, recent efforts have been devoted to evaluating climate models against these observations 3,5,6,9 . This is based on the assumption that models must reproduce the LCC-SST relationship in the current climate as a necessary but not sufficient condition to have confidence in their ability to simulate a more realistic future climate change in regions dominated by low clouds, although there is no guarantee that current climate variability itself is indicative of longer term climate changes 10 . Their results suggest that models that are in better agreement with observations in this way are those with a higher climate sensitivity-i.e., warmer surface temperature change in the future.
All these studies used passive sensor measurements to study this relationship and evaluate the models, because they provide good spatial and temporal coverage along with a long record, which reduces uncertainty in the LCC-SST relationship. However, the space-borne passive instruments typically cannot resolve the vertical extent of clouds and miss some clouds that are shielded by higher clouds. In comparison, the vertical structure of cloud changes in response to surface temperature variations has received far less attention climate models 4 . Yet, the 2-dimensional cloud amount as seen from space (i.e., LCC) may hide compensating errors in cloud amount at different levels and does not document the thickness of the cloud. Recent literature has shown the importance of knowing the vertical structure of low clouds to better understand how clouds may respond to climate change 11,12 . Moreover, in addition to other information (e.g., horizontal extent), getting the vertical structure of low clouds could also help discriminating the cumulus clouds from the stratocumulus clouds, the former typically having higher cloud top 13 . This emphasizes the need for further evaluation of the vertical structure of clouds in the present-day and how it will evolve in a warmer climate. Thus, active remote sensing instruments can potentially provide important information about the dominant low cloud regimes and their responses to perturbations. In addition to providing detailed information on the vertical structure of clouds, the horizontal resolution of the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO 14 ) satellite's lidar is typically finer than that of space-borne passive instruments (90m footprint vs. a few hundred meters to kilometers), allowing a better detection of fractional cover of cumulus, which are radiatively dominant in many of the subsiding regions of the tropics. On the other hand, CALIPSO is limited to a 2-dimensional swath and thus produces a much smaller sample of clouds than passive instruments. Thus, active and passive techniques are complementary.
Here we propose to characterize and evaluate the response of tropical low clouds to SST forcings in two generations of the GISS ModelE general circulation model (GCM), with a focus on its vertical structure, using 10 years of CALIPSO satellite measurements. To put this into a larger context, we also assess this relationship for a large sample of other climate models. Finally, we identify the best models, based on how well they replicate the observed vertical structure relationship between tropical low cloud and SST, and compare the cloud cover response to SSTs of these models against the others.

Observations
We use the GCM-oriented CALIPSO Cloud Product (CALIPSO-GOCCP) version 2.9 15 for the LCC and the cloud fraction from 2007 to 2016 over a 2.5˚x2.5˚ grid and for 40 levels with 480 m spacing from 0 to 19.2km. CALIPSO-GOCCP was developed to facilitate the evaluation of cloud properties in GCMs when combined with a lidar simulator 16 that uses the same cloud definitions and ensures a consistent comparison between observations and simulations. The caveats of this dataset are discussed in Cesana et al. 15 and Cesana and Waliser 7 , e.g., strong attenuation by liquid-topped low clouds may generate an underestimation of the cloud fraction underneath, close to the surface (0 to 960m), although it does not affect the cloud cover. To avoid daytime noise contamination on the lidar signal, we only use nighttime data, however the results using nighttime and daytime data are similar with a slightly larger amplitude (10% to 15% larger).
To derive an uncertainty estimate of the relationship between interannual cloud amount change and SST anomalies, we use four different datasets for the SST: ERAI, Extended Reconstructed SST version 5 (ERSSTv5 17 ), NOAA Optimum Interpolation (OI) SST version 2 (NOAA-OI SSTv2 18 ) and Centennial in situ Observation-Based Estimates -SST version 2 (COBE-SST2 19 ). The uncertainty related to clouds is due to the cloud threshold and/or the attenuation of the lidar beam. However, this is reproduced in the model via the use of the lidar simulator and therefore does not necessitate further investigation here. For radiative fluxes, we use the Clouds and the Earth's Radiant Energy System (CERES) Energy Balanced and Filled (EBAF) edition 4 dataset (CERES-EBAF 4.0 20 ) over the same time period as CALIPSO-GOCCP. The large-scale circulation comes from the ERA-interim reanalysis 21 .

Simulations
In this study, we analyze prescribed-SST (Atmospheric Model Intercomparison Project, AMIP) monthly outputs from two generations of the GISS Model GCM: the GISS-E222 model that was used for the 5th Coupled Model Intercomparison Project (CMIP5), and a developmental version of the GISS-E3 model that will be submitted to CMIP6. E3 and E2 differ in many ways that can potentially affect low clouds described in details in a future paper. To summarize, the dry turbulence scheme has been replaced by the moist scheme of Bretherton and Park23. The Sunqdvist-type diagnostic cloud fraction scheme of E224 is replaced for E3 by a diagnostic scheme that uses a triangular probability density function (PDF) to compute cloud fraction and cloud liquid water mixing ratio25 for water clouds, and for ice clouds the inversion of that PDF scheme26 to compute cloud fraction. The Sundqvist-type prognostic cloud water parameterization used in E224 is replaced in E3 by a two-moment microphysics scheme with prognostic precipitation modified from Gettelman and Morrison27. The convective cloud microphysics has been updated with field experiment-based observations28. The double plume parameterization of E2 is still used in E3, but with increased entrainment and convective rain evaporation and a new parameterization of downdraft cold pools 29.
To ensure a fair evaluation, we compare modeled and observed cloud fields through the use of the lidar simulator30 although the relationships found in this study are very similar (in terms of sign and shape) when original cloud fractions are utilized in GISS-E3. The model outputs are monthly means of the CALIPSO low-level cloud fraction and CALIPSO cloud fraction, so-called cllcalipso and clcalipso, respectively.

Definition of low cloud regions.
In this work, we focus on the low-level clouds that form over the tropical oceans (between 35˚S and 35˚N) in subsidence regimes defined as having a large-scale pressure vertical velocity at 500 hPa (⍵ 500 ) greater than 10 hPa/day. This captures most of the stratocumulus and stratocumulus-to-shallow-cumulus transition regions, which are located climatologically within the magenta contour in Fig. 2a-b-d-f. In the literature, some studies use a 0 hPa/day ⍵ 500 threshold 4,9 . Here we chose a more conservative ⍵ 500 threshold to minimize areas where high cirrus clouds are ubiquitous and may mask the detection of underlying low-clouds in the observations.

Cloud-SST relationship and observational constraint
One goal of our study is to investigate the interannual variation of the vertical cloud fraction (CF) and LCC in response to a change in SST in both the observations and the models; and to use the observed relationship to evaluate the models. Capturing the mechanisms that govern the change of clouds in response to a surface warming is an essential condition -although not the only one -to predict the future climate. Thus, we select the GCMs that produce the most realistic change in cloud profile per K of SST warming. We refer to these as "constrained" models, in the sense that they are separated from other models in our analysis using an observational constraint; we emphasize though that the models have not been changed in response to the observations. We compare the cloud fraction and short-wave (SW), long-wave (LW) and net cloud radiation effect (CRE) changes of these models to the others, which we refer to as "unconstrained" models.
To calculate the relationship between SST and cloud amount, we compute the monthly mean of CF and LCC and monthly anomalies of SST after having filtered out all grid boxes wherein ⍵ 500 is lower than 10 hPa/day, referred to as CF sub , LCC sub and SST sub,anom . Those can be seen as dynamically-based means and anomalies, as opposed to spatially-based anomaly/mean studies that focus on particular regions 3,5 . Hence, the cloud response is dominated by the local component rather than the large-scale component (dynamics). It is therefore complementary to uniform +4K increase and abrupt 4 times CO2 increase that are also significantly affected by dynamical changes. We then linearly regress CF sub and LCC sub against SST sub,anom to obtain the change (∆) in cloud fraction and low cloud cover per K of SST warming ∆ = , where C is either the CF or LCC. Using a centered finite-differencing scheme as in Myers and Norris 4 instead of a linear regression does not impact the results (not shown).

Assumptions and caveats
By using this method, we make some assumptions that generate some caveats. For example, we assume that the relationship between SST and low cloud amount is time-scale invariant, i.e., the same regardless of the time-scale over which anomalies are calculated. This seems to be supported by several previous studies 3, 31 , but we note that any such relevance to cloud feedback in the regions we study does not necessarily have broader implications for the global equilibrium climate sensitivity 32 . Moreover, we analyze the effect of the SST on the cloud by assuming that the cloud effect on the SST is negligible on a monthly time-scale based on previous studies 3, 31,33 . The relatively short period of the time-record is another caveat here. However, the standard deviation (STD) computed using the four SST datasets (or the 5-95 % confidence intervals when using a single SST dataset, not shown) is far smaller than the multimodel mean STD and bias, as shown in section 4. In addition, using a smaller period of time does not change the sign and shape of the results but may change its magnitude (not shown).
Other environmental factors may cause low cloud changes such as the estimated inversion strength or ⍵ 500 5,9 . When these factors are held constant the variation of the cloud amount as a function of the SST becomes a partial derivative. Past studies have shown that computing the partial derivative may decrease the magnitude of ∆LCC 4,5 .
As stated earlier, our ⍵ 500 filter aims at stratocumulus and stratocumulus to shallow cumulus transition regions. Such a definition of low clouds -while extensively used in the literature -does not permit us to distinguish between the two most common low-cloud types, that is to say trade cumulus and stratocumulus, and it also excludes parts of the trade cumulus regimes that have been argued to be important to overall cloud feedback (weak convective regimes 13 ). As a consequence, our results do not target a specific type of cloud but rather represent the regional-only averaged effect of all types of low clouds. Figure 1a shows the averaged cloud fraction profiles over the tropical oceans (35˚S to 35˚N) in subsidence regimes (⍵ 500 > 10 hPa/day). In the low levels (z<3.36 km), both GISS models underestimate the CF. Although GISS-E2's peak (purple line with stars) is slightly larger than E3's (blue line with stars), the shape of the GISS-E3 profile is in better agreement with the observations (two large values at 1.2 km and 1.68 km). In addition, GISS-E3's CF values are in very good agreement with the observations at 2.16 km and above while they are overestimated in GISS-E2, suggesting an excess of trade cumulus type of clouds. Most of the other models (9/12) also underestimate the CF, making the multi-model mean peak ~43 % smaller than observed (triangle green line, 11.2 %, vs. circled orange line, 19.6 %, Fig. 1a). In addition, the models' behavior is relatively diverse, which highlights the large uncertainty around the simulation of low clouds. The observed shape of the cloud fraction profile -a single peak around 1.2 km -is not captured by all models. Some simulate a double-peak shape, which is likely the result of the distinct contribution of stratocumulus and trade cumulus clouds, the latter having typically smaller CF and higher cloud top (usually defined in different parameterizations). Other models show a single peak as in the observations but with a far smaller CF. This could be explained by several reasons: a too shallow BL, a simple general lack of low clouds for a given thermodynamic state, a strong masking effect by overlying high-clouds or by a larger influence of the convection/shallow convection parameterization over that of the large-scale cloud and turbulence parameterizations that determine stratocumulus clouds. In Figure 1b, we show the interannual change in CF per K of SST warming (∆CF = dCF/dSST) based on a linear regression method between SST anomalies and CF, as described in Section 3.2. As for the mean cloud profiles, the models' responses are quite diverse, generating a very large variability compared to the observed STD. A group of models predict a very small change, which can be either an increase, a decrease or both at different heights. Others models simulate a large increase of the CF at the cloud top and a large decrease below, i.e., an upward shift rather than a cloud cover change. Finally, the remaining models reproduce the shape of observed change pretty well, that is to say a large decrease below 2 km.

Constraining the vertical response of low-level cloud fraction
In this study, we assume that i) the physical mechanisms that control the subtropical low cloud response to warmer surface temperature remain identical across all time scales and ii) those mechanisms are essential to predict the correct subtropical low cloud change in the future, although they may not necessarily be the only ones (e.g., current climate variability does not include the radiative effect of increased CO 2 on cloud-top turbulence). Additional phenomena, e.g., large-scale dynamical feedbacks that differ on interannual and centennial time scales, could also mitigate or amplify the change. However, we believe that the present-day interannual change in the cloud fraction (∆CF) is one important test that a model must pass to have confidence in its prediction of future climate. We therefore isolate the change of the low cloud cover due to a surface warming as well as the related top-ofatmosphere radiative impact for the subset of models that best reproduce the observed cloud fraction change -i.e., a large CF decrease (< -1 %/K) and no significant CF cloud top increase (< +0.5 %/K). In the remainder of the manuscript, we will call this category the "constrained models" (6/14, CanAM4, CES-CAM5, GFDL, GISS-E3, HadGEM2A and IPSL5B), represented in blue, and the other models the "unconstrained models" (8/14, BCC, CCSM4-CAM4, CNRM, GISS-E2, IPSL5A, MIROC5, MPI and MRI), represented in purple. The two GISS models fall into each category: the unconstrained category for GISS-E2 and the constrained category for the newest version, GISS-E3.
Overall, the constrained models simulate a larger cloud amount in the low-levels, in better agreement with CALIPSO, than the unconstrained models (Fig. 1c). In addition to underestimating the low-level cloud amount and its decrease with respect to a surface warming, some unconstrained models predict low-level cloud top rising, either because of a deepening of the boundary-layer or due to an increase of the upper cloud fraction peak (Fig. 1d). This may imply an excess of trade cumuli in the present-day climate in the models having a dual-peak cloud fraction in the low levels (e.g., CCSM4-CAM4, MIROC, MRI, GISS-E2 and MPI, not shown): one large peak close to the surface (stratocumulus type) and another smaller peak above (trade cumulus type).

Consequences for low cloud cover
In the remainder of the manuscript, we use star shapes in our plots to distinguish the GISS models from the other models and emphasize the effect of cloud parameterization changes with respect to interannual LCC and cloud radiative effect (CRE) changes in a GCM.  Based on this observational constraint, we now investigate how well the models simulate the LCC in present-day climate and with respect to a surface warming. Figure 2 shows the LCC maps for the observations and for the two model categories as well as their bias. Although the LCC global means of GISS models are almost identical (LCC E2 =28.5 % and LCC E3 =28.6 %), their spatial patterns (Fig. 2b-d) are completely different (E2 failing to produce any stratocumulus clouds), which results in a very poor correlation factor for E2 (r=0.11, the smallest of all 14 models) as opposed to a very good one for E3 (r=0.86, the largest of all 14 models). The reader should also bear in mind that E3 cloud fraction and cloud cover are slightly underestimated in the present study because the simulator is run offline (at daily frequency), which generates lower cloud fractions and cloud covers than the inline version (not shown). On the other hand, the constrained models simulate larger LCC global -and tropical -means (LCC=30.5 %, r=0.92), closer to the observations (LCC=37 %), and also better reproduce the observed LCC pattern than the unconstrained models (LCC=25.7 %, r=0.86) and the multimodel mean (LCC=27.8 %, r=0.90).  Table 1: CRE, LCC and CRE/LCC changes depending on the cloud regime for the models and the observations in subsidence regimes defined as ⍵ 500 > 10 hPa/day. The constrained models and the observations are represented in bold. The star means that the models include moist processes in the PBL (either due to turbulence parametrization, shallow convection or both). The numbers into parenthesis correspond to the standard deviation, computed based on four different SST datasets in the observations.
We then applied the same method as in Section 3.2 to calculate the interannual change in LCC per K of surface warming ( Figure 3a and Table 1 first column, ∆LCC/∆SST. Consistent with the cloud fraction profiles, GISS-E3, the only model being within the observation uncertainty, predicts a decrease of the LCC in response to a local 1K surface warming (-3.55 % K -1 ), like most models (12/14), as opposed to a small increase for GISS-E2 (0.22 % K -1 ). Like between GISS-E2 and E3, the multimodel spread is significantly large (5.4 % K -1 , Table 1), which is about two and half times greater than the absolute value of the multimodel mean (-2.25 % K -1 , Table 1). However, the constrained models simulate a ∆LCC/∆SST slightly smaller than the observation but within the observational uncertainty (-3.59 % K -1 +/-0.28 % K -1 ) and with a much-reduced spread (-3.49 % K -1 +/-1.01 % K -1 ). The observed ∆LCC/∆SST is significant as its amplitude is more than three times larger than the LCC annual standard deviation in the same dynamical regimes (1 %).
It is plausible to think that ∆LCC could depend on the initial amount of LCC in a model 34 . While the difference between GISS-E2 and GISS-E3 is not significant, comparing this relationship for multiple versions of the GISS-E3 model (run along the course of its development) confirms a relationship between ∆LCC and the present-day LCC in subsidence regions (Fig. 3b). This relationship holds regardless of whether the simulator is used or not. Except for MIROC5, which simulates a present-day LCC almost as large as the observations, the constrained models simulate a larger present-day LCC in subsidence regions (consistent with what was found in Fig. 2 aside, the correlation between the LCC and ∆LCC in Fig. 3 becomes more obvious (r = -0.57 vs. r = -0.40 for all models). One should note that the present-day LCC could be biased low in some models, due to a too strong shielding effect by overlying high-clouds compared to the observations, possibly affecting the relationship between the present-day LCC and ∆LCC. In the GISS-E3 model, the simulator does not affect ∆LCC (Fig. 3; compare red and black versions of the same symbols), despite its significant impact on the present-day LCC as hypothesized before. In addition, the relationship may be different depending on the type of clouds since Fig. 3 does not separate trade cumulus from stratocumulus.

Consequences for interannual low cloud feedbacks
In this section, we further examine the impact of cloud changes on the radiative budget, using CREs, defined as the difference between the all-sky flux minus the clear-sky flux at the TOA. Figure 4 shows the change in the SW, LW, and net CREs per K of surface warming referred to as ∆CRE/∆SST (i.e., dCRE/dSST). A positive ∆CRE/∆SST indicates a warming effect at the top-of-the-atmosphere due to clouds when the SST increases; conversely, a negative ∆CRE/∆SST indicates a cooling effect. This quantity may be used as a proxy to characterize cloud feedbacks at the TOA 35,36 . All observed ∆CRE SW /∆SST, ∆CRE LW /∆SST and ∆CRE NET /∆SST are positive, a feature particularly well-captured by GISS-E3, which is in almost perfect agreement with the data for both the SW and LW components of the interannual feedback, while GISS-E2 gets the sign of the SW component wrong. Both constrained and unconstrained multimodel means (colored triangles) get the correct sign of all three feedbacks although the sign and the magnitude of ∆CRE NET /∆SST vary significantly among the models, mostly driven by the SW component, in agreement with previous studies 35,36 . Overall, the constrained models perform better than the unconstrained models for all three components, in terms of absolute value and variability. In particular, the unconstrained models largely underestimate the ∆CRE SW /∆SST (0.73 Wm -2 K -1 , Table 1 second column), compared to the observations (3.05 +/-0.28 Wm -2 K -1 ) whereas the constrained models almost fall within the observed uncertainty (2.60 Wm 2 K -1).
Because of the optical properties of their spherical droplets, tropical low-level liquid clouds strongly interact with solar radiation by reflecting back to space most of the incoming shortwave radiation. As a result, any change in the LCC should affect the CRE SW at TOA and one should expect a good correlation between the two quantities, which is demonstrated in Fig. 4a, with a correlation factor of -0.94 (excluding the outlier of the calculation). There is no particular correlation for the LW component whereas for the NET component, the correlation is also very large (r = -0.94), driven by the shortwave radiation, confirming its crucial role in determining the cloud feedback spread of CMIP models 37 . Once again, both the magnitude and the variability of the three components is better reproduced by the constrained category of models. In addition, we analyzed the sensitivity of ∆CRE SW to ∆LCC by simply computing the ratio between the two quantities ( Table 1, third column; as in Klein et al. 31 ). GISS-E2 largely overestimates the magnitude of this ratio (by a factor of 10) as do two other models (IPSL-5A and CNRM), which poorly represent the stratocumulus deck. On the other hand, GISS-E3 stands among the best models and almost perfectly replicates the observed ratio. Like GISS-E2, the unconstrained models largely overestimate the radiative impact of an LCC loss (-3.13 W/m 2 /%) compared to the observations (-0.85 W/m 2 /%) while the constrained models reproduced the observed relationship quite well (-0.74 W/m 2 /%). The inability of the unconstrained models to simulate a sufficient amount of LCC in the present-day climate may generate a lack of outgoing SW radiation at TOA, which is compensated by artificially increasing the reflectivity of the clouds during the tuning process in some modeling centers 38 .
The constrained models all generate large stratocumulus decks along with a substantial amount of tropical lowclouds in non-stratocumulus regions., which seems key to simulate the correct global response of low clouds to surface warming. This is likely to be due to the fact that they simulate moist processes in the planetary boundary layer (PBL) by either the turbulence (e.g., GISS-E3, CESM1-CAM5, GFDL AM3, hadGEM2A, CanAM4), the convection (IPSL5B) or both parameterizations (hadGEM2A) at the same time, in addition to having a turbulence scheme that allows stratocumuli to form. This becomes more evident when looking at the evolution of individual models. For example, implementing a more physically-based "moist" turbulence parametrization (based on some aspects of Bretherton and Park 23 ) in the GISS-E3 model changes the sign of ∆LCC/∆SST and ∆CRE SW /∆SST and brings the model results within the range of uncertainty of the observations (in addition to other changes). Similarly, the changes in the IPSL model from the version 5A to 5B significantly improved its simulations of the ∆LCC and ∆CRE SW quantities most likely because its "dry" PBL was turned into a "moist" PBL through the implementation of moist shallow convection within the PBL 39 , which improved their wind profiles and PBL height 40 combined to a revision of their turbulence scheme, which improved their representation of stratocumulus clouds. However, the MPI "moist-PBL" model does not fall into the constrained category. Even though its results are quite close to the observations, the clear overestimation of the cloud frequency above 2.16 km (not shown, likely trade cumulus clouds) alters its ∆CF and leads to a too strong sensitivity of ∆CRE sw to ∆LCC. Conversely, the BCC "dry-PBL" model captures ∆LCC and ∆CRE sw variations pretty well (within the range of the constrained models) although its ∆CF is unrealistic. Therefore, the capacity of the models to replicate the observed response of low-level clouds and radiation to warmer surface temperature seems to be tied to whether or not i) they simulate moist processes in the PBL and ii) their turbulence scheme sustains stratocumulus clouds. Such results also demonstrate that a simple 2D description of the cloud properties is not sufficient to fully understand and predict how cloud may react to surface temperature forcings.

SUMMARY AND DISCUSSION
In response to interannual surface warming, the marine tropical low cloud cover (LCC) as observed by the active sensor from the CALIPSO satellite over a 10-year period significantly decreases (∆LCC/∆SST = -3.59 %/K). This reduction of the LCC is larger than that found using results passive sensor satellites (∆LCC = -1 to -2.95 %/K), albeit consistent in terms of sign and magnitude 3,5 . Overall, the ensemble mean of CMIP5 models captures the sign and the shape of the observed interannual low-cloud cover change (∆LCC/∆SST) quite well. However, its magnitude is underestimated and the model variability is large (∆LCC/∆SST = -2.25 ±1.58 %/K), with some models (2/14) even simulating the wrong sign (a gain instead of a loss). When scrutinized as a function of the height, the interannual cloud fraction change (∆CF) in the lower levels reveals various behaviors, which depend on the type of cloud and its height. We further show that it is possible to separate the model responses to a surface warming using CALIPSO observations of the vertical cloud fraction (∆CF/∆SST) as a constraint: we select the GCMs that produce the most realistic change in cloud profile per K of SST warming, referred to as "constrained" models. By doing so, we find that the "constrained" models simulate a more realistic behavior of low-level cloud fraction and their associated interannual radiative feedbacks (∆CRE SW /∆SST) together with a smaller variability in response to a surface warming. Their averaged ∆LCC/∆SST is within the observed uncertainty while they slightly underestimate the ∆CRE SW /∆SST. Meanwhile, the "unconstrained" category fails to reproduce the right magnitude of both quantities by a factor of 3 to 4. The fact that models that simulate moist processes within the planetary boundary layer produce sustainable stratocumulus decks appears crucial to replicate the observed relationship between cloud/radiation and surface temperature.
Future work will focus on defining a method to discriminate stratocumulus from trade cumulus clouds in observations. By doing so, we will be able to assess the spatial distributions of these clouds and to evaluate the models more precisely. In addition to this, refining the contribution of additional cloud controlling factors may advance our understanding of physical processes driving the change of cloud fraction in response to a warmer climate.

ACKNOWLEDGMENT
GC and AD were supported by a CloudSat-CALIPSO grant at the NASA Goddard Institute for Space Studies. AA, MK, YC and AF were supported by a NASA Modeling, Analyis, and Prediction grants. The GISS-E3 simulations can be made available upon request; the final version of GISS-E3 will be made part of the CMIP6 model archive.