Global Sensitivity Analysis Using Sobol Indices
1 Overview
This report presents results from a variance-based global sensitivity analysis of the SIPNET ecosystem model using Sobol indices. Unlike local sensitivity analysis (which measures rates of change at baseline conditions), variance-based methods decompose output variance into contributions from individual parameters and their interactions.
Sobol indices provide a rigorous framework for:
- Quantifying parameter importance via variance decomposition
- Detecting interactions between parameters (non-additive effects)
- Factor prioritization (which parameters to constrain for uncertainty reduction)
- Factor fixing (which parameters can be fixed without loss of information)
1.1 Theoretical Background
Sobol indices are normalized variance contributions:
\[ S_i = \frac{V_i}{\text{VAR}(y)} \quad \text{(First-order index)} \tag{1}\]
\[ T_i = 1 - \frac{\text{VAR}_{x_{\sim i}}[\mathbb{E}_{x_i}(y | x_{\sim i})]}{\text{VAR}(y)} = \frac{\mathbb{E}_{x_{\sim i}}[\text{VAR}_{x_i}(y | x_{\sim i})]}{\text{VAR}(y)} \quad \text{(Total-order index)} \tag{2}\]
where \(x_{\sim i}\) denotes all-parameters-but-\(x_i\).
1.1.1 Interpretation
- \(S_i\): Direct effect of \(x_i\) on output variance (excluding interactions)
- \(T_i\): Total effect of \(x_i\) including all interactions with other parameters
- \(T_i - S_i\): Interaction strength (purely higher-order effects)
- \(\sum S_i \approx 1\): Model is additive (linear-like behavior)
- \(\sum S_i \ll 1\): Model is non-additive (interactions dominate)
2 Study Design
Method: Saltelli’s sampling scheme (Saltelli et al. 2010) with Jansen estimators for both first-order and total-order indices, implemented via the sensobol R package (Puy et al. 2022). Bootstrap resampling (R = 500) provides confidence intervals.
Sample Size: \(N = 32\), Total Runs = \(N \times (k + 2)\) = 1440
Scope: 17 sites across California row-crop agriculture
3 Data Processing
We process the raw indices to parse namespaced parameters (e.g., grass.SLA), identify the dummy parameter (Sobol validation; its \(T_i\) should be near zero), and calculate interaction components (\(T_i - S_i\)).
4 Key Findings
Based on a Sobol analysis with 32 base samples and 1440 total runs across 17 sites:
- Top inputs: Initial Conditions, Soil Methane Rate, and N Volatilization Rate explain the most output variance (\(T_i\) = 0.91, 0.81, 0.63)
- Model parameters account for ~71% of the first-order variance contribution; the remaining contribution is distributed across meteorological drivers, initial conditions, and management inputs
- Model additivity (\(\sum S_i\)) averages 1.93 – values near 1 indicate additive behavior, values \(\ll 1\) indicate parameter interactions dominate
- Noise floor: the dummy parameter’s median \(T_i\) = 0, providing a rigorous threshold for identifying non-influential inputs
5 Variance Decomposition (Source Partitioning)
This section summarizes first-order (main-effect) uncertainty contributions by source (model parameters, meteorology, initial conditions, and management). This is done by summing first-order Sobol indices (\(S_i\)) within broad categories (Equation 1). These contributions do not include interaction effects (see \(T_i - S_i\)).
For discrete inputs such as initial condition and meteorological ensembles, Sobol coordinates are used to select among ensemble members / files. The resulting indices quantify variance attributable to uncertainty in which ensemble member is selected, not sensitivity along an ordered or continuous variable.
The “Control (Noise)” source is excluded for clarity; it accounts for numerical estimation error from the Sobol design.
6 Parameter Ranking (First & Total Order)
This plot compares First-Order (\(S_i\), solid bar) and Total-Order (\(T_i\), whisker) indices. The gap between them quantifies interaction strength (Equation 2 minus Equation 1).
This ranking identifies the inputs with the largest total effect (\(T_i\)), indicating which uncertainties most strongly influence model outputs.
Solid bar (\(S_i\)): Direct (main) effect of the parameter alone
Gap (\(T_i - S_i\)): Higher-order interaction effects with other parameters
Parameters are ordered by median \(T_i\) across sites. We show the top 15 inputs for each output variable.
6.1 Cross-Site Variability
A parameter whose importance varies across sites (wide box) indicates context-dependent sensitivity: its influence depends on local conditions (soil, climate, crop type). These are priorities for spatially explicit calibration.
7 Interaction Analysis
Parameters with large differences between \(T_i\) and \(S_i\) interact strongly with other factors. These non-additive effects mean the parameter’s impact on output depends on the values of other inputs.
Interaction strength (\(T_i - S_i\)) indicates how much a parameter’s influence depends on other inputs rather than acting independently.
8 Factor Fixing (Screening)
To identify non-influential parameters, we compare their Total-Order index (\(T_i\)) against the Dummy parameter (noise floor) that is useful for screening. The dummy parameter varies randomly but has no effect on model output, so we assume that any parameter with \(T_i\) at or below the dummy’s median can be fixed to a constant value without meaningful loss of variance coverage.
The dashed red line marks the noise floor (median dummy \(T_i\)).
Fixable parameters (Ti below noise floor):
| Variable | Input | Median Ti | Noise Threshold |
|---|---|---|---|
| Aboveground Biomass | Growing Degree Days | 0 | 0 |
| Aboveground Biomass | Leaf Fall Fraction | 0 | 0 |
| Aboveground Biomass | Growth Resp. Factor | 0 | 0 |
| Aboveground Biomass | Leaf Growth Rate | 0 | 0 |
| Aboveground Biomass | Compost C:N Ratio | 0 | 0 |
| Aboveground Biomass | Compost Application | 0 | 0 |
| Aboveground Biomass | N Fertilizer Rate | 0 | 0 |
| Aboveground Biomass | Anaerobic Decomp. Rate | 0 | 0 |
| Aboveground Biomass | Anaerobic Transition Exp. | 0 | 0 |
| Aboveground Biomass | Fine Root C:N | 0 | 0 |
| Aboveground Biomass | Leaf C:N | 0 | 0 |
| Aboveground Biomass | Wood C:N | 0 | 0 |
| Aboveground Biomass | Anoxia Fraction | 0 | 0 |
| Aboveground Biomass | C:N Decomp. Modifier | 0 | 0 |
| Aboveground Biomass | Litter Methane Rate | 0 | 0 |
| Aboveground Biomass | N Fixation Half-Sat. | 0 | 0 |
| Aboveground Biomass | Max N Fixation Fraction | 0 | 0 |
| Aboveground Biomass | N Leaching Fraction | 0 | 0 |
| Aboveground Biomass | N Volatilization Rate | 0 | 0 |
| Aboveground Biomass | Soil Methane Rate | 0 | 0 |
| Aboveground Biomass | SOM Respiration Rate | 0 | 0 |
| CH₄ Flux | Growing Degree Days | 0 | 0 |
| CH₄ Flux | Leaf Fall Fraction | 0 | 0 |
| CH₄ Flux | Growth Resp. Factor | 0 | 0 |
| CH₄ Flux | Leaf Growth Rate | 0 | 0 |
| CH₄ Flux | N Fertilizer Rate | 0 | 0 |
| CH₄ Flux | N Fixation Half-Sat. | 0 | 0 |
| CH₄ Flux | Max N Fixation Fraction | 0 | 0 |
| CH₄ Flux | N Leaching Fraction | 0 | 0 |
| CH₄ Flux | N Volatilization Rate | 0 | 0 |
| Latent Heat Flux | Growing Degree Days | 0 | 0 |
| Latent Heat Flux | Leaf Fall Fraction | 0 | 0 |
| Latent Heat Flux | Growth Resp. Factor | 0 | 0 |
| Latent Heat Flux | Leaf Growth Rate | 0 | 0 |
| Latent Heat Flux | Compost C:N Ratio | 0 | 0 |
| Latent Heat Flux | Compost Application | 0 | 0 |
| Latent Heat Flux | N Fertilizer Rate | 0 | 0 |
| Latent Heat Flux | Anaerobic Decomp. Rate | 0 | 0 |
| Latent Heat Flux | Anaerobic Transition Exp. | 0 | 0 |
| Latent Heat Flux | Fine Root C:N | 0 | 0 |
| Latent Heat Flux | Leaf C:N | 0 | 0 |
| Latent Heat Flux | Wood C:N | 0 | 0 |
| Latent Heat Flux | Anoxia Fraction | 0 | 0 |
| Latent Heat Flux | C:N Decomp. Modifier | 0 | 0 |
| Latent Heat Flux | Litter Methane Rate | 0 | 0 |
| Latent Heat Flux | N Fixation Half-Sat. | 0 | 0 |
| Latent Heat Flux | Max N Fixation Fraction | 0 | 0 |
| Latent Heat Flux | N Leaching Fraction | 0 | 0 |
| Latent Heat Flux | N Volatilization Rate | 0 | 0 |
| Latent Heat Flux | Soil Methane Rate | 0 | 0 |
| Latent Heat Flux | SOM Respiration Rate | 0 | 0 |
| Net Primary Productivity | Growing Degree Days | 0 | 0 |
| Net Primary Productivity | Leaf Fall Fraction | 0 | 0 |
| Net Primary Productivity | Growth Resp. Factor | 0 | 0 |
| Net Primary Productivity | Leaf Growth Rate | 0 | 0 |
| Net Primary Productivity | Compost C:N Ratio | 0 | 0 |
| Net Primary Productivity | Compost Application | 0 | 0 |
| Net Primary Productivity | N Fertilizer Rate | 0 | 0 |
| Net Primary Productivity | Anaerobic Decomp. Rate | 0 | 0 |
| Net Primary Productivity | Anaerobic Transition Exp. | 0 | 0 |
| Net Primary Productivity | Fine Root C:N | 0 | 0 |
| Net Primary Productivity | Leaf C:N | 0 | 0 |
| Net Primary Productivity | Wood C:N | 0 | 0 |
| Net Primary Productivity | Anoxia Fraction | 0 | 0 |
| Net Primary Productivity | C:N Decomp. Modifier | 0 | 0 |
| Net Primary Productivity | Litter Methane Rate | 0 | 0 |
| Net Primary Productivity | N Fixation Half-Sat. | 0 | 0 |
| Net Primary Productivity | Max N Fixation Fraction | 0 | 0 |
| Net Primary Productivity | N Leaching Fraction | 0 | 0 |
| Net Primary Productivity | N Volatilization Rate | 0 | 0 |
| Net Primary Productivity | Soil Methane Rate | 0 | 0 |
| Net Primary Productivity | SOM Respiration Rate | 0 | 0 |
| N₂O Flux | Growing Degree Days | 0 | 0 |
| N₂O Flux | Leaf Fall Fraction | 0 | 0 |
| N₂O Flux | Growth Resp. Factor | 0 | 0 |
| N₂O Flux | Leaf Growth Rate | 0 | 0 |
| Soil Carbon | Growing Degree Days | 0 | 0 |
| Soil Carbon | Leaf Fall Fraction | 0 | 0 |
| Soil Carbon | Growth Resp. Factor | 0 | 0 |
| Soil Carbon | Leaf Growth Rate | 0 | 0 |
| Soil Carbon | N Fertilizer Rate | 0 | 0 |
| Soil Carbon | N Fixation Half-Sat. | 0 | 0 |
| Soil Carbon | Max N Fixation Fraction | 0 | 0 |
| Soil Carbon | N Leaching Fraction | 0 | 0 |
| Soil Carbon | N Volatilization Rate | 0 | 0 |
| Soil Moisture | Growing Degree Days | 0 | 0 |
| Soil Moisture | Leaf Fall Fraction | 0 | 0 |
| Soil Moisture | Growth Resp. Factor | 0 | 0 |
| Soil Moisture | Leaf Growth Rate | 0 | 0 |
| Soil Moisture | Compost C:N Ratio | 0 | 0 |
| Soil Moisture | Compost Application | 0 | 0 |
| Soil Moisture | N Fertilizer Rate | 0 | 0 |
| Soil Moisture | Anaerobic Decomp. Rate | 0 | 0 |
| Soil Moisture | Anaerobic Transition Exp. | 0 | 0 |
| Soil Moisture | Fine Root C:N | 0 | 0 |
| Soil Moisture | Leaf C:N | 0 | 0 |
| Soil Moisture | Wood C:N | 0 | 0 |
| Soil Moisture | Anoxia Fraction | 0 | 0 |
| Soil Moisture | C:N Decomp. Modifier | 0 | 0 |
| Soil Moisture | Litter Methane Rate | 0 | 0 |
| Soil Moisture | N Fixation Half-Sat. | 0 | 0 |
| Soil Moisture | Max N Fixation Fraction | 0 | 0 |
| Soil Moisture | N Leaching Fraction | 0 | 0 |
| Soil Moisture | N Volatilization Rate | 0 | 0 |
| Soil Moisture | Soil Methane Rate | 0 | 0 |
| Soil Moisture | SOM Respiration Rate | 0 | 0 |
9 Model Additivity
The sum of First-Order indices (\(\sum S_i\)) indicates the degree of model linearity. Values near 1.0 imply the model is additive and parameters contribute independently. Values well below 1 indicate that interactions dominate – the model’s response to one parameter depends on the values of others.
10 Convergence Diagnostics
A Sobol analysis is only as reliable as its sample size allows. With \(N = 32\) base samples and \(k = 41\) parameters, our total model evaluation budget is 1440 runs per site. This section assesses whether that budget was sufficient for stable index estimation, following the diagnostic framework of Nossent et al. (2011).
10.1 Confidence Interval Precision
The width of the bootstrap confidence interval (\(\text{CI}_{95}\)) for each index indicates estimation precision. Parameters with narrow CIs have well-determined importance; wide CIs flag indices that may shift with additional samples.
10.2 Parameter-Output Relationships
These scatterplots show the direct relationship between each top parameter’s sampled value and the model output. They reveal whether the sensitivity is linear (validating the Sobol decomposition) or nonlinear (indicating threshold behavior or saturation). Each point is one model evaluation from the Sobol design matrix.
- Linear trend with tight spread: parameter has a strong, additive effect – Si will be high
- Nonlinear trend (curve, threshold): parameter has important interactions – Ti >> Si
- Cloud with no trend: parameter has low sensitivity for this variable
- Funnel shape (spread changes with x): heteroscedasticity – the parameter’s effect depends on other inputs
10.3 Convergence with Sample Size
To assess whether \(N = 32\) is sufficient, we recompute Sobol indices using increasing fractions of the base sample. If indices stabilize at the full sample size, this supports adequate convergence; indices still drifting indicate that a larger N would improve precision.
Our base sample \(N = 32\) is modest by Sobol analysis standards (Nossent et al. 2011 recommend N = 500+ for stable individual indices), chosen to balance computational cost (~1,440 model runs per site x 17 sites) against the need for results across many sites. Where indices show ongoing drift at \(N = 32\), a larger ensemble (Phase 4 workplan) would improve precision. The current results are reliable for parameter ranking and factor prioritization (relative ordering is robust), even though individual index point estimates carry estimation uncertainty.
11 Synthesis
What drives uncertainty? The variance decomposition (Section 5) separates biological parameter uncertainty from environmental forcing.
What to calibrate first? Parameter rankings (Section 6) identify the top candidates for reducing predictive uncertainty. Parameters whose importance varies across sites (Figure 3) need spatially explicit calibration.
What can be simplified? Factor fixing (Section 8) uses the dummy parameter as a rigorous noise floor to identify inputs that can be set to nominal values.
Is the model additive? Additivity diagnostics (Section 9) indicate whether parameter effects are independent or interact. Strong interactions (large \(T_i - S_i\) in Section 7) suggest that calibrating parameters independently may be insufficient.
12 References
Puy, A., et al. (2022). sensobol: An R Package to Compute Variance-Based Sensitivity Indices. Journal of Statistical Software, 102(5).
Saltelli, A., et al. (2010). Variance based sensitivity analysis of model output. Computer Physics Communications.
Dietze, M. (2017). Ecological Forecasting. Princeton University Press.
Gelman, A. & Dodhia, R. (2002). Let’s practice what we preach: Turning tables into graphs.
Nossent, J., Elsen, P. & Bauwens, W. (2011). Sobol’ sensitivity analysis of a complex environmental model. Environmental Modelling & Software, 26, 1515-1525.