1. Introduction
Monitoring the growth of agricultural crops during the whole growing season is important for increasing crop yields and reducing costs and input resources for the agricultural sector [
1]. Spatially-explicit knowledge of biophysical variables, such as the leaf area index (LAI) and the chlorophyll content (Chl), is fundamental for the understanding of agricultural ecosystems [
2]. Moreover, the variables as LAI are used as inputs of important agricultural models, such as the adapted FAO-56 Penman-Monteith (FAO-56 PM) model [
3] which derives the reference (ET
o) and potential (ET
c) crop evapotranspiration. Different empirical methods to estimate evapotranspiration (ET) have been developed over the last 50 years by numerous specialists worldwide. Testing the accuracy of these methods in situ is however time-consuming and costly, and yet ET data are frequently needed at short notice for irrigation scheduling design. The Penman-Monteith model (FAO-56 PM) [
3], also referred as the one-step or direct approach, is considered to offer the best results with a minimum possible error. During recent years, there has been a consistent effort to estimate vegetation parameters from remotely sensed data, allowing to adapt the Penman-Monteith equation for direct use with Earth observation (EO) based LAI and surface albedo retrieval [
4], minimizing time and cost. Consequently, nowadays it is the most commonly used method for the estimation of ET [
5,
6,
7].
LAI is defined as the one half of the total leaf area per unit horizontal ground surface area (m
2 leaf per m
2 surface or dimensionless) [
8] and Chl is given by the weight of green pigment per leaf surface (g Chl per m
2 leaf) [
9]. By quantifying and monitoring both parameters, the photosynthetic capacity [
10,
11], nutrient stress [
12,
13] and development stage [
14,
15] of crops can be detected. There are two approaches to remotely estimate Chl in crops. One of them is by the assessment of the leaf chlorophyll content (LCC) [
16] and the other approach is based on canopy chlorophyll content (CCC), the total crop chlorophyll content, which is defined as the product of LCC and LAI [
17]. Different studies have demonstrated that a direct estimation of CCC is more robust and accurate than an estimation based on the product of the individual estimation of LAI and LCC [
18]. Therefore, the remote estimation of CCC has been preferred the one of the LCC.
The direct field measurements of biophysical parameters require continuous updates and can be extremely time-consuming and expensive [
19]. Therefore, remote sensing from satellite, aerial and unmanned aerial vehicle platforms has become a popular technique for monitoring agricultural areas because of its ability to acquire synoptic information at different times and spatial scales [
20,
21]. There are different approaches for estimating biophysical parameters from remotely sensed data, i.e., (1) empirical retrieval methods, which consist of relating the biophysical parameter of interest against spectral data (e.g., vegetation indices—VIs), (2) statistical category, which defines regression functions according to information from remote sensing data (e.g., artificial neural network—ANN) and (3) physically-based retrieval methods, which refers typically to the inversion of radiative transfer models (RTMs) against remote sensing observations.
The empirical approaches provide an acceptable level of accuracy in the estimation of important biophysical parameters and can be calculated without high computational demands, but due to the sensitivity of VIs to vegetation type, the site and sensor characteristics, reliable ground measurements are required for model calibration. Furthermore, VIs are affected by the canopy structure, given different leaf angles, leaf spatial distribution and row orientation, which strongly influences canopy reflectance, which is further influenced by soil optical properties and sun-target-sensor geometry [
15]. Additionally, VIs are generally based on only a few spectral bands and a single-angle observation, leading to an under-exploitation of the full spectral and directional range available from new generation sensors. Despite the above, there are VIs based models that have been improved because they also account for other factors such as the bare soil response affecting the canopy reflectance, as is the case of the Clevers leaf area index by reflectance (CLAIR) model [
22].
The statistical methods can be either parametric or non-parametric [
23]. Parametric models rely on physical knowledge of the problem and build explicit parameterized expressions, assuming a finite set of parameters. Therefore, the complexity of the model is bounded even if the amount of data is unbounded. This makes them quite inflexible. Alternatively, non-parametric models are adjusted to predict a variable of interest using a training dataset of input-output data pairs, building a non-linear regression model using the observed parameters as inputs and without taking into account physical restrictions. This makes them more flexible.
On the other hand, the physically based models of canopy reflectance consider the crop architecture, illumination, soil backgrounds and viewing geometries, making them applicable across multiple operational applications for crop biophysical parameters retrieval [
24]. Nevertheless, other restrictions, such as the intrinsic risk of oversimplifying the architecture of canopy for those RTMs fast enough for operational applications, have to be considered in this context [
25]. They generally correspond to a simple description of canopy architecture which may not represent the actual one, particularly regarding the clumped nature of many vegetation types. Moreover, in these approaches, radiometric measurement uncertainties have to be added to the simulations when building up the training datasets [
26]. The ill-posed problem has to be taken also into account when performing model inversion: The different parameter combinations may produce almost identical spectra, resulting in significant uncertainties in parameter estimation [
27].
For agricultural monitoring by remote sensing, the spatial resolution should be at least 20 m and, preferably, 10 m in order to make site-specific management possible [
28]. A temporal resolution of less than a week would be required to follow-up acute changes in the crop condition and provide a timely response in management practices. Specifically, extensive research has been carried out on ET crop estimation for management strategies using EO data, but to date, one of the major limitations for their applicability and technological transfer was the limited spatial and temporal resolution of the sensors [
29]. In this context, the Sentinel-2 (S2) mission from the European Space Agency (ESA) [
30] fulfills such operational requirements. S2 is a constellation of satellites, Sentinel-2A (S2A) and Sentinel-2B (S2B) at the moment, launched by the ESA on 23 June, 2015 and 7 March, 2017, respectively. They occupy the same sun-synchronous orbit at an altitude ~786 km, but are separated by 180°. Together, they provide better than 5-day revisit of the Earth’s land surfaces under cloud-free conditions with a 10, 20 and 60 m of pixel size. S2A and S2B carry on board a virtually identical sensor, named the multi-spectral imager (MSI), covering the visible, the near-infrared (NIR) and the shortwave-infrared (SWIR) spectral regions. Hence, the S2 mission improves the temporal, spatial and spectral resolution of remote sensing data, compared to other multi-spectral missions, such as Landsat, and offers great opportunities for agricultural monitoring. For S2, the operational biophysical parameter products associated with a quality indicator are provided through the Sentinel Application Platform (SNAP) toolbox and produced through an ANN which has been trained by simulated spectra generated from well-known RTMs [
31]. However, the accuracy of these products has been poorly tested, producing mostly improvable results [
32,
33]. It should be mentioned that both the S2 satellite images and the SNAP program are free of charge.
The present work aims at evaluating empirical (VIs), semi-empirical (CLAIR model) and ANN S2 products derived from SNAP biophysical processor methods for LAI and CCC estimation, the essential parameters to understand the evapotranspiration process.
Section 2.1 describes the four test sites used in this study, a strong point of this research as they are totally independent zones with their specific characteristics.
Section 2.2 details the field measurement protocol followed for the measurement of the LAI and LCC parameters in each of the zones, as well as the field instrument calibration equations.
Section 2.3 specifies the four datasets obtained from the field campaigns, detailing the crop types sampled and the number of samples of each crop type, among other characteristics. From these in situ data and the corresponding S2 spectral information, the ANN S2 products are applied (
Section 2.4), as well as the semi-empirical CLAIR model and the most commonly VIs used by the literature for the estimation of LAI and CCC (
Section 2.5).
Section 2.6 describes the FAO-56 PM equation adapted to remote sensing, specifying where the LAI parameter is within the equation.
Section 3 details the results obtained for the estimation of the LAI and CCC parameters with each of the methodologies, as well as the impact obtained from the LAI parameter on the FAO-56 PM equation. Finally,
Section 4 evaluates the results, highlighting the positive and negative aspects of each method and comparing with other studies.
Section 5 outlines the main conclusions, specifying the most optimal methodology for LAI and CCC retrieval and for the evapotranspiration estimation from the operational point of view.
4. Discussion
The importance of estimating biophysical variables lies in their use as input parameters for various models dedicated to the monitoring of seasonal plant functioning and the optimization of agricultural zones. If the estimation of variables, such as LAI and CCC, is through satellites as S2, information of these key variables can be obtained with high frequency and spatial resolution, increasing the accuracy of the dynamical models fed by their input. This study analyzes different methodologies to estimate LAI and CCC, the essential parameters for the understanding of the evapotranspiration process. Specifically, empirical (VIs), semi-empirical (CLAIR model with fixed and calibrated extinction coefficient) and ANN S2 products are derived from SNAP biophysical processor approaches for the LAI and CCC estimation, using S2 spectral information. Previous S2 retrieval works are commonly based on simulated data [
68] or on in situ data taken only in one test site [
69]. The main strength of this study is the use of four datasets with in situ multi-crop information obtained with different instruments from totally diverse test sites.
The VIs most commonly used by the scientific community for the LAI and CCC remote retrieval were applied to the four study areas. These indices have produced good results in previous studies both in the analysis of a single crop type [
54,
58] and in the studies with a variety of crops [
55,
63,
70]. In this study, one of the best indices, together with the EVI index, for the estimation of LAI in the four datasets i.e., successful in multi-crop data, was the SeLI index. This is a consistent result with a previous work that was done with LAI data taken in a multi-crop area [
33]. SeLI uses the red-edge region as the area of greatest influence of the LAI (B5—705 nm) and the NIR region as the control band (B8a—865 nm). Furthermore, for the estimation of the CCC parameter, one of the VI with the best statistics in each dataset is the CI
red-edge, which uses a two bands ratio located in the red-edge region (B5—705 nm, B7—783 nm). The S2 bands located in the red-edge area have been identified as key bands for the biophysical parameters estimation, mainly the LAI [
52] and CCC [
71]. One of the main critics of VIs is that it does not take into account other parameters that affect reflectance [
16]. Hence, the methodologies based on VIs have been developed and improved by incorporating the calibration factors, such as the influence of bare soil. This is the case of the semi-empirical CLAIR [
22] model composed of the WDVI, soil line concept and a correction coefficient corresponding to the minimum error between the observed and estimated LAI (α*). This model has been widely used for the estimation of the LAI parameter with different satellites such as DEIMOS-1 [
72] or WorldView-2 [
73], and even with the S2 satellite [
74]. In this study, this model is first estimated with a fixed value of α* parameter and after being checked with an optimized α* for each test site. The results with the optimized CLAIR model (CLAIR
opt) are better in the CAS19_IT and TAR16_IT datasets, especially in the latter where RMSE is halved. However, the CLAIR
opt model obtains worse results in the Bahía Blanca and Valencia test areas. This may be due to the fact that the fixed value of α* is 0.41 and the optimized value is approximately 0.30 for the image with the highest number of in situ points. However, if the optimized α* of another image is checked, it returns approximately 0.60, so 0.41 is an intermediate value, obtaining in the end better results with the non-optimized CLAIR model. Other studies have obtained also better results with a fixed α* value than by performing a calibration process [
72]. Therefore, from an operational point of view, the CLAIR model can be used with constant values. In case the area of study is specific, a calibration can be made but it should include the maximum in situ points from several images for an optimal calibration. On the other hand, several authors who have used the CLAIR model in their studies obtained saturation problems with high LAI values when WDVI value tended to
[
75]. This saturation process also occurs in our four datasets with the LAI values higher than 3. Generally, the studies based on the CLAIR model visually select approximately 50–100 bare soil pixels in the satellite image to estimate the soil line slope [
72]. In this case, the soil line slope value has been estimated from all the bare soil pixels in the image, masking the rest of the pixels. In this case, the SCL map derived by Sen2Cor have been used for masking, producing good results.
Regarding the ANN S2 LAI and CCC products, the LAI product has been validated in several studies, obtaining results that can be improved (R
2 = 0.5), both when applied to a single crop type [
76] and when validated with multi-crop data [
33]. Even studies that have carried out an analysis of different retrieval methodologies concluded that VIs produce better results than the ANN S2 LAI product [
77]. In this study, the ANN S2 LAI product obtains good results in the three datasets corresponding to Caserta, Tarquinia and Bahía Blanca test sites (R
2 > 0.7, RMSE < 0.9 g/m
2). The validation of the ANN S2 CCC product, on the other hand, has been done by very few studies and only for the wheat crop type. These studies obtained good results. A recent study [
76] obtained a R
2 = 0.72 with the ANN S2 CCC product and in situ wheat data, and [
78] has defined a relation between the ANN S2 CCC product and the canopy nitrogen, obtaining a R
2 = 0.90. The results of this study support and extend these results, obtaining a R
2 > 0.7 and RMSE < 0.7 g/m
2, for three totally different agricultural areas and with different crop types: Caserta, Tarquinia and Bahía Blanca test sites.
The special case of the Valencia study area should be mentioned. This study site is characterized by very small plots (40–100 m in length) and composed of some crop types (e.g., orange tree and lettuce) with a fraction vegetation cover (FVC) ≈ 50% [
33], leaving the soil an important role. Therefore, the VIs and the ANN S2 products do not present good statistics in this area. The CLAIR model includes a soil correction factor, the soil line slope concept, which is expected to present better statistics. However, the model obtains improvable statistics (R
2 = 0.46, RMSE = 0.84). The FVC has always influenced remote sensing studies, not only in the biophysical retrieval models, also in studies related to evapotranspiration and irrigation management [
79,
80]. Therefore, it is necessary to incorporate the FVC parameter in the biophysical models, mainly when they are applied to areas such as Valencia. A possible solution is to use the ANN S2 FVC product, but this must be validated to a further extent as very few studies are available concerning the quality analysis of this product. For example, a recent study [
77] has validated the ANN S2 FVC product in soybeans, canola, wheat, corn, oats, black beans and alfalfa crop types, obtaining good results.
In general, the Plant Canopy Analyzer (LAI-2000, LAI-2200) is the instrument most commonly used to measure in situ LAI parameter [
81,
82]. However, in many cases, this instrument is not available due to its high cost. In this study, the LAI data from Caserta, Tarquinia and Valencia were taken with the standard LAI instrument but the Bahía Blanca (Argentina) data were obtained with the PocketLAI smartphone application [
83]. This application has been used in several previous works with satisfactory results, mainly on rice crop studies [
84,
85] but also on other crop types such as vineyards [
86]. In this last study [
86], both PocketLAI and the digital hemispherical photography (DHP) instrument were used and compared with destructive LAI values, observing that PocketLAI measurements were closer to the ground-truth (RRMSE = 43.00% for PocketLAI; RRMSE = 99.46% for DHP). Furthermore, in a recent comparative study of different LAI measurement instruments against destructive LAI values [
39], the PocketLAI was found to a good alternative for the LAI operational measurement due to its cost-effective and similar results to destructive LAI values, obtaining a RMSE of < 0.65 for alfalfa, broad bean and maize crop types.
Finally, the influence of the biophysical parameter retrieval methodologies over the biophysical model outputs was analyzed. Crop potential evapotranspiration (ET
c) has been estimated using the FAO-56 PM equation adapted to remote sensing [
3]. In general, the ET
c is estimated using the reference evapotranspiration (ET
o) by a crop specific coefficient (K
c), focusing the different studies on optimizing this coefficient for the different crop types with satellite data. For example, [
5] optimized the K
c coefficient for wheat using SPOT-4 satellite data and [
7] optimized the K
c coefficient for vineyards with Landsat 8—RapidEye satellite data combination. In this study, on the other hand, the standard K
c values established by FAO [
3] for each crop type and growth stage were selected and the impact of the LAI parameter on the model was analyzed. The LAI obtained with the different methodologies was used as an input. As a result, the ET
c estimated with the ANN S2 LAI product is the closest to the ET
c estimated with the in situ LAI. Previous satellite evapotranspiration estimation studies have also used the ANN S2 LAI product as an input, producing satisfactory results in comparison with the soil water balance module in the environmental policy integrated climate (EPIC) model [
64]. In addition, this study found that the satellite retrieved LAI values provided a more pronounced seasonal trend compared to the K
c approach for both wheat and tomato crop types.
In short, the ANN S2 products are the closest match to the in situ data. From the operational point of view, this study provides evidence that ANN S2 LAI and CCC products have great potential to be used for the estimation and understanding of evapotranspiration, being one of the methodologies that provide similar results to ground-truth, without saturation problems. In future works, the authors want to validate the ANN S2 FVC product with the objective of incorporating this parameter in models applied to the study areas with low vegetation cover FVC ≤ 50%, as is the case of Valencia area.