Explainability of end and mid-season cotton yield predictors in CONUS

Abstract

In this study, we examined the effectiveness of integrating satellite-based crop biophysical parameters, meteorological conditions, and soil properties for the end and mid-season cotton yield prediction in the continental United States (CONUS) region. We employed six machine learning algorithms: decision tree (DT), random forest (RF), adaptive boosting (Ad-aBoost), gradient boosting (GB), light gradient boosting machine (LightGBM), and extreme gradient boosting machine (XGBoost). By employing this rigorous approach to hyperparameter tuning based on Bayesian optimization, the XGBoost method was found as the best method for both mid-season and end-season cotton yield prediction. Furthermore, we investigated the global importance of temporal and static features using the Shapley Additive Global importancE (SAGE) method to understand the driving factors of cotton yield prediction. As a result of global feature importance analysis, precipitation (P), enhanced vegetation index (EVI), and leaf area index (LAI) were found as the most important temporal features, while silt and pH were found as the most important soil properties.

Publication
In IEEE International Geoscience and Remote Sensing Symposium 2023
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Create your slides in Markdown - click the Slides button to check out the example.

Add the publication’s full text or supplementary notes here. You can use rich formatting such as including code, math, and images.

Mustafa Serkan Isik
Mustafa Serkan Isik
PhD in Geosciences

My research interests include water cycle, remote sensing and satellite geodesy, and ML/DL algorithms.