individual
What campus are you from?
Daytona Beach
Authors' Class Standing
Leah Oberkehr, Senior
Lead Presenter's Name
Leah Oberkehr
Faculty Mentor Name
Siddharth Parida
Abstract
This study presents a systematic machine learning-based framework to predict the compression index (Cc) and recompression index (Cr) of Florida soils using a database of 497 consolidation tests conducted for the Florida Department of Transportation. Traditional methods for estimating compressibility rely on time intensive laboratory tests or simplified empirical correlations that fail to capture the nonlinear, multivariate nature of soil behavior. To overcome these limitations, this research evaluates multiple predictive models—including polynomial regression, ridge regression, support vector regression (SVR), and random forests (RF)—to identify the most accurate and robust approach. Among these, the RF model showed the best performance, when applied to the database. Furthermore, a systematic feature selection process using Pearson correlation, distance correlation, and Gini importance identified initial void ratio, moisture content, dry unit weight, liquid limit, and plasticity index as the parameters that have the highest influence on model prediction. Feature reduction demonstrated that models using only these key parameters retained comparable predictive accuracy, enhancing efficiency for practical applications to sparse datasets. Furthermore, the study explored the effects of soil classification on model performance, comparing conventional Unified Soil Classification System (USCS) groupings with unsupervised K-means clustering. Data-driven clustering produced more homogeneous subsets, improving model robustness and mitigating the poor performance observed in sparse coarse-grained USCS categories. The proposed framework highlights the potential of machine learning—especially RF models combined with intelligent feature selection and clustering—to develop interpretable, location-specific, and highly accurate predictive tools for geotechnical engineering applications, reducing reliance on time-consuming laboratory testing.
Did this research project receive funding support from the Office of Undergraduate Research.
No
A Machine Learning Based Systematic Framework for Modelling of Compression and Recompression Indices for Florida Soils
This study presents a systematic machine learning-based framework to predict the compression index (Cc) and recompression index (Cr) of Florida soils using a database of 497 consolidation tests conducted for the Florida Department of Transportation. Traditional methods for estimating compressibility rely on time intensive laboratory tests or simplified empirical correlations that fail to capture the nonlinear, multivariate nature of soil behavior. To overcome these limitations, this research evaluates multiple predictive models—including polynomial regression, ridge regression, support vector regression (SVR), and random forests (RF)—to identify the most accurate and robust approach. Among these, the RF model showed the best performance, when applied to the database. Furthermore, a systematic feature selection process using Pearson correlation, distance correlation, and Gini importance identified initial void ratio, moisture content, dry unit weight, liquid limit, and plasticity index as the parameters that have the highest influence on model prediction. Feature reduction demonstrated that models using only these key parameters retained comparable predictive accuracy, enhancing efficiency for practical applications to sparse datasets. Furthermore, the study explored the effects of soil classification on model performance, comparing conventional Unified Soil Classification System (USCS) groupings with unsupervised K-means clustering. Data-driven clustering produced more homogeneous subsets, improving model robustness and mitigating the poor performance observed in sparse coarse-grained USCS categories. The proposed framework highlights the potential of machine learning—especially RF models combined with intelligent feature selection and clustering—to develop interpretable, location-specific, and highly accurate predictive tools for geotechnical engineering applications, reducing reliance on time-consuming laboratory testing.