Is this project an undergraduate, graduate, or faculty project?
Undergraduate
group
Authors' Class Standing
Daniel Oldham, Senior
Lead Presenter's Name
Daniel Oldham
Faculty Mentor Name
Mihhail Berezovski
Abstract
Data mining and statistical analysis software are increasingly becoming widespread in business fields to maximize company efficiency, and the technology itself has applications for a plethora of beneficial industry and real-world applications. Using clinical patient data from a foster care organization based in Gainesville, Florida, this research attempts to gain insights into childrens’ lives using R statistical analysis software and Orange Data Mining Suite. In total, 8 years’ worth of data totaling 250,000 observations and 52 variables regarding both the children and their parents was imported, cleaned, and leveraged for predictive analysis using these programs. The goal is to provide insights to help the organization identify the characteristics of the children and parents who are most at risk for undesirable outcomes. Various insights have been discovered, and as the research continues, more are being found. Given the scale of this project’s data, it has become increasingly necessary to filter the data into “clusters” to analyze, which is where the research is currently headed. This research provides a glimpse into the realm of data mining for unique industrial purposes and sheds light on the diversity of state-of-the-art data mining and statistical programs’ capabilities.
Did this research project receive funding support from the Office of Undergraduate Research.
Yes, Ignite Grant
Data Mining to benefit Foster Care Children and Parents
Data mining and statistical analysis software are increasingly becoming widespread in business fields to maximize company efficiency, and the technology itself has applications for a plethora of beneficial industry and real-world applications. Using clinical patient data from a foster care organization based in Gainesville, Florida, this research attempts to gain insights into childrens’ lives using R statistical analysis software and Orange Data Mining Suite. In total, 8 years’ worth of data totaling 250,000 observations and 52 variables regarding both the children and their parents was imported, cleaned, and leveraged for predictive analysis using these programs. The goal is to provide insights to help the organization identify the characteristics of the children and parents who are most at risk for undesirable outcomes. Various insights have been discovered, and as the research continues, more are being found. Given the scale of this project’s data, it has become increasingly necessary to filter the data into “clusters” to analyze, which is where the research is currently headed. This research provides a glimpse into the realm of data mining for unique industrial purposes and sheds light on the diversity of state-of-the-art data mining and statistical programs’ capabilities.