Author Information

Daniel OldhamFollow

Is this project an undergraduate, graduate, or faculty project?

Undergraduate

group

Authors' Class Standing

Daniel Oldham, Senior

Lead Presenter's Name

Daniel Oldham

Faculty Mentor Name

Mihhail Berezovski

Abstract

Data mining and statistical analysis software are increasingly becoming widespread in business fields to maximize company efficiency, and the technology itself has applications for a plethora of beneficial industry and real-world applications. Using clinical patient data from a foster care organization based in Gainesville, Florida, this research attempts to gain insights into childrens’ lives using R statistical analysis software and Orange Data Mining Suite. In total, 8 years’ worth of data totaling 250,000 observations and 52 variables regarding both the children and their parents was imported, cleaned, and leveraged for predictive analysis using these programs. The goal is to provide insights to help the organization identify the characteristics of the children and parents who are most at risk for undesirable outcomes. Various insights have been discovered, and as the research continues, more are being found. Given the scale of this project’s data, it has become increasingly necessary to filter the data into “clusters” to analyze, which is where the research is currently headed. This research provides a glimpse into the realm of data mining for unique industrial purposes and sheds light on the diversity of state-of-the-art data mining and statistical programs’ capabilities.

Did this research project receive funding support (Spark or Ignite Grants) from the Office of Undergraduate Research?

Yes, Ignite Grant

Share

COinS
 

Data Mining to benefit Foster Care Children and Parents

Data mining and statistical analysis software are increasingly becoming widespread in business fields to maximize company efficiency, and the technology itself has applications for a plethora of beneficial industry and real-world applications. Using clinical patient data from a foster care organization based in Gainesville, Florida, this research attempts to gain insights into childrens’ lives using R statistical analysis software and Orange Data Mining Suite. In total, 8 years’ worth of data totaling 250,000 observations and 52 variables regarding both the children and their parents was imported, cleaned, and leveraged for predictive analysis using these programs. The goal is to provide insights to help the organization identify the characteristics of the children and parents who are most at risk for undesirable outcomes. Various insights have been discovered, and as the research continues, more are being found. Given the scale of this project’s data, it has become increasingly necessary to filter the data into “clusters” to analyze, which is where the research is currently headed. This research provides a glimpse into the realm of data mining for unique industrial purposes and sheds light on the diversity of state-of-the-art data mining and statistical programs’ capabilities.