Big Data to the Rescue
UofSC researchers crunch data — lots of it — to tackle real-world problems
By Chris Horn, email@example.com, 803-777-3687
Sensors on an Army helicopter gather critical information about daily wear on gears and bearings, providing a cost-saving and life-saving guide for replacing parts before they fail.
In a S.C. farm field, instruments continuously track data points such as humidity, temperature and precipitation, allowing researchers to model the amount of deadly aflatoxin fungus accumulating on corn crops about to be harvested.
And around the state, health care researchers on separate projects are studying patient data to improve utilization rates of HIV medical care and reduce the frequency of hospital readmissions.
A gold thread runs through each of these projects being conducted by University of South Carolina scientists — complex data analytics. In each case, USC researchers are finding ways to extract useful information from huge data sets, then apply it in ways that make an extraordinary impact.
Complex data research has been gaining momentum in recent years, and USC has become not only a major player but also a source of data analytics education and training for students.
“I’m of the mind that everyone needs to be able to do data analytics, from STEM fields to economics,” says John Rose, a professor in the Department of Computer Science and Engineering who teaches a two-semester course on the subject. “I’m an evangelist for data analytics.”
‘I can predict what you’re going to do…’
If Rose is an evangelist for big data analytics, Banky Olatosi is among the converted. Olatosi, a clinical associate professor in the Department of Health Services Policy and Management at the Arnold School of Public Health, believes it’s a way to address some of health care’s big challenges.
“Big data analytics has been a passion of mine for the past several years,” Olatosi says. “If you think about it, this is what the Googles, the Amazons already use. It’s proven in terms of, if I get the right data, I can get insight into human behavior. And once I get insight into that behavior, I can predict what you’re going to do even before you know what you’re going to do.”
That might sound a little spooky, but Olatosi and Xiaoming Li, a health promotion, behavior and education professor in the Arnold School, are hoping that predictive analytics can help address the medical care needs of South Carolina’s HIV-positive population. They’re collaborating on a project to gather anonymous data on that population, using sophisticated algorithms to predict when HIV-positive clients might drop out of medical care. Keeping those individuals engaged in regular medical care means better health outcomes for the patients and lowers the risk of the virus being transmitted to others.
In a separate project, nursing professor Ronda Hughes, director of the Center for Nursing Leadership, is partnering with Palmetto Health, one of South Carolina’s leading health care providers, to help change practices that can help lower the number of patients rehospitalized within 30 days of discharge. Those readmissions, which often occur with elderly patients or those with many health care needs, incur steep penalties for hospitals — sometimes totaling millions of dollars — from Medicare and Medicaid authorities.
Hughes’ project is aimed at improving protocols so that patients can avoid having to use hospital services so soon after being discharged, helping patients avoid unnecessary and expensive acute care and reducing costly hospital penalties in the process.
“We want to move toward making more data-informed decisions in caring for patients instead of just experienced-informed decisions,” Hughes says. “When patients have multiple chronic conditions, data analytics can do the heavy lifting for you in terms of knowing which protocols and interventions are the best to follow for treatment.”
Hughes’ project involves USC faculty from engineering, public health, business and pharmacy and taps into data from electronic health records, state and community databases and even national data.
“A lot of current clinical protocols are essentially check lists, including some diagnostic algorithms that all clinicians use. What we’re trying to do is create sophisticated predictive analytic-driven solutions that you can’t do in your head — something close to artificial intelligence.”
‘Like mixing a cake…’
When mechanical engineering professor Abdel Bayoumi and his team began a cost-benefit analysis of military helicopter maintenance in 1998, it became clear that the terabytes of data they were collecting on vibrations and temperature could — if properly interpreted — solve real problems. Twenty years later, the Center for Predictive Maintenance, now part of the McNAIR Center for Aerospace Innovation and Research, has saved the U.S. Army millions of dollars in helicopter maintenance costs and likely saved lives in the process.
“Just as one example, using data analytics we found a design fault on one part in the tail rotor gearbox of the Apache helicopter,” Bayoumi says. “It took the company that made the component several months to agree that there was a problem with the design, but when they redesigned it, they invited us to be part of the process. Fixing that one part saved the Army $52 million annually for the Apache helicopter fleet.”
Bayoumi’s team gathers data from a variety of areas — from sensors that measure vibration and temperature to human input in the form of maintenance records and technical manuals from the crew actually flying the helicopters. “That way you get a nice mix of data, like mixing a cake,” Bayoumi says. “We then develop algorithms to describe the root cause of the problem that might be premature failure of a bearing or some other part.”
Bayoumi’s group is leveraging its expertise in helicopter sensor analytics to work with Safran Engineering Services, whose broad range of aerospace services include manufacturing aircraft wiring harnesses. They’re also using big data techniques to collaborate with an Egyptian university to design and build a massive pump that will transfer water from the Red Sea to a desalination plant. His group will be responsible for creating a digital “twin” that will allow the pump to be a smart system that makes decisions.
In conjunction with the algorithms and models they create in house, the Center for Predictive Maintenance is tapping into a trove of software that Siemens Corp. donated to the university last year that Bayoumi believes will be helpful in the Egyptian water pump research. And with a new setup in its two-decade-old helicopter research, center staff will soon be able to make broader application of data analytics to the entire helicopter airframe, not just the drive train.
‘You have to gather a lot of data’
It’s virtually unobservable, but that doesn’t impair aflatoxin’s deadly effects. A fungus that grows on corn, aflatoxin causes cancer and can be fatal. Children who drink milk from cows that consumed aflatoxin-infested corn can experience impaired growth. That’s why the U.S. Department of Agriculture keeps a watchful eye on aflatoxin levels across the country.
“You have to gather a lot of data from environmental conditions including moisture and temperature to build a model to predict which fields or which parts of a field are at risk,” says Gabriel Terejanu, an associate professor of computer science and engineering. “The problem is that variability is very high. When you harvest corn, the aflatoxin-affected kernels might end up somewhere in the truck that’s not sampled.”
Terejanu is working with fellow Arnold School of Public Health researchers Buz Kloot and Anindya Chanda to deploy data-gathering instruments in S.C. farm fields to create predictive models. “We have a model, but we’re not happy with the validation so far,” he says. “I would like to have more field measurements — crop insurance claims, things like that — to validate our model.”
Once a valid predictive model is created, farmers could use it to decide the best time to harvest a crop and grain elevator companies would know which shipments from certain farms need oversampling to check for the fungus. Regulatory agencies would know where to deploy more inspectors into the field. Such modeling could be applied anywhere in the world.
Terejanu says aflatoxin is especially problematic in certain parts of Africa where grain isn’t stored properly in low-moisture silos. He and Chanda have submitted a funding proposal to the Gates Foundation for a project focused on aflatoxin modeling there. “Several hundred people died in Kenya several years ago because of aflatoxin exposure," Terejanu says. "An accurate predictive model could help prevent that.”
Training the next generation
As this sampling of projects at USC indicates, the applications of data analytics reach far and wide. John Rose’s “Big Data Analytics” course attracts not only MBA and statistics students — the types of majors you might expect — but also students from STEM disciplines and even sports management. “Everyone has seen ‘Money Ball,’” Rose quips.
The two sections of the course fill quickly, and they are part of a set of four courses, which include data visualization, another component of big data analytics. A minor in data science, developed cooperatively between the Department of Statistics in the College of Arts and Sciences and the Department of Computer Science and Engineering, is now available, and the two units are working on a certificate and an undergraduate double major. In addition, the Darla Moore School of Business offers a certificate in data analytics for graduate and undergraduate students.
It’s anyone’s guess how far big data research will go from here, but Rose points to deep learning as an obvious example. Sophisticated data sets coupled with computers and real-time sensors have made autonomously driven cars a reality.
“Now we have the computational infrastructure to handle all of this data,” he says. “But just because you have a hammer, not everything is a nail. Sometimes we have to know exactly what tool to use, and that’s what we’re teaching our students.”