So Much Data, So Little Information!

So Much Data, So Little Information!

When it comes to the manufacturing world, we have had process control systems and data historians for decades, but we have mostly lagged behind the “simpler” industries, like retail, when it comes to converting data into actionable information. If you look at the ways in which online advertising and big chain stores use data, you will perhaps feel a bit exposed to how much they can glean from our various buying and browsing habits. But aside from the problems associated with the inescapable monitoring of everyone’s behavior in our society, we have to admit that it is a well-oiled machine that delivers huge value to online and brick-and-mortar retailers. Even beyond the marketing and advertising, those same retailers have developed sophisticated, data-based algorithms to understand their supply chain and logistics, predicting problem areas and taking proactive actions to avoid those problems before they arise. So, how do industries like ours have had data for many more years but struggle to convert that data into something meaningful and valuable?

There are several reasons that have led to our current situation, but diving deep into that history isn’t particularly useful unless it helps us to make progress in our own endeavors to convert data into information. So, instead of focusing on what has happened, let’s focus on what we can do to change the trajectory and deliver value from our data.

The most critical and primary action is to organize and contextualize our data. On its face, this shouldn’t be too hard because we have engineers and process technicians who understand the data and have a general idea of which input variables will impact which output variables. However, mimicking human thought and pattern detection with data models and predictive algorithms is not an easy task. There are a few high-level steps that must be taken with our data to develop this kind of automated pattern detection and subsequent action:

1. Gather: Build a place where all data streams can reside. Automate the ingestion of data from various sources into that single location. Label parameters well and then organize them into simple, easy-to-navigate data tables. Map your process in total and see if there are variables that you know or suspect are important but which are not currently being recorded automatically.

2. Integrate: Most of our data streams are not immediately compatible. Consider an injection molding process where resin and additives are mixed before going to the injection molding machine. 

The resin and additives are received as batches, each with some measure of variation from the last. The resin blend won’t incorporate a new batch of resin at the exact same time as it switches to a new batch of each additive, so we have timing inconsistencies there. Then, once the resin blend is feeding the machine, that process could run for days or weeks, producing a steady stream of time-based data that is not in a batch-level format. Understanding whether you can reduce process data down to an average for the entire run versus needing some frequency of the time-based data to be preserved is a decision with major consequences for data analysis. Integrating your data streams means that you are carefully thinking through these questions. You can gather all of the data, but integrating it into a holistic picture that can be converted into decision-quality information is the heart of any data analytics exercise.

3. Correlate: Find connections between your data. Always remember that correlation does not equal causation. Ice cream sales and shark attack frequency have a startling correlation, but no one believes that sharks preferentially attack people who have recently eaten ice cream. Instead, they both map to another variable, which is the outside temperature. Likewise, we can develop correlations and links amongst our data streams, but we have to hold off on any conclusions at this early stage.

4. Model: Data modeling has a meaning within the IT function that has to do with how data is gathered streamed, and organized. In other words, the first two steps we discussed already. In this context, however, we are talking about how to take simple correlations and test whether there is some level of causation or prediction. Such correlations could be direct and simple or could involve many variables acting together in a complex manner. Principle Components Analysis can help to narrow the field of variables down to a manageable number for further analysis. Then multi-variant regression can be done to understand which variables will best predict a given outcome, whether desirable or undesirable. There are quite a few methods at your disposal for this task, but it is too complex to dig into in this article. Many software programs, including Excel, offer regression functions built in. One way or another, though, this is the money-making step. This is where you finally convert all your data into nuggets of actionable information.

5. Confirm and Act: Every regression can be overfitted and lead to erroneous conclusions. You must conduct whatever experiments that you can to prove what your model is saying. In most cases, the first step is to segregate your data such that the majority is used to build the model, and then a separate data set is plugged into the model to see if it predicts the outcomes properly. As you continue to gather more data from normal operations or from experimentation, you can refine and correct your predictive model.

Once you have confidence in the model, you can put alarms and interlocks into place, just as you would for safety alarms. Only in this case, you won’t be reacting to one variable or to quality outcome way down the line. Instead, you will look for a combination of parameter values, which your model predicts will be a problem for the eventual product being made. When you see that combination of input variables reacting in this undesirable way, you can shut down the process or alert an engineer. Thus, you will avoid making a lot of scrap due to the inherently slow feedback loop from a quality failure to corrective action.

To be sure, the above steps are a simplification of a sophisticated analysis project. However, it’s been my experience that many manufacturing plants don’t have a good understanding of this conceptual plan of attack. They have engineers that can do regression and process modeling, but they don’t have the data organized and integrated well. Maybe not all data is available to the right people. Often, the data itself is not being gathered, or at least not all the key parameters. These are high-level gaps that must be addressed before anyone, no matter how skilled in statistics or engineering, can hope to develop a good predictive model. Doing this effectively can predict machine failures as well as product quality, allowing for a much more efficient factory overall.

Read Also

Industry 4.0: Navigating Disruptive Technologies in Manufacturing

Peter Chambers, Managing Director, Sales, AMD APJ

Virtual Sensor Innovation Drives Higher Productivity

Russell Dover, General Manager, Service Product Line, Lam Research Corp. (LRCX: NASDAQ)

Automating the Engineering Journey with the Cloud

Wouter Meijs, Global Head of Cloud, ING

Recent Developments in Advanced 3d Sensing Applications

Ralph Gudde, VP Marketing and Sales, TRUMPF Photonic Components

Displays of the Future: Die- Attach Challenges in MicroLED Assembly

KlemensBrunn, President at HeraeusElectronics

RF Power Sensing for Productivity and Calibration

Adam Fleder, senior director of product management at Advanced Energy