• Events
  • Subscribe

Self-serve analytics streamlines operations for pharma manufacturers

Integration of data and user-based visualizations put timely insights in the hands of experts for more informed decision-making
As pharmaceutical manufacturers strive for greater efficiency and productivity, many are looking for a self-service analytics (SSA) solution to streamline these efforts. But what is self-service analytics and how would such a solution best benefit our customers? Applied Materials Pharma group has looked at the many human, technology and process aspects that need to be considered in developing meaningful SSA. Ultimately, the solution needs to integrate different data types from many different locations and human expertise to be easier to direct informed decision-making.

Defining self-service analytics

Self-service data analytics is essentially a tool that enables users to access and analyze data without relying on IT or system specialists for support. It entails being able to integrate across data sources and easily apply machine learning via a self-service machine learning tool. Most often in biopharma manufacturing today, the end user is the business user who wants information to address a specific issue or assess a KPI. This user could have the role of process development engineer, manufacturing operator, or manufacturing manager, among others.

Currently, data is stored in silos such as building monitoring systems which capture temperature, humidity, pressure, CO2 levels, entry and exit of personnel and such, lab information management systems, and systems that store process data. Until it is integrated, complete data cannot be efficiently processed into information and used for effective modelling. We often help our customers with integration of disparate systems to build a data hub, or data lake, where all this information is fully accessible.

Need for data fusion

Access to the data often needs a data fusion layer to bring it all together in an easy way, contextualize it, align it and even deal with different types of data. The nature of data changes across the different sources, for example, standard time series data compared to spectral data.

Even in terms of what it means to then analyze the data, this could refer to using machine learning models, AI, neural networks, or it could be calculations – a mechanistic relationship you just want an equation for. The software needs to be able to handle different needs and possibly to merge them together.

Contextualizing data for the purpose

Data provides little information without context. To get the most valuable information from a model, the data needs to be contextualized in terms of the big picture – what it is you are trying to learn from the model. For example, maybe you’re looking to determine the ability of the control system to keep up with the demand of the process. You could use a model looking at how much fluctuation there is in this temperature around a set point during a certain phase of a sterilization cycle. You wouldn’t use just the temperature in your model, because that is constantly streaming. You would instead specify that it’s the temperature during a certain part of the process, and not only what the temperature is, but how fast it’s increasing. Models use this type of preprocessed data.

The idea of data collection and fusion can sometimes be complex, as in a model we developed for contamination control, illustrated in figure 1 below.

Figure 1: Multiple data types and locations integrated for a contamination control model
Figure 1: Multiple data types and locations integrated for a contamination control model

We needed to look at the cleaning history of the room, as well as the operations and maintenance personnel coming in and out of the room, alongside particular events. Combining all this data, we were able to determine that there is a contamination risk for a certain unit operation when specific activities take place.

Ideally, a self-serve analytics tool would have different models in place and different ways of combining data for those models. It would then have a dashboard to bring relevant information to each user without them having to create the view themselves or request expert input from a data scientist. With a click or a drag, they could see what was happening on a piece of equipment, or why something happens at a particular point in the process. When the user doesn’t have to spend time coding, they can direct their time to more complex tasks or analyses that require their skill sets.

Integrating data and skillsets

There will always be the need for an understanding of the process to develop an accurate model that will meet the end user’s needs. When process experts understand how requested information will be used, they can identify the variables that need to be included. As in the above examples, it’s not a matter of single data points, but data in the context of the process – how different pieces of equipment, people, sensors and such interact – that provides actionable insights. As self-service analytics matures, it provides the potential for not only greater integration of data sources, but of human roles and skill sets.

About the Author

Picture of Lucas Vann, PhD – Pharma CTO
Lucas Vann, PhD – Pharma CTO
Lucas has over 15 years of experience with both lab-scale and pilot scale bioprocessing equipment, specifically fermenters, and therefore has a thorough knowledge of bioreactor construction and operation, instrumentation, process control, cell culture techniques and fermentation processes. He has extensive automation and PAT experience, including data analytics and chemometric modeling using various software platforms. He leads the Pharma APG Technical Organization. Lucas earned a PhD in Bioprocessing from North Carolina State University.