Machine Learning-Enabled NIR Spectroscopy for Pharmaceutics Data Selection

Article

This study is an important contribution to the field of machine learning-enabled NIR spectroscopy, offering researchers a systematic method for selecting representative subsamples from existing data with quality measures, diagnostic tools, and visualization techniques.

Researchers from Graz University of Technology and Christ University have presented a systematic method for choosing representative subsamples from existing research with an extensive set of quality measures and a visualization strategy. In their article, published in AAPS PharmSciTech, Amrit Paudel and Gobi Ramasamy describe the systematic and structured procedure for selecting subsamples from the historical data (1). They offer a wide range of in-depth quality measures, diagnostic tools, and visualization techniques.

Artificial intelligence (AI), machine learning and modern computer technologies concepts. Business, Technology, Internet and network concept. | Image Credit: © putilov_denis - stock.adobe.com

Artificial intelligence (AI), machine learning and modern computer technologies concepts. Business, Technology, Internet and network concept. | Image Credit: © putilov_denis - stock.adobe.com

The study used an open-source tablet data set that consists of different doses in milligrams, different shapes, and sizes of dosage forms, slots in tablets, three different manufacturing scales (laboratory, pilot, production), coating differences (coated vs. uncoated), and more. The model was developed on one scale, and the researchers investigated how well the top models are transferable when tested on new data like pilot-scale or production (full) scale.

The researchers demonstrated the selection of appropriate hyperparameters and their impact on the artificial neural network-multilayer perceptron (ANN-MLP) model performance. The choice of hyperparameter tuning approaches and performance with available references are discussed for the data under investigation. The model extension from laboratory-scale to pilot-scale was successfully demonstrated.

ANN-MLP is a type of artificial neural network that is widely used for supervised learning. It is a feedforward neural network with multiple layers of neurons, including an input layer, one or more hidden layers, and an output layer. Each neuron in the network receives input from the previous layer, performs a mathematical operation on the input, and then passes the result to the next layer. ANN-MLP is used for a variety of applications such as database exploration, calibration modeling, image recognition, speech recognition, and natural language processing.

Near-infrared (NIR) spectroscopy is non-destructive and non-intrusive, requires little to no sample preparation, and its overall analysis time may be considerably reduced, making it an ideal real-time analytical tool. This technique is used primarily in the pharmaceutical, agriculture, food and dairy, cosmetics, pulp and paper, and precision medicine industries.

Derivatization, normalization, scatter correction, and advanced approaches are a few of the data pre-processing techniques used to conceal physical information and retrieve chemically related information from NIR data. Modelling is employed after physical/chemical information has been segmented. Principal component regression (PCR) and partial least squares regression (PLS) are used in multivariate linear models.

This study is an important contribution to the field of machine learning-enabled NIR spectroscopy, offering researchers a systematic method for selecting representative subsamples from existing data with quality measures, diagnostic tools, and visualization techniques. The research provides a framework for choosing appropriate hyperparameters and demonstrates the extension of models from laboratory-scale to pilot-scale.

Reference

(1) Ali, H.; Muthudoss, P.; Ramalingam, M.; Kanakaraj, L.; Paudel, A.; Ramasamy, G. Machine Learning–Enabled NIR Spectroscopy. Part 2: Workflow for Selecting a Subset of Samples from Publicly Accessible Data. AAPS PharmSciTech. 2023, 24, 34. https://link.springer.com/article/10.1208/s12249-022-02493-5

Related Content