This installment, and the next one, comprise lists of four key explanatory or tutorial references for each of 29 chemometric topics described in a previous article, with the addition of programming platforms often used for chemometrics. The references are selected as being particularly helpful to explain the use of each technique with spectroscopic data whenever possible. This reference list cannot be exhaustive, due to space limitations, but it is extensive and comprehensive. Included is a series of tables listing the key reference numbers for each chemometric technique.
The August 2020 installment of the “Chemometrics in Spectroscopy” column was entitled, “Survey of Chemometric Methods Used in Spectroscopy” (1). In that article, we delineated 29 common chemometric methods (or techniques) in use today by spectroscopists, and selected a single literature reference for each method. This current installment, and the one that will follow, continue this theme, forming a two-part series. In this two-part series, the respective chemometric methods, with their corresponding literature reference numbers, are given in Tables I through V. The tables for the two-part series include the following topics (in order of appearance); in all, 30 chemometrics topics will be covered.
Part I includes Tables I and II with references. Part II will include Tables III through V with corresponding literature references.
The science of chemometrics has rapidly advanced to now be included in the broader field of data analytics. There are now degree programs in data analytics and all types of data processing related to computer science or specific other data fields, including econometrics, biometrics, biomedical statistics, data mining, chemometrics—and software design and architecture specifically for design engineering, control systems, manufacturing engineering, robotics, instrument control, predictive modeling and learning, and many other fields. This includes artificial intelligence and its subfields of machine learning algorithms (used in data filtering and computer imaging applications), computational statistics and optimization, supervised, semi-supervised, and unsupervised learning, and various forms of predictive modeling. Other machine learning family members include deep learning (deep structured learning), artificial neural networks, and specialized learning algorithms. A menagerie of techniques or combinations of techniques is continuously being introduced, with ever-changing names. However, the basic mathematical concepts behind these changing names are much the same, and advances are mostly in name, the computer processing power used, and in the combinations and applications of the data analysis algorithms used. At some future point, we hope to at least summarize the nomenclature of these fields in this column.
Description of References
The general references 1–12 are basic descriptions of chemometrics reviews specific to spectroscopy applications. Then, a set of four references are given in Tables I through IV for each chemometric method name given, and five references are given for each computer software platform introduced in Table V. The references for the two-part series will be sequential from 1 to 148 so that the series can be viewed as a single body of work.
Introduction to the Tables
Many excellent videos, technical notes, online sources, and published articles exist for the purpose of instruction and understanding of algorithms and chemometrics topics. Here, we have selected a set of papers from the technical literature that includes chemometrics reviews for spectroscopy (1–12) as well as a set of articles for each of 30 selected chemometrics topics (including software platforms): We included four references for each topic that we considered most applicable to Spectroscopy readers, and we also tried to include those references that might be considered “classic” or tutorial papers. As we specifically delve into each subject or topic, we will include additional references that will be helpful to the reader in understanding and using these various chemometric methods.
Here in part I of this series, Table I presents the references for various signal preprocessing techniques. These data processing methods are often used prior to the application of data exploration, or prior to qualitative or quantitative methods. Table II lists references for component analysis techniques used mostly for data exploration and discovery. In Part II of the series, Table III will show the variety of references for quantitative (calibration) methods used to take raw or preprocessed data and compute predictive calibration models for quantitative determination of physical or chemical parameters in a dataset. Table IV will provide references for the qualitative (classification) methods used to take raw or preprocessed data and compute predictive calibration models for qualitative (classification) of different groups or types of samples or of physical or chemical parameters in a dataset. Table V will include references for using the most common programming languages or platforms for general data interpretation using chemometrics or other statistical analysis.
References
Chemometrics Reviews for Spectroscopy
(1) J. Workman and H. Mark, Spectroscopy 35(8), 9–14 (2020).
(2) J.J. Workman Jr., P.R. Mobley, B.R. Kowalski, and R. Bro, Appl. Spectrosc. Rev. 31(1–2), 73–124 (1996).
(3) P.R. Mobley, B.R. Kowalski, J.J. Workman Jr., and R. Bro, Appl. Spectrosc. Rev. 31(4), 347–368 (1996).
(4) R. Bro, J.J. Workman Jr., P.R. Mobley, and B.R. Kowalski, Appl. Spectrosc. Rev. 32(3), 237–261 (1997).
(5) P. Geladi, Spectrochim Acta Part B At Spectrosc. 58(5), 767–782 (2003).
(6) P. Geladi, B. Sethson, J. Nyström, T. Lillhonga, T. Lestander, and J. Burger, Spectrochim Acta Part B At Spectrosc. 59(9), 1347–1357 (2004).
(7) B. Lavine and J. Workman, Anal. Chem. 80(12), 4519–4531 (2008).
(8) T. Rajalahti and O.M. Kvalheim, Int. J. Pharm. 417(1–2), 280–290 (2011).
(9) R.G. Brereton, J. Jansen, J. Lopes, F. Marini, A. Pomerantsev, O. Rodionova, J.M. Roger, B. Walczak, and R. Tauler, Anal. Bioanal. Chem. 409(25), 5891–5899 (2017).
(10) R.G. Brereton, J. Jansen, J. Lopes, F. Marini, A. Pomerantsev, O. Rodionova, J.M. Roger, B. Walczak, and R. Tauler, Anal. Bioanal. Chem. 410(26), 6691–6704 (2018).
(11) H. Yang, Spectroscopy 34(11), 40–42 (2019).
(12) H. Mark, and J. Workman Jr., Chemometrics in Spectroscopy (Elsevier, Academic Press, New York, New York, 2nd ed., 2018)
Signal Preprocessing
1. Baseline Subtraction
(13) C. Rowlands and S. Elliott, J. Raman Spectrosc. 42(3), 363–369 (2011).
(14) A.T. Weakley, P.R. Griffiths, and D.E. Aston, Appl. Spectrosc. 66(5), 519–529 (2012).
(15) A. Jirasek, G. Schulze, M.M.L. Yu, M.W. Blades, and R.F.B. Turner, Appl. Spectrosc. 58(12), 1488–1499 (2004).
(16) J.R. Powell, F.M. Wasacz, and R.J. Jakobsen, Appl. Spectrosc. 40(3), 339–344 (1986).
2. Derivative Preprocessing
(17) M.N. Leger and A.G. Ryder, Appl. Spectrosc. 60(2), 182–193 (2006).
(18) Y.L. Loethen, D. Zhang, R.N. Favors, S.B. Basiaga, and D. Ben-Amotz, Appl. Spectrosc. 58(3), 272–278 (2004).
(19) A.C. Dotto, R.S.D Dalmolin, A. ten Caten, and S. Grunwald, Geoderma 314, 262–274 (2018).
(20) B. Zimmermann and A. Kohler, Appl. Spectrosc. 67(8), 892–902 (2013).
3. Detrending
(21) K.E. Jang, S. Tak, J. Jung, J. Jang, Y. Jeong, and Y.C. Ye, J. Biomed. Opt. 14(3), 034004 (2009).
(22) B.K. Alsberg, W.G. Wade, and R. Goodacre, Appl. Spectrosc. 52(6), 823–832 (1998).
(23) D. Cozzolino and A. Moron, Anim. Feed Sci. Technol. 111(1–4), 161–173 (2004).
(24) A. Fassio and D. Cozzolino, Ind. Crops Prod. 20(3), 321–329 (2004).
4. Mean Centering
(25) A. Afkhami and M. Bahram, Talanta 66(3), 712–720 (2005).
(26) J.B. Cooper, Chemometr. Intell. Lab. Syst. 46(2), 231–247 (1999).
(27) M.P. Gómez-Carracedo, J.M. Andrade, D.N. Rutledge, and N.M. Faber, Anal. Chim. Acta 585(2), 253–265 (2007).
(28) A. Lorber, K. Faber, and B.R. Kowalski, J. Chemom. 10(3), 215–220 (1996).
5. Multiplicative Signal Correction
(29) Y.P Du, S. Kasemsumran, K., Maruo, T. Nakagawa, and Y. Ozaki, Anal. Sci. 21(8), 979–984 (2005).
(30) G.E. Fodor, R.A. Mason, and S.A. Hutzler, Appl. Spectrosc. 53(10), 1292–1298 (1999).
(31) H. Martens and E. Stark, J. Pharm. Biomed. 9(8), 625–635 (1991).
(32) A. Kohler, J. Sulé-Suso, G.D. Sockalingum, M. Tobin, F. Bahrami, Y. Yang, J. Pijanka, P. Dumas, M., Cotte, D.G. Van Pittius, and G. Parkes, Appl. Spectrosc. 62(3), 259–266 (2008).
6. Normalization
(33) J. Palacký, P. Mojzeš, and J. Bok, J. Raman Spectrosc. 42(7), 1528–1539 (2011).
(34) Å. Rinnan, F. Van Den Berg, and S.B. Engelsen, Trends Analyt. Chem. 28(10), 1201–1222 (2009).
(35) M.A. Czarnecki, Appl. Spectrosc. 53(11), 1392–1397 (1999).
(36) N.B. Zorov, A.A. Gorbatenko, T.A. Labutin, and A.M. Popov, Spectrochim. Acta B 65(8), 642–657 (2010).
7. Standard Normal Variate
(37) R.J. Barnes, M.S. Dhanoa, and S.J. Lister, Appl. Spectrosc. 43(5), 772–777 (1989).
(38) M.S. Dhanoa, S.J. Lister, R. Sanderson, and R.J. Barnes, J. Near Infrared Spectrosc. 2(1), 43–47 (1994).
(39) Q. Hai-bin, O. Dan-lin, and C. Yi-yu, J. Zhejiang Univ. Sci. B 6(8), 838–843 (2005).
(40) S. Romero-Torres, J.D. Pérez-Ramos, K.R. Morris, and E.R. Grant, J. Pharm. Biomed. 38(2), 270–274 (2005).
8. Successive Projections Algorithm (SPA)
(41) M.C.U. Araújo, T.C.B. Saldanha, R.K.H. Galvao, T. Yoneyama, H.C. Chame, and V. Visani, Chemometr. Intell. Lab. Syst. 57(2), 65–73 (2001).
(42) S.F.C. Soares, A.A. Gomes, M.C.U. Araujo, A.R. Galvão Filho, and R.K.H. Galvão, Trends Analyt. Chem. 42, 84–98 (2013).
(43) R.K.H. Galvao, M.C.U. Araujo, W.D. Fragoso, E.C. Silva, G.E. Jose, S.F.C. Soares, and H.M. Paiva, Chemometr. Intell. Lab. Syst. 92(1), 83–91 (2008).
(44) M.J.C. Pontes, R.K.H. Galvao, M.C.U. Araújo, P.N.T. Moreira, O.D.P. Neto, G.E. Jose, and T.C.B. Saldanha, Chemometr. Intell. Lab. Syst. 78(1-2), 11–18 (2005).
9. Wavelets
(45) B.K. Alsberg, A.M. Woodward, M.K. Winson, J. Rowland, and D.B. Kell, Analyst 122(7), 645–652 (1997).
(46) B. Walczak, E. Bouveresse, and D.L. Massart, Chemometr. Intell. Lab. Syst. 36(1), 41–51 (1997).
(47) J. Trygg and S. Wold, Chemometr. Intell. Lab. Syst. 42(1–2), 209–220 (1998).
(48) P.J. Brown, T. Fearn, and M. Vannucci, J. Am. Stat. Assoc. 96(454), 98–408 (2001).
Component Analysis
10. Classical Least Squares (CLS)
(49) D.M. Haaland and D.K. Melgaard, Vib. Spectrosc. 29(1–2), 171–175 (2002).
(50) D.M. Haaland and D.K. Melgaard, Appl. Spectrosc. 55(1), 1–8 (2001).
(51) D.K. Melgaard, D.M. Haaland, and C.M. Wehlburg, Appl. Spectrosc. 56(5), 615–624 (2002).
(52) T.G. Diaz, A. Guiberteau, J.O. Burguillos, and F. Salinas, Analyst 122(6), 513–517 (1997).
11. Independent Component Analysis (ICA)
(53) J.D. Bayliss, J.A. Gualtieri, and R.F. Cromp, “Analyzing Hyperspectral Data with Independent Component Analysis,” in 26th AIPR Workshop: Exploiting New Image Sources and Sensors, 3240, 133–143, International Society for Optics and Photonics (1998).
(54) J. Chen and X.Z. Wang, J. Chem. Inform. Comput. Sci. 41(4), 992–1001 (2001).
(55) J.M. Nascimento and J.M. Dias, IEEE Transactions on Geoscience and Remote Sensing 43(1), 175–187 (2005).
(56) N. Pasadakis and A.A. Kardamakis, Anal. Chim. Acta 578(2), 250–255 (2006).
12. Inverse Adding Doubling (IAD)
(57) S. Prahl, “Everything I Think You Should Know About Inverse Adding-Doubling,” Oregon Medical Laser Center, St. Vincent Hospital, 1–74 (2011).
(58) J. Yao, “Inverse Adding-Doubling Method for the Determination of Optical Properties of Thermotropic Material,” in 2010 International Conference on Display and Photonics, 7749, 77490V, International Society for Optics and Photonics (2010).
(59) S. Bellini, R. Bendoula, E. Latrille, and J.M. Roger, Appl. Spectrosc. 68(10), 1154–1167 (2014).
(60) W. Wang, C. Li, and R.D. Gitaitis, Trans. ASABE 57(6), 1771–1782 (2014).
13. Multivariate Curve Resolution (MCR)
(61) A. De Juan and R. Tauler, Crit. Rev. Anal. Chem. 36(3–4), 163–176 (2006).
(62) A. de Juan, J. Jaumot, and R. Tauler, Anal. Meth. 6(14), 4964–4976 (2014).
(63) Y. Xie, W. Cao, S. Krishnan, H. Lin, and N. Cauchon, Pharm. Res. 25(10), 2292 (2008).
(64) M. Garrido, F.X. Rius, and M.S. Larrechi, Anal. Bioanal. Chem. 390(8), 2059–2066 (2008).
14. Principal Components Analysis (PCA)
(65) E.K. Kemsley, Chemometr. Intell. Lab. Syst. 33(1), 47–61 (1996).
(66) E.J. Hasenoehrl and P.R. Griffiths, Appl. Spectrosc. 47(5), 643–650 (1993).
(67) R.C. Pereira, V.L Skrobot, E.V. Castro, I.C. Fortes, and V.M. Pasa, Energy Fuels 20(3), 1097–1102 (2006).
(68) C.W. Chang, D.A. Laird, M.J. Mausbach, and C.R. Hurburgh, Soil Sci. Soc. Am. J. 65(2), 480–490 (2001).
Jerome Workman, Jr. serves on the Editorial Advisory Board of Spectroscopy and is the Senior Technical Editor for LCGC and Spectroscopy. He is also a Certified Core Adjunct Professor at U.S. National University in La Jolla, California. He was formerly the Executive Vice President of Research and Engineering for Unity Scientific and Process Sensors Corporation.
Howard Mark serves on the Editorial Advisory Board of Spectroscopy, and runs a consulting service, Mark Electronics, in Suffern, New York. Direct correspondence to: SpectroscopyEdit@mmhgroup.com ●
From Classical Regression to AI and Beyond: The Chronicles of Calibration in Spectroscopy: Part I
February 14th 2025This “Chemometrics in Spectroscopy” column traces the historical and technical development of these methods, emphasizing their application in calibrating spectrophotometers for predicting measured sample chemical or physical properties—particularly in near-infrared (NIR), infrared (IR), Raman, and atomic spectroscopy—and explores how AI and deep learning are reshaping the spectroscopic landscape.
Improving Citrus Quality Assessment with AI and Spectroscopy
February 13th 2025Researchers from Jiangsu University review advancements in computer vision and spectroscopy for non-destructive citrus quality assessment, highlighting the role of AI, automation, and portable spectrometers in improving efficiency, accuracy, and accessibility in the citrus industry.
Advancing Near-Infrared Spectroscopy and Machine Learning for Personalized Medicine
February 12th 2025Researchers have developed a novel approach to improve the accuracy of near-infrared spectroscopy (NIRS or NIR) in quantifying highly porous, patient-specific drug formulations. By combining machine learning with advanced Raman imaging, the study enhances the precision of non-destructive pharmaceutical analysis, paving the way for better personalized medicine.
New Method for Detecting Fentanyl in Human Nails Using ATR FT-IR and Machine Learning
February 11th 2025Researchers have successfully demonstrated that human nails can serve as a reliable biological matrix for detecting fentanyl use. By combining attenuated total reflectance-Fourier transform infrared (ATR FT-IR) spectroscopy with machine learning, the study achieved over 80% accuracy in distinguishing fentanyl users from non-users. These findings highlight a promising, noninvasive method for toxicological and forensic analysis.
New AI-Powered Raman Spectroscopy Method Enables Rapid Drug Detection in Blood
February 10th 2025Scientists from China and Finland have developed an advanced method for detecting cardiovascular drugs in blood using surface-enhanced Raman spectroscopy (SERS) and artificial intelligence (AI). This innovative approach, which employs "molecular hooks" to selectively capture drug molecules, enables rapid and precise analysis, offering a potential advance for real-time clinical diagnostics.