Heavy metal poisoning via food is a leading cause of kidney disease and cancer in many parts of the world. Pollution of staple food products has led to serious health effects across the globe. This research investigates the application of laser-induced breakdown spectroscopy (LIBS) and machine learning (ML) for detecting elemental composition of food, using rice as an example. Six varieties of rice samples were measured using LIBS. The resulting spectral data were analyzed, and elemental assignments were made based on characteristic wavelengths. The study leveraged the inherent capabilities of LIBS to provide rapid and non-destructive elemental analysis of the samples. Principal component analysis (PCA) was performed to extract meaningful information from the complex LIBS spectral dataset to obtain a holistic understanding of the variations and similarities among the rice samples. A back propagation artificial neural network (BP-ANN) was trained using the PCA scores, reaching an overall classification accuracy of 88.68% on test data. The correlation between specific wavelengths and elemental content was established, providing an understanding on the elemental variability in different rice varieties.
While modern agricultural practices emphasize ecological, organic, and healthy methods in developed countries, conventional global agriculture, particularly in developing nations, still relies heavily on fertilizers and chemicals. This contamination poses a real threat to the health and well-being of people in many parts of the world. The ingestion of heavy metal, especially when present in food, has become a leading cause of cancer and kidney diseases (1). The production of low-quality staple foods like rice and wheat has global implications for public health due to their widespread consumption. To address this concern, this study investigates the application of laser-induced breakdown spectroscopy (LIBS) for food quality assessment. Rice serves as a model food, where LIBS detection is used to identify its elemental composition. Machine learning (ML) is then employed to analyze the resulting data.
Rice, a crucial staple food in the world consumed by more than half of the global population (with 90% of its production concentrated in Asia) (2), has provided energy to people in agricultural societies for thousands of years. It also serves as a good source of essential nutrients, including magnesium, phosphorus, manganese, selenium, iron, folic acid, thiamin, and niacin (2). While numerous rice varieties exist, the most commercially available types fall into two main categories: brown rice and white rice. The use of chemical fertilizers and pesticides in rice cultivation can lead to the release of toxic heavy metals into the soil. These metals may then be absorbed by the crops, resulting in heavy metal accumulation (3).
Previous research has employed various techniques for elemental analysis of rice. X-ray fluorescence (XRF) technique has been used to detect elements like Al, As, Br, Cd, Cl, Co, Cs, Cu, Fe, Hg, K, Mg, Mn, Mo, Rb, Se, and Zn in rice varieties (4). It has been further applied to analyze rice husk (5) and rice husk ash samples (6,7). A combination of plasma atomic emission spectrometry (PAES) and instrumental neutron activation analysis (INAA) has been used on Thai Jasmine brown rice, determining concentrations of Ca, K, Mg, and P using PAES and Fe, Mn and Zn using INAA (8). Inductively coupled plasma mass spectrometry (ICP-MS) has been used to determine elemental concentration of three uncooked long rice grains. Scanning electron microscopy-energy dispersive X-ray spectroscopy has been used to compare the structure of cooked and uncooked rice grain (9). In another study, flame atomic absorption spectrometry (FAAS) and ICP atomic emission spectrometry have been used to detect trace metal concentrations of eight different rice samples, determining the presence of Fe, Cd, Cr, and Zn (10). FAAS has been further employed to determine Cd in brown rice and spinach (11).
However, traditional elemental analysis methods suffer from limitations such as lengthy procedures, time consumption, sample destruction, and high cost. In contrast, LIBS offers several advantages, including few sample processing, short detection time, high accuracy, and the ability to measure multiple elements simultaneously in a non-destructive manner (12). In LIBS, a powerful laser is focused on a microscopic area of the sample, creating an intensely hot plasma composed of ionized matter. This plasma ablates a minute amount of the material. The exited atoms within the material emit light at various characteristic wavelengths corresponding to different elements present in the sample. A fiber optic detector captures the released light and sends it to a spectrometer. The spectrometer separates the light to its component wavelengths and analyzes their intensities, generating a detailed elemental profile (13).
The complex nature of LIBS data presents a challenge for analysis. These spectra contain hundreds of wavelengths and intensities that represent elemental composition of samples. Principal component analysis (PCA) is a dimensionality reduction technique that can be utilized to simplify this data for clear understanding. By identifying the underlying patterns and correlations within the data, PCA prioritizes the dimensions that explain the most variance, and linearly combines the original data points of hundreds of wavelengths and intensities into new axes called principal components (PC). PCs form a lower dimensional score plot capturing the essence of the structure in the data. This can be used to identify trends, outliers, and clusters of samples based on the elemental composition. It further filters out background noise and irrelevant fluctuations in the spectra making the results reliable (14–17).
By applying PCA to the LIBS data, PC scores are generated. These scores serve as the input for a back-propagation artificial neural network (BP-ANN), a powerful ML technique suitable for handling complex datasets. BP-ANN utilizes an iterative algorithm that minimizes cost function in neural networks by adjusting its weights and biases involving two steps: forward and backward propagation. Input data is fed into the neural network during the forward propagation, and output is computed through the successive neuron layers. The error between the predicted and actual output is calculated during the backward propagation which is used to adjust weights using chain rule (18,19). BP-ANNs are versatile, scalable, efficient, and suitable for diverse tasks and datasets.
This study combines LIBS, PCA, and BP-ANN to detect, analyze, and classify six different rice samples: red rice (Sample 1), oat rice (Sample 2), brown rice (Sample 3), scattered black soil pearl rice (Sample 4), black rice (Sample 5), and scattered rice (Sample 6).
The LIBS setup utilizes a high-energy Q-switched Nd: YAG (neodymium-doped yttrium aluminium garnet) nanosecond pulse laser (Surelite II-10 Continuum Co., Ltd.) (20) as the excitation source. The optical system consists of reflectors, a plano-convex lens, a signal detector, a multi-channel spectrometer (AvaSpec ULS2048-4 Channel-usb 2.0, Avantes) (21), a digital delay generator, and a computer (22). The laser operates at 1024 nm wavelength delivering a single laser pulse with an energy of 60 mJ within a 10 ns duration at a repetition frequency of 10 Hz. The laser spot size is approximately 7 mm, and its irradiance is about 1.6 × 107 W/cm2. The focusing lens has a 5 cm focal length. The spectrometer detects wavelengths from 200 nm to 890 nm (22–25). A time-delay device synchronizes the spectrometer detection with the laser pulse. The time delay is set to 1.5 μs for improved spectral resolution (26). The integration times of channel 1,2,3 and 4 of the spectrometers are 5.25 ms, 10 ms, 5.25 ms and 10 ms respectively. 1000 data points were gathered for each sample. The schematic diagram of the LIBS setup is illustrated in Figure 1.
Figure 1: Schematic diagram of the LIBS setup.
The six types of rice samples were purchased from a local grocery store. All six samples originated from the East Asian region. The samples were ground using a mortar and a pestle. Subsequently, six pallets were made from the ground samples using a hydraulic press. Finally, the pallets were exposed to the focal point of the laser pulse generated by the LIBS instrument for detection.
The gathered datapoints were used to generate graphs for the six rice samples across four channels of spectrums. Elemental assignments were made using the National Institute of Standards and Technology (NIST) atomic spectra database (27) for reference. The element assignment graphs of the six rice samples are illustrated in Figure 2. For comparison, the LIBS spectrum of ambient air is also included.
Figure 2: LIBS spectra of the six rice samples over four wavelength regions: (a) 240–300 nm, (b) 350–450 nm, (c) 460–680 nm, and (d) 700–800 nm. The spectrum of ambient air is included in cyan color for comparison.
Variations of Mg, Na, Ca, K, and Mn elements are observed in the spectra, along with the presence of C, H, O, and N (due to the ambient air) (28). The elemental profiles of all rice samples are very similar with a few exceptions. Notably, Mn is only observed in black rice, while K is absent from red rice. Rice (brown rice in particular) can be considered as a good source of Mg (29), as evidenced by spectral lines Mg I 285.162 nm, Mg II 279.505 nm and Mg II 280.217 nm.
Four Ca II lines are observed in the spectrum: Ca II 370.546 nm and Ca II 373.598 nm with weak intensities, and Ca II 393.178 nm and Ca II 396.666 nm with very high intensities that vary slightly depending on rice type. Additional Ca I lines were observed at 422.547 nm, 430.058 nm, 430.574 nm, and so on (Figure 2). Different rice variations naturally absorb Ca from soil in varying amounts. The amount of Ca in the soil itself is influenced by factors such as pH and the presence of minerals such as calcium carbonate. Additionally, the application of calcium-containing fertilizers may have further contributed to the high concentration of Ca observed in these rice samples (30,31).
Two strong Na I spectral lines are observed at 588.896 nm and 589.485 nm These lines are commonly observed in LIBS analysis of biological samples and likely indicate natural Na uptake from soil by rice plants (32). High levels of dissolved Na in irrigation water can also be absorbed, concentrating Na in rice and potentially explaining the strong Na signals observed. Even in non-saline conditions, rice plants naturally absorb Na from soil through various mechanisms. The amount of Na uptake depends on factors such as soil type, its salinity level, and rice variety (33–35).
The presence of Mn is exclusively observed in black rice, with six Mn II lines detected at various wavelengths. Trace amounts of Mn may exist in other rice samples undetectable levels. Black rice, commonly cultivated in Mn-rich soils, likely absorbs and accumulates more Mn due to its specific plant traits. The six Mn lines suggest a notable Mn presence, although not all are highly intense. Their intensities are significantly lower than Mg, highlighting the importance of Mg in rice. Mn might concentrate in specific parts of black rice, potentially explaining the lower spectral line intensities. Mn serves as a vital micronutrient in plant growth and metabolic processes like photosynthesis, regulated by specialized proteins. Variations in these proteins across rice types can influence Mn uptake. Soil conditions also play a role, leading to Mn deficiency or toxicity (36).
K is an essential micronutrient in plants (35), vital for enzyme activation and ion transport. Spectral lines corresponding to K I 361.806 nm, K I 766.486 nm and K I 769.848 nm are observed, likely caused by neutral potassium atoms. However, these lines are absent in the spectrum of red rice. Red rice inherently accumulates lower K content relatively to other rice samples. Nevertheless, considering the mostly similar elemental composition among all rice samples, it is likely that K is still present in red rice, although the signal is too weak to detect. The presence of K variations in higher intensities indicates a significantly higher K requirement in rice plants than Mn, further suggesting that Mn could be present in a localized manner within the rice grain, resulting in lower Mn signal.
This analysis suggests that the variations in elemental content between rice samples likely arise from a complex interplay of environmental and human-influenced factors. Soil composition appears to be a key factor, as evidenced by the higher Mn content observed in black rice grown Mn rich soil. Similarly, irrigation practices may influence the uptake of certain elements. For example, high levels of dissolved Na in irrigation water could contribute to the strong Na signals observed in the spectra. Human intervention through fertilizers may also play a role. The use of Ca-containing fertilizers could explain the observed Ca concentration in the rice samples. While these observations offer valuable insights, further studies are needed to definitively quantify the relative contributions of each factor (soil composition, irrigation practices, fertilizer use) to the elemental composition of rice samples.
PCA is a common dimensionality reduction technique in data science that simplifies complex, high-dimensional LIBS spectral data while preserving variance. The technique transforms data into a lower-dimensional space, generating principal components (PC) capturing variance. PC1 holds maximum variance, followed by PC2 and PC3 in decreasing importance while maintaining orthogonality to the previous PC. These components simplify visualizations and aid in pattern recognition. Second dimensional (2D) distribution plots project data onto PC1 and PC2, while third dimensional (3D) plots include PC3 to reveal additional nuances in the data.
The dataset obtained from the LIBS spectrometer underwent cleaning and subjected to PCA on both Origin and MATLAB platforms. The results are shown in Figure 3a, Figure 3b, Figure 4a, and Figure 4b.
Figure 3: Two-dimensional score plot (a) and three-dimensional score plot (b) generated by PCA performed on Origin platform.
Figure 4: Two-dimensional score plot (a) and three-dimensional score plot (b) generated by PCA performed on MATLAB platform.
Figure 3a, Figure 3b, Figure 4a, and Figure 4b depict 2D and 3D feature distributions of the six rice samples. Clear separations are observed between different sample groups based on their PC scores. These visualizations suggest that the first two principal components effectively capture a significant variance and differentiate between the sample groups. The distinct feature separation among the six rice samples suggests that the LIBS data has captured differences in the elemental compositions, which may reflect their origin, variety, quality, or contamination. Previous studies have highlighted the capabilities of LIBS in distinguishing rice samples from various geographical regions based on the concentrations of elements such as K, Ca, Mg, and Fe (37). The first two principal components (PC1 and PC2) account for 50.5% and 5.0% of the variance, respectively, explaining 55.5% of the variation in total. Notably, samples 1 and 2 cluster closely together in the positive quadrant of PC1, indicating similar spectral characteristics. Sample 3 is distinctly separated along PC2, while samples 4, 5, and 6 are spread out relatively to others.
A back propagation artificial neural network (BP-ANN) was trained on the principal component scores capturing the most variation. The dataset was formed using the PCA scores of the six samples along with their corresponding labels. Stratified shuffle split was employed to create a balanced training and testing dataset with a 50:50 split. The features in the dataset were normalized before feeding them into the neural network. To address class imbalance, over-sampling and under-sampling techniques were applied to achieve an even distribution across all samples.
The feed-forward neural network architecture employed an input layer, two hidden layers and an output layer. The input layer contained 12 neurons, matching the number of features used. The two hidden layers had 10 and 5 neurons, respectively, and utilized tansig (hyperbolic tangent) and logsig (logistic sigmoid) activation functions (38,39). The output layer contained six neurons, corresponding to the six rice sample classes.
Training was limited to a maximum of 2000 epochs with a target error of 10-6 and an initial learning rate of 0.01. Regularization was employed using a weight decay parameter to prevent overfitting. The backpropagation algorithm adjusted the weights to optimize the neural network performance during training.
The accuracy of the trained model was evaluated on both the train and test datasets. The model achieved an accuracy of 92.90% on the training dataset (Figure 5b), and 88.68% on the testing dataset (Figure 5a).
Figure 5: (a) Classification accuracy on test data (a) and train data (b) for the BP-ANN.
In the confusion matrices (Figure 6a, Figure 6b), classes 1–6 correspond to samples 1–6, respectively, and represent red rice, oat rice, brown rice, scattered black soil pearl rice, black rice, and scattered rice. The performance of the model is evaluated using accuracy and sensitivity metrics. Sensitivity, also known as recall, is a metric crucial for evaluating classification models where correctly identifying positive instances is essential. Accuracy is a more general metric that assesses the overall correctness of predictions across all classes.
Figure 6: Confusion matrices representing actual classes and predicted classes. Matrix (a) represents the model’s performance on unseen test data, while matrix (b) reflects its performance on the training data it was fitted on. The blue squares on the diagonal represent correct classifications for each of the six classes. Off-diagonal elements indicate misclassified samples. The bottom-horizontal and right-vertical sections show model accuracy and sensitivity per sample.
Confusion matrix for the test data (Figure 6a) reveals excellent performance by the BP-ANN on red rice, achieving 97.4% accuracy and precision. It correctly predicted 451 instances with only 9 misclassifications: 6 as oat rice and 3 as brown rice. The model also demonstrated high reliability, with a sensitivity of 98%.
Scattered rice achieved good results with 92.7% accuracy and precision, correctly identifying 429 instances. However, it misclassified scattered rice in 55 instances, primarily as scattered black soil pearl rice (21 instances) and black rice (33 instances), with a sensitivity of 88.6%.
The model performed well on oat rice, with 91.3% accuracy and precision, correctly classifying 422 instances. The main misclassifications were brown rice (43 instances) and scattered black soil pearl rice (1 instance), with a sensitivity of 88.5%.
Black rice achieved a moderate performance of 85.7% accuracy and precision with 87.3% sensitivity. It was correctly identified it in 397 instances, with 58 misclassifications: 30 as scattered black soil pearl rice, 22 as scattered rice and 6 as brown rice.
Brown rice results were moderate as well, with 82.7% accuracy and precision and 85.8% sensitivity. The model correctly classified 382 instances of brown rice with misclassifications as scattered black soil pearl rice (30 instances), oat rice (29 instances), and black rice (4 instances).
Lastly, scattered black soil pearl rice achieved 82.3% accuracy and precision with a sensitivity of 83.7%. It was correctly classified in 380 instances with multiple misclassified instances: 29 as black rice, 28 as brown rice, 12 as scattered rice and 5 as oat rice.
The model demonstrated the highest classification accuracy of 97.4% for red rice and the lowest classification accuracy of 82.3% for scattered black soil pearl rice.
This study demonstrates LIBS as a powerful tool for ensuring food quality in agriculture, and paves the way for using LIBS to detect heavy metal contamination in food samples. Six rice varieties (red rice, oat rice, brown rice, scattered black soil pearl rice, black rice and scattered rice) were detected using LIBS. The resulting spectra was compared to a reference database (27) to identify elements present. The analysis revealed variations of Mg, K, Mn, Na, and Ca across the samples, along with common atmospheric elements like C, H, N, and O. PCA was performed on LIBS data to condense the complex spectra into a visually interpretable space to identify trends, outliers and clusters based on elemental composition. PCA created a lower-dimensional scatter plot, visualizing clear separations of the six rice samples based on their unique elemental profiles. Next, a BP-ANN was trained using the LIBS data features. The trained model achieved impressive prediction accuracy, reaching 88.68% on test data and 92.9% on train data, demonstrating the highest classification accuracy of 97.4% for red rice followed by scattered rice (92.7%), oat rice (91.3%), black rice (85.7%), brown rice (82.7%) and scattered black soil pearl rice (82.3%). This method has the potential to be applied to other crops like various beans, grains and even processed food items like flour.
This work was supported by National Natural Science Foundation of China (Grant No. 62375136).
Asiri Iroshan, Jun Feng, Boyuan Han, Ruoyu Zhai, Ziang Chen, and Yuzhu Liu are with the Jiangsu Key Laboratory for Optoelectronic Detection of Atmosphere and Ocean at Nanjing University of Information Science & Technology, in Nanjing, China. Iroshan, Feng, Han, Zhai, Chen, and Liu are also with Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET), in Nanjing, China. Xiangxue Li is with the Chengdu University of Technology, in Chengdu, China. Direct correspondence to Yuzhu Liu at yuzhu.liu@nuist.edu.cn ●
Laser Ablation Molecular Isotopic Spectrometry: A New Dimension of LIBS
July 5th 2012Part of a new podcast series presented in collaboration with the Federation of Analytical Chemistry and Spectroscopy Societies (FACSS), in connection with SciX 2012 — the Great Scientific Exchange, the North American conference (39th Annual) of FACSS.
Assessing Milk Protein Stability Using ATR-FT-IR Spectroscopy
March 18th 2025A study published in the International Journal of Dairy Technology by lead author Mark A. Fenelon and his team at Teagasc Food Research Centre and University College Dublin demonstrates that ATR-FT-IR spectroscopy can effectively monitor heat-induced structural changes in milk proteins and colloidal calcium phosphate, offering valuable insights for optimizing dairy product stability and quality.
New Study Shows FT-MIR Spectroscopy Can Authenticate Parmigiano Reggiano Farming Practices
March 11th 2025A new study published in the Journal of Dairy Science demonstrates that FT-MIR spectroscopy can effectively authenticate farming practices and dairy systems in Parmigiano Reggiano production but has limited ability to verify animal welfare parameters.