Using THz-TDS and Machine Learning to Identify Wheat Gluten Types

News
Article

A recent study from China explored a new, non-destructive method combining terahertz time-domain spectroscopy (THz-TDS) and machine learning to accurately classify wheat gluten strength.

A new study led by Yin Shen of Chongqing Medical University proposed a novel method to improve the analysis of wheat gluten quality. This new method involved using terahertz time-domain spectroscopy (THz-TDS) combined with advanced machine learning (ML) methods. Published in Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, the research presented here has implications for the agricultural and food industries by providing a non-destructive, efficient, and accurate method for differentiating wheat based on gluten strength (1).

Ripening wheat in an agricultural field | Image Credit: © Gajus - stock.adobe.com

Ripening wheat in an agricultural field | Image Credit: © Gajus - stock.adobe.com

In the United States, wheat ranks third among U.S. field crops in planted acreage (2). There are three main classes of wheat produced in the United States: winter wheat; spring wheat; and durum wheat (3). These three classes of wheat are generally classified into high-gluten, medium-gluten, and low-gluten varieties. Gluten strength is a critical determinant of wheat’s suitability for various culinary and industrial applications, such as bread making, pasta production, and pastry preparation (1). However, gluten analysis is not an easy process, and it takes significant time to conduct properly. As a result, there is a demand for new technologies to accelerate this process (1).

In this study, the research team implemented terahertz time-domain spectroscopy (THz-TDS) to collect the spectral data from wheat samples. Once the spectral data was collected, they were then processed to derive frequency-domain spectra, refractive index spectra, and absorption coefficient spectra (1). Notably, the refractive index spectra demonstrated significant differences among wheat samples with varying gluten strengths, making them the primary focus for further analysis (1).

The researchers then used a variable selection technique before applying machine learning algorithms to their data set. For this study, the team used competitive adaptive reweighted sampling (CARS) to identify characteristic frequencies within the 0.1 to 1.5 THz range (1). These frequencies served as the foundation for constructing machine learning models to classify wheat gluten strength (1).

The researchers tested four machine learning models in their study. These models were as follows: support vector machines (SVM); back propagation neural networks (BPNN); improved convolutional neural networks (Improved CNN); and sparrow algorithm optimized support vector machines (SSA-SVM). Out of these four models, the SSA-SVM was the most effective because it was the only model to achieve an 100% accuracy rate as validated by a confusion matrix (1). The model’s high stability and efficiency underscore its potential for industrial applications in grain quality control.

Several challenges were faced in this study. The spectral data itself presented challenges because of their high dimensionality and volume. The researchers overcame this issue by evaluating three spectral feature extraction methods: principal component analysis (PCA), uninformative variable elimination (UVE), and the CARS algorithm. As mentioned above, the researchers decided to use CARS because of its ability to perform adaptive weighted sampling and meet the demands of efficient multi-objective optimization (1).

However, despite the challenges faced in this study, the researchers proved that the non-destructive nature of THz-TDS ensures that samples remain intact for further use, while the high accuracy of the SSA-SVM model ensures reliable differentiation among wheat varieties (1). This has implications for industrial-scale wheat processing because this method could streamline quality control processes, enhance product consistency, and reduce production costs, while achieving gluten analysis much quicker.

Wheat production remains important for health and economic benefits. Wheat is used to make many food items that American (and global) consumers rely on. Winter wheat remains the most commonly grown wheat variety in the United States, making up 70% of all wheat production (3). However, because of the extreme winter weather of northern states like Minnesota, Montana, North Dakota, and South Dakota, all of whom are significant wheat producers in the United States, spring and durum varieties of wheat are expected to increase (3).

This research by Yin Shen and colleagues demonstrates the power of combining THz-TDS with machine learning to revolutionize wheat quality analysis. With its potential to transform industrial practices, this method represents a step forward in the quest for smarter, more efficient agricultural solutions.

References

  1. Peng, S.; Wei, S.; Zhang, G.; et al. Discrimination of Wheat Gluten Quality Utilizing Terahertz Time-domain Spectroscopy (THz-TDS). Spectrochimica Acta Part A: Mol. Biomol. Spectrosc. 2025, 328, 125452. DOI: 10.1016/j.saa.2024.125452
  2. U.S. Department of Agriculture, Overview of Wheat. USDA.gov. Available at: https://www.ers.usda.gov/topics/crops/wheat/#:~:text=Wheat%20ranks%20third%20among%20U.S.,Agricultural%20Supply%20and%20Demand%20Estimates. (accessed 2025-01-03).
  3. U.S. Department of Agriculture, Wheat Sector at a Glance. USDA.gov. Available at: https://www.ers.usda.gov/topics/crops/wheat/wheat-sector-at-a-glance/ (accessed 2025-01-03).
Related Content