Classifying Ink Using Various Spectral Approaches

News
Article

Scientists from the University of Granada (Spain) recently compared how effective hyperspectral imaging (HSI) and machine learning (ML) methods are in classifying ink found in historical documents. Their findings were published in Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy (1).

Feather and ink bottle isolated on paper background | Image Credit: © Sergey Yarochkin - stock.adobe.com

Feather and ink bottle isolated on paper background | Image Credit: © Sergey Yarochkin - stock.adobe.com

Identifying materials used in tangible cultural heritage is vital for selecting appropriate restoration and preservation strategies. Analyzing inks in manuscripts and historical documents can enrich one’s understanding of artistic and historical context, bettering efforts to date documents, determine authorship, detecting falsifications or undocumented restorations, and identifying causes of deterioration. Ink analysis, therefore, is key for codicologists and historians looking to explore the content and material composition of manuscripts.

To obtain compositional information while preserving objects’ integrity and value, non-invasive analytical techniques are predominantly used, the most widely utilized being X-ray fluorescence (XRF), X-ray diffraction (XRD), Fourier transform infrared (FTIR) spectroscopy, and Raman spectroscopy. Recently, however, hyperspectral imaging (HSI) has gained prominence in this field. Combining spectroscopy and spatial imaging, this technique provides images at different wavelengths, capturing spectral reflectance at each pixel of an image, creating a hypercube containing three-dimensional data (two spatial coordinates and a spectrum for every pixel of the image). According to the researchers, HSI’s primary advantage over other methods is its ability to provide spatial information, enabling the retrieval of material distribution within a document, which is critical for historical studies and conservation evaluation (2). Additionally, its non-contact and rapid data acquisition capabilities make it suitable for on-site analysis of historical artifacts at locations like museums or libraries.

While HSI has its advantages, the researchers claim that no studies have investigated the automatic classification of historical inks by using machine learning (ML) and HSI data. For this study, six supervised ML models were trained and validated to automatically classify three types of inks: (1) pure metallo-gallate inks (MGP); (2) carbon-containing inks (CC), which include pure carbon-based inks like ivory black or bone black, as well as mixtures of carbon-based and metallo-gallate or sepia inks; and (3) non-carbon-containing inks (NCC), which can be pure sepia or a mixture of MGP and sepia. Six supervised classification models, including five traditional algorithms (Support Vector Machines [SVM], K-Nearest Neighbors [KNN], Linear Discriminant Analysis [LDA], Random Forest [RF], and Partial Least Squares Discriminant Analysis [PLS-DA]) and one deep learning (DL)-based model, were evaluated. Further, principal component analysis (PCA) was used before classification for visualization of the separability of the classes and dimensionality reduction, comparing the classification accuracy and running time with and without PCA.

With mock-up samples and historical documents, micro-averaged accuracy above 90%was achieved for all models. The best results came from the DL model, with micro- and macro-averaged accuracy and recall reaching above the 99%threshold. Among traditional models, SVM was the best option with all metrics above the 95% threshold and micro- and macro-averaged accuracy and recall above 97%. That said, neither model achieved perfect results. As such, choosing between a traditional or DL model can mostly be based on available computational resources and how dire the need is for slightly better accuracy.

Future research will be focused on tackling more detailed classification where subclasses in CC and NCC groups can be separated. Applying unmixing techniques could prove more interpretable analyses of individual components and their concentrations in mixtures compared to DL or ML approaches. Their effectiveness, however, will depend on the choice of mixing model, the accuracy of the extracted endmembers (spectra of pure components), and the availability of a comprehensive reference library.

References

(1) López-Baldomero, A. B.; Buzzelli, M.; Moronta-Montero, F.; Martínez-Domingo, M. Á.; Valero, E. M. Ink Classification in Historical Documents Using Hyperspectral Imaging and Machine Learning Methods. Spectrochim. Acta – A: Mol. Biomol. Spectrosc. 2025, 335, 125916. DOI: 10.1016/j.saa.2025.125916

(2) Catelli, E.; Randeberg, L. L.; Alsberg, B. K.; Gebremariam, K. F.; Bracci, S. An Explorative Chemometric Approach Applied to Hyperspectral Images for the Study of Illuminated Manuscripts. Spectrochim. Acta – A: Mol. Biomol. Spectrosc. 2017, 177, 69–78. DOI: 10.1016/j.saa.2017.01.015

Recent Videos
Modern video camera recording tv studio interview blurred background mass media technology concept | Image Credit: © Studios - stock.adobe.com.
Modern video camera recording tv studio interview blurred background mass media technology concept | Image Credit: © Studios - stock.adobe.com.
Baltimore Downtown Skyline Panorama | Image Credit: © Stefan - stock.adobe.com
Team of Medical Research Scientists Work on a New Generation Disease Cure. They use Microscope, Test Tubes, Micropipette and Writing Down Analysis Results. Laboratory Looks Busy, Bright and Modern. | Image Credit: © Gorodenkoff - stock.adobe.com.
Hand scooping up a mixture of sand and microplastics from the shore, theme of pollution. Generated using AI. | Image Credit: © nabila - stock.adobe.com.
Jeanette Grasselli Brown 
Related Content