Written by: Julia Donovan
Edited by: Madison Fitzgerald, Kapil Shrawankar, Nick Janne, Ryan Schildcrout
When Mount Vesuvius erupted, it simultaneously destroyed the entire civilization of Pompeii whilst preserving the city under volcanic ash. Among the items excavated in 1752 was a collection of 1,800 scrolls from the nearby city of Herculaneum. The Herculaneum Scrolls are the only known large-scale library in classic antiquity. Given the small number of classical works that have survived beyond the period–Sophocles wrote 120 plays but only 7 remain–there is hope that these scrolls contain unknown works. Some researchers argue that only the best works from antiquity had a chance at survival, meaning the 7 plays of Sophocles that exist were his most popular ones. Evidence for this theory includes the fact that the Iliad was the most copied poem during antiquity, with many private manuscripts of the poem surviving to this day. Other experts argue that the survival of classical works is purely due to chance, supported by the fact that the poems by Catullus survive in only one manuscript. Similarly, the works of Sappho, which were highly regarded in her own times, exist only in fragments. Decoding the contents of these surviving scrolls could extensively add to the body of classical works. However, previous attempts to open the scrolls have led to their destruction. Due to the volcanizing of the scrolls, the brittleness of the papyrus causes breakage and the ink often fades when exposed to air. As a result, approximately 1,000 of the Herculaneum Scrolls remain intact. Researchers began to wonder if there was a way to see inside the scrolls without opening them, and if artificial intelligence (AI) could then help decode what was written. In 2023, AI aided in the discovery of the first word from a Herculaneum scroll. The path to this monumental step built on the AI research of many different labs.
With the emergence of AI, custom machine learning models have been developed to attempt a variety of tasks, including parsing ancient inscriptions. In 2022, an AI named Ithaca was developed to restore unreadable ancient Greek inscriptions. Ithaca was trained on 178,551 transcribed and decoded inscriptions, which are defined as words or phrases written on durable material, such as stone or pottery. Once Ithaca had learned the patterns of the language through repeated exposure to its grammar, syntax, and semantics, researchers exposed the model to inscriptions that were particularly difficult to read or partially destroyed. Ithaca weighed character and word input based on surrounding context (other nearby words, style of writing, etc.) to infer what the inscription said. Ithaca was also trained to guess the period and geographical origin of a text. Using AI to reconstruct missing pieces of inscriptions has led researchers to believe that AI could read the Herculaneum Scrolls. However, the Herculaneum Scrolls presented another challenge: researchers still had to figure out how to unroll the scrolls without damaging them.
A research team at the University of Kentucky led by Brent Seales postulated that they could use computer imaging to see inside the scrolls. They managed to read ancient scrolls using an algorithm, which they refer to as virtual unwrapping. They used X-ray-based micro-computer tomography (micro-CT), which is able to see the inside of an object by taking consecutive 2D projections, leading to a high-resolution 3D reconstruction which can then be virtually unfolded. Seales’ algorithm performed three tasks: segmentation, texturing, and flattening. In segmentation, the unevenly shaped scroll is virtually built to a symmetric shape to help the computer distinguish between the different layers or pages of the scroll. Texturing assigns intensity or brightness values to each point. In micro-CT, brightness refers to regions of denser material: the brighter the spot, the higher the possibility of ink on that spot. Finally, flattening (or a virtual unfolding) allowed Seales and his team to convert the 3D models of the scrolls into readable 2D. This method led to success with other ancient scrolls. However, when Seales and his team attempted this method on the Herculaneum scrolls, they encountered another issue. Unlike other ancient scrolls that featured metal-based inks, the Herculaneum Scrolls appeared to contain a carbon-based ink. Metal-based inks are bright enough to show up on micro-CT, but carbon-based inks appear less bright when scanned, meaning spots with ink look identical to the raw papyrus. Researchers believed that the ink on the Herculaneum Scrolls was carbon-based due to its ability to quickly fade with exposure to air. Their hypothesis was confirmed when the ink failed to show up with micro-CT. The new algorithm Seales and his team developed could segment and flatten the scrolls but could not identify the ink.
Consequently, they shifted their focus to developing the highest possible resolution X-ray scans, believing that AI might be able to detect any subtle differences between the weathered carbon ink and papyrus. Using high-energy X-rays, Seales obtained high-resolution scans of two full scrolls and multiple fragments. Given the sheer number of scrolls, Seales and his team asked the public for help. Thus, the Vesuvius Challenge was born, and researchers from the Seales’ group uploaded the unrolled scrolls and let the public determine a way to program AI to differentiate between ink and papyrus. In August, Casey Handmer described the ‘crackle’ technique on his blog. Handmer noticed that scans of the scrolls had dark and light channels. He hypothesized that the dark channels contained ink (see image below).
Left: image of parchment with dark and light channels/crackles. Right: annotated ink location. Source: Casey Handmer’s blog
This technique of focusing on the dark channels or ‘crackles’ became known as the crackle technique. Using this technique, another Vesuvius challenge contestant Luke Farritor began training AI to detect and classify dark channels. While difficult to see with the human eye, AI was able to classify dark channels as letters. In October of 2023, Farritor discovered 13 letters that made the word “ΠΟΡΦΥΡΑϹ,” which translates to “purple.” A third contestant Youssef Nader discovered two words ανυοντα (“achieving”) and Ομοιων (“similar”) using his own technique to identify ink. Nader didn’t focus on the crackle but identified what appeared to be letters and labeled them. He repeated this process and then developed a method based on his labeling to identify the inked letters. Seales remains hopeful that the first complete scrolls will be deciphered in 2024. In early February of 2024, Farritor and Nader received $700K for their work on the scrolls, decoding four passages with at least 85% of the characters. On X, Elon Musk announced his intention to donate an unspecified amount to help decode the Herculaneum Scrolls, which was later confirmed by Nat Friedman, sponsor of the Vesuvius Challenge, to be $2.1 million.
The use of AI has made it possible for researchers to read ancient scrolls without damaging them. Ithaca showed that it was possible to train AI to read texts once thought to be damaged beyond repair, and Seales’ development of virtual unwrapping created high-resolution scans of the Herculaneum Scrolls. While progress is slow on parsing the scrolls, even gleaning a few words is major progress. Given the small number of classical works that survive, researchers hope that the Herculaneum scrolls will offer more insight into the lives and thoughts of ancient people.
Julia Donovan is a first-year PhD student in the Chemistry Program at the University of Michigan. This is her first time writing for MiSciWriters.

