this post was submitted on 25 Sep 2023
97 points (100.0% liked)

Linguistics

849 readers
1 users here now

This community was migrated to !linguistics@mander.xyz (Kbin link).

founded 4 years ago
MODERATORS
 

without the filler:

Excavations have been taking place at Boğazköy-Hattusha for more than century under the direction of the German Archaeological Institute (DAI).

Around 30,000 clay tablets have been found at the site to date, which have shed light on various aspects of life during the Hittite period, according to the Julius-Maximilians-Universität Würzburg. The tablets contain inscriptions in cuneiform—what is generally considered to be the oldest known writing system. Developed by the ancient Sumerians of Mesopotamia more than 5,000 years ago, cuneiform is a script that was used to write several languages of the ancient Near East.

Most of the inscriptions found at Boğazköy-Hattusha record the extinct Hittite language, which is the oldest attested member of the Indo-European family. Other languages, such as Luwian and Palaic, are also represented at the site.

However, excavations conducted this year, led by professor Dr. Andreas Schachner of the DAI's Istanbul Department, surprisingly uncovered a recitation of a previously unknown extinct language. The language was hidden on a cuneiform tablet containing a ritual text written in Hittite. The Hittite ritual text refers to the lost tongue as the language of the land of Kalašma, an area that likely corresponds to where the towns of Bolu or Gerede in northern Turkey are located today.

"The new language was written in cuneiform," Schachner told Newsweek. "It is the same writing system the Hittites used. The text is part of a longer text starting in Hittite. As it continues it says at one point: 'Continue in the language of the Land [of] Kalašma.'"

"The Hittites were uniquely interested in recording rituals in foreign languages," Daniel Schwemer, head of the Chair of Ancient Near Eastern Studies at Julius-Maximilians-Universität Würzburg, said in a press release.

The recently discovered language remains largely incomprehensible. However, Professor Elisabeth Rieken with the Philipps University of Marburg, Germany, a specialist in Anatolian languages, has confirmed that the Kalasmaic tongue belongs to the Indo-European family, according to Julius-Maximilians-Universität Würzburg.

EDIT: a more readable article with some other details here - https://www.uni-wuerzburg.de/en/news-and-events/news/detail/news/new-indo-european-language-discovered/

you are viewing a single comment's thread
view the rest of the comments
[–] antonim@lemmy.dbzer0.com 7 points 1 year ago (2 children)

Putting aside all the other issues... How do you expect to train a large language model on what is probably one clay tablet of text?

[–] Tvkan@feddit.de 7 points 1 year ago* (last edited 1 year ago)

Dude the blockchain will literally fix it all. Sprinkle some federated protocol on there and it'll cure cancer by tomorrow.

[–] notfromhere@lemmy.one 2 points 1 year ago (1 children)

My bad I thought it said 30,000 tablets were found in the unknown language.

[–] antonim@lemmy.dbzer0.com 2 points 1 year ago* (last edited 1 year ago) (2 children)

Ouch, ok.

Also as far as I know, LLMs require parallel corpora (i.e. same text in different languages) to learn to translate. Otherwise I see no way how they could establish connections across the different languages.

[–] notfromhere@lemmy.one 2 points 1 year ago

Interesting. I wonder if we can eventually figure out how to train against an unknown language and map relationships to a known language without parallel corpora.

[–] notfromhere@lemmy.one 1 points 1 year ago

Wouldn’t there be similar patterns that emerge that could be correlated to approximate the matching symbols using alignment techniques?