日本地球惑星科学連合2023年大会

講演情報

[J] オンラインポスター発表

セッション記号 M (領域外・複数領域) » M-IS ジョイント

[M-IS22] 歴史学×地球惑星科学

2023年5月22日(月) 09:00 〜 10:30 オンラインポスターZoom会場 (2) (オンラインポスター)

コンビーナ:加納 靖之(東京大学地震研究所)、芳村 圭(東京大学生産技術研究所)、岩橋 清美(國學院大學)、玉澤 春史(京都市立芸術大学)

現地ポスター発表開催日時 (2023/5/21 17:15-18:45)

09:00 〜 10:30

[MIS22-P04] みんなで翻刻が生成したオープンデータ

*加納 靖之1,2、橋本 雄太3 (1.東京大学地震研究所、2.東京大学地震火山史料連携研究機構、3.国立歴史民俗博物館)

キーワード:みんなで翻刻、市民参加、オープンデータ

"Minna de Honkoku" (https://honkoku.org/) is crowdsourced and online collaborative project to transcribe historical materials written in old Japanese. "Minna de Honkoku" was launched as an online citizen science project to transcribe earthquake-related historical materials from Earthquake Research Institute Library, the University of Tokyo. On July 2019, the system of "Minna de Honkoku" was upgraded to support IIIF, International Image Interoperability Framework. Broader range of manuscripts on digital archives adopting IIIF can be registered for transcription. The subjects of the project was extended to cover wide variety of historical materials as well as earthquake-related materials. AI-assisted transcription was also implemented. More than 3,500 documents are registered on the system. Total number of characters transcribed is about 28 million.
The transcribed text data is shared using Creative Commons licenses (CC BY-SA). The data is used for, for example, editing bibliographic information at libraries, museums and so on. The text is also used for publishing e-books that translates classical literature. An experiment of OCR text conversion of digitized materials of the National Diet Library, Japan (NDL) Lab utilized the transcribed text of "Minna de Honkoku." The OCR Training dataset is also published by NDL Lab.