5:15 PM - 7:15 PM
[STT41-P09] Research on improving the effectiveness of data ubiquitous on the Internet
- Automatic generation of disaster training scenarios using generative AI -

Keywords:disaster training scenarios, generative AI, NERmodel, BERT
1. Introduction
We aim to improve the value of information by integrating and accumulating sensor information, action history, location information, and SNS information, and extracting and distributing useful information. As an example, we are conducting research on automatic generation of training scenarios for disaster scenarios. We will utilize generative AI to enable the automatic generation of realistic and detailed training scenarios. The created scenarios will be compared with the training scenarios of local governments to evaluate their effectiveness.
2. proposed method
2.1 Flow of the Proposed Method
This paper proposes a method for creating a disaster database by extracting useful information from the collected information using a BERT model finetuned to the NER task. We also propose a method for creating training scenarios by inputting the created disaster database and training scenarios used in actual municipalities into a generative AI.
The flow of the method proposed in this paper is shown in (1) through (4). The overall flow is also shown in Figure 1.
(1) Collection of disaster information from SNS and Web news
In this study, a disaster database was created for past disasters that occurred in Chiba Prefecture. Octoparse was used to collect data, and about 6,000 data related to “disaster” were obtained from SNS and NHK news sites. Unnecessary characters were removed and numbers were converted to half-width characters as preprocessing.
(2) Methodology for creating a disaster information extraction model
In order to extract useful information from SNS and news articles, we fine-tuned the pre-trained Tohoku BERT model and created a model specialized for extraction of unique expressions. Training data was created from the 6,000 cases collected, and an extraction model was constructed.
(3) Creating a disaster database from extracted information
Using the extraction model created in (2), information on “time, place name, disaster type, victims, and damage details” was extracted. Specific information such as “number of victims, scale of damage,” etc., which were considered effective for generating training scenarios, were organized, and a disaster database was created.
(4) Automatic generation of training scenarios
The disaster database created in (3) and some of the scenarios used in actual training were input into the generation AI for automatic generation. Figure 2 shows an example of the training scenario generated.
3. evaluation of the proposed method and disaster database
To create an accurate disaster database, the performance of the extraction model was evaluated. Information was extracted for the following five items: location, date-time, type of disaster, nature of damage, and number of victims, and the percentage of correct answers was calculated based on visual confirmation. Figure 3 shows the number of successful and failed extractions for each item in 100 test data items.
The evaluation of the created training scenarios is planned to be done by comparing them with the actual training scenarios used in actual municipalities, but establishing an evaluation method is a future issue.
4.Conclusion
In this paper, we created a disaster database and studied the automatic generation of training scenarios using a generative AI, but two issues remain. The first is to improve the accuracy of the disaster information extraction model. While it is possible to extract time, place, and type of disaster, the accuracy of extracting damage details and the number of victims is not practical, and it is considered insufficient to handle a variety of expressions.
In the future, in addition to improving the extraction model by increasing the accuracy of training data, we aim to construct a more accurate disaster database. In addition, we will establish evaluation methods and improve scenarios with a view to practical application in local governments.
We aim to improve the value of information by integrating and accumulating sensor information, action history, location information, and SNS information, and extracting and distributing useful information. As an example, we are conducting research on automatic generation of training scenarios for disaster scenarios. We will utilize generative AI to enable the automatic generation of realistic and detailed training scenarios. The created scenarios will be compared with the training scenarios of local governments to evaluate their effectiveness.
2. proposed method
2.1 Flow of the Proposed Method
This paper proposes a method for creating a disaster database by extracting useful information from the collected information using a BERT model finetuned to the NER task. We also propose a method for creating training scenarios by inputting the created disaster database and training scenarios used in actual municipalities into a generative AI.
The flow of the method proposed in this paper is shown in (1) through (4). The overall flow is also shown in Figure 1.
(1) Collection of disaster information from SNS and Web news
In this study, a disaster database was created for past disasters that occurred in Chiba Prefecture. Octoparse was used to collect data, and about 6,000 data related to “disaster” were obtained from SNS and NHK news sites. Unnecessary characters were removed and numbers were converted to half-width characters as preprocessing.
(2) Methodology for creating a disaster information extraction model
In order to extract useful information from SNS and news articles, we fine-tuned the pre-trained Tohoku BERT model and created a model specialized for extraction of unique expressions. Training data was created from the 6,000 cases collected, and an extraction model was constructed.
(3) Creating a disaster database from extracted information
Using the extraction model created in (2), information on “time, place name, disaster type, victims, and damage details” was extracted. Specific information such as “number of victims, scale of damage,” etc., which were considered effective for generating training scenarios, were organized, and a disaster database was created.
(4) Automatic generation of training scenarios
The disaster database created in (3) and some of the scenarios used in actual training were input into the generation AI for automatic generation. Figure 2 shows an example of the training scenario generated.
3. evaluation of the proposed method and disaster database
To create an accurate disaster database, the performance of the extraction model was evaluated. Information was extracted for the following five items: location, date-time, type of disaster, nature of damage, and number of victims, and the percentage of correct answers was calculated based on visual confirmation. Figure 3 shows the number of successful and failed extractions for each item in 100 test data items.
The evaluation of the created training scenarios is planned to be done by comparing them with the actual training scenarios used in actual municipalities, but establishing an evaluation method is a future issue.
4.Conclusion
In this paper, we created a disaster database and studied the automatic generation of training scenarios using a generative AI, but two issues remain. The first is to improve the accuracy of the disaster information extraction model. While it is possible to extract time, place, and type of disaster, the accuracy of extracting damage details and the number of victims is not practical, and it is considered insufficient to handle a variety of expressions.
In the future, in addition to improving the extraction model by increasing the accuracy of training data, we aim to construct a more accurate disaster database. In addition, we will establish evaluation methods and improve scenarios with a view to practical application in local governments.