10:30 AM - 12:10 PM
[3Rin2-41] Official document simplification using neural machine translation approach
Keywords:Text simplification
Official documents are documents distributed at public facilities such as city halls, hospitals and schools. These documents contain a lot of important information for living. However, they are difficult for non-native speakers because they contain difficult vocabulary and expressions. Therefore, official documents must be simplified. We try to simplify official document using machine translation approach. We use a parallel corpus of the original and three kinds of simplified ones including literal translation, free translation and summary. They are rewritten by 40 Japanese teachers. We adapt several methods for low-resource machine translation such as pre-trained embeddings and sharing encoder, decoder and output embeddings (tied-embeddings). The result shows that Transformer can simplify official document using pre-trained embeddings and tied-embeddings in spite of low resource. Performance improvement using several methods of low resource machine translation shows that Transformer can improve performance more than other methods by extending training data.