International Display Workshops General Incorporated Association

3:30 PM - 3:50 PM

[DES2/AIS2-2(Invited)] Challenges of Integrating Vision and Language

*Yoshitaka Ushiku1,2 (1.OMRON SINIC X Corp. (Japan), 2.Ridge-i Inc. (Japan))

Deep Learning, Vision and Language, Computer Vision, Natural Language Processing, Encoder-Decoder

https://doi.org/10.36463/idw.2021.0867

The benefits of deep learning are not limited to advanced recognition and generation of data in different modalities, such as images, acoustic signals. As a result of the fact that they are now implemented using commoditized tools based on deep learning, it has become possible to import approaches to understanding other modal data quickly. As a result of the fact that they are now implemented...