11:00 〜 12:10
[3A2-PS-3-01] Learning Beyond 2D Images
We observed super-human capabilities from current (2D) convolutional networks for the images — either for discriminative or generative models. For this talk, we will show our recent attempts in visual cognitive computing beyond 2D images. We will first demonstrate the huge opportunities as augmenting the leaning with temporal cues, 3D (point cloud) data, raw data, audio, etc. over emerging domains such as entertainment, security, healthcare, manufacturing, etc. In an explainable manner, we will justify how to design neural networks leveraging the novel (and diverse) modalities. We will demystify the pros and cons for these novel signals. We will showcase a few tangible applications ranging from video QA, robotic object referring, situation understanding, autonomous driving, etc. We will also review the lessons we learned as designing the advanced neural networks which accommodate the multimodal signals in an end-to-end manner.
講演PDFパスワード認証
論文PDFの閲覧にはログインが必要です。参加登録者の方は「参加者用ログイン」画面からログインしてください。あるいは論文PDF閲覧用のパスワードを以下にご入力ください。