3:20 PM - 3:40 PM
[1J2-01] A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions
Keywords:NLP, Machine Learning
Estimating pointwise mutual information (PMI), a well-known co-occurrence measure between linguistic expressions,
leads to a trade-off between learning time and the robustness to data sparsity. We propose a new kernel-based co-occurrence measure, named pointwise HSIC (PHSIC). PHSIC, intuitively, is a ``smoothed PMI'' by kernels, so it is robust to data sparsity; furthermore, its estimator is reduced to an efficient linear-time matrix calculation. In our experiments, we apply PHSIC to a dialogue response selection task using sparse language data. Experimental results show that the learning speed is about $100$ times faster than that of a recurrent neural network-based PMI estimator; moreover, when the size of the data is small, its predictive performance hardly deteriorates compared to PMI.
leads to a trade-off between learning time and the robustness to data sparsity. We propose a new kernel-based co-occurrence measure, named pointwise HSIC (PHSIC). PHSIC, intuitively, is a ``smoothed PMI'' by kernels, so it is robust to data sparsity; furthermore, its estimator is reduced to an efficient linear-time matrix calculation. In our experiments, we apply PHSIC to a dialogue response selection task using sparse language data. Experimental results show that the learning speed is about $100$ times faster than that of a recurrent neural network-based PMI estimator; moreover, when the size of the data is small, its predictive performance hardly deteriorates compared to PMI.