[3Xin4-74] Proposal for Collaborative Clustering on Distributed Data
On the Subject of Lifestyle Data of Municipal Residents
Keywords:Data collaboration analysis, Distributed data, Privacy preserving, Clustering, Dimensionality reduction
The purpose of this paper is extending Data Collaboration (DC) analysis for clustering. Our proposed method enables collaborative clustering on distributed data without sharing private data. The data used for the experiment is a lifestyle survey of municipal residents, with each sample belonging to one of 11 existing communities within a municipality (n=2763). In the experimental evaluation, we compare classification accuracy of following three methods: (i) centralized clustering, in which private data is shared among all communities, (ii) individual clustering, in which private data cannot be shared across communities, and (iii) our proposed DC clustering, in which dimensionality reduced data is shared. Our method significantly improves the classification accuracy compared to the individual clustering and achieves the same level of that as the centralized clustering. Our result suggests that DC clustering is a useful approach for clustering on distributed data while preserving privacy.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.