10:00 AM - 10:20 AM
[2H1-GS-3a-04] Shared-Memory Parallelization of FP-growth with Dynamic Load Estimation and Balancing
Keywords:Frequent Pattern Mining, Parallelization
Although FP-growth is known to be an efficient frequent pattern mining algorithm, it is still a problem how to make it work for huge transactional databases. In this paper, we propose a shared-memory, task parallelization method for FP-growth. In the proposed method, each task receives conditional transactions from the current FP-tree, builds a conditional FP-tree, and generates the next tasks for the successive branches in the search tree. Then, we dynamically estimate the loads of such next tasks based on the corresponding conditional FP-trees and balance them among CPU cores in the manner of work-stealing. Furthermore, in order to exploit computational resources in a light-weight way, we implement the proposed method in Rust, a compiler language that can handle memory safely without garbage collection. In the most of benchmark datasets we tested, a performance improvement in parallelization was generally observed in comparison with Zaiane et al.'s simple method.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.