课题组成员受邀撰稿，在综述类SCI 源刊和开源期刊发表长篇综述论文4篇

来源: 甘文生/

暨南大学

2633

2017-12-07 19:20:02

2017-12-07

课题组成员受邀撰稿，在综述类SCI 源刊和开源期刊发表长篇综述论文4篇

在2016-2017年度，林浚玮教授的课题组成员受主编邀请撰稿，先后历时了6-12个月的pre-peer view, 在JCR 2区综述类SCI 源刊《Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery》（SCI, JCR 2区, IF:2.111）和数据科学与模式识别的开源新期刊《Ubiquitous International Data Science and Pattern Recognition》（Open access, ISSN 2520-4165）上发表高水平长篇综述论文4篇。研究主题涵盖了数据挖掘领域的几个经典研究与应用子领域，具体包括了：频繁模式挖掘和关联规则挖掘（frequent itemset mining & association rule mining）、序列模式/规则挖掘（sequential pattern/rule mining）、分布式数据挖掘（distributed data mining）、高效用模式的增量挖掘（incremental high utility pattern mining）等。哈尔滨工业大学（深圳）均为论文的第一作者单位。

《Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery》（SCI, JCR 2区，IF:2.111），WIRES DMKD是数据挖掘领域知名的综述性评论期刊，其2016年影响因子为IF:2.111, 每年仅发表二十余篇学术论文，一般由在各领域有不错建树的研究学者受邀撰稿，旨在对当今数据挖掘和知识发现研究的经典问题和热点问题做历史总结、原理阐述、现状分析和趋向预测。先由主编邀请撰稿，然后还要经过6-12个月的pre-peer view同行评审。《Ubiquitous International Data Science and Pattern Recognition》，DSPR是数据科学与模式识别领域的开源新期刊，收录人工智能、机器学习、数据挖掘、模式识别等各个领域的综述类文章或原创性文章，其ISSN 为2520-4165。

本课题组于2017年度应邀撰稿的4篇长篇综述类论文的具体信息如下所述。欢迎同行学者们的下载、阅读、批评指正、引用拓展。特别地，热烈欢迎对上述研究领域感兴趣或有致力于扩展研究的同行与我们交流、探讨、合作论文。

综述论文1

Title: A survey of itemset mining

文章链接：http://onlinelibrary.wiley.com/doi/10.1002/widm.1207/full

Authors: Philippe Fournier-Viger 1*, Jerry Chun-Wei Lin 2, Bay Vo 3,4, Tin Truong Chi 5, Ji Zhang 6, and Hoai Bac Le 7

Journal: 《Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery》（SCI, JCR 2区，IF:2.111）。

Abstract:

Itemset mining is an important subfield of data mining, which consists of discovering interesting and useful patterns in transaction databases. The traditional task of frequent itemset mining is to discover groups of items (itemsets) that appear frequently together in transactions made by customers. Although itemset mining was designed for market basket analysis, it can be viewed more generally as the task of discovering groups of attribute values frequently cooccurring in databases. Because of its numerous applications in domains such as bioinformatics, text mining, product recommendation, e-learning, and web click stream analysis, itemset mining has become a popular research area. This study provides an up-to-date survey that can serve both as an introduction and as a guide to recent advances and opportunities in the field. The problem of frequent itemset mining and its applications are described. Moreover, main approaches and strategies to solve itemset mining problems are presented, as well as their characteristics are provided. Limitations of traditional frequent itemset mining approaches are also highlighted, and extensions of the task of itemset mining are presented such as high-utility itemset mining, rare itemset mining, fuzzy itemset mining, and uncertain itemset mining. This study also discusses research opportunities and the relationship to other popular pattern mining problems, such as sequential pattern mining, episode mining, subgraph mining, and association rule mining. Main open-source libraries of itemset mining implementations are also briefly presented.

综述论文2

Title: A Survey of Sequential Pattern Mining

文章链接：http://www.ikelab.net/dspr-pdf/vol1-1/dspr-paper5.pdf

Authors: Philippe Fournier-Viger 1*, Jerry Chun-Wei Lin 2, Rage Uday Kiran 3, Yun Sing Koh 4, and Rincy Thomas 5

Journal: 《Ubiquitous International Data Science and Pattern Recognition》（Open access, ISSN 2520-4165）

Abstract:

Discovering unexpected and useful patterns in databases is a fundamental data mining task. In recent years, a trend in data mining has been to design algorithms for discovering patterns in sequential data. One of the most popular data mining tasks on sequences is sequential pattern mining. It consists of discovering interesting subsequences in a set of sequences, where the interestingness of a subsequence can be measured in terms of various criteria such as its occurrence frequency, length, and profit. Sequential pattern mining has many real-life applications since data is encoded as sequences in many fields such as bioinformatics, e-learning, market basket analysis, text analysis, and webpage click-stream analysis. This paper surveys recent studies on sequential pattern mining and its applications. The goal is to provide both an introduction to sequential pattern mining, and a survey of recent advances and research opportunities. The paper is divided into four main parts. First, the task of sequential pattern mining is defined and its applications are reviewed. Key concepts and terminology are introduced. Moreover, main approaches and strategies to solve sequential pattern mining problems are presented. Limitations of traditional sequential pattern mining approaches are also highlighted, and popular variations of the task of sequential pattern mining are presented. The paper also presents research opportunities and the relationship to other popular pattern mining problems. Lastly, the paper also discusses open-source implementations of sequential pattern mining algorithms.

Keywords: Sequential pattern mining, Sequences, Frequent pattern mining, Itemset mining, Data Mining

综述论文3

Title: Data mining in distributed environment: a survey

文章链接：http://onlinelibrary.wiley.com/doi/10.1002/widm.1216/full

Authors: Wensheng Gan 1, Jerry Chun-Wei Lin 1*, Han-Chieh Chao 1,2, and Justin Zhan 3

Journal: 《Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery》（SCI, JCR 2区，IF:2.111）

Abstract:

Due to the rapid growth of resource sharing, distributed systems are developed, which can be used to utilize the computations. Data mining (DM) provides powerful techniques for finding meaningful and useful information from a very large amount of data, and has a wide range of real-world applications. However, traditional DM algorithms assume that the data is centrally collected, memory-resident, and static. It is challenging to manage the large-scale data and process them with very limited resources. For example, large amounts of data are quickly produced and stored at multiple locations. It becomes increasingly expensive to centralize them in a single place. Moreover, traditional DM algorithms generally have some problems and challenges, such as memory limits, low processing ability, and inadequate hard disk, and so on. To solve the above problems, DM on distributed computing environment [also called distributed data mining (DDM)] has been emerging as a valuable alternative in many applications. In this study, a survey of state-of-the-art DDM techniques is provided, including distributed frequent itemset mining, distributed frequent sequence mining, distributed frequent graph mining, distributed clustering, and privacy preserving of distributed data mining. We finally summarize the opportunities of data mining tasks in distributed environment

综述论文4

Title: A Survey of Incremental High-Utility Itemset Mining

Authors: Wensheng Gan 1, Jerry Chun-Wei Lin 1*, Philippe Fournier-Viger 2, Han-Chieh Chao 1,3, Tzung-Pei Hong 4, and Hamido Fujita 5

Journal: 《Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery》（SCI, JCR 2区，IF:2.111）

Abstract:

Traditional association rule mining has been widely studied. But it is unsuitable for real-world applications where factors such as unit profits of items and purchase quantities must be considered. High-utility itemset mining (HUIM) is designed to find highly profitable patterns by considering both the purchase quantities and unit profits of items. However, most high-utility itemset mining algorithms are designed to be applied on static databases. But in real-world applications such as market basket analysis and business decision making, databases are often dynamically updated by inserting new data such as customer transactions. Several researchers have proposed algorithms to discover high-utility itemsets in dynamically updated databases. Unlike batch algorithms, which always process a database from scratch, incremental HUIM algorithms incrementally update and output HUIs, thus reducing the cost of discovering HUIs. This paper provides an up-to-date survey of the state-of-the-art incremental high-utility itemset mining algorithms, including Apriori-based, tree-based, and utility-list-based approaches. The paper also identifies several important issues and research challenges for incremental high-utility itemset mining.

附件

[1] 2017 WIRES DMKD_Fournier-Viger et al_A survey of itemset mining.pdf
[2] 2017 DSPR_A Survey of Sequential Pattern Mining.pdf
[3] 2017 WIRES DMKD_Gan et al_Survey of distributed data mining1.pdf
[4] 2017 WIRES DMKD_Gan et al_Survey of iHUIM1.pdf

登录用户可以查看和发表评论，请前往登录或注册。