Current Issue Previous Issue  

Volume 4 Issue 4
Published:05 December 2021

Changjie Wang,Zhihua Li,Benjamin Sarpong

2021, 4(4): 223-232.   doi:10.26599/BDMA.2021.9020006
Abstract ( 68 HTML ( 0   PDF(5424KB) ( 60 )   Save

Identity-recognition technologies require assistive equipment, whereas they are poor in recognition accuracy and expensive. To overcome this deficiency, this paper proposes several gait feature identification algorithms. First, in combination with the collected gait information of individuals from triaxial accelerometers on smartphones, the collected information is preprocessed, and multimodal fusion is used with the existing standard datasets to yield a multimodal synthetic dataset; then, wi...

Jintao Zhang,Quan Xu

2021, 4(4): 233-241.   doi:10.26599/BDMA.2021.9020008
Abstract ( 46 HTML ( 0   PDF(648KB) ( 26 )   Save

As a powerful tool for elucidating the embedding representation of graph-structured data, Graph Neural Networks (GNNs), which are a series of powerful tools built on homogeneous networks, have been widely used in various data mining tasks. It is a huge challenge to apply a GNN to an embedding Heterogeneous Information Network (HIN). The main reason for this challenge is that HINs contain many different types of nodes and different types of relationships between nodes. HIN contains rich semant...

Xueting Liao,Danyang Zheng,Xiaojun Cao

2021, 4(4): 242-251.   doi:10.26599/BDMA.2021.9020010
Abstract ( 39 HTML ( 0   PDF(6377KB) ( 26 )   Save

The COVID-19 pandemic has hit the world hard. The reaction to the pandemic related issues has been pouring into social platforms, such as Twitter. Many public officials and governments use Twitter to make policy announcements. People keep close track of the related information and express their concerns about the policies on Twitter. It is beneficial yet challenging to derive important information or knowledge out of such Twitter data. In this paper, we propose a Tripartite Graph Clustering f...

Xiaohan Li,Bowen Yu,Guanyu Feng,Haojie Wang,Wenguang Chen

2021, 4(4): 252-265.   doi:10.26599/BDMA.2021.9020009
Abstract ( 36 HTML ( 0   PDF(2651KB) ( 15 )   Save

In recent years, Apache Spark has become the de facto standard for big data processing. SparkSQL is a module offering support for relational analysis on Spark with Structured Query Language (SQL). SparkSQL provides convenient data processing interfaces. Despite its efficient optimizer, SparkSQL still suffers from the inefficiency of Spark resulting from Java virtual machine and the unnecessary data serialization and deserialization. Adopting native languages such as C++ could help to avoid su...

Chenyu Hou,Jiawei Wu,Bin Cao,Jing Fan

2021, 4(4): 266-278.   doi:10.26599/BDMA.2021.9020011
Abstract ( 57 HTML ( 0   PDF(3273KB) ( 27 )   Save

Time series forecasting has attracted wide attention in recent decades. However, some time series are imbalanced and show different patterns between special and normal periods, leading to the prediction accuracy degradation of special periods. In this paper, we aim to develop a unified model to alleviate the imbalance and thus improving the prediction accuracy for special periods. This task is challenging because of two reasons: (1) the temporal dependency of series, and (2) the tradeoff betw...

Sudhir Kumar Patnaik,C. Narendra Babu,Mukul Bhave

2021, 4(4): 279-297.   doi:10.26599/BDMA.2021.9020012
Abstract ( 50 HTML ( 0   PDF(4600KB) ( 21 )   Save

Data are crucial to the growth of e-commerce in today’s world of highly demanding hyper-personalized consumer experiences, which are collected using advanced web scraping technologies. However, core data extraction engines fail because they cannot adapt to the dynamic changes in website content. This study investigates an intelligent and adaptive web data extraction system with convolutional and Long Short-Term Memory (LSTM) networks to enable automated web page detection using the You only l...