TY - JOUR AU - Pham, Duc-Tinh AU - Nguyen, Minh-Tan AU - Nguyen, Ha-Nam AU - Tran, Tien-Dzung PY - 2021 TI - Analyzing cancer data in North Vietnam by complex network technique JF - Journal of Science and Technology: Issue on Information and Communications Technology; Vol 19 No 12.2 (2021): Volume 19, Number 12.2, 2021 DO - 10.31130/ict-ud.2021.140 KW - N2 - Data-clustering tools can be employed to generate new knowledge for the diagnosis and treatment of cancer. However, traditional clustering methods, such as the K-mean approach, often require the determination of input parameters such as the cluster number and initial centers to be viable. In this study, we present a network science-based clustering method with fewer parameters that were used to mine a cancer-screening dataset containing over 177,000 records. We propose an algorithm that computes the similarity between pairs of records to create a complex network in which each node represents a record, and two nodes are connected by an edge if their similarity is greater than a given threshold as determined by experimental observation. Based on the network created, we employed the network modularity optimization algorithm to detect modules (clusters) within it. Each cluster contains records that are similar to one another in terms of some attributes; therefore, we could derive rules from the cluster for insights into the cancer situation in Vietnam. These rules reveal that some cancer types are more widespread in specific families and living environments in Vietnam. Clustering data based on network science can therefore be a good option for large-scale relational data-mining problems in the future. UR - http://ict.jst.udn.vn/index.php/jst/article/view/140