Saidah Saad, and Zikun, Huang (2024) Leveraging transfer learning and label optimization for enhanced traditional Chinese medicine ner performance. Asia-Pacific Journal of Information Technology and Multimedia, 13 (1). pp. 47-60. ISSN 2289-2192
PDF
342kB |
Official URL: https://www.ukm.my/apjitm
Abstract
Named Entity Recognition (NER) is a crucial component in various domains, including medical and financial fields, as it helps identify text fragments belonging to predefined categories from unstructured text. Over time, NER algorithms have evolved from dictionary-based approaches to machine learning and deep learning techniques. Transfer learning, a novel deep learning method, has shown impressive results in NER tasks. However, transfer learning models still face challenges, such as limited entity labels and the impact of noisy datasets. To address these challenges, this research aims to optimise the application of deep learning models for NER and achieve enhanced results. The research initially applied the BERT+CRF model to the WanChuang dataset, resulting in an F1-measure of 89.1%. This established the feasibility of using transfer learning models for NER on Chinese medical data and served as a baseline for comparison in the project. To address label-related issues in the baseline model, a scheme was proposed to improve the learning rate of the CRF layer, resulting in an increased F1 measure of 91.0%. Additionally, to mitigate the impact of noisy training data, a 10-fold retraining scheme was introduced to optimise the training set. By retraining the model using the optimised training set, an optimal F1 measure of 92.7% was achieved. The experiments demonstrated that the transfer learning model enhances NER entity extraction capabilities while the optimised CRF layer effectively captures the internal relationships of entity tags, thus improving overall performance. This research contributes to advancing NER techniques and their application in various domains.
Item Type: | Article |
---|---|
Keywords: | Named Entity Recognition; Traditional Chinese medicine; Transfer learning; BERT; CRF |
Journal: | Asia - Pasific Journal of Information Technology and Multimedia (Formerly Jurnal Teknologi Maklumat dan Multimedia) |
ID Code: | 23992 |
Deposited By: | Mr. Mohd Zukhairi Abdullah |
Deposited On: | 09 Aug 2024 07:52 |
Last Modified: | 12 Aug 2024 03:26 |
Repository Staff Only: item control page