Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Published in The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. (NAACL 2018), 2018

Recommended citation: Zhenghui Wang, Yanru Qu, Liheng Chen, Jian Shen, Weinan Zhang, Shaodian Zhang, Yimei Gao, Gen Gu, Ken Chen and Yong Yu. The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. NAACL 2018.

Oral presentation 6.73%

[PDF] [Slide] [Video]

Abstract

We study the problem of named entity recognition (NER) from electronic medical records, which is one of the most fundamental and critical problems for medical text mining. Medical records which are written by clinicians from different specialties usually contain quite different terminologies and writing styles. The difference of specialties and the cost of human annotation makes it particularly difficult to train a universal medical NER system. In this paper, we propose a label-aware double transfer learning framework (La-DTL) for cross-specialty NER, so that a medical NER system designed for one specialty could be conveniently applied to another one with minimal annotation efforts. The transferability is guaranteed by 2 components: (i) we propose label-aware MMD for feature representation transfer, and (ii) we perform parameter transfer with a theoretical upper bound which is also label aware. We annotate a new medical NER corpus and conduct extensive experiments on 12 cross-specialty NER tasks. The experimental results demonstrate that La-DTL provides consistent accuracy improvement over strong baselines. Besides, the promising experimental results on non-medical NER scenarios indicate that La-DTL is potential to be seamlessly adapted to a wide range of NER tasks.