Home 9 Publications 9 Interpretable and Low-Resource Entity Matching via Decoupling Feature Learning from Decision Making

Interpretable and Low-Resource Entity Matching via Decoupling Feature Learning from Decision Making

Author: Z. Yao, C. Li, T. Dong , X. Lv, J. Yu, L. Hou, J. Li, Y. Zhang, Z. Dai
Journal: ACL IJCNLP
Year: 2021

Citation information

Z. Yao, C. Li, T. Dong , X. Lv, J. Yu, L. Hou, J. Li, Y. Zhang, Z. Dai,
ACL IJCNLP,
2021,
https://doi.org/10.48550/arXiv.2106.04174

Entity Matching (EM) aims at recognizing en-tity records that denote the same real-world ob-ject. Neural EM models learn vector represen-tation of entity descriptions and match entitiesend-to-end. Though robust, these methods re-quire many annotated resources for training,and lack of interpretability. In this paper, wepropose a novel EM framework that consists ofHeterogeneous Information Fusion (HIF) andKey Attribute Tree (KAT) Induction to decou-ple feature representation from matching deci-sion. Using self-supervised learning and maskmechanism in pre-trained language modeling,HIFlearns the embeddings of noisy attributevalues by inter-attribute attention with unla-beled data. Using a set of comparison fea-tures and a limited amount of annotated data,KATInduction learns an efficient decision treethat can be interpreted by generating entitymatching rules whose structure is advocatedby domain experts. Experiments on 6 pub-lic datasets and 3 industrial datasets show thatour method is highly efficient and outperformsSOTA EM models in most cases. We will re-lease the code upon acceptance.