Chinese Computational Linguistics: 20th China National Conference, CCL 2021, Hohhot, China, August 13–15, 2021, Proceedings (Lecture Notes in Artificial Intelligence) 3030841855, 9783030841850

This book constitutes the proceedings of the 20th China National Conference on Computational Linguistics, CCL 2021, held

137 28 31MB

English Pages 504 [488] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Organization
Contents
Machine Translation and Multilingual Information Processing
Reducing Length Bias in Scoring Neural Machine Translation via a Causal Inference Method
1 Introduction
2 Related Work
3 Approach
3.1 Correcting Length Bias via Half-Sibling Regression
3.2 Re-scoring Translation Candidates
3.3 Discussion
4 Experiments
4.1 Datasets and Evaluation Metric
4.2 Length Normalization Baselines
4.3 Model Setups
4.4 Main Results
4.5 Performance on Wider Beam Size
5 Conclusion and Future Work
References
Low-Resource Machine Translation Based on Asynchronous Dynamic Programming
1 Introduction
2 Background
2.1 Neural Machine Translation
2.2 RL Based NMT
3 Approach
3.1 Priority Acquisition of Experience
3.2 Asynchronous Strategy
3.3 Training
4 Experiment and Analysis
4.1 Datasets and Preprocessing
4.2 Setting
4.3 Main Results and Analysis
5 Conclusion
References
Minority Language Information Processing
Uyghur Metaphor Detection via Considering Emotional Consistency
1 Introduction
2 Related Work
2.1 Metaphor Detection
2.2 Uyghur Metaphor
3 Our Proposed Model
3.1 Basic Idea
3.2 Model Structure
4 Experiment
4.1 Dataset
4.2 Experimental Details
4.3 Result Analysis
5 Conclusion
References
Incorporating Translation Quality Estimation into Chinese-Korean Neural Machine Translation
1 Introduction
2 Related Work
3 Methodology
3.1 Model Overview
3.2 Generate Rewards Through Sentence-Level Quality Estimation
3.3 Reward Computation
3.4 The Training of Reinforcement Learning
4 Experiments
4.1 Datasets
4.2 Preprocessing
4.3 Setting
4.4 Main Results and Analysis
4.5 Performance Verification About QE
4.6 Example of Translation Results
5 Conclusion
References
Social Computing and Sentiment Analysis
Emotion Classification of COVID-19 Chinese Microblogs Based on the Emotion Category Description
1 Introduction
2 Related Work
3 Methods
3.1 Definition and Strategy of Emotion Category Description
3.2 Emotion Classification Fine-Tuning Based on a Question Answering (QA) Method
4 Experiments
4.1 Experimental Dataset
4.2 Baseline Models
4.3 Implementation Details
4.4 Experimental Results
5 Conclusion
References
Multi-level Emotion Cause Analysis by Multi-head Attention Based Multi-task Learning
1 Introduction
2 Related Work
3 Methodology
3.1 Task Definition
3.2 Model Description
3.3 Training and Parameter Learning
4 Experiments and Results
4.1 Dataset and Settings
4.2 Comparison of Different Methods
4.3 Ablation Study
4.4 Case Study
4.5 Error Analysis
5 Conclusions
References
Text Generation and Summarization
Using Query Expansion in Manifold Ranking for Query-Oriented Multi-document Summarization
1 Introduction
2 Related Work
3 Method
3.1 Manifold Ranking
3.2 Query Expansion
3.3 Sentence Similarity Calculation
4 Experimental Setup
4.1 Datasets and Evaluation
4.2 Parameter Settings
5 Results
5.1 Comparison with Query Expansion Methods
5.2 Comparison with Related Methods
5.3 Influence of Sentence Similarity Parameter Tuning
5.4 Query Relevance Performance
6 Conclusion
References
Jointly Learning Salience and Redundancy by Adaptive Sentence Reranking for Extractive Summarization
1 Introduction
2 Related Work
2.1 Salience Learning
2.2 Redundancy Learning
3 Method
3.1 Overall Architecture
3.2 Extraction Summarization as Ranking
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Evaluation Metric
4.4 Experimental Results
5 Qualitative Analysis
5.1 Ablation Study
5.2 Effect of Summary Length
5.3 Analysis of N-Gram Frequency
5.4 Human Evaluation
5.5 Case Study
6 Conclusions
References
Incorporating Commonsense Knowledge into Abstractive Dialogue Summarization via Heterogeneous Graph Networks
1 Introduction
2 Heterogeneous Dialogue Graph Construction
2.1 Graph Notation
2.2 Utterance-Knowledge Bipartite Graph Construction
2.3 Speaker-Utterance Bipartite Graph Construction
2.4 Heterogeneous Dialogue Graph Construction
3 Dialogue Heterogeneous Graph Network
3.1 Node Encoder
3.2 Graph Encoder
3.3 Pointer Decoder
3.4 Training
4 Experiments
4.1 Automatic Evaluation
4.2 Human Evaluation
4.3 Ablation Study
4.4 Zero-Shot Setting
4.5 Visualization
4.6 Case Study
5 Related Work
6 Conclusion
References
Information Retrieval, Dialogue and Question Answering
Enhancing Question Generation with Commonsense Knowledge
1 Introduction
2 Related Work
3 Knowledge Extraction
4 Model Description
4.1 Multi-task Learning Framework
4.2 Iterative Training Framework
5 Experimental Settings
5.1 Dataset and Metrics
5.2 Baseline Models
5.3 Implementation Details
6 Results
6.1 Main Results
6.2 Ablation Study
6.3 Analysis of Auxiliary Tasks
6.4 Human Evaluation
6.5 Case Study
7 Conclusion
References
Topic Knowledge Acquisition and Utilization for Machine Reading Comprehension in Social Media Domain
1 Introduction
2 Related Work
3 Method
3.1 Knowledge Acquisition
3.2 Topic Knowledge Reader
4 Experiment
4.1 TweetQA Dataset
4.2 Implement Detail
4.3 Baselines
4.4 Main Results
4.5 Different Number of Concepts
4.6 Ablation Study
4.7 Extractive vs. Generative
4.8 Weakly Supervised Training
5 Conclusion
References
Category-Based Strategy-Driven Question Generator for Visual Dialogue
1 Introduction
2 Related Work
3 Proposed Model
3.1 The Design and Implement of Category Info to Questions
3.2 Multi-modality Encoder
3.3 Category Predictor
3.4 Generation Decoder
4 Experiments
4.1 Implementation Details on Guesser and Oracle
4.2 Contrast Experiments
4.3 Ablation Experiments
4.4 Good Case Study
5 Conclusions
References
From Learning-to-Match to Learning-to-Discriminate: Global Prototype Learning for Few-shot Relation Classification
1 Introduction
2 Background
3 Global Transformed Prototypical Networks
3.1 Instance Encoder
3.2 Relation Marker
3.3 Multi-view Global Transformation
3.4 Prototype-Based Classification
3.5 Model Learning
4 Experiments
4.1 Experimental Settings
4.2 Overall Results
4.3 Detailed Analysis
5 Related Work
6 Conclusions
References
Multi-strategy Knowledge Distillation Based Teacher-Student Framework for Machine Reading Comprehension
1 Introduction
2 Related Work
2.1 Machine Reading Comprehension
2.2 Knowledge Distillation
3 Methods
3.1 MRC Model
3.2 Knowledge Distillation Strategies
3.3 Overall Training Procedure
4 Experiments
4.1 Datasets
4.2 Baseline
4.3 Experimental Setup and Evaluation Metrics
4.4 Results
5 Conclusion
References
LRRA:A Transparent Neural-Symbolic Reasoning Framework for Real-World Visual Question Answering
1 Introduction
2 Related Work
3 Approach
4 Experiments
4.1 Dataset
4.2 Implementation Details
4.3 Results
4.4 Example Analysis
5 Conclusion
References
Linguistics and Cognitive Science
Meaningfulness and Unit of Zipf's Law: Evidence from Danmu Comments
1 Introduction
1.1 Zipf's Law
1.2 Remaining Questions in Zipf's Law
1.3 Goal of the Current Study
1.4 Danmu Comments
2 Previous Work
2.1 Meaningfulness of Zipf's Law
2.2 Units in Zipf's Law
3 Research Method
3.1 Data
3.2 Hypotheses
3.3 Predictions for Danmu Comments
4 Results and Discussion
4.1 Danmu
4.2 Clauses
4.3 Words
4.4 N-grams
5 Conclusion
References
Language Resource and Evaluation
Unifying Discourse Resources with Dependency Framework
1 Introduction
2 Background
3 Unifying Discourse Corpora
3.1 Conversion of HIT-CDTB
3.2 Conversion of SU-CDTB
3.3 Corpus Statistics
4 Dependency Discourse Parsing
5 Conclusions
References
A Chinese Machine Reading Comprehension Dataset Automatic Generated Based on Knowledge Graph
1 Introduction
2 Related Work
2.1 English Machine Reading Comprehension Datasets
2.2 Chinese Machine Reading Comprehension Datasets
2.3 Machine Reading Comprehension Models
3 CMedRC: A Chinese Medical MRC Dataset
3.1 Question and Answer Generation
3.2 Synonymous Sentences Generation
3.3 Answer Matching and Document Linkage
3.4 Dataset Cleaning
3.5 Data Statistics
4 Experiment
4.1 Evaluation Metric
4.2 Baselines
4.3 Evaluation
5 Conclusion and Further Research
5.1 Conclusion
5.2 Further Research
References
Morphological Analysis Corpus Construction of Uyghur
1 Introduction
2 Related Work
2.1 Agglutinative Language and Morphological Analysis
2.2 Morphological Analysis Corpus
3 Construction of Uyghur Morphological Analysis Corpus
3.1 Data Preparation and Preprocessing
3.2 The Annotation Scheme
3.3 Corpus Construction Process
4 Corpus Information Statistics and Evaluation
5 Summary
References
Knowledge Graph and Information Extraction
Improving Entity Linking by Encoding Type Information into Entity Embeddings
1 Introduction
2 Background and Related Works
2.1 Local and Global Entity Linking Models
2.2 Related Works
3 Our Method
3.1 Entity Labeling
3.2 Word and Type Vectors Pre-training
3.3 Entity Embeddings Training
3.4 Entity Embeddings Fusing
4 Experiments
4.1 Datasets and Evaluation Metric
4.2 Experimental Settings
4.3 Results
4.4 Analysis
5 Conclusion
References
A Unified Representation Learning Strategy for Open Relation Extraction with Ranked List Loss
1 Introduction
2 Methodology
2.1 Neural Encoders
2.2 Ranked List Loss
2.3 Virtual Adversarial Training
3 Experiment
3.1 Datasets
3.2 Settings
3.3 Main Results
3.4 Visualization Analysis
3.5 Other Empirical Studies
4 Conclusion
References
NS-Hunter: BERT-Cloze Based Semantic Denoising for Distantly Supervised Relation Classification
1 Introduction
2 Related Work
3 Model
3.1 Source Entity and Target Entity
3.2 Discrimination on Dependency Features
3.3 Denoising and Classification
4 Experiments
4.1 Dataset and Evaluation
4.2 Implementation Details
4.3 Baselines
4.4 Main Results
4.5 Denoising
4.6 Effects of Our MASK-lhs Feature
4.7 Apply NS-Hunter to CNN
4.8 Effect of Entity on Denoising
5 Conclusion
References
A Trigger-Aware Multi-task Learning for Chinese Event Entity Recognition
1 Introduction
2 The Proposed Algorithm
2.1 Event Trigger Dictionary Construction
2.2 Trigger Detection Network
2.3 Event-Featured Transformer-CRF
2.4 Loss Function
3 Experiments
3.1 Experimental Setup
3.2 Results and Discussion
4 Related Work
5 Conclusion
References
Improving Low-Resource Named Entity Recognition via Label-Aware Data Augmentation and Curriculum Denoising
1 Introduction
2 Related Work
2.1 Low-Resource NER
2.2 Curriculum Learning
3 Proposed Method
3.1 Overview
3.2 Data Augmentation via Pre-trained BERT
3.3 Denoising via Curriculum Learning
4 Experimental Setups
4.1 Datasets
4.2 Implementations
4.3 Experimental Results
4.4 Discussion
5 Conclusion
References
Global Entity Alignment with Gated Latent Space Neighborhood Aggregation
1 Introduction
2 Related Work
2.1 KG Embedding
2.2 Embedding-Based Entity Alignment
3 Problem Formulation
4 The Proposed Model
4.1 Basic Embedding Module
4.2 Topological Neighborhood and Latent Space Neighborhood
4.3 Neighborhood Aggregation
4.4 Gated Topological and Latent Space Neighborhood Aggregation
4.5 Global Entity Alignment Strategy
5 Experiments
5.1 Datasets and Experiment Settings
5.2 Entity Alignment Results
5.3 Effectiveness of Latent Space Neighborhood
5.4 Impact of Distance When Generating Latent Space Neighborhood
5.5 Effectiveness of Global Entity Alignment Strategy
6 Conclusion
References
NLP Applications
Few-Shot Charge Prediction with Multi-grained Features and Mutual Information
1 Introduction
2 Related Work
2.1 Charge Predication
2.2 Few-Shot Text Classification
3 Methodology
3.1 Task Formalization
3.2 Coarse-Grained Features
3.3 Fine-Grained Features
3.4 Capsule Layer
3.5 Attribute Features
3.6 Aggregation and Prediction
3.7 Optimization
4 Experiments
4.1 Datasets
4.2 Baselines
4.3 Evaluation Metrics
4.4 Experiment Settings
4.5 Experimental Results
4.6 Ablation Study
4.7 Efficiency Analysis
5 Conclusion
References
Sketchy Scene Captioning: Learning Multi-level Semantic Information from Sparse Visual Scene Cues
1 Introduction
2 Related Works
3 A New Dataset for Sketchy Scene Captioning
3.1 SketchyScene Dataset without Descriptions
3.2 Description Collection for SketchyScene Dataset
3.3 Dataset Analysis
4 Multi-level Sketchy Scene Captioning Through Sequence Learning
4.1 Sketchy Scene Encoder for Deep Visual Features
4.2 Sketchy Scene Decoder with Spatial Visual Attention
5 Experiment and Analysis
5.1 Dataset and Preprocessing
5.2 Model Learning and Inference
5.3 Quantitative Evaluation
5.4 Qualitative Evaluation
6 Conclusion and Future Work
References
BDCN: Semantic Embedding Self-explanatory Breast Diagnostic Capsules Network
1 Introduction
2 Model
2.1 Check the Report Semantic Segmentation Layer
2.2 Sem-Bert Layer
2.3 Muti-Cap Layer
2.4 Model Interpretability
3 Experiment
3.1 Dataset
3.2 Parameter Settings
3.3 Different Stage Selection
4 Conclusion
References
GCN with External Knowledge for Clinical Event Detection
1 Introduction
2 Related Work
2.1 Clinical Event Detection
2.2 Stanford CoreNLP
2.3 BioBERT
3 Model
3.1 Character-Word Embedding
3.2 Semantic GCN
3.3 CRF Decoding
4 Experiment
4.1 Dataset
4.2 Evaluation Metrics
4.3 Settings
4.4 Evaluation on CED
4.5 Ablation Study
4.6 Type Analysis
5 Conclusion
References
A Prompt-Independent and Interpretable Automated Essay Scoring Method for Chinese Second Language Writing
1 Introduction
2 Related Work
2.1 Prompt-Specific vs. Prompt-Independent
2.2 Interpretability and Feedback
2.3 Automated Essay Scoring of Chinese Essays
3 The Proposed Method
3.1 The Interpretable Representations of Essay Features
3.2 The Ordinal Logistic Regression Model
4 Experiments
4.1 Dataset and Preprocessing
4.2 Feature Selection
4.3 Models, Parameters and Evaluation Metrics
4.4 Results
5 Discussion
5.1 Analysis on Confusion Matrix
5.2 Revisiting Linear Regression: Interpretability and Potential of Providing Feedback
6 Conclusion and Future Work
A The Linguistic Complexity and Writing Error Features
B The Example Essays
References
A Robustly Optimized BERT Pre-training Approach with Post-training
1 Introduction
2 The Proposed Model: PPBERT
2.1 Pre-training
2.2 Post-training
2.3 Fine-Tuning
3 Experiments
3.1 Tasks
3.2 Datasets
3.3 Experimental Results
4 Ablation Study and Analyses
4.1 Cooperation with Other Pre-trained LMs
5 Conclusion
References
Author Index
Recommend Papers

Chinese Computational Linguistics: 20th China National Conference, CCL 2021, Hohhot, China, August 13–15, 2021, Proceedings (Lecture Notes in Artificial Intelligence)
 3030841855, 9783030841850

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

LNAI 12869

Sheng Li · Maosong Sun · Yang Liu · Hua Wu · Liu Kang · Wanxiang Che · Shizhu He · Gaoqi Rao (Eds.)

Chinese Computational Linguistics 20th China National Conference, CCL 2021 Hohhot, China, August 13–15, 2021 Proceedings

123

Lecture Notes in Artificial Intelligence Subseries of Lecture Notes in Computer Science

Series Editors Randy Goebel University of Alberta, Edmonton, Canada Yuzuru Tanaka Hokkaido University, Sapporo, Japan Wolfgang Wahlster DFKI and Saarland University, Saarbrücken, Germany

Founding Editor Jörg Siekmann DFKI and Saarland University, Saarbrücken, Germany

12869

More information about this subseries at http://www.springer.com/series/1244

Sheng Li Maosong Sun Yang Liu Hua Wu Liu Kang Wanxiang Che Shizhu He Gaoqi Rao (Eds.) •













Chinese Computational Linguistics 20th China National Conference, CCL 2021 Hohhot, China, August 13–15, 2021 Proceedings

123

Editors Sheng Li Harbin Institute of Technology Harbin, China

Maosong Sun Tsinghua University Beijing, China

Yang Liu Tsinghua University Beijing, China

Hua Wu Baidu (China) Beijing, China

Liu Kang Chinese Academy of Sciences Beijing, China

Wanxiang Che Harbin Institute of Technology Harbin, China

Shizhu He Chinese Academy of Sciences Beijing, China

Gaoqi Rao Beijing Language and Culture University Beijing, China

ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Artificial Intelligence ISBN 978-3-030-84185-0 ISBN 978-3-030-84186-7 (eBook) https://doi.org/10.1007/978-3-030-84186-7 LNCS Sublibrary: SL7 – Artificial Intelligence © Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

Welcome to the proceedings of the twentieth China National Conference on Computational Linguistics. The conference and symposium were hosted and co-organized by Inner Mongolia University, China. CCL is an annual conference (bi-annual before 2013) that started in 1991. It is the flagship conference of the Chinese Information Processing Society of China (CIPS), which is the largest NLP scholar and expert community in China. CCL is a premier nationwide forum for disseminating new scholarly and technological work in computational linguistics, with a major emphasis on computer processing of the languages in China such as Mandarin, Tibetan, Mongolian, and Uyghur. The Program Committee selected 108 papers (77 Chinese papers and 31 English papers) out of 303 submissions for publication. The acceptance rate was 35.64%. The 31 English papers cover the following topics: – – – – – – – – –

Machine Translation and Multilingual Information Processing (2) Minority Language Information Processing (2) Social Computing and Sentiment Analysis (2) Text Generation and Summarization (3) Information Retrieval, Dialogue, and Question Answering (6) Linguistics and Cognitive Science (1) Language Resource and Evaluation (3) Knowledge Graph and Information Extraction (6) NLP Applications (6)

The final program for CCL 2021 was the result of intense work by many dedicated colleagues. We want to thank, first of all, the authors who submitted their papers, contributing to the creation of the high-quality program. We are deeply indebted to all the Program Committee members for providing high-quality and insightful reviews under a tight schedule, and we are extremely grateful to the sponsors of the conference. Finally, we extend a special word of thanks to all the colleagues of the Organizing Committee and secretariat for their hard work in organizing the conference, and to Springer for their assistance in publishing the proceedings in due time. We thank the Program and Organizing Committees for helping to make the conference successful, and we hope all the participants enjoyed the first online CCL conference. July 2021

Sheng Li Maosong Sun Yang Liu Hua Wu Kang Liu Wanxiang Che

Organization

Conference Chairs Sheng Li Maosong Sun Yang Liu

Harbin Institute of Technology, China Tsinghua University, China Tsinghua University, China

Program Chairs Hua Wu Kang Liu Wanxiang Che

Baidu, China Institute of Automation, CAS, China Harbin Institute of Technology, China

Program Committee Renfen Hu Xiaofei Lu Zhenghua Li Pengfei Liu Jiafeng Guo Qingyao Ai Rui Yan Jing Li Wenliang Chen Bowei Zou Shujian Huang Hui Huang Yating Yang Chenhui Chu Junsheng Zhou Wei Lu Rui Xia Lin Gui Chenliang Li Jiashu Zhao

Beijing Normal University, China Pennsylvania State University, USA Soochow University, China Carnegie Mellon University, USA Institute of Computing Technology, CAS, China The university of Utah, USA Renmin University of China, China The Hong Kong Polytechnic University, China Soochow University, China Institute for Infocomm Research, Singapore Nanjing University, China University of Macau, China Xinjiang Physics and Chemistry Institute, CAS, China Kyoto University, Japan Nanjing Normal University, China Singapore University of Technology and Design, Singapore Nanjing University of Science and Technology, China University of Warwick, UK Wuhan University, China Wilfrid Laurier University, Canada

Local Arrangement Chairs Guanglai Gao Hongxu Hou

Inner Mongolia University, China Inner Mongolia University, China

viii

Organization

Evaluation Chairs Hongfei Lin Wei Song

Dalian University of Technology, China Capital Normal University, China

Publications Chairs Shizhu He Gaoqi Rao

Institute of Automation, CAS, China Beijing Language and Culture University, China

Workshop Chairs Xianpei Han Yanyan Lan

Institute of Software, CAS, China Tsinghua University, China

Sponsorship Chairs Zhongyu Wei Yang Feng

Fudan University, China Institute of Computing Technology, CAS, China

Publicity Chair Xipeng Qiu

Fudan University, China

Website Chair Liang Yang

Dalian University of Technology, China

System Demonstration Chairs Jiajun Zhang Haofen Wang

Institute of Automation, CAS, China Tongji University, China

Student Counseling Chair Zhiyuan Liu

Tsinghua University, China

Student Seminar Chairs Xin Zhao Libo Qin

Renmin University of China, China Harbin Institute of Technology, China

Finance Chair Yuxing Wang

Tsinghua University, China

Organization

Organizers

at

io n

oc

or

ie t

y of C h in a

C h i n e s e In f m

Pro c e s s i n

gS

Chinese Information Processing Society of China

Tsinghua University, China

Inner Mongolia University, China

ix

x

Organization

Publishers

Lecture Notes in Artificial Intelligence, Springer

Journal of Chinese Information Processing

Science China

Journal of Tsinghua University (Science and Technology)

Sponsoring Institutions Gold

Organization

Silver

Bronze

xi

Contents

Machine Translation and Multilingual Information Processing Reducing Length Bias in Scoring Neural Machine Translation via a Causal Inference Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xuewen Shi, Heyan Huang, Ping Jian, and Yi-Kun Tang

3

Low-Resource Machine Translation Based on Asynchronous Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaoning Jia, Hongxu Hou, Nier Wu, Haoran Li, and Xin Chang

16

Minority Language Information Processing Uyghur Metaphor Detection via Considering Emotional Consistency . . . . . . . Qimeng Yang, Long Yu, Shengwei Tian, and Jinmiao Song Incorporating Translation Quality Estimation into Chinese-Korean Neural Machine Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feiyu Li, Yahui Zhao, Feiyang Yang, and Rongyi Cui

31

45

Social Computing and Sentiment Analysis Emotion Classification of COVID-19 Chinese Microblogs Based on the Emotion Category Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xianwei Guo, Hua Lai, Yan Xiang, Zhengtao Yu, and Yuxin Huang

61

Multi-level Emotion Cause Analysis by Multi-head Attention Based Multi-task Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiangju Li, Shi Feng, Yifei Zhang, and Daling Wang

77

Text Generation and Summarization Using Query Expansion in Manifold Ranking for Query-Oriented Multi-document Summarization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quanye Jia, Rui Liu, and Jianying Lin

97

Jointly Learning Salience and Redundancy by Adaptive Sentence Reranking for Extractive Summarization . . . . . . . . . . . . . . . . . . . . . . . . . . Ximing Zhang and Ruifang Liu

112

Incorporating Commonsense Knowledge into Abstractive Dialogue Summarization via Heterogeneous Graph Networks . . . . . . . . . . . . . . . . . . . Xiachong Feng, Xiaocheng Feng, and Bing Qin

127

xiv

Contents

Information Retrieval, Dialogue and Question Answering Enhancing Question Generation with Commonsense Knowledge . . . . . . . . . . Xin Jia, Hao Wang, Dawei Yin, and Yunfang Wu Topic Knowledge Acquisition and Utilization for Machine Reading Comprehension in Social Media Domain . . . . . . . . . . . . . . . . . . . . . . . . . . Zhixing Tian, Yuanzhe Zhang, Kang Liu, and Jun Zhao Category-Based Strategy-Driven Question Generator for Visual Dialogue. . . . Yanan Shi, Yanxin Tan, Fangxiang Feng, Chunping Zheng, and Xiaojie Wang From Learning-to-Match to Learning-to-Discriminate: Global Prototype Learning for Few-shot Relation Classification . . . . . . . . . . . . . . . . . . . . . . . Fangchao Liu, Xinyan Xiao, Lingyong Yan, Hongyu Lin, Xianpei Han, Dai Dai, Hua Wu, and Le Sun Multi-Strategy Knowledge Distillation Based Teacher-Student Framework for Machine Reading Comprehension . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaoyan Yu, Qingbin Liu, Shizhu He, Kang Liu, Shengping Liu, Jun Zhao, and Yongbin Zhou LRRA:A Transparent Neural-Symbolic Reasoning Framework for Real-World Visual Question Answering . . . . . . . . . . . . . . . . . . . . . . . . Zhang Wan, Keming Chen, Yujie Zhang, Jinan Xu, and Yufeng Chen

145

161 177

193

209

226

Linguistics and Cognitive Science Meaningfulness and Unit of Zipf’s Law: Evidence from Danmu Comments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yihan Zhou

239

Language Resource and Evaluation Unifying Discourse Resources with Dependency Framework . . . . . . . . . . . . Yi Cheng, Sujian Li, and Yueyuan Li A Chinese Machine Reading Comprehension Dataset Automatic Generated Based on Knowledge Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hanyu Zhao, Sha Yuan, Jiahong Leng, Xiang Pan, Zhao Xue, Quanyue Ma, and Yangxiao Liang Morphological Analysis Corpus Construction of Uyghur . . . . . . . . . . . . . . . Gulinigeer Abudouwaili, Kahaerjiang Abiderexiti, Jiamila Wushouer, Yunfei Shen, Turenisha Maimaitimin, and Tuergen Yibulayin

257

268

280

Contents

xv

Knowledge Graph and Information Extraction Improving Entity Linking by Encoding Type Information into Entity Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tianran Li, Erguang Yang, Yujie Zhang, Jinan Xu, and Yufeng Chen A Unified Representation Learning Strategy for Open Relation Extraction with Ranked List Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Renze Lou, Fan Zhang, Xiaowei Zhou, Yutong Wang, Minghui Wu, and Lin Sun

297

308

NS-Hunter: BERT-Cloze Based Semantic Denoising for Distantly Supervised Relation Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tielin Shen, Daling Wang, Shi Feng, and Yifei Zhang

324

A Trigger-Aware Multi-task Learning for Chinese Event Entity Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yangxiao Xiang and Chenliang Li

341

Improving Low-Resource Named Entity Recognition via Label-Aware Data Augmentation and Curriculum Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . Wenjing Zhu, Jian Liu, Jinan Xu, Yufeng Chen, and Yujie Zhang

355

Global Entity Alignment with Gated Latent Space Neighborhood Aggregation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei Chen, Xiaoying Chen, and Shengwu Xiong

371

NLP Applications Few-Shot Charge Prediction with Multi-grained Features and Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Han Zhang, Zhicheng Dou, Yutao Zhu, and Jirong Wen

387

Sketchy Scene Captioning: Learning Multi-level Semantic Information from Sparse Visual Scene Cues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lian Zhou, Yangdong Chen, and Yuejie Zhang

404

BDCN: Semantic Embedding Self-explanatory Breast Diagnostic Capsules Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dehua Chen, Keting Zhong, and Jianrong He

419

GCN with External Knowledge for Clinical Event Detection . . . . . . . . . . . . Dan Liu, Zhichang Zhang, Hui Peng, and Ruirui Han A Prompt-Independent and Interpretable Automated Essay Scoring Method for Chinese Second Language Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yupei Wang and Renfen Hu

434

450

xvi

Contents

A Robustly Optimized BERT Pre-training Approach with Post-training . . . . . Zhuang Liu, Wayne Lin, Ya Shi, and Jun Zhao

471

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

485

Machine Translation and Multilingual Information Processing

Reducing Length Bias in Scoring Neural Machine Translation via a Causal Inference Method Xuewen Shi1,2 , Heyan Huang1,2 , Ping Jian1,2(B) , and Yi-Kun Tang1,2 1

School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China 2 Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing 100081, China {xwshi,hhy63,pjian,tangyk}@bit.edu.cn

Abstract. Neural machine translation (NMT) usually employs beam search to expand the searching space and obtain more translation candidates. However, the increase of the beam size often suffers from plenty of short translations, resulting in dramatical decrease in translation quality. In this paper, we handle the length bias problem through a perspective of causal inference. Specifically, we regard the model generated translation score S as a degraded true translation quality affected by some noise, and one of the confounders is the translation length. We apply a HalfSibling Regression method to remove the length effect on S, and then we can obtain a debiased translation score without length information. The proposed method is model agnostic and unsupervised, which is adaptive to any NMT model and test dataset. We conduct the experiments on three translation tasks with different scales of datasets. Experimental results and further analyses show that our approaches gain comparable performance with the empirical baseline methods. Keywords: Machine translation regression

1

· Causal inference · Half-sibling

Introduction

Recently, with the renaissance of deep learning, end-to-end neural machine translation (NMT) [2,26] has gained remarkable performances [6,27,28]. NMT models are usually built upon an encoder-decoder framework [5]: the encoder reads an input sequence x = {x1 , ..., xTx } into a hidden memory H, and the decoder is designed to model a probability over the translation y = {y1 , ..., yTyˆ } by: P (y|x) =

Ty 

P (yt |y