Exploring mention representations for coreference in dialogue

Exploring mention representations for coreference in dialogue

Research questions: 

–  What is the best way to represent mentions? Is it enough to use a concatenation of different embeddings? If yes, which embeddings work best for which type of mentions?  

– Can we improve the results by encoding/embedding other (linguistic) features?  

– To what extent does context affect the choice of markables?   

-How to represent the span with multipable tokens? sum, average, concat? 

Data: we can use (part of) the data from the CODI-CRAC Anaphora Resolution Shared Task 2021 and/or the OneCommon dataset (https://github.com/Alab-NII/onecommon/tree/master/aaai2020).   

Paper suggestions:  

  1. Improving coreference resolution by learning entity-level distributed representations, Clark and Manning, 2016. Link: https://arxiv.org/pdf/1606.01323 
  1. End-to-end neural coreference resolution, Lee et al, 2017. Link: https://arxiv.org/pdf/1707.07045  
  1. Integrating knowledge graph embeddings to improve mention representation for bridging anaphora resolution, Pandit et al., 2020. Link: https://aclanthology.org/2020.crac-1.7.pdf 
  1. CorefQA: Coreference Resolution as Query-based Span Prediction, Wu et al., 2020. Link: https://aclanthology.org/2020.acl-main.622/ 
  1. Pre-training Mention Representations in Coreference Models, Varkel and Globerson, 2020. Link: https://aclanthology.org/2020.emnlp-main.687.pdf  
  1. Improving Span Representation for Domain-adapted Coreference Resolution, Gandhi et al., 2021. Link: https://aclanthology.org/2021.crac-1.13.pdf