NLPatVCU CLEF 2020 ChEMU Shared Task System Description

Abstract

This paper describes our team’s participation in the Tracks 1 & 2 from Conference and Labs of the Evaluation Forum (CLEF 2020) Challenge organized by Cheminformatics Elsevier Melbourne University for extracting information over chemical reactions from patents. We discuss our systems: MedaCy, a python-based supervised multi-class entity recognition system, and RelEx, a python-based relation extraction system which includes rule-based and supervised learning pipelines. Our best model for Task 1 obtained an overall relaxed precision of 0.95 and exact precision of 0.87; relaxed recall of 0.99 and exact recall of 0.86; and relaxed F-1 score of 0.97 and exact F-1 score of 0.87. Our best model for Task 2 obtained an overall precision of 0.80; recall of 0.54; and F-1 score of 0.65.

Publication
Conference and Labs of the Evaluation Forum (CLEF) 2020