Skip navigation.
Semantic Software Lab
Concordia University
Montréal, Canada

An Automatic Workflow for Formalization of Scholarly Articles' Structural and Semantic Elements

Printer-friendly versionPrinter-friendly versionPDF versionPDF version
TitleAn Automatic Workflow for Formalization of Scholarly Articles' Structural and Semantic Elements
Publication TypeConference Paper
Year of Publication2016
AuthorsSateli, B., and R. Witte
Refereed DesignationRefereed
EditorsSack, H., S. Dietze, A. Tordai, and C. Lange
Conference NameThe 13th Extended Semantic Web Conference (The Semantic Publishing Challenge 2016)
Tertiary TitleThird SemWebEval Challenge at ESWC 2016, Heraklion, Crete, Greece, May 29 - June 2, 2016, Revised Selected Papers
Date Published06/2016
PublisherSpringer International Publishing
Conference LocationHeraklion, Crete, Greece
Type of WorkPaper
ISBN Number978-3-319-46565-4

We present a workflow for the automatic transformation of scholarly literature to a Linked Open Data (LOD) compliant knowledge base to address Task 2 of the Semantic Publishing Challenge 2016. In this year's task, we aim to extract various contextual information from full-text papers using a text mining pipeline that integrates LOD-based Named Entity Recognition (NER) and triplification of the detected entities. In our proposed approach, we leverage an existing NER tool to ground named entities, such as geographical locations, to their LOD resources. Combined with a rule-based approach, we demonstrate how we can extract both the structural (e.g., floats and sections) and semantic elements (e.g., authors and their respective affiliations) of the provided dataset’s documents. Finally, we integrate the LODeXporter, our flexible exporting module to represent the results as semantic triples in RDF format. As the result, we generate a scalable, TDB-based knowledge base that is interlinked with the LOD cloud, and a public SPARQL endpoint for the task’s queries. Our submission won the second place at the SemPub2016 challenge Task 2 with an average 0.63 F-score.


Copyright © Springer International Publishing Switzerland 2016. This is the author's version of the work. It is posted here by permission of Springer for your personal use. Not for redistribution.

sempub_challenge2016_extended.pdf668.56 KB