Gianmaria Silvello

Infrmation Management Systems

Department of Information Engineering

University of Padua

  • About Me
  • Research
  • Publications
  • Teaching
  • Events and Service
  • Contact & Meet Me
  • Download CV

Data Citation

A video tutorial: RDA Webinar on "Automatically generating citation text from queries"


"Learning to cite" system for XML

You can browse the software at

http://ims-svn.dei.unipd.it/repos/datacitation/

Username: guest - Password: guest

You can check it out using Subversion

$ svn checkout --username guest --password guest 
http://ims-svn.dei.unipd.it/repos/datacitation/ datacitation

Documentation

The JavaDoc is available at the URL:

http://www.dei.unipd.it/~silvello/datacitation/learningtocite

Data citation test collection

We build the experimental collection by using the Library of Congress digital finding aids collection encoded in the EAD format which is publicly available at the following URL: http://findingaids.loc.gov/.

To build the training and validation set, we selected at random 25 EAD files and for each one of these files we randomly extract 4 citable units; we obtained a set of 100 XPaths identifying an equal number of different citable units. For each citable unit (i.e., XML element), we manually created a human-readable citation to be used to train the citation system and a machine-readable citation to build the ground-truth to be used for validation purposes.

The test set has been built by following a similar procedure: from the whole EAD collection minus the 25 files selected for the training and validation set, we randomly selected 50 EAD files and for each one a single citable unit has been selected at random. Then, we manually created a ground-truth machine-readable citation for each one of these randomly sampled citable units. We created the ground-truth citations by following the guidelines provided by the archives of the Purdue University which follows the Modern Language Association (MLA) citation style.

You can browse the test collection at

http://ims-svn.dei.unipd.it/repos/datacitation_collections/

Username: guest - Password: guest

You can check it out using Subversion

$ svn checkout --username guest --password guest 
http://ims-svn.dei.unipd.it/repos/datacitation_collections/ datacitation_collections

Rule-Based Citation System for XML

The system is presented in the following paper:

Peter Buneman and Gianmaria Silvello. A Rule-Based Citation System for Structured and Evolving Datasets, IEEE Bulletin of the Technical Committee on Data Engineering, Vol. 3, No. 3. IEEE Computer Society, pp. 33-41, September 2010. Download:

The JavaDoc is available at the URL:

http://www.dei.unipd.it/~silvello/datacitation/rulebasedsystem