Home > Research > Publications & Outputs > Emerging practices for mapping and linking life...
View graph of relations

Emerging practices for mapping and linking life sciences data using RDF: a case series

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Emerging practices for mapping and linking life sciences data using RDF: a case series. / Marshall, M. Scott; Boyce, Richard; Deus, Helena F. et al.
In: Journal of Web Semantics, Vol. 14, 07.2012, p. 2-13.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Marshall, MS, Boyce, R, Deus, HF, Zhao, J, Willighagen, EL, Samwald, M, Pichler, E, Hajagos, J, Prud’hommeaux, E & Stephens, S 2012, 'Emerging practices for mapping and linking life sciences data using RDF: a case series', Journal of Web Semantics, vol. 14, pp. 2-13. https://doi.org/10.1016/j.websem.2012.02.003

APA

Marshall, M. S., Boyce, R., Deus, H. F., Zhao, J., Willighagen, E. L., Samwald, M., Pichler, E., Hajagos, J., Prud’hommeaux, E., & Stephens, S. (2012). Emerging practices for mapping and linking life sciences data using RDF: a case series. Journal of Web Semantics, 14, 2-13. https://doi.org/10.1016/j.websem.2012.02.003

Vancouver

Marshall MS, Boyce R, Deus HF, Zhao J, Willighagen EL, Samwald M et al. Emerging practices for mapping and linking life sciences data using RDF: a case series. Journal of Web Semantics. 2012 Jul;14:2-13. doi: 10.1016/j.websem.2012.02.003

Author

Marshall, M. Scott ; Boyce, Richard ; Deus, Helena F. et al. / Emerging practices for mapping and linking life sciences data using RDF : a case series. In: Journal of Web Semantics. 2012 ; Vol. 14. pp. 2-13.

Bibtex

@article{d8011e2419a44a05bee46d51c7ff9e5b,
title = "Emerging practices for mapping and linking life sciences data using RDF: a case series",
abstract = "Members of the W3C Health Care and Life Sciences Interest Group (HCLS IG) have published a variety of genomic and drug-related data sets as Resource Description Framework (RDF) triples. This experience has helped the interest group define a general data workflow for mapping health care and life science (HCLS) data to RDF and linking it with other Linked Data sources. This paper presents the workflow along with four case studies that demonstrate the workflow and addresses many of the challenges that may be faced when creating new Linked Data resources. The first case study describes the creation of linked RDF data from microarray data sets while the second discusses a linked RDF data set created from a knowledge base of drug therapies and drug targets. The third case study describes the creation of an RDF index of biomedical concepts present in unstructured clinical reports and how this index was linked to a drug side-effect knowledge base. The final case study describes the initial development of a linked data set from a knowledge base of small molecules.This paper also provides a detailed set of recommended practices for creating and publishing Linked Data sources in the HCLS domain in such a way that they are discoverable and usable by people, software agents, and applications. These practices are based on the cumulative experience of the Linked Open Drug Data (LODD) task force of the HCLS IG. While no single set of recommendations can address all of the heterogeneous information needs that exist within the HCLS domains, practitioners wishing to create Linked Data should find the recommendations useful for identifying the tools, techniques, and practices employed by earlier developers. In addition to clarifying available methods for producing Linked Data, the recommendations for metadata should also make the discovery and consumption of Linked Data easier.",
keywords = "Linked Data, Semantic web, Health care, Life sciences, Data integration",
author = "Marshall, {M. Scott} and Richard Boyce and Deus, {Helena F.} and Jun Zhao and Willighagen, {Egon L.} and Matthias Samwald and Elgar Pichler and Janos Hajagos and Eric Prud{\textquoteright}hommeaux and Susie Stephens",
year = "2012",
month = jul,
doi = "10.1016/j.websem.2012.02.003",
language = "English",
volume = "14",
pages = "2--13",
journal = "Journal of Web Semantics",
issn = "1570-8268",
publisher = "Elsevier",

}

RIS

TY - JOUR

T1 - Emerging practices for mapping and linking life sciences data using RDF

T2 - a case series

AU - Marshall, M. Scott

AU - Boyce, Richard

AU - Deus, Helena F.

AU - Zhao, Jun

AU - Willighagen, Egon L.

AU - Samwald, Matthias

AU - Pichler, Elgar

AU - Hajagos, Janos

AU - Prud’hommeaux, Eric

AU - Stephens, Susie

PY - 2012/7

Y1 - 2012/7

N2 - Members of the W3C Health Care and Life Sciences Interest Group (HCLS IG) have published a variety of genomic and drug-related data sets as Resource Description Framework (RDF) triples. This experience has helped the interest group define a general data workflow for mapping health care and life science (HCLS) data to RDF and linking it with other Linked Data sources. This paper presents the workflow along with four case studies that demonstrate the workflow and addresses many of the challenges that may be faced when creating new Linked Data resources. The first case study describes the creation of linked RDF data from microarray data sets while the second discusses a linked RDF data set created from a knowledge base of drug therapies and drug targets. The third case study describes the creation of an RDF index of biomedical concepts present in unstructured clinical reports and how this index was linked to a drug side-effect knowledge base. The final case study describes the initial development of a linked data set from a knowledge base of small molecules.This paper also provides a detailed set of recommended practices for creating and publishing Linked Data sources in the HCLS domain in such a way that they are discoverable and usable by people, software agents, and applications. These practices are based on the cumulative experience of the Linked Open Drug Data (LODD) task force of the HCLS IG. While no single set of recommendations can address all of the heterogeneous information needs that exist within the HCLS domains, practitioners wishing to create Linked Data should find the recommendations useful for identifying the tools, techniques, and practices employed by earlier developers. In addition to clarifying available methods for producing Linked Data, the recommendations for metadata should also make the discovery and consumption of Linked Data easier.

AB - Members of the W3C Health Care and Life Sciences Interest Group (HCLS IG) have published a variety of genomic and drug-related data sets as Resource Description Framework (RDF) triples. This experience has helped the interest group define a general data workflow for mapping health care and life science (HCLS) data to RDF and linking it with other Linked Data sources. This paper presents the workflow along with four case studies that demonstrate the workflow and addresses many of the challenges that may be faced when creating new Linked Data resources. The first case study describes the creation of linked RDF data from microarray data sets while the second discusses a linked RDF data set created from a knowledge base of drug therapies and drug targets. The third case study describes the creation of an RDF index of biomedical concepts present in unstructured clinical reports and how this index was linked to a drug side-effect knowledge base. The final case study describes the initial development of a linked data set from a knowledge base of small molecules.This paper also provides a detailed set of recommended practices for creating and publishing Linked Data sources in the HCLS domain in such a way that they are discoverable and usable by people, software agents, and applications. These practices are based on the cumulative experience of the Linked Open Drug Data (LODD) task force of the HCLS IG. While no single set of recommendations can address all of the heterogeneous information needs that exist within the HCLS domains, practitioners wishing to create Linked Data should find the recommendations useful for identifying the tools, techniques, and practices employed by earlier developers. In addition to clarifying available methods for producing Linked Data, the recommendations for metadata should also make the discovery and consumption of Linked Data easier.

KW - Linked Data

KW - Semantic web

KW - Health care

KW - Life sciences

KW - Data integration

U2 - 10.1016/j.websem.2012.02.003

DO - 10.1016/j.websem.2012.02.003

M3 - Journal article

VL - 14

SP - 2

EP - 13

JO - Journal of Web Semantics

JF - Journal of Web Semantics

SN - 1570-8268

ER -