E-Prints to Data Centre
The emergence of a Semantic Web of data offers the prospect of exciting new possibilities for knowledge discovery, and there are already clear indications that it will deliver on its potential.
A good example, from the biological sciences, is that of NextBio (http://www.nextbio.com/b/home/home.nb), an innovative biotech company that is developing a scientific foundation consisting of a robust framework to connect highly heterogeneous data and textual information. This approach to integrating experimental data and textual information offers a new tool for tackling complex problems. For example, identifying associations between different pathologies at both the clinical and genetic/biochemical level provides a means to establish common disease pathways.
For engineering materials, a similarly complex case exists for composition-property relationships in advanced alloys, a better understanding of which will facilitate improved alloy design, materials selection and lifing for a wide range of applications. Integration of experimental data and textual information is especially exciting for inter-disciplinary knowledge discovery, which very often leads to new breakthroughs in science and technology simply as a consequence of connecting complementary activities in different domains. With a Semantic Web of data, discovery of these inter-disciplinary connections will no longer be a matter of chance. However, to realize this potential, efforts are needed to conserve experimental data together with the publications to which they correspond.
The objective of EP2DC is to develop a prototype module to enable the EPrints repository (presently deployed at more than 250 institutes) to support the submission of XML-formatted experimental data together with the manuscript to which they correspond. The module will be tested and refined by an integration with the JISC-funded 'Materials Data Centre' (MDC) that is presently under development at the University of Southampton. By promoting the capture of experimental data and cross-referencing to the scientific publications to which they correspond, this work is aligned to the Semantic Web / linked data priority.
A good example, from the biological sciences, is that of NextBio (http://www.nextbio.com/b/home/home.nb), an innovative biotech company that is developing a scientific foundation consisting of a robust framework to connect highly heterogeneous data and textual information. This approach to integrating experimental data and textual information offers a new tool for tackling complex problems. For example, identifying associations between different pathologies at both the clinical and genetic/biochemical level provides a means to establish common disease pathways.
For engineering materials, a similarly complex case exists for composition-property relationships in advanced alloys, a better understanding of which will facilitate improved alloy design, materials selection and lifing for a wide range of applications. Integration of experimental data and textual information is especially exciting for inter-disciplinary knowledge discovery, which very often leads to new breakthroughs in science and technology simply as a consequence of connecting complementary activities in different domains. With a Semantic Web of data, discovery of these inter-disciplinary connections will no longer be a matter of chance. However, to realize this potential, efforts are needed to conserve experimental data together with the publications to which they correspond.
The objective of EP2DC is to develop a prototype module to enable the EPrints repository (presently deployed at more than 250 institutes) to support the submission of XML-formatted experimental data together with the manuscript to which they correspond. The module will be tested and refined by an integration with the JISC-funded 'Materials Data Centre' (MDC) that is presently under development at the University of Southampton. By promoting the capture of experimental data and cross-referencing to the scientific publications to which they correspond, this work is aligned to the Semantic Web / linked data priority.