Genomics and Biological Big Data: Facing Current and Future Challenges around Data and Software Sharing and Reproducibility

Gesing, Sandra; Connor, Thomas Richard; Taylor, Ian

Computer Science > Distributed, Parallel, and Cluster Computing

arXiv:1511.02689 (cs)

[Submitted on 9 Nov 2015]

Title:Genomics and Biological Big Data: Facing Current and Future Challenges around Data and Software Sharing and Reproducibility

Authors:Sandra Gesing, Thomas Richard Connor, Ian Taylor

View PDF

Abstract:Novel technologies in genomics allow creating data in exascale dimension with relatively minor effort of human and laboratory and thus monetary resources compared to capabilities only a decade ago. While the availability of this data salvage to find answers for research questions, which would not have been feasible before, maybe even not feasible to ask before, the amount of data creates new challenges, which obviously need new software and data management systems. Such new solutions have to consider integrative approaches, which are not only considering the effectiveness and efficiency of data processing but improve reusability, reproducibility and usability especially tailored to the target user communities of genomic big data. In our opinion, current solutions tackle part of the challenges and have each their strengths but lack to provide a complete solution. We present in this paper the key challenges and the characteristics cutting-edge developments should possess for fulfilling the needs of the user communities to allow for seamless sharing and data analysis on a large scale.

Comments:	Position paper at BDAC-15 (Big Data Analytics: Challenges and Opportunities), workshop in cooperation with ACM/IEEE SC15, November 16, 2015, Austin, TX, USA
Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC); Computers and Society (cs.CY)
Cite as:	arXiv:1511.02689 [cs.DC]
	(or arXiv:1511.02689v1 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.1511.02689

Submission history

From: Sandra Gesing [view email]
[v1] Mon, 9 Nov 2015 14:17:13 UTC (154 KB)

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Genomics and Biological Big Data: Facing Current and Future Challenges around Data and Software Sharing and Reproducibility

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Genomics and Biological Big Data: Facing Current and Future Challenges around Data and Software Sharing and Reproducibility

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators