Empirical Evaluation of Decentralized Genomic Data Computation Using Bacalhau and IPFS
Abstract
Large-scale genomic analysis typically relies on centralized infrastructures, creating conflicts between collaborative needs and data sovereignty regulations. This study solves this dilemma by evaluating a decentralized architecture designed to facilitate secure, inter-institutional genomic computation without moving raw data. We integrated Bacalhau for orchestration and IPFS Cluster with CRDT consensus for storage, employing AES-256 encryption. A quantitative evaluation was conducted on AWS using five t3.medium nodes to simulate a resource-constrained hospital network. We tested three scenarios: a centralized baseline (SSH+SCP), an ideal decentralized workflow, and a "chaos" scenario involving active network fault injection. While the centralized baseline was the fastest (Mean=37.69s), the decentralized architecture incurred a manageable ~30% overhead under ideal conditions (Mean=49.22s, SD=1.58s). Critically, under chaos fault injection, although execution time increased to 90.67s (SD=17.84s), the system achieved a superior 100% job completion rate compared to the fragile baseline. This research quantifies the trade-off between execution speed and system resilience in a healthcare context. We demonstrate that this architecture prioritizes data sovereignty and high availability over raw speed, offering a proven model for privacy-critical Decentralized Science (DeSci) collaborations.
Downloads
References
M. Bourgey et al., “GenPipes: An open-source framework for distributed and scalable genomic analyses,” Gigascience, vol. 8, no. 6, Jun. 2019, doi: 10.1093/gigascience/giz037.
B. Liu et al., “Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses,” J Biomed Inform, vol. 49, pp. 119–133, 2014, doi: 10.1016/j.jbi.2014.01.005.
M. Beyene, P. A. Toussaint, S. Thiebes, M. Schlesner, B. Brors, and A. Sunyaev, “A scoping review of distributed ledger technology in genomics: Thematic analysis and directions for future research,” Aug. 01, 2022, Oxford University Press. doi: 10.1093/jamia/ocac077.
T. Zhao, F. Wang, R. Mott, J. Dekkers, and H. Cheng, “Using encrypted genotypes and phenotypes for collaborative genomic analyses to maintain data confidentiality,” Genetics, vol. 226, no. 3, Mar. 2024, doi: 10.1093/genetics/iyad210.
P. Kang, W. Yang, and J. Zheng, “Blockchain Private File Storage-Sharing Method Based on IPFS,” Sensors, vol. 22, no. 14, Jul. 2022, doi: 10.3390/s22145100.
Y. Zhang, M. Zhong, X. Zhao, C. Curtis, X. Li, and C. Chen, “Enabling privacy-preserving sharing of genomic data for GWASs in decentralized networks,” in WSDM 2019 - Proceedings of the 12th ACM International Conference on Web Search and Data Mining, Association for Computing Machinery, Inc, Jan. 2019, pp. 204–212. doi: 10.1145/3289600.3290983.
T. T. Kuo et al., “iDASH secure genome analysis competition 2018: blockchain genomic data access logging, homomorphic encryption on GWAS, and DNA segment searching,” Jul. 21, 2020, BioMed Central Ltd. doi: 10.1186/s12920-020-0715-0.
D. Copeland and A. Taylor, “A novel encryption protocol for facilitating de-identification of genomics health data,” Int J Popul Data Sci, vol. 9, no. 5, Sep. 2024, doi: 10.23889/ijpds.v9i5.2907.
M. Shabani, “Blockchain-based platforms for genomic data sharing: a de-centralized approach in response to the governance problems?,” Jan. 01, 2019, Oxford University Press. doi: 10.1093/jamia/ocy149.
A. A. Corodescu et al., “Locality-aware workflow orchestration for big data,” in ACM International Conference Proceeding Series, Association for Computing Machinery, Nov. 2021, pp. 62–70. doi: 10.1145/3444757.3485106.
G. Gürsoy, C. M. Brannon, E. Ni, S. Wagner, A. Khanna, and M. Gerstein, “Storing and analyzing a genome on a blockchain,” Genome Biol, vol. 23, no. 1, Dec. 2022, doi: 10.1186/s13059-022-02699-7.
R. P. Adelson et al., “Empirical design of a variant quality control pipeline for whole genome sequencing data using replicate discordance,” Sci Rep, vol. 9, no. 1, Dec. 2019, doi: 10.1038/s41598-019-52614-7.
S. N. Kobren et al., “Commonalities across computational workflows for uncovering explanatory variants in undiagnosed cases,” Genetics in Medicine, vol. 23, no. 6, pp. 1075–1085, Jun. 2021, doi: 10.1038/s41436-020-01084-8.
P. S. Almeida, “Approaches to Conflict-free Replicated Data Types,” ACM Comput Surv, vol. 57, no. 2, Nov. 2024, doi: 10.1145/3695249.
D. Cotroneo, L. De Simone, and R. Natella, “ThorFI: a Novel Approach for Network Fault Injection as a Service,” Journal of Network and Computer Applications, vol. 201, May 2022, doi: 10.1016/j.jnca.2022.103334.
A. Basiri et al., “Chaos Engineering,” IEEE Softw, vol. 33, no. 3, pp. 35–41, May 2016, doi: 10.1109/MS.2016.60.
C. Diekmann, L. Hupel, J. Michaelis, M. Haslbeck, and G. Carle, “Verified iptables Firewall Analysis and Verification,” J Autom Reason, vol. 61, no. 1–4, pp. 191–242, Jun. 2018, doi: 10.1007/s10817-017-9445-1.
W. Hoarau, S. Tixeuil, and F. Vauchelles, “Fault Injection in Distributed Java Applications.”
R. Chandra, R. M. Lefever, K. Joshi, M. Cukier, and W. H. Sanders, “A Global-State-Triggered Fault Injector for Distributed System Evaluation *.”
P. Singhal, “Orchestration Workflows in Distributed Systems: A Systematic Analysis of Efficiency Optimization and Service Coordination.” [Online]. Available: www.ijfmr.com
D. Trautwein et al., “Design and evaluation of ipfs: A storage layer for the decentralizedweb,” in SIGCOMM 2022 - Proceedings of the ACM SIGCOMM 2022 Conference, Association for Computing Machinery, Inc, Aug. 2022, pp. 739–752. doi: 10.1145/3544216.3544232.
O. A. Lajam and T. A. Helmy, “Performance evaluation of IPFS in private networks,” in ACM International Conference Proceeding Series, Association for Computing Machinery, Feb. 2021, pp. 77–84. doi: 10.1145/3456146.3456159.
S. Ma, Y. Cao, and L. Xiong, “Efficient logging and querying for blockchain-based cross-site genomic dataset access audit,” BMC Med Genomics, vol. 13, Jul. 2020, doi: 10.1186/s12920-020-0725-y.
R. Hariharan, “Resilience Engineering in Distributed Cloud Architectures,” International Journal of Engineering and Architecture, vol. 2, no. 1, pp. 39–75, May 2025, doi: 10.58425/ijea.v2i1.355.
G. Mandinyenya and V. Malele, “Comparative Security and Performance Evaluation of IPFS and Filecoin for Off-chain Blockchain Storage,” The Indonesian Journal of Computer Science, vol. 14, no. 4, Aug. 2025, doi: 10.33022/ijcs.v14i4.4968.
Abstract views: 0 times
Download PDF: 0 times
Copyright (c) 2025 Journal of Information Systems and Informatics

This work is licensed under a Creative Commons Attribution 4.0 International License.
- I certify that I have read, understand and agreed to the Journal of Information Systems and Informatics (Journal-ISI) submission guidelines, policies and submission declaration. Submission already using the provided template.
- I certify that all authors have approved the publication of this and there is no conflict of interest.
- I confirm that the manuscript is the authors' original work and the manuscript has not received prior publication and is not under consideration for publication elsewhere and has not been previously published.
- I confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- I confirm that the paper now submitted is not copied or plagiarized version of some other published work.
- I declare that I shall not submit the paper for publication in any other Journal or Magazine till the decision is made by journal editors.
- If the paper is finally accepted by the journal for publication, I confirm that I will either publish the paper immediately or withdraw it according to withdrawal policies
- I Agree that the paper published by this journal, I transfer copyright or assign exclusive rights to the publisher (including commercial rights)














