Data curation preservation issues (Budgets, costs, Staffing and skills)
Every day, governments, researchers, businesses, and communities generate vast amounts of digital data. However, the value of these data does not lie in their creation alone; rather, it lies in their ability to remain accessible, understandable, and reusable long after their original purpose has passed. Preserving digital information is neither automatic nor inexpensive. While technological advancements have made data storage increasingly affordable, the long-term sustainability of data curation remains constrained by organizational realities such as limited budgets, rising operational costs, shortages of skilled personnel, and inadequate institutional capacity. These issues are often overshadowed by discussions of technology, despite being among the most significant barriers to successful preservation. As organisations continue to embrace data driven decision making, the question is no longer whether data should be preserved, but whether institutions possess the financial and human resources required to preserve it effectively. This blog argues that budgets, costs, staffing, and skills are not peripheral concerns in data curation; rather, they determine whether preservation efforts succeed or fail.
A common misconception is that declining storage prices have solved the financial burden of data preservation. While technological advances have reduced storage costs, the most expensive aspect of curation remains the human effort required to create, manage, and maintain high-quality metadata, governance frameworks, and preservation workflows (Jeffery, 2020). Data do not become reusable simply because they are stored; they require contextual information that enables future users to understand their origin, quality, and relevance. Consequently, organisations often underestimate the true cost of preservation by focusing on infrastructure while neglecting the ongoing investment required for curation activities.
Budget constraints are particularly problematic because data curation generates long-term benefits rather than immediate returns. As Jeffery (2020) notes, institutions must continually decide which assets justify preservation and for how long they should be retained. This creates difficult cost benefit considerations, especially in environments where resources are scarce. A useful example is climate and environmental research, where observations collected at a specific place and time are often impossible to reproduce. Contemporary understanding of issues such as climate change, biodiversity loss, and atmospheric change depends on decades of carefully curated datasets. Had these records not been preserved because of budgetary limitations, much of today's evidence base for environmental policy and scientific decision-making would be significantly weaker (Jeffery, 2020). Therefore, funding for curation should be viewed as an investment in future knowledge rather than a discretionary operational cost.
Staffing challenges further complicate preservation efforts. Effective curation requires collaboration among researchers, archivists, records managers, information professionals, and information technology specialists. However, many organisations lack dedicated personnel responsible for data stewardship. The consequence is often inconsistent metadata creation, weak governance, and poor preservation outcomes. According to the Digital Curation Centre (2024), sustainable preservation programs depend on clearly defined roles and institutional commitment to data stewardship. Without adequate staffing structures, even sophisticated technological systems struggle to achieve their intended objectives.
Closely linked to staffing shortages is the growing skills gap in digital curation. Modern preservation environments require expertise in metadata standards, digital preservation strategies, research data management, data governance, and emerging technologies. The rapid evolution of digital systems means that many professionals must continuously update their competencies. Research by Cox et al. (2023) demonstrates that skills shortages remain one of the greatest barriers to implementing effective research data management and preservation services. In my view, organizations frequently underestimate the strategic importance of human capital in curation. While advances in artificial intelligence and automation can assist with metadata generation and routine preservation tasks, they cannot replicate the professional judgment required to evaluate data value, authenticity, and long-term relevance. Consequently, investment in staff development should be regarded as a strategic necessity rather than an optional expenditure.
Addressing these challenges requires a shift in organisational thinking. Institutions should integrate data curation into strategic planning, allocate dedicated budgets for preservation activities, invest in staff development, and promote cross disciplinary collaboration. Incremental metadata collection, workflow automation, and the adoption of recognised frameworks such as Data Management Plans can also reduce costs while improving preservation outcomes (Jeffery, 2020). Furthermore, systematic appraisal practices can help organisations identify datasets with the greatest long-term value, ensuring that limited resources are directed toward preserving information that offers the highest potential for future reuse and impact (Whyte & Wilson, 2022). Most importantly, organisations must recognise that sustainable preservation depends as much on people and governance as it does on technology.
Data curation is frequently discussed as a technological challenge, yet the evidence suggests that its greatest obstacles are organisational rather than technical. Sustainable preservation depends on adequate funding, skilled professionals, and institutional commitment to managing data as a strategic asset. Although automation, metadata standards, and preservation frameworks can improve efficiency, they can not replace the expertise required to make informed curation decisions. In my view, organisations that continue to treat data curation as an optional support function risk undermining the long-term value of their information resources. As the volume and importance of digital data continue to grow, investment in budgets, staffing, and professional skills must be regarded not as a cost burden but as a prerequisite for preserving knowledge, supporting innovation, and ensuring that today's data remain tomorrow's evidence.
References
Cox, A. M., Kennan, M. A., Lyon, L., & Pinfield, S. (2023). Developments in research data management and the roles of information professionals. Journal of Librarianship and Information Science, 55(1), 3–18.
Digital Curation Centre. (2024). Digital Curation Centre Guidance and Resources. Edinburgh: Digital Curation Centre.
Higgins, S. (2018). The DCC Curation Lifecycle Model. International Journal of Digital Curation, 3(1), 134–140.
Jeffery, K. G. (2020). Data curation and preservation. In Z. Zhao & M. Hellström (Eds.), Towards Interoperable Research Infrastructures for Environmental and Earth Sciences (pp. 123–139). Springer.
Whyte, A., & Wilson, A. (2022). How to appraise and select research data for curation. International Journal of Digital Curation, 17(1), 1–15.
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J. W., da Silva Santos, L. B., Bourne, P. E., et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018.
Well written, keep it up
ReplyDeleteFantastic
DeleteExcellent Naomi
ReplyDeleteGreat job
ReplyDeleteIndeed digital data preservation faces more than just technical challenges.
ReplyDeleteI like the way you connected the points. Easy to follow
ReplyDeleteUpskilling and retooling is the way to go in this fast paced technological world for a workforce to be efficient and effective in achieving organisational goals. Well put across
ReplyDelete