Rethinking Data Selection and Appraisal in a Data-Driven World



In an era defined by data abundance, the challenge is no longer simply how to store information, but how to decide what deserves to be preserved. The selection and appraisal of data have become central to effective data management, shaping not only what is retained but also what knowledge remains accessible for future use. These processes are far from straightforward they are deeply influenced by institutional priorities, resource limitations, and evolving user needs .

Traditionally, appraisal has been framed as a technical or archival function, primarily concerned with identifying records of long term value. Eastwood (2004) describes appraisal as a professional judgment grounded in evidential and accountability requirements. While this perspective is valuable, it tends to position appraisal as a static decision making process. In contrast, Hodge and Frangakis (2004) argue that appraisal must align with broader data management policies and anticipated patterns of reuse. Building on these perspectives, I argue that appraisal should be understood as a dynamic and iterative process that evolves alongside changing data environments and user expectations.

One of the central tensions in appraisal lies in balancing preservation with practicality. Data are generated in large volumes, often in diverse formats, making it neither feasible nor desirable to retain everything indefinitely . Borgman (2015) supports this view, emphasizing that data curation is constrained by financial, technical, and human resources. However, reducing appraisal decisions to cost efficiency risks overlooking the potential long term value of data. What may appear insignificant today could hold substantial importance in future contexts. Therefore, appraisal frameworks must move beyond immediate utility and consider the broader, and sometimes unpredictable, value of data over time.

Equally important is the role of context in ensuring that preserved data remain meaningful and reusable. Evidence suggests that data without adequate metadata and documentation quickly lose their interpretability . This is reinforced by Tenopir et al. (2020), who demonstrate that data reuse is strongly influenced by the quality of documentation and accessibility. Despite this, data management practices often remain underprioritized. I contend that this reflects a structural weakness in current approaches, where appraisal is treated as an afterthought rather than as an integral component of the data lifecycle.

Another challenge lies in the diversity of data practices across different domains. Variations in data types, standards, and institutional policies make it difficult to establish universal appraisal criteria . While frameworks such as OAIS provide useful guidance, their application must remain flexible to accommodate contextual differences. In my view, effective appraisal should be guided by shared principles such as value, usability, and sustainability while allowing for adaptation to specific environments.

Ultimately, the selection and appraisal of data are not merely technical exercises but strategic decisions that shape what knowledge is preserved and made accessible. By engaging critically with existing approaches, it becomes evident that appraisal must be reimagined as a collaborative and forward looking practice. It is not just about deciding what to keep, but about determining what will matter in the future.

https://youtu.be/BEf6gDNPPs0?si=5qMPL7KRDmb42bUA

References

Borgman, C. L. (2015). Big data, little data, no data: Scholarship in the networked world. MIT Press.
Eastwood, T. (2004). Appraising digital records for long-term preservation. Data Science Journal, 3, 214–220.
Hodge, G., & Frangakis, E. (2004). Digital preservation and permanent access to scientific information. CENDI.
Tenopir, C., et al. (2020). Data sharing, management, use, and reuse: Practices and perceptions of scientists worldwide. PLOS ONE, 15(3).

Comments

Post a Comment

Popular posts from this blog

Storing Data

Data collection and Repositories

Data curation preservation issues (threats to digital materials)