Increasingly, scholars across disciplines and throughout the research life cycle are using a wide variety of online portals such as GitHub, FigShare, Publons, and SlideShare to conduct aspects of their research and to communicate research outcomes.
However, these portals, whether dedicated to scholarly use or general purpose, exist outside of the traditional scholarly publishing system and no infrastructure exists to systematically and comprehensively archive the deposited artifacts. We have shown in previous work that without adequate infrastructure, scholarly artifacts will vanish from the web in much the same way and with similar frequency ”regular” web resources do.
In the “Scholarly Orphans” project, we assume that research institutions are interested in collecting scholarly artifacts created by their researchers. As such, we are designing an institutional pipeline to track, capture, and archive these artifacts. The tracking part is crucial as institutions are usually not even aware of the existence of artifacts created by their researchers in online portals. Regarding capture, we are developing a novel framework we call Memento Tracer that plays a crucial role in creating high-fidelity Mementos of artifacts. With Memento Tracer, a human curator interacts with a web-based artifact to establish its essential components, and to record these interactions as Traces. A Trace can be used as instructions for automatic web archiving frameworks to capture artifacts of the same class. In addition, Traces can be shared with a community of practice enabling a new level of collaboration among artifact archiving institutions. These characteristics give Memento Tracer the potential to bring about significant progress for high-quality web archiving at scale.
To demonstrate the potential of this approach, we have established a pilot that is available at https://myresearch.institute/. We shared a few insights gained from this pilot at the CNI 2019 Spring meeting.