With crowdsourcing and citizen science projects growing in popularity, a team of literary researchers needed a way to improve their tools, methods, and online presence. They also hoped to expand their community in order to gain in productivity. For this to happen we needed to create an online environment where new communities of contributors could have access to digitized manuscript collections and be able to assist project leaders in transcribing and editing collections.
The main challenges with the editing tools the team was using were its licensing costs, its highly specialized interface— which had proven difficult for non-technical users in the past, and the lack of collaborative capabilities, which made the transcription and editing processes costly in time and effort. One other major concern was the quality of the work that would be contributed by inexperienced contributors. I began by looking into collaborative and crowdsourcing projects to gain understanding of existing tools, as well as how one could expect to manage workload, and finally what lies behind participant motivation. I researched existing in-browser editors and how to implement them. I worked to recover user input in XML format that would be specific to each individual project. This was the result of an initial POC that would later lead to enhanced work on the platform architecture in order to improve the robusticity of the tool and also to incorporate the needs and ideas of project leaders who wanted to implement their editorial process within the frame of this online platform.
We had two main user types, literary researchers (transcription specialists) who would act as project leaders, and contributing users who were amateurs of literature and avid readers. In order to understand what these two groups would be able to accomplish using the platform I needed to get into the details of the types of actions users would perform. I researched into the editorial process by engaging transcription specialists and also by doing literature reviews of existing projects. I created initial site maps to begin work on the platform's information architecture.
Three user journeys that were identified.
An example of a user flow from the perspective of a transcription contributor.
I initially designed the platform's architecture to allow users to browse manuscripts from a catalogue, transcribe them, and also edit and publish them. The catalogue view included a document's status of completion and the last user having worked on it. When consulting any chosen transcription, users were also able to see the full list of contributors to a page. Finally, users were able to view published pages, which resulted from the editing and review process that we put in place for our transcription specialists.
For development, the Symfony PHP Framework was chosen in order to have a robust and flexible architecture that would allow for future development and additions to the platform. I developed the transcription tool within this framework, which allowed me to add new capabilities for collaboration and crowdsourcing. I put in place a user authentification system based on existing open-source components, an editing-review cycle, and a also created a comments section for users to have a place to reach out to others about their work.
At this stage I worked closely with project leaders to define a user protocol for one of our test projects, and that reflected the transcription work that they expected to be accomplished. Their feedback ensured that these instructions would be well adapted to their public.
User testing helped identify pain points in the interface and allowed us to make improvements to the platform, both front-end and back-end. We iterated through several versions of onboarding instructions with the goal of achieving instructions that would be easy for novice users to assimilate. We improved the look and feel of the editor and improved the editor embedded stylesheets to facilitate transcription work for users. User generated tooltips were also added to the editor to help contributors understand the purpose of the elements that they would be working with.
I put in place a test framework based on experiment planning and ran experiments with users to observe the relationship between page difficulty and transcription quality. I introduced a method for evaluating page difficulty based on observable features, such as the number of modifications and character size. The results showed that more difficult pages would inevitably have more variable results in transcription quality. More importantly, the findings collected from user contributions generated new ideas for categorizing pages according to difficulty. Within a broader scope this would help match users to pages based on level of expertise. I also demonstrated how users' contributions could be evaluated against target transcription samples (provided by transcription specialists) and suggested that by providing such target samples, projects leaders could evaluate the work of contributors and provide them with a fun scoring system to track their progress.
© Copyright Anne Vikhrova 2018