We got word earlier today that the JISC Mini-Projects program is going to fund CaPRéT!
CaPRéT is one of the tools I’ve proposed as part of my Plagiarism is Good idea. Though we dropped a T and made it sound French in what we proposed to the British!
This program was a bit of a departure from the normal bid method that JISC uses, Phil Barker describes the mini-projects program in his blog.
Mini project grants will be awarded as a fixed fee of £10,000 payable on receipt of agreed deliverables. Funding is not restricted to UK Higher and Further Education Institutions.
We submitted essentially a draft proposal that was openly discussed on the oer-discuss mailing list.
Below is the text of our proposal.
CaPRéT—Cut and Paste Reuse Tracking
Brandon Muramatsu, Massachusetts Institute of Technology, Office of Educational Innovation and Technology, mura-at-mit-dot-edu, 77 Massachusetts Ave. NE48-308, Cambridge, Massachusetts 02139, United States
- Brandon Muramatsu, MIT OEIT
- Justin Ball and Joel Duffin, Tatemae
A Brief Note: We look forward to the discussion about this project idea and your suggestions for improving and refining it over the next week. We’d like to work with projects in the UKOER community to implement CaPRéT for testing on your sites!
Our Basic Idea
Can we better understand the use/reuse of OERs by developing a tool that allows collection and content providers to track the use of individual text snippets from their webpages? We think it’s common practice for a student or faculty member to cut a section of OER text and paste it into an assignment or report, into a lesson plan, or onto another web page. They may use it as is, or they may remix it with other content. Let’s support this activity, but at the same time improve the likelihood of proper attribution, propagate the license, and most importantly track what content is being used/reused.
The development and use of Open Educational Resources, in the UK and beyond, has increased dramatically in the last few years. Despite the number of resources being developed, we still know relatively little about the use, reuse or remixing of the resources. Recently, projects in the UK and the United States have begun to focus on open education practice. A key distinguishing feature of OERs is their ability to be used, reused and remixed. OERs typically permit educators and students to freely use materials as long as they provide attribution to the source. But, how do we know if OERs are being used, reused and remixed?
In many ways, the OER movement can be viewed as an extension of the educational repository movement dating back to the mid-1990s. It has inherited many of the processes used by those repositories; relevant to the topic at hand is the reliance on web server logs as proxies for use. The reliance on web server logs means that analytics are focused on access instead of use (McMartin & Muramatsu, 2007). Web server logs can say which pages were accessed and for how long, and they say what might have been downloaded. These analytics also focus on the larger objects within a collection, whole pages and files—there’s very little known about the use of individual snippets of text within the pages (Muramatsu & Caswell, 2010). We’ve discussed the possibility of doing automated web searches for the individual text snippets through web searches, however we believe it’s more effective to collect this data at the point of use. The work proposed here is designed to capture use and reuse directly from the text snippets, though it could be enhanced with a correlation with web searches for reuse.
There has been recent research and development in areas that focus on better understanding use, reuse and remixing of OERs. For example, there have been a number of published studies looking at reuse and remix in Connexions (Ochoa, 2010; Duncan, 2009). Yet most OERs don’t exist in that kind of environment that supports authoring, publishing and use, so it doesn’t provide a great model to build upon for everyone else. And absent a tracking mechanism, such as that proposed by Scott Leslie and worked on as part of his OLNet fellowship (Leslie, 2010), most collections are left to report statistics based on access and inference gleaned from web server logs. [fusion_builder_container hundred_percent=”yes” overflow=”visible”][fusion_builder_row][fusion_builder_column type=”1_1″ background_position=”left top” background_color=”” border_size=”” border_color=”” border_style=”solid” spacing=”yes” background_image=”” background_repeat=”no-repeat” padding=”” margin_top=”0px” margin_bottom=”0px” class=”” id=”” animation_type=”” animation_speed=”0.3″ animation_direction=”left” hide_on_mobile=”no” center_content=”no” min_height=”none”][Or perhaps a project will be proposed for the JISC CETIS OER Technical Mini Projects that can go beyond access metrics!]
For Project 3 we will focus on one aspect of the problem of reuse and tracking of OER content. We will develop a Cut and Paste Reuse Tracking (CaPRéT) tool that facilitates direct linking to content for users while providing OER providers with a method of tracking use (based on cut and paste text) and that ensures that appropriate citation/attribution and license information is automatically included as the text is pasted.
How might it work?
- As a user selects text on an CaPRéT enabled site, and pastes the text into a rich text enabled editor, the text that was highlighted is pasted, along with a link back to the original site and with full citation/attribution information (to support the attribution clause of most OER Creative Commons licensed content).
- The OER resource provider receives notification that text from a particular page was cut and (probably) reused. If the user then clicks on the link, the OER resource provider receives confirmation that the text was indeed used elsewhere, and can track continued reuse based on the additional clicks to the link.
The CaPRéT concept is based on the commercial service available from Tynt Insight, but differs in that it will be developed to meet the needs of the OER community (attribution and licensing).
We fully expect that CaPRéT will draw upon and integrate with the existing work of OpenAttribute to extract attribution and Creative Commons licensing. The current OpenAttribute tool enables end users to better cite Creative Commons licensed content when no other attribution information is provide. It is a browser extension (for Chrome, Firefox and Opera) that searches for Creative Commons RDF metadata and creates citation information for a web page. In contrast, CaPRéT is a tool for content providers to enable them to send their attribution and licensing information along with text that is paste from their site.
Some of the challenges for the project will be: creating a rich citation/license propagation if Creative Commons RDF metadata doesn’t exist, addressing privacy concerns with data collection, ensuring simplicity of use from all perspectives, identifying appropriate usage metrics that can be collected and reported back to content providers.
We propose a six-month project running from May 1, 2011 to October 31, 2011 to develop CaPRéT.
- CaPRéT Server: Server to track usage (source URL, copied text, number of return clicks, etc.) and notify the resource provider. The server will likely use MySQL, Java and/or Ruby.
- Tested Implementation: Initial testing at Project Greenfield mirror of MIT OpenCourseWare, OERGlue and 1-3 UKOER projects. (We’ve been discussing this concept with our colleagues at the Open University Institute of Educational Technology and Knowledge Media Institute.)
We plan to develop CaPRéT as a service. The initial implementation will be hosted by MIT OEIT/Tatemae and then after initial testing and refinement we plan to make the software available via an open source code repository (likely with an MIT License via Github).
We will post regular short progress updates and all deliverables including a final report to the oer-discuss list. We welcome an ongoing dialog with the UKOER and broader OER communities.
Experience of the Team
MIT Office of Educational Innovation and Technology: OEIT has been deeply involved on a policy and strategy level in Open Education worldwide through its work over the last decade. Previously, OEIT developed the Open Knowledge Initiative at technical framework for system level interoperability for plug and play educational projects. Currently, Brandon Muramatsu at OEIT is working on Project Greenfield to extend the reach and usefulness of MIT OpenCourseWare through new tools and services. Greenfield will serve as a testbed for CaPRéT.
Duncan, S. M. (2009). Patterns of Learning Object Reuse in the Connexions Repository. Ph.D. Dissertation. Retrieved August 8, 2010 from: http://www.archive.org/details/PatternsOfLearningObjectReuseInTheConnexionsRepository
Leslie, S. (2010, July 12). OLNet Fellowship Week 2 – Initial Thoughts on Tracking Downloaded OERs. Retrieved April 8, 2011 from EdTechPost Web site: http://www.edtechpost.ca/wordpress/2010/07/12/olnet-tracking-oer-first-stab/
Ochoa, X. (2010). Connexions: a Social and Successful Anomaly among Learning Object Repositories. Journal of Emerging Technologies in Web Intelligence 2, no. 1: 11. Retrieved August 8, 2010 from: http://ojs.academypublisher.com/index.php/jetwi/article/viewArticle/2459
McMartin, F. & Muramatsu, B. (2007, June). Use versus Access: Design and Use in Educational Digital Libraries. Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries, Vancouver, BC.
Muramatsu, B. & Caswell, T. (2010, November 3). Plagiarism is Good: Moving from Access to Use as Metrics for OCW/OER Use and Reuse. Presentation at Open Education 2010 Conference: Barcelona, Spain, November 3, 2010. Retrieved April 8, 2011 from: http://www.slideshare.net/bmuramatsu/plagiarism-is-good-moving-from-access-to-use-as-metrics-for-ocwoer-use-and-reuse