Why DOI?

Image 1: Oprah with text “You get a DOI, everything gets a DOI”While many researchers have heard of Digital Object Identifiers (DOIs), some may not know why and when they should be used. The single most important characteristic of DOIs is that they can be attached to just about any digital, online research output. If something has a URL, or a specific location on the web, it can be assigned a DOI. The versatility of DOIs means they can be tied to journal articles, datasets, supplemental material and addendum; to video, audio, streaming media, and 3D objects; to theses, dissertations, technical reports, and visualizations. More recently, DOIs are being assigned to pre-prints of articles, acknowledging the pre-print’s role in some disciplines to be as valuable as the published version.

Why does this matter? As the APA Style Blog explains,

The DOI is like a digital fingerprint: Each article receives a unique one at birth, and it can be used to identify the article throughout its lifespan, no matter where it goes. (https://shar.es/1VECYv)

This digital fingerprint grows in importance as we move into an era that scholar Péter Jacsó has described as a “metadata mega mess.” Keyword searches by title or author in Google, for example, and even Google Scholar, which relies on mechanisms rather than unique IDs, often return inaccurate information: titles are attributed to the wrong authors, especially those with common names; citations of articles are mistaken for the original article; publication years become volume numbers; and a score of other inaccuracies. Researchers who rely on Google Scholar often quip that the service provides an easy way to begin a citation search, but that sources must be verified by DOI through Crossref and other registries. An article with a DOI reduces its risk of becoming lost in this “metadata mega mess” (Péter Jacsó, “Metadata mega mess in Google Scholar”, Online Information Review 2010: 34.1: 175-191, https://doi.org/10.1108/14684521011024191).

The second essential feature of the DOI is that it is persistent. As a unique identifier, it enables digital objects to be found anywhere, anytime with a one simple click on a link. This means that a paper or dataset is accessible and discoverable without requiring a separate search. Incorporated into a citation, the DOI becomes a guaranteed location for the item cited because it will always resolve to the right web address (URL). When attached to a resource, the DOI is also machine-readable, supporting online discovery as well as targeted aggregations and indexes.

The Anatomy of a DOI
Every DOI has three parts:

anatomy of a doi diagram
Source: http://www.ands.org.au/online-services/doi-service/doi-policy-statement. CC-BY
  • Resolving Web Address. Like web addresses (URLs), DOIs enable research output to be discoverable and accessible. Online publishing and digital archiving have made them almost a necessity for scholarship, and they have become the de facto standard for identifying research output.
  • Prefix. The prefix is the beginning of a unique, alphanumeric ID that irrefutably represents a digital object, and as such it creates an actionable, interoperable, persistent link to the work. The prefix is almost always associated with the entity or organization, and can allow users to trace the digital material back to its source.
  • Suffix. The final part of the alphanumeric ID is unique to its assigned object.  Integrity of DOIs are guaranteed because they do not rely alone on URLs and the web’s DNS (Domain Name System) servers for resolution. A DOI, then, is both an online location and a unique name and description of a specific digital object. Moreover, while the DOI base infrastructure is a species of the Handle System, DOIs run on a managed global network dedicated to their resolution.

A recent data DOI created for a data set in the IUScholarWorks repository (https://doi.org/10.5967/K8SF2T3M) illustrates one of our unique prefix “shoulders” (10.5967/K8) and a randomly generated alphanumeric string that is unique to this object (SF2T3M). Our open access journal system, on the other hand, is configured to create DOIs that are more semantic and tell us more about the object. This DOI (https://doi.org/10.14434/v17i3.21306) also has a unique prefix for Indiana University’s open journal system (10.14434). What’s more, the rest of the ID tells us that it is from Volume 17, Issue 3, article number 21306 of its originating journal.

So, Why DOI?
The short answer is that DOIs increase the reach and impact of your work. Publishers, repositories, aggregators, indexers, and providers of research and academic profiles are now relying on DOIs to identify specific works accurately, which in turn more reliably links that work to its authors and creators. Furthermore, metadata and information about individual works are increasingly tied to DOIs.

Crossref — one of the largest providers of DOIs for publications and the provider of DOIs for our open journal program — continues to expand the metadata that can be tied to DOIs, thereby increasing what your work can do in the world. The Scholarly Communication Department plans to deploy two specific Crossref programs that use DOIs to improve the accuracy and accessibility of usage data, bibliometrics, research profiles, and altmetric impact. Cited-by uses an object’s DOI to track where and how a digital publication or data has been cited, and can be displayed alongside an article with other metadata, such as authors’ bios (https://www.crossref.org/services/cited-by). Event Data, a program currently being rolled out by Crossref, goes even further. It will leverage the increasing ubiquity of DOIs to enhance the metrics available to scholars for their work. Known commonly as altmetrics, Event Data will collect a publication’s appearance on social media and online communities, such as Wikipedia, Reddit, Twitter, Stack Exchange, and blog posts (https://www.crossref.org/services/event-data).

Furthermore, for any research products — from software and datasets to technical reports and presentations –created and authored by IU faculty, staff, and students that do not have a previously assigned DOI, the IUScholarWorks Repository can mint them free-of-charge for any and all submissions.

How to Create a Data Management Plan

Grant proposal season is upon us. Increasingly, writing a grant proposal also means writing a data management plan that details how data will be managed, preserved, and shared after a funded project ends. The Scholarly Communication Department offers a Data Management Planning service and works directly with PIs, grant writers, and administrators to create plans that align with funder requirements.

Why are data management plans required?

In February of 2013, the White House Office of Science and Technology Policy released a memo entitled “Increasing Access to the Results of Federally Funded Scientific Research.” This memo mandated that all federal agencies with over $100 million in annual conduct of research and development expenditures develop a plan for public access to research output. Data management plans, previously required only in some circumstances by some federal agencies, became widespread. By October 2016, all federal agencies meeting these criteria had implemented public access policies. These public access policies hinge on the precept that research funded by taxpayer dollars should be made available to the public, industry, and research community.

Why can’t I preserve data with my funding agency?

The 2013 OSTP memo was an unfunded mandate. This contributed to a landscape of distributed solutions provided by many stakeholders in academic research. Commercial publishers, universities, non profits, and government data centers all worked to support researchers working to comply with new data sharing guidelines. In some cases, individual directorates/divisions will provide or endorse a data repository, for example the Arctic Data Center for NSF-funded science on the Arctic, or GenBank, the NIH genetic sequence database. In other cases, researchers are expected to use their discretion in selecting an appropriate data sharing solution.

Where do I find data management plan requirements?

Indiana University is a member of the DMPTool, a tool that walks users through creating, reviewing, and sharing a data management plan.

Image 1: Data Management Planning Tool (dmptool.org) homepage
https://dmptool.org/

The tool has pre-fabricated templates for each directorate/division across funding organizations. To browse requirements for a specific funder, navigate to the DMP Requirements section and search for or select a funder from the list provided. To create a data management plan using one of these templates, log in to the tool using IU credentials and select the relevant funder from the list provided.

How do I choose a repository for my data?

This question is best answered on a case-by-case basis, but there are general guidelines that researchers can use to make the best choice. If in doubt, get in touch.

  1. If a repository is mandated by a funding organization, researchers must use this repository for sharing data
  2. If there is a widely-used disciplinary repository in your domain, consider choosing that repository. If you aren’t sure, check author guidelines for the top three journals in your field. Do they all recommend the same repository for sharing data? Alternately, take a look at www.re3data.org/ to see a registry of disciplinary repositories.
  3. If you have no appropriate disciplinary repository, would rather not pay fees to deposit data, or prefer to keep your data with your institution, consider Indiana University’s institutional repository IUScholarWorks. It is completely free, operated by the Libraries, and designed to support funder requirements.
  4. If none of the above solutions are appropriate for your data and you need unique or specific features, look for an established, well-supported, open repository like Zenodo (Integrates with Github!) or Harvard’s Dataverse (APIs! Maps geospatial files!)

I want to use IUScholarWorks to preserve and share my data. What do I say in my plan?

Language for data management plans will differ depending on the project and the funder. However, many researchers have found the following statement to be a useful starting point in describing IUScholarWorks:

To increase access to the published research that has been funded, the researchers will deposit peer-reviewed or pre-print manuscripts (with linked supporting data where possible) in the IU ScholarWorks institutional repository. A DOI will be created for the data and used in all publications to facilitate discovery.
These data will be preserved according to the current digital preservation standards in place for content within the IU’s institutional repository infrastructure.  This includes a duplicate copy within the IU Scholarly Data Archive (SDA) and eventual deposit into the Digital Preservation Network preservation platform.
The combination of these systems provides mirroring, redundancy, media migration, access control, file integrity validation, embargoes, and other security-based services that ensure the data are appropriately archived for the life of the project and beyond.

I have a lot of data – can I still put it in IUScholarWorks?

Yes. In almost all cases, we are able to to provide free data archiving to IU-affiliated researchers through our partnership with the UITS Scholarly Data Archive. Large datasets live in the Scholarly Data Archive and are made accessible through IUScholarWorks by way of a persistent URL. Here is an example of a weather dataset published in IUScholarWorks.

Pro tip: You can drop off your dataset in the departmental staging area and send us an email with contextual information – we’ll do the heavy lifting and make sure it gets into IUScholarWorks.

Who can help me with my data management plan?

We can. Contact us for assistance creating or implementing a data management plan. The Scholarly Communication Department can help to connect PIs with free campus-supported services to preserve and share data.

When Can I Deposit What? Everything You Need to Know about Permissions and Versions When Submitting to the Repository

Every time you submit an item to the IUScholarWorks repository, you must accept the IUScholarWorks License. By accepting our non-exclusive license, you acknowledge that you either own the copyright to the work you are depositing, or you have been granted permission by the copyright holder to deposit it. If you are depositing material that has already been published, you will first need to find out if you hold the copyright.

When you publish an article in a journal, copyright is typically transferred to the publisher (this will be indicated in your original publishing agreement). If the publisher owns the copyright to your work, you will need to check whether they allow you to deposit it in the institutional repository. Fortunately, most publishers have developed explicit policies that speak to this, so you often won’t need to contact them directly. You can search for a publisher’s copyright policy on their website, or use the Sherpa/Romeo database.

When publishers do allow you to deposit your work in an institutional repository, they frequently impose restrictions, such as an embargo period and/or the type of version permitted.

Embargoes

Publisher embargo periods can range anywhere from 6 to 24 months (and sometimes longer). If a publisher requires you to embargo your work, you can still deposit it in the institutional repository now and designate the amount of time after which it can be made openly available.

Version types

There are three types of versions that a publisher may or may not allow you to submit to the institutional repository:

Pre-print – a draft of an article before peer review

Post-print – the final, peer-reviewed article submitted for publication

Publisher PDF – the final, peer-reviewed article in the publisher’s typesetting and formatting

It’s important to note that content-wise, the post-print and the publisher PDF versions are identical. Many more publishers allow authors to deposit the post-print version in the repository than they do the publisher PDF version.

If you are ever unsure about what work you can or can’t deposit, please contact the IUScholarWorks Team.

Copyright and IUScholarWorks

So you want to submit a published or unpublished article into IUScholarWorks (IUSW) repository? Here’s what you’ll need to know about copyright.

If you are submitting an unpublished article, no worries – you are the rightsholder, so go ahead and submit it to IUSW. If you are submitting an article that has been previously published, though, you (the author) are probably not the rightsholder. If this is the case, you will need to do a little extra research before depositing into IUSW.

Generally, copyright transfers over to a publisher upon publication of an article, so you will need to check with the publisher prior to depositing it. If you still have your signed publishing agreement this should indicate what your rights are. If you don’t have this document, here are some suggestions to move forward.

  1. Your first step is to search SHERPA/RoMEO, a freely available online database of publisher copyright policies. Simply type in the name of your journal and you should receive information on what you can submit to an institutional repository such as IUSW. (For those new to S/R, this helpful video should clarify the search process and terminology.)
  2. If you cannot find information through SHERPA/RoMEO, you will want to check to see if the journal has a website. If so, copyright information may be located there.
  3. The final way to check copyright of an article is to contact the editor of the journal–not the publisher, which usually oversees many journals. It is helpful for the author of the work in question to write the message. We’ve found that this usually helps expedite the process. You can use a format like this sample letter to the editor. 

After completing these steps, you should now know what exactly can be deposited into IUSW: pre-print, post-print, or the publisher’s version of your article.

One easy way to save yourself this trouble moving forward is to complete the SPARC Author Addendum prior to signing your copyright over to a publisher. This legal document ensures that you keep the rights that you want, including the ability to archive your work in an institutional repository like IUSW. Read about the addendum to determine if it’s right for you!

Open Access, Copyright, Licensing, and IUScholarWorks

When most people hear the term, “open access,” they typically think of information that is freely accessible on the web; however, that only encompasses half of what open access stands for. Open access is not only about being able to obtain information for free, but it is also about being able to reuse that information freely, i.e. how that information is subsequently distributed, linked to, and built upon.

Image 1: Example of a derivative work, with green arrows pointing between the original and the new
Example of a derivative work. Retrieved from http://commons.wikimedia.org/wiki/File%3ADerivative-work-icon.svg

By default, you, the author, hold the copyright for every new work you create, meaning you alone have the right to distribute and create derivatives from it. The good news is you can waive this right by adding a Creative Commons License to your work, which explains to users what they may or may not do with it. For example, a CC-BY license tells users that they may distribute and create derivative works, as long as they attribute the original work to you.

Adding a Creative Commons license to your work in IUScholarWorks is a simple step. When you submit an item to the repository, you have the opportunity to specify the name of a license in the Rights field during the submission process. Remember, leaving this field blank means the that you reserve all rights to your work!

To learn more about licensing options, check out the Creative Commons website (http://creativecommons.org/licenses/) or contact the IUScholarWorks team.

Free Tools to Visualize Your Data

Data visualization has grown in popularity as datasets have become larger and tools have become more user-friendly. This area is eagerly being explored by researchers in a variety of disciplines. Although many people think of numbers when they consider types of data, data comes in many forms–including text! In fact, for many researchers, especially those in the humanities or social sciences, text is their primary data source.

Image 1: network visualization
This example of a network visualization could be created using a tool like Gephi or Sci2. Image: Clickstream Data Yields High-Resolution Maps of Science. Johan Bollen, Herbert Van de Sompel, Aric Hagberg, Luis Bettencourt, Ryan Chute, Marko A. Rodriguez, Lyudmila Balakireva. http://www.plosone.org/article/info:doi/10.1371/journal.pone.0004803

Here is a brief list of freely available tools you can use to explore and visualize both numerical and textual data. This list is by no means comprehensive; to check out additional tools, try the visualization tool list at Bamboo DiRT.

  • D3 – A JavaScript data visualization library. While you would need to invest the time to learn basic JavaScript, this introductory tutorial breaks down steps to learn D3. You can also check out the array of impressive visualizations resulting from its use.
  • Gephi – If you only wanted to invest the time to learn one visualization tool, this open source software for visualizing networks and complex systems is a great choice. Take a look at one of the many available tutorials to get started.
  • ManyEyes – This tool allows users to easily upload datasets and create basic visualizations. To get a feel for the types of visualizations created, view the ManyEyes gallery.
  • Sci2 Tool – This tool, developed at the Indiana University Cyberinfrastructure for Network Science Center, is billed as “a modular toolset specifically designed for the study of science [that] supports the temporal, geospatial, topical, and network analysis and visualization of scholarly datasets.” Its strength lies in its ability to handle network data, similar to Gephi.
  • Tableau Public – This free, limited-functionality version of the popular software Tableau simplifies the act of creating charts and graphs.
  • Voyant – This is a browser-based platform for analysis and visualization of texts. It is a beginner-friendly tool with modest functionality: visualizations created within Voyant are limited to charts and graphs, though it would be easy to plug the data generated by the program into another platform with greater capacity for visualization, such as Gephi.
  • WordSeer – WordSeer is a textual analysis and visualization tool comparable to Voyant. The latest version, 3.0, has not yet been released publicly.

Lastly, I would be remiss if I failed to mention the important role that data management plays in data visualization. Poorly managed data may hinder your ability to create effective visualizations, so learn a few simple steps to manage your data more effectively. For more information, contact Stacy Konkiel, Science Data Management Librarian, at skonkiel@indiana.edu to schedule a consultation!

Open Access: 7 Things You Need to Know

Stacy Konkiel, Science Data Management Librarian, @skonkiel, and myself, Jen Laherty, Digital Publishing Librarian, @jlaherty, were asked to provide the Bloomington Library Faculty Council with an overview of Open Access.  Here is our quick presentation, given December 4, 2013.

On a related note, the Bloomington Faculty Council Library committee, co-chaired by faculty Jason Baird Jackson, Director of the Mathers Museum of World Cultures and Associate Professor of Folklore and Ethnomusicology and Ted Striphas, Associate Professor and Director of Graduate Studies, Department of Communication and Culture, are leading a discussion to recommend, or not, that an open access deposit policy be adopted by Indiana University Bloomington Faculty.  A similar conversation is happening at the IUPUI campus.

Librarians at IUB may wish to discuss an open access deposit policy for their scholarly outputs ahead of a campus policy – akin to those described in the seventh ‘think to know’ in the linked presentation.

17 More Essential Altmetrics Resources (the Library Version)

As promised, I have compiled some “required reading” related specifically to altmetrics and their use in libraries. These articles and blog posts actually comprise a majority of the writing out there on altmetrics in libraries–there’s surprisingly little that librarians have written to date on how our profession might use altmetrics to enhance our work.

Ironically enough (given librarians’ own OA advocacy), some of the articles linked below have been published in toll access library science journals. Apologies in advance for any paywalls you may encounter. (Though if you do find barriers to access, you should tell OA Button about it!)

General

Collection Development

Research Data Curation

  • Weber, N. M., Thomer, A. K., Mayernik, M. S., Dattore, B., Ji, Z., & Worley, S. (2013). The Product and System Specificities of Measuring Curation Impact. International Journal of Digital Curation, 8(2).  doi:10.2218/ijdc.v8i2.286

Institutional Repositories

  • Day, M., & Michael Day. (2004). Institutional repositories and research assessment. Project Report. UKOLN, University of Bath. (pp. 1–30). Bath: University of Bath. Retrieved from http://opus.bath.ac.uk/23308/
  • Frank Scholz, S. D. (2006). International Workshop on Institutional Repositories and Enhanced and Alternative Metrics of Publication Impact. CERN. Retrieved from http://edoc.hu-berlin.de/series/dini-schriften/2006-8/PDF/8.pdf
  • Konkiel, S., & Scherer, D. (2013). New opportunities for repositories in the age of altmetrics. Bulletin of the American Society for Information Science and Technology, 39(4), 22–26. doi:10.1002/bult.2013.1720390408
  • Merceur, F., Gall, M. Le, Salaün, A., & Le Gall, M. (2011). Bibliometrics: a new feature for institutional repositories. In 14th Biennal EURASLIC Meeting (pp. 1–21). Lyon. Retrieved from http://archimer.ifremer.fr/doc/00031/14253/11886.pdf
  • Organ, M. K. (2006). Download Statistics – What Do They Tell Us? The Example of Research Online, the Open Access Institutional Repository at the University of Wollongong, Australia. D-Lib Magazine. Retrieved February 13, 2012, from http://ro.uow.edu.au/asdpapers/44/

Do you have “must read” articles relating to libraries and altmetrics that didn’t make it on this list? Leave ’em in the comments below!

Want to read some general altmetrics-related research? Check out the original list of 17 Essential Altmetrics Resources.

Simple Steps to Manage Your Data More Effectively

Data management can be an intimidating topic. However, learning how to manage your data can improve your research processes and therefore your life! Not to mention the fact that many grant funding agencies now require data management plans to be submitted with proposals. Is your interest piqued yet? Read below for some easy first steps toward managing your data.

Consider your current data practices

Here are some preliminary questions to ask yourself.

  • What data do I collect?
  • Do I follow a process for collecting and documenting my data?
  • Who contributes data–just me or others, too?
  • What format is the data in?
  • Where is the data stored?
  • Is the data being backed up?

Determine areas to improve

Compare the following suggestions to your own data practices. If you can start taking steps to improve the weaker areas, you’ll be all set.

  • Documentation – Document the processes and workflows you follow when collecting and managing your data in a README file. It is also important to follow standards within your field for documenting contextual information about your data. In library jargon, this is known as metadata. To search for a metadata standard in your discipline, try the Digital Curation Centre’s helpful search tool.
  • Formats – Ideally, data should be stored in open, non-proprietary formats. This will ensure that it can be accessed well into the future. The Open Data Handbook gives a good overview of open formats. This can be as simple as saving files as a CSV instead of Excel spreadsheet or a text file instead of Microsoft word document.
  • Storage – IU offers several options for data storage. You can store your data on the cloud through IU Box, which also provides excellent versioning and collaborative functionality. For sensitive or large data sets, you can use the Scholarly Data Archive. Whatever you do, just make sure that you are backing up your data and not just relying on your hard drive to keep your data safe. Also note that these options do not ensure long-term preservation. For this, you should consider adding completed data sets to the IU institutional repository, IUScholarWorks (IUSW).
  • Sharing and Access Opening up your data won’t be appropriate for all researchers, but those whose research is complete should consider storing their data in IUSW to promote discoverability and access to their data.

Get help

Data management advice is nearly impossible to generalize, especially in a short blog post! Contact Stacy Konkiel, Science Data Management Librarian, at skonkiel@indiana.edu with questions, comments, or to schedule a one-on-one consultation about how the IU Libraries Data Management Service can help you manage your data.

Predatory Publishers and IUScholarWorks

My name is Brianna Marshall and I am the Scientific Data Curation Assistant in the Scholarly Communication Department. While my responsibilities primarily pertain to helping researchers manage their data, I also work with IUScholarWorks (IUSW) quite frequently. Making your work available in IUSW ensures that it is preserved and made available to researchers around the world. Unfortunately, individuals submitting work to IUSW and other institutional repositories may find themselves targeted by predatory open access publishers.

What is a predatory publisher?

Often, predatory publishers do not offer traditional editorial services, such as peer review (although they may claim that they do). Many of these journals will accept an article then let the author know that they owe an exorbitant publication fee.

These predatory publishers can seem legitimate – they may have fully functional websites and authors rights statements that are similar to those of well-respected publishers, but this is no guarantee of their quality. The rise of online publishing has made it easier for these groups to masquerade as legitimate publishers.

How can I identify a predatory publisher?

Predatory publishers don’t serve any risk to researchers if you can identify and discount them as an option for disseminating your work.

Predatory publishers are seeking to make a large profit, so they are known to aggressively seek out new authors or editors. Receiving a form email that requests your submission to a particular publisher should be your first clue. Some publishers are bold enough to find authors who have submitted to institutional repositories: a librarian within our department experienced this firsthand after submitting her work into IUSW.

Don’t be fooled by these publishers. If you have any suspicions about the publisher, we recommend that you consult Beall’s List of Predatory Publishers. Jeffrey Beall, a librarian at the University of Colorado-Denver, publishes a list of “potential, possible, or probable predatory scholarly open-access publishers” on his website. If after consulting his list you still have questions or concerns, consult your local librarian.

How can I avoid unwanted reuse of my work?

Clearly licensing your work with a non-commercial Creative Commons license is a possible way to thwart unwanted reuse of your work, but it’s not fool-proof. The rise of predatory publishers means that scholars need to be more vigilant than ever about researching where they choose to publish and what rights they have over that work.