How to Create a Data Management Plan

Grant proposal season is upon us. Increasingly, writing a grant proposal also means writing a data management plan that details how data will be managed, preserved, and shared after a funded project ends. The Scholarly Communication Department offers a Data Management Planning service and works directly with PIs, grant writers, and administrators to create plans that align with funder requirements.

Why are data management plans required?

In February of 2013, the White House Office of Science and Technology Policy released a memo entitled “Increasing Access to the Results of Federally Funded Scientific Research.” This memo mandated that all federal agencies with over $100 million in annual conduct of research and development expenditures develop a plan for public access to research output. Data management plans, previously required only in some circumstances by some federal agencies, became widespread. By October 2016, all federal agencies meeting these criteria had implemented public access policies. These public access policies hinge on the precept that research funded by taxpayer dollars should be made available to the public, industry, and research community.

Why can’t I preserve data with my funding agency?

The 2013 OSTP memo was an unfunded mandate. This contributed to a landscape of distributed solutions provided by many stakeholders in academic research. Commercial publishers, universities, non profits, and government data centers all worked to support researchers working to comply with new data sharing guidelines. In some cases, individual directorates/divisions will provide or endorse a data repository, for example the Arctic Data Center for NSF-funded science on the Arctic, or GenBank, the NIH genetic sequence database. In other cases, researchers are expected to use their discretion in selecting an appropriate data sharing solution.

Where do I find data management plan requirements?

Indiana University is a member of the DMPTool, a tool that walks users through creating, reviewing, and sharing a data management plan.

screenshot of DMPTool
https://dmptool.org/

The tool has pre-fabricated templates for each directorate/division across funding organizations. To browse requirements for a specific funder, navigate to the DMP Requirements section and search for or select a funder from the list provided. To create a data management plan using one of these templates, log in to the tool using IU credentials and select the relevant funder from the list provided.

How do I choose a repository for my data?

This question is best answered on a case-by-case basis, but there are general guidelines that researchers can use to make the best choice. If in doubt, get in touch.

  1. If a repository is mandated by a funding organization, researchers must use this repository for sharing data
  2. If there is a widely-used disciplinary repository in your domain, consider choosing that repository. If you aren’t sure, check author guidelines for the top three journals in your field. Do they all recommend the same repository for sharing data? Alternately, take a look at www.re3data.org/ to see a registry of disciplinary repositories.
  3. If you have no appropriate disciplinary repository, would rather not pay fees to deposit data, or prefer to keep your data with your institution, consider Indiana University’s institutional repository IUScholarWorks. It is completely free, operated by the Libraries, and designed to support funder requirements.
  4. If none of the above solutions are appropriate for your data and you need unique or specific features, look for an established, well-supported, open repository like Zenodo (Integrates with Github!) or Harvard’s Dataverse (APIs! Maps geospatial files!)

I want to use IUScholarWorks to preserve and share my data. What do I say in my plan?

https://scholarworks.iu.edu/

Language for data management plans will differ depending on the project and the funder. However, many researchers have found the following statement to be a useful starting point in describing IUScholarWorks:

To increase access to the published research that has been funded, the researchers will deposit peer-reviewed or pre-print manuscripts (with linked supporting data where possible) in the IU ScholarWorks institutional repository. A DOI will be created for the data and used in all publications to facilitate discovery.
These data will be preserved according to the current digital preservation standards in place for content within the IU’s institutional repository infrastructure.  This includes a duplicate copy within the IU Scholarly Data Archive (SDA) and eventual deposit into the Digital Preservation Network preservation platform.
The combination of these systems provides mirroring, redundancy, media migration, access control, file integrity validation, embargoes, and other security-based services that ensure the data are appropriately archived for the life of the project and beyond.

I have a lot of data – can I still put it in IUScholarWorks?

Yes. In almost all cases, we are able to to provide free data archiving to IU-affiliated researchers through our partnership with the UITS Scholarly Data Archive. Large datasets live in the Scholarly Data Archive and are made accessible through IUScholarWorks by way of a persistent URL. Here is an example of a weather dataset published in IUScholarWorks.

Pro tip: You can drop off your dataset in the departmental staging area and send us an email with contextual information – we’ll do the heavy lifting and make sure it gets into IUScholarWorks.

Who can help me with my data management plan?

We can. Contact iuswdata@indiana.edu for assistance creating or implementing a data management plan. The Scholarly Communication Department can help to connect PIs with free campus-supported services to preserve and share data.

Free Tools to Visualize Your Data

Data visualization has grown in popularity as datasets have become larger and tools have become more user-friendly. This area is eagerly being explored by researchers in a variety of disciplines. Although many people think of numbers when they consider types of data, data comes in many forms–including text! In fact, for many researchers, especially those in the humanities or social sciences, text is their primary data source.

journal.pone.0004803.g005
This example of a network visualization could be created using a tool like Gephi or Sci2. Image: Clickstream Data Yields High-Resolution Maps of Science. Johan Bollen, Herbert Van de Sompel, Aric Hagberg, Luis Bettencourt, Ryan Chute, Marko A. Rodriguez, Lyudmila Balakireva. http://www.plosone.org/article/info:doi/10.1371/journal.pone.0004803

Here is a brief list of freely available tools you can use to explore and visualize both numerical and textual data. This list is by no means comprehensive; to check out additional tools, try the visualization tool list at Bamboo DiRT.

  • D3 – A JavaScript data visualization library. While you would need to invest the time to learn basic JavaScript, this introductory tutorial breaks down steps to learn D3. You can also check out the array of impressive visualizations resulting from its use.
  • Gephi – If you only wanted to invest the time to learn one visualization tool, this open source software for visualizing networks and complex systems is a great choice. Take a look at one of the many available tutorials to get started.
  • ManyEyes – This tool allows users to easily upload datasets and create basic visualizations. To get a feel for the types of visualizations created, view the ManyEyes gallery.
  • Sci2 Tool – This tool, developed at the Indiana University Cyberinfrastructure for Network Science Center, is billed as “a modular toolset specifically designed for the study of science [that] supports the temporal, geospatial, topical, and network analysis and visualization of scholarly datasets.” Its strength lies in its ability to handle network data, similar to Gephi.
  • Tableau Public – This free, limited-functionality version of the popular software Tableau simplifies the act of creating charts and graphs.
  • Voyant – This is a browser-based platform for analysis and visualization of texts. It is a beginner-friendly tool with modest functionality: visualizations created within Voyant are limited to charts and graphs, though it would be easy to plug the data generated by the program into another platform with greater capacity for visualization, such as Gephi.
  • WordSeer – WordSeer is a textual analysis and visualization tool comparable to Voyant. The latest version, 3.0, has not yet been released publicly.

Lastly, I would be remiss if I failed to mention the important role that data management plays in data visualization. Poorly managed data may hinder your ability to create effective visualizations, so learn a few simple steps to manage your data more effectively. For more information, contact Stacy Konkiel, Science Data Management Librarian, at skonkiel@indiana.edu to schedule a consultation!

17 More Essential Altmetrics Resources (the Library Version)

As promised, I have compiled some “required reading” related specifically to altmetrics and their use in libraries. These articles and blog posts actually comprise a majority of the writing out there on altmetrics in libraries–there’s surprisingly little that librarians have written to date on how our profession might use altmetrics to enhance our work.

Ironically enough (given librarians’ own OA advocacy), some of the articles linked below have been published in toll access library science journals. Apologies in advance for any paywalls you may encounter. (Though if you do find barriers to access, you should tell OA Button about it!)

General

Collection Development

Research Data Curation

  • Weber, N. M., Thomer, A. K., Mayernik, M. S., Dattore, B., Ji, Z., & Worley, S. (2013). The Product and System Specificities of Measuring Curation Impact. International Journal of Digital Curation, 8(2).  doi:10.2218/ijdc.v8i2.286

Institutional Repositories

  • Day, M., & Michael Day. (2004). Institutional repositories and research assessment. Project Report. UKOLN, University of Bath. (pp. 1–30). Bath: University of Bath. Retrieved from http://opus.bath.ac.uk/23308/
  • Frank Scholz, S. D. (2006). International Workshop on Institutional Repositories and Enhanced and Alternative Metrics of Publication Impact. CERN. Retrieved from http://edoc.hu-berlin.de/series/dini-schriften/2006-8/PDF/8.pdf
  • Konkiel, S., & Scherer, D. (2013). New opportunities for repositories in the age of altmetrics. Bulletin of the American Society for Information Science and Technology, 39(4), 22–26. doi:10.1002/bult.2013.1720390408
  • Merceur, F., Gall, M. Le, Salaün, A., & Le Gall, M. (2011). Bibliometrics: a new feature for institutional repositories. In 14th Biennal EURASLIC Meeting (pp. 1–21). Lyon. Retrieved from http://archimer.ifremer.fr/doc/00031/14253/11886.pdf
  • Organ, M. K. (2006). Download Statistics – What Do They Tell Us? The Example of Research Online, the Open Access Institutional Repository at the University of Wollongong, Australia. D-Lib Magazine. Retrieved February 13, 2012, from http://ro.uow.edu.au/asdpapers/44/

Do you have “must read” articles relating to libraries and altmetrics that didn’t make it on this list? Leave ’em in the comments below!

Want to read some general altmetrics-related research? Check out the original list of 17 Essential Altmetrics Resources.

Simple Steps to Manage Your Data More Effectively

Data management can be an intimidating topic. However, learning how to manage your data can improve your research processes and therefore your life! Not to mention the fact that many grant funding agencies now require data management plans to be submitted with proposals. Is your interest piqued yet? Read below for some easy first steps toward managing your data.

Consider your current data practices

Here are some preliminary questions to ask yourself.

  • What data do I collect?
  • Do I follow a process for collecting and documenting my data?
  • Who contributes data–just me or others, too?
  • What format is the data in?
  • Where is the data stored?
  • Is the data being backed up?

Determine areas to improve

Compare the following suggestions to your own data practices. If you can start taking steps to improve the weaker areas, you’ll be all set.

  • Documentation – Document the processes and workflows you follow when collecting and managing your data in a README file (click here for a good example). It is also important to follow standards within your field for documenting contextual information about your data. In library jargon, this is known as metadata. To search for a metadata standard in your discipline, try the Digital Curation Centre’s helpful search tool.
  • Formats – Ideally, data should be stored in open, non-proprietary formats. This will ensure that it can be accessed well into the future. The Open Data Handbook gives a good overview of open formats. This can be as simple as saving files as a CSV instead of Excel spreadsheet or a text file instead of Microsoft word document.
  • Storage – IU offers several options for data storage. You can store your data on the cloud through IU Box, which also provides excellent versioning and collaborative functionality. For sensitive or large data sets, you can use the Scholarly Data Archive. Whatever you do, just make sure that you are backing up your data and not just relying on your hard drive to keep your data safe. Also note that these options do not ensure long-term preservation. For this, you should consider adding completed data sets to the IU institutional repository, IUScholarWorks (IUSW).
  • Sharing and Access Opening up your data won’t be appropriate for all researchers, but those whose research is complete should consider storing their data in IUSW to promote discoverability and access to their data.

Get help

Data management advice is nearly impossible to generalize, especially in a short blog post! Contact Stacy Konkiel, Science Data Management Librarian, at skonkiel@indiana.edu with questions, comments, or to schedule a one-on-one consultation about how the IU Libraries Data Management Service can help you manage your data.

What Open Access Means to Me

On October 21-25, IU celebrated International Open Access Week with a series of events to reflect on and educate the IU community about open access, including workshops, presentations, and round table discussions on topics ranging from data management to student publishing. As part of this series, we asked faculty and students to answer the question “What does open access mean to you?” and compiled their responses here.

To wrap up the discussion, I thought I would use this post to share my own response:

Open access offers something for everyone. For librarians and users, it creates a sustainable model of scholarly communication that fosters equal access to information. For universities and funding agencies, it accelerates research, supporting the mission to advance knowledge creation. For researchers and their home institutions, it creates an unparalleled opportunity for impact.

As a graduate student in the Department of Information and Library Science, I am excited by the ways libraries are playing an increasing role in the open access movement by providing open access publishing services, supporting institutional repositories, preserving open access materials through LOCKSS, and more. I strongly believe that the principles of open access align with the core values of librarianship, and it is something that I am proud to be a part of.

If you are interested in learning more about open access, the following list of resources is a great place to get started:

Analytics for IUScholarWorks is now here!

Hi, I am Pallavi Murthy and I will be working for IUScholarWorks under Scholarly Communications as a Graduate Assistant through fall 2013 and spring 2014. This is my first blog and I am very excited to start off my job with multiple duties at hand! Well, let me put it this way- I am very happy to work on something I really love- Data and Analytics!

To say that data analysis is important to business will be an understatement. In fact, no business can survive without analyzing available data. Data analysis is the lifeline of any business. Whether one wants to arrive at some marketing decisions or fine-tune new product launch strategy, data analysis is the key to all the problems. What is the importance of data analysis – instead, one should say what is not important about data analysis. Merely analyzing data isn’t sufficient from the point of view of making a decision. How does one interpret from the analyzed data is more important. Thus, data analysis is not a decision making system, but decision supporting system. Data analysis can offer the following benefits:

  • Structuring the findings from survey research or other means of data collection
  • Break a macro picture into a micro one
  • Acquiring meaningful insights from the dataset
  • Basing critical decisions from the findings
  • Ruling out human bias through proper statistical treatment. [1]

Libraries can use data analysis for many of the same functions that businesses do. My first project for IUScholarWorks was to perform such an analysis on their website.

Said and done, I am using Google Analytics to analyze the Libraries Website, specifically website traffic related to Data Management. I am also analyzing web traffic for Journals under IU Libraries, including JoSoTL and JoTLT, which  the IUScholarWorks team has put a great deal of time and effort into maintaining.

Stacy asked me to review the data management guide so she can better understand which parts of her guide are the most accessed. The Data Management Guide has seen significant improvement since it was created. Describing Data with Metadata ranks highest as the most accessed sup-page under Data Management at IU, followed by Storage and preservation and Funder Requirements and Data Management Plans.

Next, Stacy asked me to run the analytics for Publication and Data Services webpage, so she could understand if people were using that webpage to find their way to IUScholarWorks. The report shows that IUScholarWorks Repository, Journal Publishing and Data Management Services is one of the most popular link on that page.

It is interesting and delighting to know how IUScholarWorks is trying very hard to make scholarly research and journals open access. I feel as important the efforts for making open access is, so important it is to know the whereabouts of the efforts actually reaching to its users. I would like to analyze more and more reports and bring in some valuable data wonder (as I call it!) in my next upcoming blogs.

To know more about IUScholarWorks, Depositing your scholarly research, Publishing an online journal and Archive and increasing availability of your research data visit http://scholarworks.iu.edu/.

For more information contact the IUScholarWorks team.

Sources:  [1]* http://www.migindia.biz/data.html

 

 

 

 

 

 

 

 

 

 

 


 

October 25th data visualization & management workshop for beginners

Gephi screenshot
Gephi screenshot from https://gephi.org/

Oct 25 2013
9:00am to 12:00pm
Wells Library Information Commons Instruction Cluster 1

Interested in using data visualization to enhance your research but don’t know where to begin? Learn how to use basic data visualization techniques and tools including Voyant, OpenRefine, Gephi, and Sci2 at our workshop, where we’ll give users the chance to test their skills using data from a variety of open data sources. Experts will also cover the best ways to manage your data throughout its lifecycle. No data visualization experience needed, but attendees should have a working knowledge of Microsoft Excel.

Register here: http://libprod.lib.indiana.edu/tools/workshops/workshop-listings/series-view/182/series

This workshop is part of Open Access Week 2013.

 

White House OTSP creates Open Access policy for federal agencies

OTSP Director John Holdren talks to President Obama in this undated White House photo.

One day after we posted big news about dual Open Access bills in the US and Illinois Senates, the Office of Technology and Science Policy issued a policy memorandum that will essentially enact an Open Access policy similar to the NIH Public Access policy for all federal agencies with more than $100 million in their R&D budget. This policy will not only affect publications, but also the data resulting from funded research.

Many in the Open Access advocacy community are celebrating the announcement as proof of the success of the #OAMonday/Access2Research movement and the resulting “We the People” petition, which solicited a positive (if long-overdue) response from Holdren.

Researcher Joe Hourcle, on the RDAP listserv, has distilled the policy into these essential points:

  • Must give a plan in 6 months on how they’re going to improve public access to publications & data
  • Can have an embargo after publication (baseline is 12 months)
  • No charges for access to the article metadata
  • Grants can include costs for data management & access

The Dryad repository blog explores in a bit more detail exactly what this might mean for data sharing and publication.

It remains to be seen how this surprising and groundbreaking new policy will take effect.

A Guide to Text and Data Mining at Indiana University Bloomington

Kim D, Yu H (2011) Figure Text Extraction in Biomedical Literature. PLoS ONE 6(1): e15338. doi:10.1371/journal.pone.0015338
Kim D, Yu H (2011) Figure Text Extraction in Biomedical Literature. PLoS ONE 6(1): e15338. doi:10.1371/journal.pone.0015338

Text and data mining of academic databases are becoming increasingly popular ways to conduct research. They can allow scholars to make connections not previously discovered, or find solutions more quickly and efficiently. Such research has also gotten some researchers into trouble for alleged copyright and contract violations, when practiced without due diligence into existing legal restrictions.

For IU researchers interested in accessing the Libraries’ digital journals, databases, special collections (specifically, HathiTrust), and other subscription content for the purposes of text or data mining, we’ve put together a quick-and-dirty guide to text and data mining at IUB. Check it out and let us know what you think in the comments.