Latest CADRE updates

The CADRE team is hard at work developing a platform that will do what academic libraries have long been trying to achieve. 

We are gearing up for ISSI 2019 in September, where CADRE will hold a workshop and tutorial. Our hands-on CADRE tutorial at ISSI will offer an option to use assisted programming to access Microsoft Academic Graph (MAG), as well as a second option that will allow access to the dataset using the CADRE Query Builder, which uses a graphical user interface.

A bulletin board covered in post-it notes and pictures that explain some of the CADRE infrastructure.
Visualizing the CADRE infrastructure. Photo by Abigail Godwin.

But building a platform that allows novice coders to easily query massive datasets with a GUI is a lengthy process of trial and error—and that is only one component of CADRE.

Building CADRE is a complex and fluid task: Along with the Web of Science (WoS) and MAG datasets, CADRE will include U.S. patent and trademark data. And more datasets will be added to the platform as different types of researchers request access.

IUNI Lead Software Engineer Ben Serrette says because of the potential to take on more datasets, software solutions must be as generic and adaptable as possible. Like fitting and refitting the pieces of a complex puzzle that keeps changing shape, the IUNI IT team is solving multi-faceted problems with a flexible approach on an enormous scale.

Find out how they’re doing it below. 

Latest updates
  • Ruling out what doesn’t work: The IT team narrowed down the many serverless technology options that cloud-computing platform Amazon Web Services (AWS) offers by eliminating the ones that don’t fit the bill in terms of cost or ability to interface with other CADRE technical components.
  • Designing fundamental cloud architecture: CADRE’s infrastructure of cloud-based virtual machines and AWS services has been developed.
  • Integrating Jupyter Notebooks & file storage: Advanced users can write their own code to create data-analysis tools in CADRE’s notebook. Jupyter Notebooks is up and running with a working file system for storing code.
  • Testing the query builder: One service essential to CADRE is the GUI users can use to easily query massive datasets. The IT team is testing the combined powers of a relational database and various graph databases with MAG data to create a more efficient query-builder. 
  • Building CADRE’s website: The IT team is finishing the front-end of some pages of the CADRE website, including the homepage and the event page for ISSI 2019. They are preparing to make the website live in a couple weeks.
    A screen shot of a website that says "CADRE Fellows" and shows the pictures of three fellows and a short description of their research project.
    A preview of the CADRE Website.

    If you want to stay updated on what CADRE is doing, be sure to follow us on Twitter.

Meet CADRE’s first class of fellows

The Collaborative Archive & Data Research Environment (CADRE) accepted its first class of CADRE Fellows

These seven fellowship teams span across disciplines and offer compelling research that incorporates big data and bibliometrics. Each fellow team will access CADRE’s Web of Science (WoS) and Microsoft Academic Graph (MAG) datasets to achieve their research goals. 

Our fellows will present their research at the International Society for Scientometrics and Informetrics (ISSI) 2019 Conference in Rome at either the workshop or tutorial that CADRE is hosting on Sept. 2.

Not only will these fellows show how CADRE helped advance their work, they will serve as integral use cases for how we develop our platform to suit the needs of every type of academic researcher. 

We plan to accept fellows on a rolling basis in the future, as spots become available. If you are interested in applying, email us at cadre@iu.edu.

Now, let’s meet the research teams!

Utilizing Data Citation for Aggregating, Contextualizing, and Engaging with Research Data in STEM Education Research from Purdue University

A combined photo of three researchers: one man and two women.
Witt, Carleton Parker, and Bessenbacher.

Researchers: 

  • Michael Witt, associate professor of library science, Purdue Libraries and School of Information Studies, Purdue University
  • Loran Carleton Parker, associate director & senior evaluation and research associate, Evaluation Learning Research Center, Purdue University 
  • Ann Bessenbacher, research associate and data scientist, STEMEd HUB, Purdue University

Researchers will characterize citation of data from the literature in the field of STEM education research. A sample of relevant publication venues in the field will be identified from WoS and MAG. Digital Object Identifiers (DOIs) of datasets registered with DataCite will be used to query and associate datasets with publications. The team will assess rates of citation for datasets that are cited using DataCite DOIs for each publication venue and analyze a sample of data citations and publications to determine suitability for providing an initial context to help a researcher who is unfamiliar with the data determine whether to use the dataset.

Understanding citation impact of scientific publications through ego-centered citation networks from Indiana University Bloomington, Nanjing University
A combined photo of three researchers: two men and one woman.
Bu, Min, and Ding.

Researchers:

  • Yi Bu, Ph.D. candidate in informatics, Indiana University Bloomington
  • Chao Min, research assistant professor in information management, Nanjing University in China
  • Ying Ding, professor of informatics, Indiana University

The research team seeks to find the “deeper” and “broader” impact of network-based citation measurements in the scientific community. This project will determine the citation impact of scientific publications using an ego-centered citation network, which contains the citing relationships between a publication and its citing publications, as well as the relationships within its citing publications. Researchers will use the entirety of the WoS and MAG data to establish empirical evidence in this project. 

MCAP: Mapping Collaborations and Partnerships in SDG Research from Michigan State University 
A combined photo of four researchers: two women and and two men.
Payumo, Higgins, Calvert, and He.

Researchers:

  • Jane Payumo, academic specialist and research and data evaluation manager, MSU AgBioResearch, Michigan State University 
  • Devin Higgins, digital library programmer, MSU Libraries, Michigan State University
  • Scout Calvert, data librarian, MSU Libraries, Michigan State University
  • Guangming He, information management analyst, MSU Innovation Center, Michigan State University

This project will build on the WoS report “Navigating the Structure of Research on Sustainable Development Goals (SDG),” as the researchers search for patterns of global collaboration and support the United Nations’ SDG call for action. Researchers will design a prototype to analyze and visualize the input-output of partnerships over time in SDG-supportive research. They also plan to create a scoring measure or partnership index that defines and conducts partnership analytics for SDGs by using data sourced from WoS and MAG.

The global network of air links and scientific collaboration – a quasi-experimental analysis from Indiana University Bloomington and University of Warsaw 
A combined photo of four researchers: two men and two women.
Börner, Ploszaj, Record, and Herr.

Researchers:

  • Katy Börner, Victor H. Yngve distinguished professor of engineering & information science, Indiana University Bloomington
  • Adam Ploszaj, assistant professor at the Centre for European Regional and Local Studies, University of Warsaw
  • Lisel Record, associate director, Cyberinfrastructure for Network Science Center
  • Bruce Herr II, senior system architect and project manager, Cyberinfrastructure for Network Science Center

Researchers plan to determine the impact of the introduction and availability of long-distance flights on international scientific collaboration. The team will measure collaboration through co-authorship and co-affiliation. They will also geocode publication affiliations from WoS and MAG from 1998 through 2017. This quasi-experimental research will apply state-of-the-art causal modeling techniques and explore how data-driven causality can enhance science of science policy relevance.

Measuring and Modeling the Dynamics of Science Using the CADRE Platform from University of Minnesota, New York University, Boston University, University of Pennsylvania, University of Arizona
A combined photo of three researchers who are men.
Funk, Gebhart, and Park.
A combined photo of three researchers: two men and one woman.
Lane, Murciano-Goroff, and Ross.
A combined photo of three researchers who are women.
Glennon, Leahey, and Lee.

Researchers:

  • Russell Funk, assistant professor of strategic management & entrepreneurship, University of Minnesota
  • Thomas Gebhart, Ph.D. student in computer science and engineering, University of Minnesota 
  • Michael Park, Ph.D. student in strategic management and entrepreneurship, University of Minnesota
  • Julia Lane, professor at Wagner Graduate School of Public Service, New York University 
  • Raviv Murciano-Goroff, assistant professor at Questrom School of Business, Boston University 
  • Matthew Ross, research assistant professor at Wagner Graduate School of Public Service, New York University 
  • Britta Glennon, assistant professor at Wharton School, University of Pennsylvania
  • Erin Leahey, professor and director of sociology, University of Arizona 
  • Jina Lee, Ph.D. student in sociology, University of Arizona

This research team wants to better characterize scientific influence of papers, typically measured by how many times papers are cited, by distinguishing between papers that destabilize existing knowledge with novel concepts and papers that consolidate existing knowledge. In a separate but closely related aim, the researchers also plan to create a novel unsupervised machine learning technique for author-name disambiguation by pulling abstract, title, and citation data from WoS and MAG. For both aims, the CADRE platform will provide essential infrastructure in terms of large-scale data storage and high performance computational resources.

Comparative analysis of legacy and emerging journals in mathematical biology from University of Michigan and University of Michigan Medical School
This is a combined photo of four researchers: one woman, two men, and one person who prefers to identify as "they, them."
Conte, Hansen, Martin, and Schnell.

Researchers:

  • Marisa Conte, assistant director of research & informatics, Taubman Health Sciences Library, University of Michigan
  • Samuel Hansen, mathematics and statistics librarian, Shapiro Science Library, University of Michigan
  • Scott Martin, biological sciences librarian, Shapiro Science Library, University of Michigan
  • Santiago Schnell, John A. Jacquez collegiate professor of physiology, University of Michigan Medical School

Researchers will perform a comparative analysis on papers published in four mathematical biology legacy journals and on newer journals with different publication models and disciplinary scope. The team will use the CADRE datasets to develop methodologies for comparative bibliometrics and content analyses; provide insight into publication trends in theoretical and applied domains; give authors new factors to consider when trying to publish; and help editors in similar disciplines use informatics to distinguish their journals.

Systematic over-time study of the similarities and differences in research across mathematics and the sciences from University of Michigan
A photo of a researcher who is wearing a hat that says "Library" on it.
Hansen.

Researcher:

  • Samuel Hansen, mathematics and statistics librarian, Shapiro Science Library, University of Michigan

Samuel’s project uses reference and citation aging, bibliographic coupling, and network breadth and depth to find similarities and differences between research fields in mathematics and the sciences. Specifically, they will find how information ages differently across disciplines, generate data about changes in the development of these research fields, and study how actively collaborative the disciplines are. Samuel will use WoS data from 1900 to 2017 to perform these analyses, which have typically only been done on a smaller scale in a single discipline.

Decide how CADRE advances your research with User Stories

Research needs are varied and dynamic. While CADRE can ease the many technical and financial obstacles researchers face when working with large data resources, our platform also has the flexibility to fit more specific research needs. To better understand those needs, we collect User Stories.

Whether you are an academic researcher, librarian, or technical provider, CADRE’s ongoing call for User Stories allows you to tell us what you want from CADRE.

Maybe you are a researcher who requires the CADRE platform to apply multiple graph visualization tools, such as Gephi or VoS, to the same query result so you can explore your data more easily.  

Or you might be a researcher with multiple affiliations, who wants to ensure the code you store in your repository will live in a cloud and not become lost if you change affiliations.

Perhaps you like that CADRE allows you to save space by running queries on the cloud server, but you also want the ability to download data locally to share findings with outside collaborators.

Then tell us—the CADRE team took each of the stories above into serious consideration when we began developing our platform.

User Stories give insight into the functionality requirements of different types of researchers, which will help us build a platform that better suits the needs of every researcher who uses CADRE. Any request ranging from specific interface design elements to a general computational environment will assist in our future development decisions.

How to submit stories

An example of text box questions that include the questions, "What type of user are you?" and "What features or functionality would you like to see?".
CADRE wants to hear what you need from our platform.

No user story is too trivial to share, and you can submit as many stories as you want.

Simply visit our User Story Collection page and tell us about who you are. Explain your project goals and what CADRE functionalities could help you achieve those goals.

If you are also interested in taking advantage of CADRE’s current resources and capabilities, consider applying to the CADRE Fellowship Program. The deadline for the first round of CADRE Fellows is June 25. Similar to User Stories, this fellowship will help CADRE better understand the needs of its researchers.

How will CADRE advance your research? Tell us your story today.

A red button that says "Submit User Story".