The CADRE team is hard at work developing a platform that will do what academic libraries have long been trying to achieve.
We are gearing up for ISSI 2019 in September, where CADRE will hold a workshop and tutorial. Our hands-on CADRE tutorial at ISSI will offer an option to use assisted programming to access Microsoft Academic Graph (MAG), as well as a second option that will allow access to the dataset using the CADRE Query Builder, which uses a graphical user interface.
But building a platform that allows novice coders to easily query massive datasets with a GUI is a lengthy process of trial and error—and that is only one component of CADRE.
Building CADRE is a complex and fluid task: Along with the Web of Science (WoS) and MAG datasets, CADRE will include U.S. patent and trademark data. And more datasets will be added to the platform as different types of researchers request access.
IUNI Lead Software Engineer Ben Serrette says because of the potential to take on more datasets, software solutions must be as generic and adaptable as possible. Like fitting and refitting the pieces of a complex puzzle that keeps changing shape, the IUNI IT team is solving multi-faceted problems with a flexible approach on an enormous scale.
Find out how they’re doing it below.
Ruling out what doesn’t work: The IT team narrowed down the many serverless technology options that cloud-computing platform Amazon Web Services (AWS) offers by eliminating the ones that don’t fit the bill in terms of cost or ability to interface with other CADRE technical components.
Designing fundamental cloud architecture: CADRE’s infrastructure of cloud-based virtual machines and AWS services has been developed.
Integrating Jupyter Notebooks & file storage: Advanced users can write their own code to create data-analysis tools in CADRE’s notebook. Jupyter Notebooks is up and running with a working file system for storing code.
Testing the query builder: One service essential to CADRE is the GUI users can use to easily query massive datasets. The IT team is testing the combined powers of a relational database and various graph databases with MAG data to create a more efficient query-builder.
Building CADRE’s website: The IT team is finishing the front-end of some pages of the CADRE website, including the homepage and the event page for ISSI 2019. They are preparing to make the website live in a couple weeks.
If you want to stay updated on what CADRE is doing, be sure to follow us on Twitter.
Ma says CADRE is a project designed to make valuable and complex large library data resources approachable to any academic researcher, regardless of their skill level in handling such data.
An owl is not just beautiful, it is a symbol of wisdom and a known powerful predator. Ma says this is exactly how CADRE resonates with her: a powerful system filled with complex big data wrapped in a user-friendly interface that allows even non-experienced coders to tame the data.
But to not go too deep, she adds that owls are pretty cute. A friendly mascot like the CADRE Owl helps balance a project that seems very technical and complicated to those who aren’t familiar with it.
The CADRE Owl has come a long way. The original version was a complex illustration that included the owl with glasses, hard drives to perch on, and a network backdrop. But after months of brainstorming and refining, Ma’s end product became a sharp, playful owl profile.
Even though she streamlined the logo to make it cleaner and easier to stamp across any outreach materials, Ma included details that would make the CADRE Owl feel a bit more connected to the project, including the owl’s network necklace and her creative take on an owl’s facial disc. All in all, Ma and the rest of the CADRE team are excited to be represented by the CADRE Owl.
July has been a busy month for CADRE—not only have we created our visual identity, but we also selected our first class of CADRE Fellows. We have so much more news to come, including our new website publishing soon.
We’ll also be presenting a webinar about CADRE on Aug. 8 for the OCLC Research Works in Progress Webinar Series. Don’t forget to register.
The Collaborative Archive & Data Research Environment (CADRE) accepted its first class of CADRE Fellows.
These seven fellowship teams span across disciplines and offer compelling research that incorporates big data and bibliometrics. Each fellow team will access CADRE’s Web of Science (WoS) and Microsoft Academic Graph (MAG) datasets to achieve their research goals.
Our fellows will present their research at the International Society for Scientometrics and Informetrics (ISSI) 2019 Conference in Rome at either the workshop or tutorial that CADRE is hosting on Sept. 2.
Not only will these fellows show how CADRE helped advance their work, they will serve as integral use cases for how we develop our platform to suit the needs of every type of academic researcher.
We plan to accept fellows on a rolling basis in the future, as spots become available. If you are interested in applying, email us at email@example.com.
Now, let’s meet the research teams!
Utilizing Data Citation for Aggregating, Contextualizing, and Engaging with Research Data in STEM Education Research from Purdue University
Michael Witt, associate professor of library science, Purdue Libraries and School of Information Studies, Purdue University
Loran Carleton Parker, associate director & senior evaluation and research associate, Evaluation Learning Research Center, Purdue University
Ann Bessenbacher, research associate and data scientist, STEMEd HUB, Purdue University
Researchers will characterize citation of data from the literature in the field of STEM education research. A sample of relevant publication venues in the field will be identified from WoS and MAG. Digital Object Identifiers (DOIs) of datasets registered with DataCite will be used to query and associate datasets with publications. The team will assess rates of citation for datasets that are cited using DataCite DOIs for each publication venue and analyze a sample of data citations and publications to determine suitability for providing an initial context to help a researcher who is unfamiliar with the data determine whether to use the dataset.
Understanding citation impact of scientific publications through ego-centered citation networks from Indiana University Bloomington, Nanjing University
Yi Bu, Ph.D. candidate in informatics, Indiana University Bloomington
Chao Min, research assistant professor in information management, Nanjing University in China
Ying Ding, professor of informatics, Indiana University
The research team seeks to find the “deeper” and “broader” impact of network-based citation measurements in the scientific community. This project will determine the citation impact of scientific publications using an ego-centered citation network, which contains the citing relationships between a publication and its citing publications, as well as the relationships within its citing publications. Researchers will use the entirety of the WoS and MAG data to establish empirical evidence in this project.
MCAP: Mapping Collaborations and Partnerships in SDG Research from Michigan State University
Jane Payumo, academic specialist and research and data evaluation manager, MSU AgBioResearch, Michigan State University
Devin Higgins, digital library programmer, MSU Libraries, Michigan State University
Scout Calvert, data librarian, MSU Libraries, Michigan State University
Guangming He, information management analyst, MSU Innovation Center, Michigan State University
This project will build on the WoS report “Navigating the Structure of Research on Sustainable Development Goals (SDG),” as the researchers search for patterns of global collaboration and support the United Nations’ SDG call for action. Researchers will design a prototype to analyze and visualize the input-output of partnerships over time in SDG-supportive research. They also plan to create a scoring measure or partnership index that defines and conducts partnership analytics for SDGs by using data sourced from WoS and MAG.
The global network of air links and scientific collaboration – a quasi-experimental analysis from Indiana University Bloomington and University of Warsaw
Katy Börner, Victor H. Yngve distinguished professor of engineering & information science, Indiana University Bloomington
Adam Ploszaj, assistant professor at the Centre for European Regional and Local Studies, University of Warsaw
Lisel Record, associate director, Cyberinfrastructure for Network Science Center
Bruce Herr II, senior system architect and project manager, Cyberinfrastructure for Network Science Center
Researchers plan to determine the impact of the introduction and availability of long-distance flights on international scientific collaboration. The team will measure collaboration through co-authorship and co-affiliation. They will also geocode publication affiliations from WoS and MAG from 1998 through 2017. This quasi-experimental research will apply state-of-the-art causal modeling techniques and explore how data-driven causality can enhance science of science policy relevance.
Measuring and Modeling the Dynamics of Science Using the CADRE Platform from University of Minnesota, New York University, Boston University, University of Pennsylvania, University of Arizona
Russell Funk, assistant professor of strategic management & entrepreneurship, University of Minnesota
Thomas Gebhart, Ph.D. student in computer science and engineering, University of Minnesota
Michael Park, Ph.D. student in strategic management and entrepreneurship, University of Minnesota
Julia Lane, professor at Wagner Graduate School of Public Service, New York University
Raviv Murciano-Goroff, assistant professor at Questrom School of Business, Boston University
Matthew Ross, research assistant professor at Wagner Graduate School of Public Service, New York University
Britta Glennon, assistant professor at Wharton School, University of Pennsylvania
Erin Leahey, professor and director of sociology, University of Arizona
Jina Lee, Ph.D. student in sociology, University of Arizona
This research team wants to better characterize scientific influence of papers, typically measured by how many times papers are cited, by distinguishing between papers that destabilize existing knowledge with novel concepts and papers that consolidate existing knowledge. In a separate but closely related aim, the researchers also plan to create a novel unsupervised machine learning technique for author-name disambiguation by pulling abstract, title, and citation data from WoS and MAG. For both aims, the CADRE platform will provide essential infrastructure in terms of large-scale data storage and high performance computational resources.
Comparative analysis of legacy and emerging journals in mathematical biology from University of Michigan and University of Michigan Medical School
Marisa Conte, assistant director of research & informatics, Taubman Health Sciences Library, University of Michigan
Samuel Hansen, mathematics and statistics librarian, Shapiro Science Library, University of Michigan
Scott Martin, biological sciences librarian, Shapiro Science Library, University of Michigan
Santiago Schnell, John A. Jacquez collegiate professor of physiology, University of Michigan Medical School
Researchers will perform a comparative analysis on papers published in four mathematical biology legacy journals and on newer journals with different publication models and disciplinary scope. The team will use the CADRE datasets to develop methodologies for comparative bibliometrics and content analyses; provide insight into publication trends in theoretical and applied domains; give authors new factors to consider when trying to publish; and help editors in similar disciplines use informatics to distinguish their journals.
Systematic over-time study of the similarities and differences in research across mathematics and the sciences from University of Michigan
Samuel Hansen, mathematics and statistics librarian, Shapiro Science Library, University of Michigan
Samuel’s project uses reference and citation aging, bibliographic coupling, and network breadth and depth to find similarities and differences between research fields in mathematics and the sciences. Specifically, they will find how information ages differently across disciplines, generate data about changes in the development of these research fields, and study how actively collaborative the disciplines are. Samuel will use WoS data from 1900 to 2017 to perform these analyses, which have typically only been done on a smaller scale in a single discipline.
Research needs are varied and dynamic. While CADRE can ease the many technical and financial obstacles researchers face when working with large data resources, our platform also has the flexibility to fit more specific research needs. To better understand those needs, we collect User Stories.
Whether you are an academic researcher, librarian, or technical provider, CADRE’s ongoing call for User Stories allows you to tell us what you want from CADRE.
Maybe you are a researcher who requires the CADRE platform to apply multiple graph visualization tools, such as Gephi or VoS, to the same query result so you can explore your data more easily.
Or you might be a researcher with multiple affiliations, who wants to ensure the code you store in your repository will live in a cloud and not become lost if you change affiliations.
Perhaps you like that CADRE allows you to save space by running queries on the cloud server, but you also want the ability to download data locally to share findings with outside collaborators.
Then tell us—the CADRE team took each of the stories above into serious consideration when we began developing our platform.
User Stories give insight into the functionality requirements of different types of researchers, which will help us build a platform that better suits the needs of every researcher who uses CADRE. Any request ranging from specific interface design elements to a general computational environment will assist in our future development decisions.
How to submit stories
No user story is too trivial to share, and you can submit as many stories as you want.
Simply visit our User Story Collection page and tell us about who you are. Explain your project goals and what CADRE functionalities could help you achieve those goals.
If you are also interested in taking advantage of CADRE’s current resources and capabilities, consider applying to the CADRE Fellowship Program. The deadline for the first round of CADRE Fellows is June 25. Similar to User Stories, this fellowship will help CADRE better understand the needs of its researchers.
How will CADRE advance your research? Tell us your story today.