North American DDI Conference

Presentation Abstracts

The PowerPoints from the conference are available through the KU ScholarWorks collection "NADDI" http://kuscholarworks.ku.edu/dspace/handle/1808/11005

Links to individual presentations follow each abstract below.

Authors(s): Ingo Barkow, David Schiller (IAB - Institute for Employment Research)
Title: Rogatus – a planned open source toolset to cover the whole lifecycle
Length: 50 min
Abstract: During the last years several different tools for DDI Lifecycle have been published. Nevertheless none of the current tools is able to cover the full lifecycle from beginning to end. This presentation wants to show a first outlook into Rogatus - an open source toolset currently in development at DIPF with support of GESIS, TBA21, OPIT, Colectica, Alerk Amin and IAB. Rogatus consists of different DDI compliant applications (e.g. Qbee - Questionnaire Builder, Cbee – Case Builder, Tbee – Translation Builder, Mbee – Metadata Builder and Rogatus Portal). Furthermore some components are re-used in other software products (e.g. IAB Metadata Management Portal). This presentation will also show how a final version of Rogatus could be combined with other well-known tools like Colectica or Questasy using DDI as standard for data exchange to show how a survey process from creating a study from the scratch, designing the instruments, performing the data collection, handling the administrative processes, curating the data, disseminating the data, publication and at last data archiving for secondary usage could be handled with individual tools. http://hdl.handle.net/1808/11063

Authors(s): Amber Leahey
Title: Collaborative markup of library & researcher data: Examples from OCUL
Length: 20 min
Abstract: This presentation will focus on collaborative efforts to capture, store, and disseminate social science survey data & researcher data across all of Ontario's University Libraries. Together through shared platforms and practices, collaborative markup of data using the Data Documentation Initiative (DDI) standard is possible in order to effectively deliver rich discovery services to users of library and researcher data. An overview of Scholars Portal's data services including the Ontario Data Documentation, Extraction Service and Infrastructure (ODESI), and Dataverse will highlight effective collaborative markup strategies for data. http://hdl.handle.net/1808/11050

Authors(s): William Block, Jeremy Williams, Lars Vilhuber, Carl Lagoze, Warren Brown, John Abowd
Title: Improving User Access to Metadata for Public and Restricted Use US Federal Statistical Files
Length: 20 min
Abstract: The US federal statistical system produces prodigious amounts of public and restricted use data. The restricted use data can be difficult to interact with due to poor documentation. Documentation across agencies and the public use/restricted use divide that has been produced does not adhere to a single standard, making the metadata useful, but insular. The Data Documentation Initiative (DDI) is an emerging metadata standard that is used internationally to describe data in the social sciences. It has the potential to unify the metadata managed by separate organizations into a comprehensive searchable set. Researchers from the Labor Dynamics Institute, in collaboration with the Cornell Institute for Social and Economic Research (CISER) received funding from the National Science Foundation to improve the documentation of federal statistical system data with the goal of making it more discoverable, accessible and understandable for scientific research. The scope of this paper is a subset of the overall project, and reports on development of the web interface for user searches and the search API. The primary data model being utilized in this application is DDI 2.5 (Codebook), which contains elements and attributes to describe the contents of a data set. http://hdl.handle.net/1808/11093

Authors(s): Jeremy Iverson, Libbie Stephenson, Dan Smith
Title: DDI-Lifecycle and Colectica at the UCLA Social Science Data Archive
Length: 20 min
Abstract: The UCLA Social Science Data Archive’s mission is to provide a foundation for social science research involving original data collection or the reuse of publicly available studies. Archive staff and researchers work as partners throughout all stages of the research process, beginning when a hypothesis or area of study is being developed, during grant and funding activities, while data collection and/or analysis is ongoing, and finally in long term preservation of research results. Three years ago SSDA began to search for a better repository solution to manage its data, make it more visible, and to support the organization’s disaster plan. SSDA wanted to make it easier for researchers to look for data, to document their data, and use data online. Since the goal is to document the entire lifecycle of a data product, the DDI-Lifecycle standard plays a key role in the solution. This paper explores how DDI-Lifecycle and Colectica can help a data archive with limited staff and resources deliver a rich data documentation system that integrates with other tools to allow researchers to discover and understand the data relevant to their work. The paper will discuss how SSDA and Colectica staff worked together to implement the solution. http://hdl.handle.net/1808/11049

Authors(s): Barry Radler, Jeremy Iverson and Dan Smith
Title: Applying the DDI to a longitudinal study of aging
Length: 20 min
Abstract: Midlife in the United States (MIDUS) is a large, multi-disciplinary longitudinal study of aging conducted by the University of Wisconsin. MIDUS researchers want to provide a comprehensive, canonical source of documentation for the research project. To accomplish this, the team took the diverse set of sources that previously documented the MIDUS study and created a standardized, DDI 3-based set of documentation that better enables researchers to discover and use the MIDUS data. This talk will outline the process used to create the DDI 3 documentation, and will demonstrate the resulting documentation and dissemination tools provided by Colectica. The project is a joint effort between MIDUS and Colectica. http://hdl.handle.net/1808/11053

Authors(s): Dan Smith
Title: Colectica for Excel: Using DDI Lifecycle with Spreadsheets
Length: 20 min
Abstract: Colectica is a suite of modern metadata management software that is used to document statistical datasets, public opinion and survey research methodologies, and data collection. This demonstration will introduce the new Colectica for Microsoft Excel software, a free tool to document statistical data using open standards. The software implements leading open standards including the Data Documentation Initiative (DDI) Lifecycle version 3 and ISO 11179. Using this software allows organizations to both better educate sponsors and the public on their methodology and increases the organization’s reputation for performing credible scientific research. The free Colectica for Excel tool allows researchers to document their data directly in Microsoft Excel. Variables, Code Lists, and the datasets can be globally identified and described in a standard format. Data can also be directly imported and documented from SPSS and Stata files. The standardized metadata is stored within the Excel files so it will be available to anyone receiving the documented dataset. Code books can also be customized and generated by the tool, and output in PDF, Word, Html, and XSL-FO formats. http://hdl.handle.net/1808/11054

Authors(s): Thérèse Lalor, Steven Vale, Arofan Gregory
Title: Generic Statistical Information Model and DDI
Length: 50 min
Abstract: Across the world statistical organizations undertake similar activities. Each of these activities use and produce similar information (for example all agencies use classifications, create data sets and publish products). Although the information is at its core the same, organizations tend to describe this information slightly differently (and often in different ways within each organization). There is no common means to describe the information. GSIM is a conceptual model that provides a set of standardized, consistently described information objects, which are the inputs and outputs in the design and production of statistics. DDI is a key standard in both the development of GSIM itself, and as an implementation tool for organizations using GSIM. Beyond that, it also will influence the future directions of DDI development, attracting a larger number of data producers into the DDI community. This presentation introduces GSIM and looks at the interaction between GSIM and DDI (and other related standards), and provides an update on a rapidly-evolving vision around the use of DDI within the statistical institutes in Europe and elsewhere. It will cover both the direct interaction between DDI and GSIM, and also provide a broader context for understanding what that dynamic may mean in the future. http://hdl.handle.net/1808/11045

Authors(s): David Schiller, Ingo Barkow (DIPF)
Title: Administrative Data in the IAB Metadata Management System
Length: 50 min
Abstract: The Research Data Centre (FDZ) of the German Federal Employment Agency (BA) at the Institute for Employment Research (IAB) prepares and gives access to research data. Beside survey data the IAB provides data deriving from the administrative processes of the BA. This data is very complex and not easy to understand and use. Good data documentation is crucial for the users. DDI provides a data documentation standard that makes documentation and data sharing easier. The latter is especially important for providers of administrative data because more and more other data types are merged with administrative data. Nevertheless there are also some drawbacks when using the DDI standard. Data collection for administrative data differs from data collection for survey data but DDI was established for survey data. At the same time the description of complex administrative data should be simple as possible. IAB and TBA21 are currently carrying out a project to build a Metadata Management System for IAB. The presentation will highlight the documentation needs for administrative data and show how they are covered in the Management System. In addition the need for DDI profiles, comprehensive software tools and future proofed data documentation for multiple data sources will be depicted. http://hdl.handle.net/1808/11064

Authors(s): Mary Vardigan
Title: Collaborative Research: A Metadata Portal for the Social Sciences
Length: 20 min
Abstract: The Inter-university Consortium for Political and Social Research (ICPSR), NORC at the University of Chicago, and the American National Election Studies program in the Center for Political Studies at the University of Michigan’s Institute for Social Research are currently engaged in a new collaborative effort to create a common metadata portal for two of the most important data collections in the U.S. – the American National Election Studies (ANES) and the General Social Survey (GSS). Technical support is provided by Metadata Technology and Integrated Data Management Services. This pilot project, funded by the National Science Foundation, will produce a combined library of machine-actionable DDI metadata for these collections, and demonstrate DDI-based tools for advanced searching, dynamic metadata presentation, and other functions intended to facilitate discovery and analysis of these data. The project will also lay a foundation for developing new metadata-driven workflows for both ANES and GSS. This presentation will describe the major phases and deliverables of the project and present our plan of action, with an emphasis on how the project will benefit the wider community. http://hdl.handle.net/1808/11057

Authors(s): Alerk Amin
Title: Data and Metadata Harmonization for the RAND Survey Meta Data Repository
Length: 20 min
Abstract: The RAND Survey Meta Data Repository aims to help researchers use data and metadata from the HRS-family of surveys on aging, including studies from the US, UK/Europe and Asia. The project consists of 3 major parts: 1) Importing the metadata for each wave of each survey (in various formats such as DDI, Excel, Word/PDF) and linking the various modules/items with a single hierarchy of concepts 2) Creating the RAND Harmonized datasets by combining data across different waves of different studies, to facilitate easier comparison across years and countries 3) The Repository website, which provides researchers a single point of access to browse/search the metadata across all of the different surveys Currently, only one of the studies provides metadata in DDI format to simplify the import process; for the other studies, custom scripts and a great deal of manual effort are required. This presentation will discuss how DDI could be used to improve the process of importing the metadata and creating the RAND Harmonized datasets, as well as benefits for the researchers that access the Repository. http://hdl.handle.net/1808/11040

Authors(s): Arofan Gregory, J Gager, Pascal Heus
Title: DataForge: A DDI-Enabled Toolkit for Researchers and Data Managers
Length: 20 min
Abstract: Statistical data exist in many different shapes and forms such as proprietary software files (SAS, Stata, SPSS), ASCII text (fixed, CSV, delimited), databases (Microsoft, Oracle, MySql), or spreadsheets (Excel). Such wide variety of formats present producers, archivists, analysts, and other users with significant challenges in terms of data usability, preservation, or dissemination. These files also commonly contain essential information, like the data dictionary, that can be extracted and leveraged for documentation purposes, task automation, or further processing. Metadata Technology will be launching mid-2013 a new software utility suite, "DataForge", for facilitating reading/writing data across packages, producing various flavors of DDI metadata, and performing other useful operations around statistical datasets, to support data management, dissemination, or analysis activities. DataForge will initially be made available as desktop based products under both freeware and commercial licenses, with web based version to follow later on. IASSIST 2013 will mark the initial launch of the product. This presentation will provide an overview of DataForge capabilities and describe how to get access to the software. http://hdl.handle.net/1808/11044

Authors(s): Wendy Thomas, Chris Brown and Ron Nakao
Title: PANEL: DDI and Metadata from the Researcher's Perspective
Length: 50 min
Abstract: (preliminary abstract) This will be a panel discussion on lifecycle metadata issues from the researcher's perspective.
A lot of focus has been placed on how to integrate DDI into large data collection processes in the world of official statistics, research centers, and long term projects. It makes sense in these areas to talk about the payoff for metadata reuse, developing processes and tools to harvest metadata along a production process, and the value of a software neutral means of capturing and transporting metadata. The question facing academic based data libraries and archives is how to integrate DDI into the smaller, limited time frame, research project. What are the payoffs for the individual researcher? What tools can be provided to support researchers? This panel is designed to gather input from attendees to help answer the following questions:

How can the use of DDI throughout the research process help researchers during the process?

What needs to be there (tools, processes, informational materials, etc.)?
What can data libraries/archives/services do to promote and support DDI use?
What is needed from others (Funding agencies, academic departments, computing services, etc.)?

What can DDI do to increase the use of DDI within the academic environment? http://hdl.handle.net/1808/11052

Authors(s): Arofan Gregory et al (see: http://www.ddialliance.org/alliance/working-groups#qdewg)
Title: DDI Extensions for Qualitative Data
Length:50 min
Abstract: DDI's origins lie in structuring metadata for quantitative data, data represented by numbers or by a set of values that can be tabulated. Many researchers though, generate or analyze unstructured data, text documents, images, video, sound recordings and more. In 2010 the DDI Alliance formed a DDI working group (http://www.ddialliance.org/alliance/working-groups#qdewg) charged with "developing a robust XML-based schema for qualitative data exchange (compliant with DDI) and encourage tools development based upon these needs". This presentation will report on the progress of that group and describe the state of the DDI Qualitative Data Model as of the working group meeting in Bergen, Norway in November 2012. A discussion period for feedback on the model will follow the presentation. http://hdl.handle.net/1808/11046

Authors(s): Ingo Barkow, William Block, Jay Greenfield, Marcel Hebing, Larry Hoyle, Wendy Thomas
Title: Generic Longitudinal Business Process Model
Length 20 min
Abstract: This presentation will describe a model for the processes involved in a longitudinal study. The model was developed at a symposium-style workshop held at Dagstuhl in September of 2011 (http://www.dagstuhl.de/11382). The Generic Longitudinal Business Process Model (GLBPM) emulates the Generic Statistical Business Process Model (GSBPM) (http://www1.unece.org/stat/platform/download/attachments/8683538/GSBPM+Final.pdf?version=1) which, in turn, was developed with DDI Lifecycle in mind. The GLBPM is intended as a generic model that can serve as the basis for informing discussions across organizations conducting longitudinal data collections, and other data collections repeated across time. The model is not intended to drive implementation directly, but may prove useful for those planning a study. An introductory presentation on the model will be followed by a panel discussion. http://hdl.handle.net/1808/11051

Authors(s): Mary Vardigan & Joachim Wackerow
Title: DDI - A Metadata Standard for the Community
Length: 20 min
Abstract: This presentation gives an overview of the primary benefits of DDI like rich content, metadata reuse across the life cycle, and machine-actionability in a global network. Examples of successful adoption will be described. Barriers and challenges of using DDI are discussed on multiple levels. Furthermore an outlook to the DDI of the future will be done. http://hdl.handle.net/1808/11056

Authors(s): Philip A. Wright
Title: Using SAS to generate DDI-C XML from Information Managed in Excel Spreadsheets
Length: 50 min
Abstract: DDI-C compliant files are used for two distinct rolls by ICPSR to generate variable documentation from information managed in Excel spreadsheets by the data producer. For completed studies, DDI-C compliant files are used to generate codebooks which include unweighted frequencies. For data in production, DDI-C is used to bulk load questions and variable attributes into a browser-based variable editor. This presentation will describe in moderate detail how SAS is used to generate the major DDI-C XML elements. http://hdl.handle.net/1808/11058

Authors(s): Larry Hoyle, Ada Van Roekel
Title: REDCap and DDI an update
Length: 20 min
Abstract: The REDCap (Research Electronic Data Capture) consortium is a group of over 450 institutions that supports a web application which supports data capture for research studies (see http://project-redcap.org/). The application allows interactive survey instrument development and data collection. Data and scripts for SPSS, SAS and R can be exported from REDCap. Survey metadata including question text and flow control can be also exported as a csv file. This paper describes code in the R language to convert the REDCap survey metadata from csv to DDI 3.1. It includes a discussion of which REDCap instrument attributes can be represented by DDI 3.1 elements other than Note. This presentation includes information on the REDCap API and metadata about mapping data entry forms to events in a longitudinal study. http://hdl.handle.net/1808/11047

Authors(s): Gina-Qian Cheung
Title: MQDS - the Michigan Questionnaire Documentation System
Length: 20 min
Abstract: The Michigan Questionnaire Documentation System (MQDS) is a powerful tool used to help create questionnaire documentation, with or without summary statistics, and other documentation based on the Blaise data model for a study. MQDS works by:
1. Analyzing the data model and its associated files, and then importing the Blaise metadata and data to the MQDS database; 2. Exporting that information to an eXtensible Markup Language (.xml) file; and 3. Rendering the needed elements via eXtensible Stylesheet Language (.xsl), and then generating HyperText Markup Language (HTML), Rich Text Format (.rtf) and Portable Document File (.pdf) formatted output.
MQDS output is used for testing instruments, reviewing questionnaires, preparing documentation, and comparing questionnaires across data models or across studies. MQDS is also capable of providing summary statistics by reading in a Blaise database file and a corresponding Blaise data model and outputting the questionnaire with data file contents into an HTML, RTF, or PDF file format. http://hdl.handle.net/1808/11043

Sponsored by:

North American DDI Conference

Presentation Abstracts

Sponsored by: