Data Management

Data management is the organization and planning for data throughout the research cycle. It encompasses a set of activities that are essential to the short- and long-term access and use of research data. It involves planning for the creation, storage, use, security, and continued access to data. Data are the raw, analyzed, or derived results of observations, experiments, and simulations. Data can be either analog or digital and can exist in different formats, including but not limited to text, numerical, multimedia, and instrument-specific. Management of data throughout the research life-cycle not only increases the efficiency of a research project, it also complies with expectations for the ethical conduct of research and is rapidly becoming mandatory practice for many funding agencies.
Three Things You Can Do Today to Help Manage Your Data

1. Backup. Think of what it would take to reproduce your data. To make sure you don't lose it, strive to have three copies—the original master file, a local backup (e.g., on an external hard drive), and an external backup (e.g., on a networked drive or on a web-based storage service).

2. Organize your data. Plan the directory structure and file naming conventions before creating your data, taking into consideration the potential need to track versions of data sets and documents. Follow any existing project-specific conventions or disciplinary standards or best practices.

3. Document your data. Data documentation, also known as metadata, will help you use and understand your research data into the future. If you plan to share your data it will also help others find, use, and properly cite it. At a minimum, create a readme.txt file that includes basic documentation such as title, creator, identifier, rights/access information, dates, location, methodology, etc.

Assistance with Creating Data Management Plans

NOAA is in the process of forming policies to comply with the Office of Science Technology Policy, Increasing Access to the Results of Federally Funded Scientific Research. February 22, 2013

Data Citation

Why cite data? We believe that you should cite data in just the same way that you can cite other sources of information, such as articles and books.

Data citation can help by:

  • enabling easy reuse and verification of data
  • allowing the impact of data to be tracked
  • creating a scholarly structure that recognises
  • rewards data producers.

There is an increasing recognition of the importance of properly citing data in publications

  • To ensure scientific transparency
  • To ensure reasonable accountability for authors and stewards
  • to encourage the replication of scientific results
  • to improve research standards
  • to give proper credit to data producers
  • to encourage continuity in grant funding for researchers
  • To enhance performance metrics
  • To aid in tracking the impact of data set through reference in scientific literature

Examples of data citation

We recognize that the challenges associated with data publication vary across disciplines and encourage research communities to develop citation systems that work well for them. Our recommended format for a data citation is as follows:

• Creator (PublicationYear): Title. Publisher Identifier.

It may also be desirable to include information from two optional properties, Version and ResourceType (as appropriate). If so, the recommended form is as follows:

  • Creator (PublicationYear): Title. Version. Publisher. ResourceType. Identifier

For citation purposes, DataCite recommends that DOI names are displayed as linkable, permanent URLs

  • Irino, T; Tada, R (2009): Chemical and mineral compositions of sediments from ODP Site 127-797. Geological Institute,  University of Tokyo. Pangea
  • Geofon operator (2009): GEFON event gfz2009kciu (NW Balkan Region). GeoForschungsZentrum Potsdam (GFZ). GEOFON
  • Denhard, Michael (2009): dphase_mpeps: MicroPEPS LAF-Ensemble run by DWD for the MAP D-PHASE project.
  • World Data Center for Climate. World Data Center for Climate

The assignment of persistent identifiers enables accurate data citation.

DOI® names are assigned to any entity for use on digital networks. They are used to provide current information, including where they (or information about them) can be found on the Internet. Information about a digital object may change over time, including where to find it, but its DOI name will not change.

The DOI System provides a framework for persistent identification, managing intellectual content, managing metadata, linking customers with content suppliers, facilitating electronic commerce, and enabling automated management of media.

  NOAA is in the process of developing a pilot DOI project January 2013- .

The DOI System and Registeries

DOI System


Cross Ref Cross Ref

The ESIP Federation has put together these guidelines .

Important Policies on Data Management

The National Science Foundation has released a requirement for proposal submissions regarding the management of data generated using NSF support, Starting in January, 2011, all proposals must include a data management plan (DMP). The plan should be short, no more than two pages, and will be submitted as a supplementary document. The plan will thus no count toward the 15 page limit for proposals. The plan will need to address two main topics: What data are generated by your research? What is your plan for managing the data? Data Management Plan Policy .

Oficce of Science Technology Policy- Increasing Access to the Results of Federally Funded Scientific Research, February 22, 2013. On February 22, 2013, OSTP directed Federal agencies with more than $100 million in R&D expenditures to develop plans to make the published results of federally funded research freely available to the public within one year of publication and requiring researchers to better account for an manage the digital data resulting from federally funded research. The final policy reflects substational inputs from scientists and scientific organizations, publishers, members fo Congress, and other members of the public. Office of Science Technology Policy


National Oceanographic Data Center

Some representative data journals

