Good data guidelines

P823843-banner

Before depositing your digital data with NGDC, there are a number of things to consider to ensure your dataset is well organised, robust and accessible in the future.  

To comply with good data guidelines, your deposited data should: 

  • consistently include header rows and scientific units relating to any measurements 
  • document details about any scientific standards, instrumentation, software, code or data collection methodologies used 
  • be provided as a final version and therefore complete 
  • ideally be ‘open’ data, allowing it to be discovered and re-used by other interested parties
  • use one of the preferred formats
  • have all acronyms explained and use consistent naming conventions, as in the naming strategy   

File naming strategy  

By spending time designing how samples and data will be named, problems with duplicate names, identity confusion and future renaming and sorting tasks can be avoided. Naming schemes should be descriptive, unique and reflect the content of data or sample.  

Your file and folder names should be meaningful and as brief as possible.  

Start the file name with the most important component to allow organisation of your data by the most meaningful parameter, for example location, data type, researcher.  

Use acronyms to keep the names short and include a ‘readme’ note with a list of acronyms and their explanations in full.  

Create ‘readme’ notes of other relevant information such as file type and version information, software information, researchers’ initials, or repeated metadata relating to all the files (for example, the data collection location). 

Use capital letters to delimit words instead of spaces, for example FileNameExample; this makes the file name a character shorter.  

Avoid using special characters (@#£$%&) as they may not work in other operating systems.  

For dates, use the ISO standard YYYYMMDD, or YYMMDD or YYMM. 

Use folder names to classify broad types of files.  

Think!  

  • Is it possible for future users to know the content of the file without opening it, using just the file name and any ‘readme’ notes provided?  
  • Can they identify the content if you move the file to another folder?  
  • Can they retrieve and filter the data quickly using the search or filter function on their computer?  

Bulk renaming

If you have a lot of files that you need to rename to comply with a naming convention, here are a few applications you can try:  

  • Bulk Rename Utility (Windows, free) 
  • Total Commander (Windows, free) 
  • Renamer 4 (Mac, not free) 
  • PSRenamer (Linux, Mac, or Windows, free and open source) 

Further guidance