Metadata

Metadata can be described as data about data. With robust, well-described metadata, users can find content, navigate to related information or share it with others. It also allows prospective users of the data to assess what the dataset contains and whether it is appropriate for their needs.  

NGDC will use the metadata you provide to make the dataset available via the NGDC deposited data search and further data catalogues for maximum exposure to data users, including the BGS Discovery Metadata catalogue, the data.gov.uk data search and the NERC Data Catalogue Service.  

NGDC develops a catalogue of metadata records compliant with the ISO 19139 XML implementation schema for the ISO 19115 geographic information metadata standard. NGDC further enriches the submitted metadata for selected datasets and creates Discovery Metadata records that are fully compliant with the UK standard for spatial metadata (Gemini 2.3) and the INSPIRE directive. These are made publicly accessible and searchable through standards-based searches using ISO 19139 and the OGC Catalogue Service for the Web

P771879
Information icon

Core boxes awaiting processing. BGS © UKRI.

Why is metadata important? 

Metadata is vital for ensuring that data remains understandable, accessible, and usable over time, benefiting both the creators who want their data to be influential and data users who rely on well-documented data for reliable research outcomes. 

Good quality metadata can help both data creators and data users. Data creators can share their data with others, maximise reach, receive credits when the data is cited and understand their data better when it is re-used in the future. Data users can discover, evaluate and re-use created data. 

What metadata do I need?  

The key metadata fields that must be included are the following.

Title 

The title gives a clear and concise indication of the content of the dataset, not the project or activity that produced it. You might consider a title that answers the questions ‘what, where, when?’

Avoid using acronyms or abbreviations. If they are used, they should be explained in the title or in the abstract/data description if more appropriate. 

Abstract and data description 

This is a brief summary of the content of the dataset. A good abstract will help the user decide whether the dataset or model will be of interest. Care should be taken in writing your abstract, as it is the key to ‘selling’ your data to other data users. 

Lineage statement 

The lineage statement describes how the data came into existence and the stages it has passed through. This is not a detailed methodology but a brief summary. 

Writing a good abstract 

The abstract is an ‘executive summary’ that allows the reader to determine the relevance and usefulness of the resource. The text should be concise but still contain sufficient detail to allow the reader to rapidly ascertain the scope and limitations of the resource. 

Write for readers: the abstract should be in plain English. Be wary of using jargon; geologists might understand it but an external audience may not. Write in complete sentences rather than fragments or bullet points. 

Avoid using acronyms or abbreviations. If they are used, they should be explained. 

Be aware that the first one or two sentences are used by search engines such as data.gov.uk and Google to present search results to users. It is vital your key information is in the first 100 characters or the length of this bold sentence[.] 

Whilst links to published papers, websites, blogs, etc. are useful, the abstract needs to describe the dataset fully without users having to refer to other papers, websites, etc. 

The maximum length of an abstract is 4000 characters but it can, and where possible should be, much shorter. The minimum is 100 characters.

A good abstract should answer these questions.

  • What? Describe what has been recorded and what form the data takes. Include keywords and data type, volume, structure and format.
  • Where? Indicate where the dataset was collected and if the coverage is even or variable. Include geo-location, depth, altitude and site name.
  • When? Include when the dataset was collected (single date or time period), when it will be made available and what restrictions on use the dataset has.
  • How? This covers any instrumentation or software used to collect the data, including the version number, methodology, quality control and data resolution.
  • Why? For what purpose was the data collected? What audience is likely to find the data useful? 
  • Who? Include the party or parties responsible for data collection and interpretation. You should include the full name(s), affiliation(s) and *ORCID(s) of the authors and a nominated contact person, and the funding reference if applicable.
  • Is is complete? If any data is absent from the dataset, explain which and why. 

Examples of dataset metadata

What: electron microprobe analyses of Mn-oxyhydroxide phases as elemental percentages per point analysis.

Where: Mn-oxyhydroxide phases were within limonites from Acoje (Philippines), Caldag (Turkey), Nkamouna (Cameroon), Piaui (Brazil) and Shevchenko (Kazakhstan) laterite deposits.

When: the data was acquired during the NERC SoS Minerals CoG3 project between 2015 and 2018 using a Cameca SX100 electron microprobe at the Natural History Museum, London, UK.

How: point analyses were performed on samples set within epoxy resin blocks, polished and coated with carbon. All elements were analysed using wavelength dispersive X-ray spectrometers. 

Why: the data was used to identify the Co- and Ni-bearing host minerals within each natural resource and to assess the amount and variability of these elements within specific Mn-oxyhydroxide phases. This may be useful within the mining sector, resource assessment, processing or prospecting, geo- or material scientists and processing engineers and metallurgists. 

Who: the data was acquired in the Core Research Laboratories, Natural History Museum, by the NHM CoG3 team.

What: this dataset presents meteorological records from three weather stations around a glacier in south-east Iceland from 2009 to 2020. The weather stations were installed as part of BGS’s Glacier Observatory project and were positioned at different altitudes close to the ice to record glacier weather. The data is in text format and records key meteorological parameters including temperature, relative humidity, atmospheric pressure, precipitation, wind speed and direction, and solar irradiance.

Where and when: the weather stations were placed around Virkisjökull-Falljökull, an outlet glacier of the Öraefajökull ice cap in south-east Iceland (AWS1 at 16°48’19″W, 63°57’53″N; AWS3 at 16°47’5.64″W, 63°58’12.78″N, and AWS4 at 16°48’7″W , 63°59’52″N). AWS1 was installed in September 2009, with AWS3 installed in September 2010, and AWS4 in September 2011. AWS3 and AWS4 were decommissioned in August 2018 and AWS1 in May 2020. AWS1 was located 100 m from the current glacier margin at 156 masl; AWS3 sat 50 m from the icefall at 444 masl, and AWS4 was situated on a clifftop overlooking the glacier at 858 masl, close to the equilibrium line altitude of the glacier. They were positioned at different altitudes to determine changes in weather parameters with height, thus producing, for example, temperature or humidity gradients.

How: the three stations were wirelessly linked, allowing data from the upper stations to be offloaded to the datalogger on the lower station. On-site downloads were completed using Campbell Scientific LoggerNet 4.x series software. AWS1 maintained mobile phone telemetry enabling automatic remote downloads of data from all stations on a daily basis and remote access for software updates and health checks. Each AWS supported a slightly different sensor array depending on the requirements of the site, and were mounted on 1.5 m Campbell Scientific tripods. All of the stations were designed around a Campbell Scientific CR800 datalogger,and were solar powered, using combinations of photovoltaic panels up to 100 W supplying a Campbell Scientific 25 Ah battery mounted on the tripod, plus a 110 Ah gel cell battery back-up in a separate housing.

Why: the data will be of use to researchers and students interested in the weather of south-east Iceland, glacier climate, local influence of glaciers on more regional synoptic weather systems, glacier climate modellers, glacier hydrologists and hydrogeologists.

Who: the BGS project was led and coordinated by Dr Jez Everest, technical support and implementation by Heiko Buxel and data quality assurance and checking by Dr Jon Mackay.

Completeness: any periods where equipment malfunction, testing or replacement meant that no or unreliable data were collected are indicated by a ‘NAN’ value in the datasets.

What: this set of data is the second set of impact interviews conducted with the target communities of the BRAVE project. The interviews are transcriptions in Microsoft Word.

Where: the communities involved in the data collection were from Tomo and Poa in Burkina Faso and Jawani and Tariganga in Ghana. There are 32 interviews from Burkinabe community members and 23 from the Ghanaian communities. Individuals were selected based on their participation in the BRAVE field activity of the Farmer Voice Radio.

When: the data was collected between October 2019 and February 2020 by the local researchers.

How: the data methodology was built on the initial vulnerability assessments and included questions around behaviour change and income change based on the BRAVE communities’ activities of ground water measurement and water management strategies.

Why: the data shows behaviour and livelihood change within the communities due to these activities. This is final qualitative impacts dataset from the BRAVE project. Previous linked datasets include the baseline vulnerability assessments and the first round of impact interviews.

Who: BRAVE: building understanding of climate variability into planning of groundwater supplies from low storage aquifers in Africa. BRAVE is a ‘consortium’ research project is part of the Unlocking the Potential of Groundwater for the Poor (UPGro) programme.