This is an old revision of the document!


Metadata Management

One powerful feature of the RDMS is that it gives you the option to enrich your files and folders with metadata. The term Metadata refers to “data that provides information about other data”. In other words, it is “data about data”. In layman's terms, think of the information attached to a song that tells you who the singer is and which album it is from. In terms of research, this information could contain the instrument that generated the data, as well as the project or experiment that data is generated for.

The RDMS is based on iRODS; short for “integrated Rule-Oriented Data System”. This system treats all files and folders as data with metadata directly attached to them. The minimum level of metadata that is attached data (file/folder) is the one generated by the system itself, so called system metadata. This metadata contains for example the date when the data first entered the system, the size of the data and its owner.

The second type of metadata is metadata that is added by a user, so called user-defined metadata. As a user, you can freely decide which metadata tags to use and how many of them to attach to the data as long as you have the required permissions.

For example, you might want to record the name of the machine that produced the data, the internal settings of that machine used to record the data, the experiment number for which the data was produced, the tools and software used, or any tag that you find relevant to locate that particular dataset later in time. This freedom to decide on what metadata to record allows for many extra possibilities of data handling. For example, it allows you to search much more accurately for data on the basis of metadata.

While you can also use the command-line interface to manage metadata using iCommands, the web portal offers extensive features to add/remove/change metadata and to use this metadata to search search for data. Furthermore, the portal allows to define metadata templates and new metadata types which allows for the standardization of your and your group's metadata.

Before we jump into the metadata functionality of the RDMS, one important note to give is that there is no limit to how much metadata you can attach to a file. The System is set up in such a way that does not limit entries, characters per tag or value, and number of tags in the metadata templates (Check if this is true). Finally, metadata attached to a collection is not directly inherited by the objects contained in the collection. The objects, however, will be displayed in the search results if the collection that contains them has metadata attached that fits the parameters of the search query.

Using the web portal you will have access to basic metadata functionality using a GUI. Specifically, you will be able to create, read, update, and delete (CRUD) metadata attached to your files. The web portal also allows for more complex handling of your metadata, but we will describe this in a different section when we talk about metadata templates. Below we show the data browser of the web portal and where you can see the metadata generated by the system.

When you first upload a file to the RDMS, all the metadata attached to it will be related to system information, such as upload time and file size, as shown above. You can easily attach new metadata to your data by right-clicking the object or collection you wish to add metadata to and then selecting Add Metadata. If you click any of the icons highlighted in the screenshot below, the web portal will open a side menu containing all information about the selected data. There you can easily check what metadata is already attached to the data you selected, as described in the Read Metadata section.

Once you select Add Metadata, the menu on the left will open as shown below. This way of inserting metadata is very much free form. The red boxes are mandatory, whereas the blue box can be used to provide additional information that did not fit into the Value* box. The Name* box is your metadata tag or label, while the Value* box is the value you are inputting for the tag. This value can be text, as shown in the example, or it can be numerical. Once again, this method of inputting metadata is free form and the decision of what the different values are is entirely up to you. Of course, the input for Extra Value can also be a string or numerical.

An example of the usage of the three different fields could be:

  • Name*: Distance
  • Value*: 1000
  • Extra value: nautical miles

which would add Distance: 1000 nautical miles as a metadata entry for the selected data.

When you are done inputting the metadata tag and description, you can click the blue button to commit your metadata entry.

Note:
Pressing the blue button will save the metadata and clear the window in order for you to input the next set of metadata. You can close the window when you are done. If you want to check if your metadata has been saved, click the icon (as shown in the first screenshot of this section) and the window shown below will appear. The blue box shows you the tab where the metadata attached to your data is visible, while the red box highlights the metadata entry you just saved.

Once you created metadata related to your data, you will always be able to display that information by clicking on the icon next to the respective entry in the web portal. In the screenshot below, clicking on the icon in the red box will display the metadata for the selected object, while clicking on the icon in the blue box will display the information of the current collection, in this case the user's home collection.

When the icon was clicked, a menu will appear on the right side of the portal. The first tab contains the metadata information of the object or collection you selected. Depending on the amount of metadata you recorded for this object, you might need to scroll down to find the entry you are interested in.

Note:
The metadata entries are ordered based on the time they where created. If you enter Test_meta_0 after Test_meta_1, as is shown in the figure, the system will not reorder them numerically.

Whether it be because you do not always have all the details related to certain data at the time of its storage in the RDMS or because you spot an error, you can change or update metadata you attached to a file in the way described above.

To do modify metadata, simply click on the icon as described above to open up the information menu. Select the left-most metadata tab, and finally click on the icon beside the metadata entry you wish to modify.

This will open up a window similar to the window you would get when adding metadata for the first time, as displayed below. Make the desired changes, then click the blue Add/Edit metadata button to save your changes. The window will close and the system will then reload the right-side menu and display updated metadata.

During your data's life cycle, errors can occur or metadata tags can become irrelevant. For those cases, you can easily delete a metadata tag by selecting the icon once again and clicking on the button next to a metadata entry in the metadata tab in the right-side window.

The system will then display a window to ask you for confirmation of the delete action. Clicking No cancels the action, while clicking Yes deletes the metadata entry you selected. Please make sure you do not delete metadata that should be kept!

Metadata templates

Project managers will see a menu item to create metadata templates. To create a metadata template, you need to define the metadata and create a template with these variables.

The DCC can help you with setting up these templates in your research group or project.

Automatic extraction of metadata

What we described above is the functionality of the web portal to let the user add metadata entries by hand. There are data objects or collections, however, which require a large amount of metadata to be properly described. Such data objects often have the information needed to describe them stored in their file header. The web portal offers the user the possibility to extract and import the metadata contained in the file header directly into the RDMS system. Contrary to the manual input, this action generates a single metadata entry for multiple values, similarly to what you would get by inputting values into a template.

To access this functionality, right-click the file you wish to extract metadata from, then scroll down the menu to Extract metadata. Once you click on the menu item, the system will display the message below to inform you of the formats that are currently supported for automatic metadata extraction.

N.B.: The extract function of the web portal extracts information contained in the file header. If your file does not have a header, no information will be extracted, even when the system reports successfully finishing the extraction.

Below you can find the list of file extension for which automatic extraction is supported:

  • 'dcm' , 'dcm30': 'description': 'DICOM (Digital Imaging and Communications in Medicine) image','reference': 'dicomstandard.org',
  • 'doc', 'docx': 'description': 'Microsoft Word document', -'reference': 'microsoft.com',
  • 'fits', 'fit': 'description': 'FITS (Flexible Image Transport System) image', -'reference': 'fits.gsfc.nasa.gov',
  • 'hdr', 'nii': 'description': 'NIfTI (Neuroimaging Informatics Technology Initiative) header', -'reference': 'nifti.nimh.nih.gov',
  • 'jpeg', 'jpg' : 'description': 'Exif (Exchangeable image file format)', -'reference': 'en.wikipedia.org/wiki/Exif',
  • 'pdf': -'description': 'PDF (Portable Document Format) file', -'reference': 'adobe.com/acrobat/about-adobe-pdf.html',
  • 'ppt', 'pptx' -'description': 'Microsoft Powerpoint document', -'reference': 'microsoft.com',
  • 'tif', 'tiff': -'description': 'Exif (Exchangeable image file format)', -'reference': 'en.wikipedia.org/wiki/Exif',
  • 'txt': -'description': 'ASCII text file containing key=value pairs', -'reference': '',
  • 'xls', 'xlsx' -'description': 'Microsoft Excel document', -'reference': 'microsoft.com',

If the file type you wish to extract metadata from has a readable file structure and is not present in the list above, we can probably add that file type to the list of supported extensions. Please send an e-mail to RDMS support to let us know that you wish for a file type to be added to the list.