This blog is a tutorial on some of the tools and features of HDF5 files. More specifically, we will cover the built-in terminal commands and some use of h5py, a HDF5 python package. You can download a VirtualBox image with required materials for the exercises from the following link (the username and password are both feeg6003).
Also, you can find the presentation slides here.
If you do not currently have access to HDF5 you can download it for free using macports or from their website. Acquiring the h5py module is very straightforward if you have an anaconda distribution installed, and we would recommend installing it using this method.
Note that this installs the h5py module and the HDF5 tools at the same time.
What is HDF5?
HDF5 is a hierarchical file format that allows various types of data and metadata to be stored in the same file. It has a huge amount of structural versatility and can even be used for parallel applications.
The three main sections of an HDF5 file are groups, datasets and attributes. Groups can contain other groups and datasets and are used in a similar way to folders in a directory structure. Data is stored in datasets. Attributes contain the metadata and can be attached to either groups or datasets.
Part 2: HDF5 and h5py
For those that are familiar with python, there is a module that provides various commands for the creation and manipulation of HDF5 files. Extensive documentation for this module can be found on this website.
This module has various features, including creation, reading and writing and the manipulation of file structure within HDF5 files. When wishing to use this module do not forget to use the "import h5py" command.
See the slides for details on the covered commands.
Exercise 2: Create your own file!
In the second part, you will have learned about using h5py and some of its features. This exercise will be about creating your own HDF5 file. In /Documents/ inside the VirtualBox image, you will find an IPython Notebook. Open this notebook through terminal using the command
$ cd ~/Documents/
$ ipython notebook exercise2.ipynb
Run through the exercises found in this file.