Lustre is a very popular open-source distributed parallel file system used in High Performance Computing. However in my experience with using it I could not find a good and easy to understand documentation.
Prerequisites
- RockyLinux 8.x (CentOS / RHEL)
- ConnectX-5 (or newer InfiniBand Adapter)
- dkms
Pre-Installation
Each Lustre version usually targets a particular version of a kernel and distro specifically. At the time of writing this blog post Lustre 2.12.8 was the latest LTS Lustre release available to public. You can more information about the kernels and the distros supported by Lustre either from the changelog posted in the wiki or the support matrix page.
I would strongly advise users to run the kernel supported by Lustre including the patch number.
You can skip to the next part if you have already installed the kernel supported by Lustre.
I use a nifty little plugin called versionlock
for dnf which allows me to freeze the version of a package preserving its version whenever dnf update
is run.
You can install versionlock
with the following command.
|
|
Once you installed versionlock
you can freeze package versions using versionlock add
. For example I want to freeze my kernel
package to 4.18.0-348.2.1.el8_5
, which is the official supported version by Lustre.
|
|
Lustre depends on several kernel packages.
Install all the required packages:
|
|
After installing the packages I suggest you freeze them to prevent dnf from updating them when a new kernel is available. You can freeze the packages using versionlock
:
|
|
Check the current packages frozen with dnf versionlock list
:
|
|
You can clear any frozen packages with
dnf versionlock clear
or unfreeze a single package with
dnf versionlock delete <package name>
Once you have installed the kernel reboot your system.
|
|
Confirm your kernel version with uname -r
.
You are now ready to begin installing Mellanox InfiniBand drivers.
Installing the MOFED drivers
By default the drivers shipped with the distro are a bit unrealiable and you might need to uninstall it before proceeding. Once done you can download the official MOFED drivers from Mellanox here.
Select the Downloads
tab, scoll down to see the latest version of MOFED available. Select RHEL/CentOS
and then select RHEL/CentOS 8.5
. Select x86_64
or the architecture you are running on and then click on the ISO link. You need to accept the terms and conditions before downloading.
Save the file somewhere you can access later. For this example we have downloaded MLNX_OFED_LINUX-5.5-1.0.3.2-rhel8.5-x86_64.iso
.
The above steps might differ from user to user. Please change accordingly.
Create a temporary mount point /mnt
and mount the ISO file.
|
|
Install the InfiniBand drivers:
|
|
Note here that I mentioned the distro as rhel8.5
. MOFED drivers don’t support RockyLinux by default. Since RockyLinux is designed to be a 1:1 bug-for-bug compatible with Red Hat Enterprise Linux (RHEL) you can force the installer to assume the distro is RHEL.
You might be required to install some dependency packages required for the installer to proceed. The installer will share the command needed to install the dependencies.
Once you have installed reboot
the system to load the InfiniBand drivers.
Configuring InfiniBand for IPoIB
Run ipstat
to check the physical state of the your InfiniBand adapter.
|
|
You can see that our physical state is stuck as Initializing
. Enable opensm
(Subnet Manager) to change the state to LinkUp
.
|
|
You now need to configure the InfiniBand interface like a typical Ethernet interface.
You can use nmtui
(NetworkManager
) to configure the interface (usually called ib0
)
Configure a static or dynamic IP for your InfiniBand adapter.
You may notice a parameter called Transport Mode
. Mellanox recommends Datagram Mode
for better scalability and performance and defaults to it (except for Connect-IB cards).
You may read more about it in the official Mellanox documentation and Linux Kernel documentation.
For the sake of simplicity we will choose the default settings however it is worth investigating the other option for optimizing the performance of hardware resources.
Once done you can verify if the IP has be assigned properly with:
|
|
Once verified we can move onto the installation of Lustre.
Installing Lustre Client
Lustre requires Extra Packages for Enterprise Linux (EPEL) repository enabled as it requires a package dkms
.
Install the EPEL repository:
|
|
Alternatively you can also choose to install a binary kernel module (kmod
) for which you can skip the installation of dkms
.
Now we need to add the repository containing Lustre packages.
Using nano
or any editor create a file with the following content:
|
|
Clear your dnf cache and update repository metadata
|
|
Install Lustre Client along with DKMS kernel module (Recommended)
I recommend this way since it ensures that Lustre module is built and installed properly.
|
|
Install Lustre Client along with its binary kernel module (Alternative)
|
|
Once done we now need to configure the Lustre Network (LNet). This is a required step used by Lustre for routing network metadata and file I/O.
There are two ways to write the configuration for Lnet. We shall create a static configuration for LNet. But with Lustre version 2.7.0
and above you can dynamically define the routing using a utility callled lnetctl
.
Create a lustre.conf
modprobe file:
|
|
Here the InfiniBand interface we are using is ib0
.
For multi-rail setup:
|
|
Once done reboot the machine.
Mounting Lustre
Let’s say you have a scratch
filesystem you created on a Lustre server and would like to mount it at /scratch
|
|
If we have multiple MGS
nodes you can specify the primary, secondary, and other MGS nodes for Lustre to connect.
|
|
I recommend the flock
option while mounting Lustre filesystems. This enables support for coherent posix file locks on open files. This is the default mode from Lustre 2.13
and above.
Verify mount points with:
|
|
Congrats! You successfully installed, configured and mounted Lustre on client nodes.
If you want to read further I suggest going through the Lustre documentation.