How To Install and Configure ZFS on Ubuntu

Join 5,000 subscribers and get a periodic digest of news, articles, and more.
By submitting your email, you agree to the Terms of Use and Privacy Policy.
Dave McKay first used computers in the industry when punched paper tape was in vogue and he has been programming ever since. His use of computers pre-dates the birth of the PC and the public release of Unix. He has programmed in everything from 6502 assembly to Lisp, and from Forth to C#. He is now a technology journalist and independent Data Protection and Compliance consultant. Read more…
The ZFS file system delivers huge capacities, RAID mirroring, and anti-corruption mechanisms right out of the box. Learn how to install Ubuntu with ZFS, and how to create a mirrored pool.
ZFS is an advanced file system that originated at Sun Microsystems for use with their Solaris operating system. Following Oracle’s acquisition of Sun in 2009, ZFS is now under Oracle Corporation ownership.
However, in a typical act of altruism, from 2005 onwards, Sun released an open source version of ZFS. Inevitably, this was ported to Linux where it gained wider exposure. The open source version of ZFS—openZFS—is managed and maintained by the OpenZFS project.
ZFS is a high-capacity, fault-tolerant file system. ZFS originally stood for Zettabyte File System. The ZFS architecture is based on 128 bits instead of the more common 64 bits of other files system. Being able to work with larger numeric values is one of the factors that made ZFS capable of handling zettabytes of storage. To give you an idea of what that means, a zettabyte is a billion terabytes.
Nowadays, ZFS supports file storage of up to 256 zebibytes. A zebibyte (270 bytes) is larger than a zettabyte (1021 bytes), but not by an order of magnitude. There’s much more to ZFS than sheer capacity—as mind-boggling as that is. ZFS functions as its own volume manager and RAID controller. It has built-in functions such as true copy on write that protect your data from corruption. It combines features that deliver file system pooling, cloning and copying, and its RAID-like functionality, natively.
Ubuntu has offered ZFS for some years now, but always with warnings and caveats. In Ubuntu 20.10 the warnings were removed. Canonical officially supports ZFS but only in full disk setups. To get that support you’ll need to install ZFS as you install Ubuntu. The ZFS options are still tucked away, but they’re there and no longer just for the intrepid or foolhardy.
With Ubuntu 21.10 due in October 2021, it’s a good time to see how the ZFS offering in Ubuntu is maturing.
RELATED: How to Install and Use ZFS on Ubuntu (and Why You’d Want To)
During the Ubuntu install the “Installation Type” screen lets you choose to erase the disk you’re installing Ubuntu on or to do something else. Click the “Advanced Features” button.
Advanced Features button on the Installation Type screen
The “Advanced Features” dialog appears.
ZFS options in the Advanced Features dialog
Select the “Erase Disk and Use ZFS” radio button, and click the “OK” button.
The “Installation Type” screen will display the “ZFS Selected” to show that you’ve chosen to use ZFS.
"ZFS selected" notification on the Installation Type scrren
Click the “Continue” button and complete the installation as usual.
If you have several hard drives installed in your computer you’ll be able to choose how you want them to be used by ZFS. Ubuntu will offer a suggested configuration, but you can adjust things to suit yourself.
But what if you add some hard drives once you’ve installed Ubuntu? How do you configure ZFS to use the new storage? That’s what we’ll look at next.
We installed Ubuntu with ZFS on the single hard drive of the test machine we used to research this article. We added two more hard drives, giving the computer three hard drives in total. One hard drive had Ubuntu installed on it, and the two new drives were blank, unformatted, and unmounted.
The first thing we need to do is identify how Ubuntu is referring to the new hard drives. The lsblk command lists all block devices installed in your computer. We can be specific about which columns of output we want to see in the results.

The -o (output) option is followed by the columns we want to see. We chose:
There are a bunch of squashfs loopback devices, numbered loop0 throughloop6. Each time you install a snap application, one of these pseudo-devices is created. It is part of the encapsulation and sandboxing that snap wraps around each snap application.
The first hard drive is listed as /dev/sda. It’s a 32 GB drive with five partitions on it, listed as /dev/sda1 through /dev/sda5. They’re formatted in different ways. This is the drive that was in the computer when we installed Ubuntu.
Our two new hard drives are listed as /dev/sdb and /dev/sdc. They’re 32 GB drives too, but they’re not formatted and they’re not mounted.
To utilize the new hard rives we add them to a pool. You can add as many drives to a pool as you like. There are two ways to do this. You can configure the pool so that you can use all of the storage space of each hard drive in a RAID 0 configuration, or you can configure them so that the pool only offers the amount of storage space of the smallest hard drive in the pool, in a RAID 1 configuration.
The advantage of RAID 0 is space. But the preferred—and the very highly recommended—configuration is RAID 1. RAID 1 mirrors the data across all the drives in the pool. That means you can have a hard drive failure and the file system and your data are still safe and your computer is still functional. You can replace the stricken drive and add the new drive to your pool.
By contrast, with RAID 0 a single hard drive failure renders your system inoperable until you replace the stricken drive and perform a restore from your backups.
The more drives you have in a RAID 1 pool the more robust it is. The minimum you need for RAID 1 is two drives. A failure in either drive would be an inconvenience, but not a disaster. But a failure of both hard drives at the same time would be a bigger problem, of course. So the answer would appear to be pooling as many hard drives as you can spare.
But of course, in practice, there is a limit to how many drives you’ll want—or can afford to—allocate to a single pool. If you have eight spare hard drives, setting up two four-drive RAID 1 pools is probably a better use of the hardware than a single eight-drive pool. And remember, a RAID 1 pool can only offer the storage of the smallest hard drive in the pool, so always try to use drives of the same size in a single pool.
We’ve identified our new hard drives as /dev/sdb and /dev/sdc . To create a ZFS RAID 1 pool, we use this command:
The components of the command are:
Replace “cloudsavvyit” with the name you want to call your pool, and replace /dev/sdb and /dev/sdc with the identifiers of your new hard drives.
Adding the new hard drives to a new RAID 1 pool
Creating a pool is a little anti-climactic. If all goes well you’re unceremoniously returned to the command prompt. We can use the status action with the zpool command to see the status of our new pool.
Checking the status of the new pool
Our new pool has been created, it is online, our two new drives are in the pool, and there are no errors. That all looks great. But where is the pool? Let’s see if lsblk will show us where it has been mounted.
Status of the new hard drives in the ZFS pool
We can see that our new hard drives /dev/sdb and /dev/sdc have been partitioned with two partitions each, but no mount point is listed for them. Pools aren’t mounted like regular hard drives. For example, there’s no entry in the /etc/fstab file for ZFS pools. By default, a mount point is created in the root directory. It has the same name as the pool.
The ZFS pool mount point in the root directory
If you want to have the mount point created somewhere else, use the -m (mount point) option when you’re creating the pool, and provide the path to where you’d like the mount point to be created. You can also give the mount point a different name.
The pool exists, but only the root user can store data in it. That’s not what we need, of course. We want other users to be able to access the pool.
To achieve this we will:
This scheme provides great flexibility. We can create as many data storage directories as we need, with different groups owning them. Giving users access to the different storage areas is as simple as adding them to the appropriate groups.
We’ll use groupadd to create a user group. Our group is called “csavvy1”. We’ll then use the usermod command to add a user called “dave” to the new group. The -a (append) option adds the new group to the list of existing groups that the user is in. Without this option, the user is removed from all existing groups and added to the new one. That’ll cause problems, so make sure you use the -a option.
Creating a user group and adding a user to it
So that their new group membership becomes effective, the user must log out and back in again.
Now we’ll create a directory in the pool, called “data1.”
The chgrp command lets us set the group owner of the directory.
Finally, we’ll set the group permissions using chmod . The “s” is the SGID special bit. It means that files and directories created within the “data1” directory will inherit the group owner of this directory.
Creating a directory and setting its group ownership and permissions
Our user has logged out and back in. Let’s try to create a file in the new data storage directory in our new RAID 1 ZFS pool.
And let’s see it was created.
Creating a new file in the data storage directory of the ZFS pool
Success. What if we try to create another file outside of our data1 storage area?
The user cannot create files outside of their allocated data storage area
This fails as expected. Our permissions are working. Our user is only able to manipulate files in the data storage directory that he has been given permission to access.
RELATED: How to Use SUID, SGID, and Sticky Bits on Linux
Be careful with this command. Make sure you have backups before you proceed. If you’re sure you really want to and you’ve verified you have other copies of the data in the pool, you can destroy a pool with this command:
Replace “cloudsavvyit” with the name of the pool you’re going to destroy.
If you only have one hard drive, or if you’re computer has multiple hard drives but their size varies too much to form a useful pool, you can still use ZFS. You won’t get RAID mirroring, but the built-in anti-corruption and data protection mechanisms are still worthwhile and persuasive features.
But remember, no file system—with or without RAID mirroring—means you can ignore backups.
RELATED: Backups vs. Redundancy: What’s the Difference?
The above article may contain affiliate links, which help support CloudSavvy IT.
Facebook
Twitter
LinkedIn
RSS Feed
Cloud wisdom in your inbox
By submitting your email, you agree to the Terms of Use and Privacy Policy.

source

Digital Strategist Chris Hood

Leave a Reply

Your email address will not be published. Required fields are marked *

© 2022 SHAQ HAX - Proudly powered by theme Octo