Creating ZFS pools on NetBSD

JuvenalUrbino

■ ZFS & NetBSD

ZFS support got mature and stable enough in NetBSD-9 to be implemented safely in everyday usage. There have been, nevertheless, non-negligible fixes since 9.0_RELEASE (see the CHANGES files on ny CDN), so if you plan on using ZFS you should at least run 9.2 (even better 9_STABLE or HEAD).

Given the limited number of active users reporting bugs, it's reasonable to avoid ZFS on NetBSD in production, at least for the moment.

Current branch has initial support for ZFS root, if you want to experiment with that. The boot sequence involves loading an initial ramdisk and mounting the pool on /altroot, similarly to CGD disks. This is in my todo list, so you may (or may not) see a 'ZFS root' post popping up here in the future.

■ ZFS for storage

I needed some additional storage for my RPi4 server, so I bought a 2Tb Western Digital HDD for a very affordable price. The disk hits a 100 Mb/s write speed though the USB 3.0 bus, which is more than enough compared to my expectations.

I wanted to format the disk as a ZFS pool, have it automatically mounted at /zfs upon boot, and create task-specific datasets inside it.
Turns out this is perfectly feasible.

■ Preparing the disk

First of all, it's advisable to delete any partition(s) and MBR/GPT table present on the disk, thus to start from scratch for full compatibility.
If you attempt to write a ZFS pool on a different kind of partition, you'll end up with a screwed up disk layout either way.

Most (All?) external HDDs nowadays are sold as GPT disks with a single NTFS partition. After pluggin-in the disk gpt show sd0 will identify the underlying filesystem as 'MS Basic Data'. Refer to gpt_uuid.h under src/sbin/gpt for more information on NetBSD's partition UUIDs.

In such scenario, all we have to do to prepare the disk is:

$ gpt remove -i 1 sd0
$ gpt destroy sd0

1 is the number of the partition to remove in the index, as shown by
gpt show. The gpt(8) man page is a fine piece of writing: I recommend
always keeping it at hand when messing with partitions on GPT disk.

In case you wanted to use a MBR disk, after deleting partitions, do:

$ gpt migrate sd0

And now we're ready to create our ZFS disk.

$ gpt create sd0
$ gpt add -a 2m -t zfs -l "zfs-data" sd0
$ gpt show sd0

   start        size  index  contents
       0           1         PMBR
       1           1         Pri GPT header
       2          32         Pri GPT table
      34        4062         Unused
    4096  3906957312      1  GPT part - ZFS
  3906961408        2015         Unused
  3906963423          32         Sec GPT table
  3906963455           1         Sec GPT heade

2m is the partition alignment and 'zfs-data' the designed label to use.

The kernel has assigned dk2 wedge devnode to our zfs-data partition, as revealed by dmesg:

[     1.825546] sd0 at scsibus0 target 0 lun 0: <WD, Elements 2621, 1026> disk fixed
[     1.835547] sd0: fabricating a geometry
[     1.835547] sd0: 1862 GB, 1907697 cyl, 64 head, 32 sec, 512 bytes/sect x 3906963456 sectors
[     1.845547] sd0: fabricating a geometry
[     1.845547] sd0: GPT GUID: 397a8a3f-c526-47d6-9f41-54010cfb4a01
[     1.845547] dk2 at sd0: "zfs-data", 3906957312 blocks at 4096, type: zfs

■ Enabling ZFS support

First load the driver and make it a default module to load at boot:

$ modload zfs
$ echo zfs >> /etc/modules.conf

Enable the service:

$ echo zfs=YES >> /etc/rc.conf
$ service zfs start

■ Create a ZFS pool

Running:

$ zpool create zfs /dev/dk2

Shall create a zfs pool and mount it under /zfs. The mountpoint corresponds to the zpool label (in our example 'zfs').

$ zpool status

  pool: zfs
 state: ONLINE
  scan: scrub in progress since Wed Mar 16 12:06:00 2022
        10.8G scanned out of 115G at 36.1M/s, 0h49m to go
        0 repaired, 9.42% done
config:



NAME        STATE     READ WRITE CKSUM
	zfs         ONLINE       0     0     0
	  dk2       ONLINE       0     0     0

errors: No known data errors

■ Create a dataset

zfs create -o mountpoint=/zfs/data zfs/data

Will create a ZFS 'data' dataset at the specified mountpoint.

In my case, I now have:

$ zfs list

NAME       USED  AVAIL  REFER  MOUNTPOINT
zfs        115G  1.64T    23K  /zfs
zfs/data   5.23G 1.64T	5.23G  /zfs/data
zfs/p2p    107G  1.64T   107G  /zfs/p2p
zfs/snap  7.81G  1.64T  7.81G  /zfs/snap

$ mount | grep zfs
zfs on /zfs type zfs (local)
zfs/data on /zfs/data type zfs (local)
zfs/p2p on /zfs/p2p type zfs (local)
zfs/snap on /zfs/snap type zfs (local

So, we're finally ready to write files to your zfs dataset. The zfs rc.d service will automatically mount the pool and the dataset at the usual mount point at each boot. No need to deal with fstab.

■ Dataset tunables

There are several tunable options to play with. Refer to zfs(8). For a dataset foreseeably hosting large amounts of sensitive files, we could add some redundancy and enable gzip compression and deduplication:

$ zfs set compression=gzip zfs/data
$ zfs set dedup=on zfs/data
$ zfs set copies=2 zfs/data

I won't go into adding flash devices as ZFS Log (LOG vdev) and L2ARC Cache(CACHE vdev) for the moment. Maybe next time 🙂
Remember to scrub your zpool once in a while (suggestion: cron jobs).

danboid

JuvenalUrbino Current branch has initial support for ZFS root

Really? I fired up the latest amd64 installer today and I couldn't see ZFS as an option in the installer.

bsduck

danboid It's not yet implemented in the installer, you need to set it up manually, see https://wiki.netbsd.org/wiki/RootOnZFS/

Mosfet

bsduck

IMHO, it should be implemented in the installer, otherwise few(er) people will bother using ZFS.

nia

Mosfet Be the change you want to see in the world!

Mosfet

nia
Is this an "official" proposal? 😊

Jay

http://netbsd0.blogspot.com/2022/05/the-journey-to-zfs-raidz1-with.html
By @abs