User Tools

Site Tools


nas:zfs:options_for_zfs_on_raid

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nas:zfs:options_for_zfs_on_raid [2024/12/17 17:20] adminnas:zfs:options_for_zfs_on_raid [2025/02/12 11:13] (current) admin
Line 5: Line 5:
 |--|--------------|--------------|------------------------------| |--|--------------|--------------|------------------------------|
 |Levels| Mirror/1, RAIDZ1/5, RAIDZ2/6, RAIDZ3, dRAID | Mirror/1, RAID5, RAID6 | Mirror/1, RAID5, RAID6 | |Levels| Mirror/1, RAIDZ1/5, RAIDZ2/6, RAIDZ3, dRAID | Mirror/1, RAID5, RAID6 | Mirror/1, RAID5, RAID6 |
-|Data checksums in FS| Yes |Yes | probably not |+|Data checksums in FS| Yes | Yes | probably not 
 +|Corruption detection for individual files | Yes | Yes | probably not | 
 +|Automatic corruption fix from redundancy data | Yes | only on RAID6 | only on RAID6, can go undetected 
  
 ## ZFS specific terms ## ZFS specific terms
-Resilvering: RAID rebuild or drive initialization.   + 
-RAIDZx: Number of drives that can fail. Up to three at the moment.   +* **Resilvering**: RAID rebuild or drive initialization.   
-dRAID: RAID stripes are distributed among that drives including hot spare. Leads to faster repair after a disk swap.+* **RAIDZx**: Number of drives that can fail. Up to three at the moment.   
 +* **dRAID**: RAID stripes are distributed among that drives including hot spare. Leads to faster repair after a disk swap.
  
 ## Why no hardware raid with ZFS on it ## Why no hardware raid with ZFS on it
Line 23: Line 26:
  
 On Fedora there is a `resource-agent` package that contains an agent for use with ZFS. The same version is available on CentOS 10 stream but it does not contain the ZFS agent.   On Fedora there is a `resource-agent` package that contains an agent for use with ZFS. The same version is available on CentOS 10 stream but it does not contain the ZFS agent.  
-The [packages for Fedora 42 can be rebuilt](https://github.com/simar0at/resource-agents/releases) and work on Rocky Linux 9. The "stable" option is on ZFS 2.1.x series which is still "supported" at 2.1.16 in Dec 2024. TrueNAS however is on the 2.2.x series which is in the `testing` branch at 2.2.6 in Dec 2024.   +The [packages for Fedora 42 can be rebuilt](https://github.com/simar0at/resource-agents/releases) and work on Rocky Linux 9
-There is an option to use a prebuilt module with EL 9. This is however not signed for use with EFI Secure Boot. the DKMS build system signs its artifacts automatically but of course pulls in a lot of dependencies.   + 
-There is an [official repository for ZFS packages](https://zfsonlinux.org/epel) built for EL9 (and EL8).   +There is an [official repository for ZFS packages](https://zfsonlinux.org/epel) built for EL9 (and EL8). The "stable" option is on ZFS 2.1.x series which is still "supported" at 2.1.16 in Dec 2024. TrueNAS however is on the 2.2.x series which is in the `testing` branch at 2.2.6 in Dec 2024.   
-To make use of the snapshot and send/receive feature for backup, another piece software is needed. There are some options, but I will use the part of TrueNAS that is available separately as [`zettarepl` and that can be rebuilt for EL 9](https://github.com/simar0at/zettarepl/releases). Note however that the dependencies don't translate correctly.+There is an option to use a prebuilt module with EL 9. This is however not signed for use with EFI Secure Boot. the DKMS build system signs its artifacts automatically but of course pulls in a lot of dependencies. 
 + 
 +To gain any additional safety for the data stored on zfs it is vital to regularly `scrub` all zfsThere are timer units for monthly or weekly scheduling shipped with zfs 
 +   
 +To make use of the snapshot and send/receive feature for backup, another piece software is needed. There are some options, but I will use the part of TrueNAS that is available separately as `zettarepl` and [that can be rebuilt for EL 9](https://github.com/simar0at/zettarepl/releases). Note however that the dependencies don't translate correctly.
  
 Install instructions: Install instructions:
  
-```sh+```bash
 sudo dnf install -y https://zfsonlinux.org/epel/zfs-release-2-3$(rpm --eval "%{dist}").noarch.rpm sudo dnf install -y https://zfsonlinux.org/epel/zfs-release-2-3$(rpm --eval "%{dist}").noarch.rpm
 sudo dnf config-manager --enable zfs-testing sudo dnf config-manager --enable zfs-testing
Line 39: Line 46:
 ``` ```
 In the mok util enroll the key. Use the password you just set. In the mok util enroll the key. Use the password you just set.
-```sh+ 
 +It is possible to set the size for the [ARC](https://openzfs.readthedocs.io/en/latest/performance-tuning.html#adaptive-replacement-cache) using module parameters in `/etc/modprobe.d/zfs.conf`
 +``` 
 +options zfs zfs_arc_max=8589934592 zfs_arc_min=8589934592 
 +``` 
 +This limits the size of the ARC to 8GB and reserves these 8GB on module load. 
 +If not set, there is an automatic that usually does a good job but uses up the half of the systems memory for the ARC. 
 +```bash
 sudo modprobe zfs # this will not work on an EFI Secure Boot system without the MOK enrolement sudo modprobe zfs # this will not work on an EFI Secure Boot system without the MOK enrolement
  
-sudo pcs resource create shared-ZFS-1 ocf:heartbeat:ZFS pool=shared-ZFS-1  op monitor OCF_CHECK_LEVEL="0" timeout="30s" interval="5s" +# Disable services meant for automatic import on startup -> we do this with pcs 
-sudo pcs resource create shared-ZFS-ocf:heartbeat:ZFS pool=shared-ZFS-2  op monitor OCF_CHECK_LEVEL="0" timeout="30s" interval="5s" +sudo systemctl disable --now zfs-share.service zfs-import-cache.service zfs-mount.service 
-sudo pcs resource group add nfsgroup shared-ZFS-1 --before clustered-nfs + 
-sudo pcs resource group add nfsgroup shared-ZFS---before clustered-nfs+sudo dnf install -y https://github.com/simar0at/resource-agents/releases/download/4.16.0-1/resource-agents-4.16.0-1.el9.x86_64.rpm 
 + 
 +pcs resource create nfsshare ocf:heartbeat:ZFS pool=nfsshare  op monitor OCF_CHECK_LEVEL="0" timeout="30s" interval="5s" --group nfsgroup 
 +pcs resource create zfs-scrub-monthly-nfsshare systemd\:zfs-scrub-monthly@nfsshare.timer --group nfsgroup 
 +pcs resource create nfsshare-mirror ocf:heartbeat:ZFS pool=nfsshare-mirror  op monitor OCF_CHECK_LEVEL="0" timeout="30s" interval="5s" --group nfsgroup 
 +pcs resource create zfs-scrub-monthly-nfsshare-mirror systemd\:zfs-scrub-monthly@nfsshare-mirror.timer --group nfsgroup 
 +pcs resource group add nfsgroup nfsshare --before clustered-nfs 
 +pcs resource group add nfsgroup zfs-scrub-monthly-nfsshare --after nfsshare 
 +pcs resource group add nfsgroup nfsshare-mirror --before clustered-nfs 
 +pcs resource group add nfsgroup zfs-scrub-monthly-nfsshare-mirror --after nfsshare-mirror 
 +```
  
 +```bash
 sudo dnf install python3-coloredlogs python3-jsonschema python3-isodate python3-croniter python3-paramiko sudo dnf install python3-coloredlogs python3-jsonschema python3-isodate python3-croniter python3-paramiko
-sudo dnf install python3-zettarepl-24.10.1-2.noarch.rpm+sudo dnf install -y https://github.com/simar0at/zettarepl/releases/download/24.10.1/python3-zettarepl-24.10.1-2.noarch.rpm
  
-sudo nano /etc/systemd/system/zettarepl.service+sudo vi /etc/systemd/system/zettarepl.service
 ``` ```
 ```ini ```ini
Line 58: Line 83:
 [Service] [Service]
 Environment=PYTHONPATH=/usr/lib/python3/dist-packages/ Environment=PYTHONPATH=/usr/lib/python3/dist-packages/
-ExecStart=/usr/bin/zettarepl run /shared-ZFS-1/config/zettarepl.yaml+ExecStart=/usr/bin/zettarepl run /nfsshare/config/zettarepl.yaml
 ``` ```
 Note: No install section it will be launched using pacemaker. Note: No install section it will be launched using pacemaker.
 ```sh ```sh
-sudo mkdir /shared-ZFS-1/config +sudo mkdir /nfsshare/config 
-sudo nano /shared-ZFS-1/config/zettarepl.yaml+sudo nano /nfsshare/config/zettarepl.yaml
 ``` ```
 ```yaml ```yaml
Line 69: Line 94:
 periodic-snapshot-tasks: periodic-snapshot-tasks:
   # Each task in zettarepl must have an unique id to make references for it   # Each task in zettarepl must have an unique id to make references for it
-  shared-ZFS-1-qh:+  nfsshare-qh:
     # Dataset to make snapshots     # Dataset to make snapshots
-    dataset: shared-ZFS-1+    dataset: nfsshare
     # You must explicitly specify if you want recursive or non-recursive     # You must explicitly specify if you want recursive or non-recursive
     # snapshots     # snapshots
Line 111: Line 136:
     schedule:     schedule:
       minute: "*/15"    # Every 15 minutes       minute: "*/15"    # Every 15 minutes
-  shared-ZFS-1-hour: +  nfsshare-hour: 
-    dataset: shared-ZFS-1+    dataset: nfsshare
     recursive: true     recursive: true
     #exclude:     #exclude:
-    # - shared-ZFS-1/xyz+    # - nfsshare/xyz
     lifetime: P1D     lifetime: P1D
     #allow-empty: false     #allow-empty: false
Line 123: Line 148:
       hour: "*"       hour: "*"
 replication-tasks: replication-tasks:
-  shared-ZFS-1-shared-ZFS-2:+  nfsshare-nfsshare-mirror:
     # Either push or pull     # Either push or pull
     direction: push     direction: push
Line 132: Line 157:
       type: local       type: local
     # Source dataset     # Source dataset
-    source-dataset: shared-ZFS-1+    source-dataset: nfsshare
     # Target dataset     # Target dataset
-    target-dataset: shared-ZFS-2+    target-dataset: nfsshare-mirror
     # "recursive" and "exclude" work exactly like they work for periodic     # "recursive" and "exclude" work exactly like they work for periodic
     # snapshot tasks     # snapshot tasks
Line 155: Line 180:
     # exclude all child snapshots that your periodic snapshot tasks exclude     # exclude all child snapshots that your periodic snapshot tasks exclude
     periodic-snapshot-tasks:     periodic-snapshot-tasks:
-      - shared-ZFS-1-qh +      - nfsshare-qh 
-      - shared-ZFS-1-hour+      - nfsshare-hour
     # If true, replication task will run automatically either after bound     # If true, replication task will run automatically either after bound
     # periodic snapshot task or on schedule     # periodic snapshot task or on schedule
Line 168: Line 193:
 sudo systemctl daemon-reload sudo systemctl daemon-reload
 sudo pcs resource create zettarepl systemd\:zettarepl.service sudo pcs resource create zettarepl systemd\:zettarepl.service
-sudo pcs resource group add nfsgroup zettarepl --after shared-ZFS-2+sudo pcs resource group add nfsgroup zettarepl --after nfsshare-mirror 
 +``` 
 + 
 +## Incompatibility between zfs' handling of nfs and resource-agents 
 + 
 +The `ocf:heartbeat:exportfs` configures nfs but does not write anything to disk. There are good reasons for doing this especially with the active/passive failover setup.   
 +zfs also has the ability to contain nfs setup instructions and will use `exportfs` to reconfigure nfs. It however does write something to disk and assumes that there is a configuration on diks for any exports. it calls ` exportfs -ra` which "reexports" according to the on disk state. 
 + 
 +This combination kills the nfs connections for a short period of time on certain zfs operations like taking snapshots. 
 + 
 +The quick and dirty fix for this is to make the `ocf:heartbeat:exportfs` persist to disk. Hack: 
 + 
 +```diff 
 +--- /usr/lib/ocf/resource.d/heartbeat/exportfs.orig     2025-01-18 20:59:11.511427785 +0100 
 ++++ /usr/lib/ocf/resource.d/heartbeat/exportfs  2025-01-18 21:00:19.526826188 +0100 
 +@@ -339,6 +339,7 @@ 
 +        fi 
 + 
 +        ocf_log info "directory $dir exported" 
 ++        cp /var/lib/nfs/etab /etc/exports.d/heartbeat.exports 
 +        return $OCF_SUCCESS 
 + } 
 + exportfs_start () 
 +@@ -403,6 +404,7 @@ 
 + unexport_one() { 
 +        local dir=$1 
 +        ocf_run exportfs -v -u ${OCF_RESKEY_clientspec}:$dir 
 ++        cp /var/lib/nfs/etab /etc/exports.d/heartbeat.exports 
 + } 
 + exportfs_stop () 
 + {
 ``` ```
  
 +## On ZFS virtual device types
  
 +[Write up](https://klarasystems.com/articles/openzfs-understanding-zfs-vdev-types/)
 +* LOG: Maybe useful for NFS exports. Can be very small (32 GB), should have very low latency.
 +* L2ARC: Read cache, rarely useful.
 +* SPECIAL: If enabled contains the whole metadata so is a point of failure. Can contain small files. Should be an SSD.
nas/zfs/options_for_zfs_on_raid.1734452406.txt.gz · Last modified: by admin