## Unraid User-Share vs. Disk-Share SMB Performance

This is yet another post on Unraid’s poor SMB performance, but I think I narrowed down the cause of the problem to the Unraid FUSE filesystem. I discovered this about 2 months ago, but with COVID-19 and no kids weekend sporting event duties, I have some time to post.

In this round of testing I compared the performance of “User” shares vs. “Disk” shares. An Unraid “User” share is a volume backed by Unraid’s FUSE filesystem, while a “Disk” share is a volume directly backed by the disk’s native filesystem.

Per suggestions from Unraid I also tested enabling DirectIO and disabling SMB case sensitivity.

As before, I used my DiskSpeedTest utility to automate the testing.

The case insensitive SMB and DirectIO options made no discernable difference.

But we can see that the performance of disk shares are near the same performance we get from Ubuntu. This means the performance problem is caused by the Unraid FUSE code affecting all user shares.

One may expect some performance degradation due to the FUSE code needing to perform disk parity operations, but this level of impact is unacceptable compared to other software based RAID systems, and worse is that the test was performed on the SSD cache volume where no parity computation is required.

The Unraid FUSE code is proprietary, so code inspection is not possible, but I suspect the code path is less than optimized. In my experience the performance and quality demands of filesystem code requires extremely competent and diligent developers. Other than the obvious performance degradation, I’ll offer two other examples of questionable code behavior: 1) All IO is halted while waiting for a disk to spin up, even if the disk being spun up has nothing to do with servicing the IO backed by another disk. This could be an overly simplified locking or synchronization model, instead of an IO path based locking model. 2) The cache volume is not backed by parity, but IO performance is still severely degraded. This directly shows the performance degradation caused by code not IO, and could be avoided by direct IO passthrough, or file handle remapping as done in overlay filesystems. But, I’m really just speculating, other than observation I have no substantiation.

I really do like the flexibility of Unraid as an all-in-one storage plus docker plus virtualization host. But the “proprietary” Unraid RAID implementation is showing to be the weakest link, not just in performance but also being limited to 28 data + 2 parity drives. I am leaning towards adding my support to the growing number of users that would like to see native ZFS support in Unraid.

Unfortunately still no word from Unraid as to a performance fix.

## Unraid vs. Ubuntu Bare Metal SMB Performance

In my last test I compared Unraid SMB performance with an Ubuntu VM running on Unraid, and Ubuntu outperformed Unraid. I was wondering if the VM disk image synthetically improved performance, maybe IO caching, so this time I tested Ubuntu on the same hardware that runs Unraid.

I configured the system to boot from either the Unraid USB stick, or an Ubuntu Server USB stick. In both cases the hardware was exactly the same, and the SMB share was on the same 4 x 1TB Samsung 860 Pro SSD BTRFS volume. I mounted the BTRS volume using the same mount options that Unraid uses. The Ubuntu Samba server used default options, the only change I made was to register the share.

Samba Config:
[cstshare]
comment = Samba on Ubuntu
path = /mnt/cache/CacheSpeedTest
browsable = yes

BTRFS Mount:
mount -t btrfs -o noatime,nodiratime -U 89d1ad3a-83f3-4086-9006-5f0931370d36 /mnt/cache

I ran the same tests as before, and the results again showed that the Unraid SMB ReadWrite and Write performance is much worse compared to Ubuntu. It was interesting to note that the Ubuntu ReadWrite performance was higher than the theoretical 1Gbps limit at 1MB and 2MB block sizes. I re-tested twice and got the same results, my assumption is that the DiskSpd options to disable local and remote caching were not effective.

I have now tested Unraid vs. W2K19 VM, Ubuntu VM, and now Ubuntu Bare Metal, and Unraid ReadWrite and Write performance is always abysmal.

I have again reported my findings in the Unraid SMB performance issue thread, and we continue to wait for a fix.

## Unraid vs. Ubuntu SMB Performance

In my last round of testing I found that Unraid v6.8 SMB still underperforms compared to Windows Server 2019, but I was wondering if it is a Linux Samba problem, or an Unraid problem.

I installed an Ubuntu Server 18.04.3 LTS VM on Unraid, bridged network, 16GB RAM, 128GB raw disk located on the BTRS cache volume, consisting of 4 x Samsung Pro 860 SSD drives. This is exactly the same configuration I use for the W2K19 test VM. I installed Samba on Ubuntu using default options.

I created a SMB share that is backed by the VM disk image, and a second share that is mapped directly to an Unraid share located on the cache volume. For both shares the Ubuntu VM and Samba server will handle SMB network traffic, but one share will write to the Ubuntu EXT4 volume backed by the VM disk image, and the second will write through to the underlying Unraid BTRFS cache volume using VirtFS.

I ran a series of tests using my DiskSpeedTest utility, and the results are below.

Note that the VirtFS mapped share exhibit some problems that appear to be caching related. E.g. the file iteration test would create 14000 files, but iterating the just created files would only read 3080.

My conclusion is that the Linux Samba SMB performance is on par with that of Windows Server 2019, and that the performance problems are attributed to the Unraid file write performance. The Windows test used NTFS and Ubuntu used EXT4, so it could be BTRFS and XFS related, but more likely something Unraid does. Maybe the next step could be to test a bare metal Ubuntu SMB on XFS and BTRFS.

## Unraid SMB Performance: v6.7.2 vs. v6.8.1

I previously wrote about the poor SMB performance I experienced in Unraid v6.7.2. Unraid v6.8 supposedly addressed SMB performance issues for concurrent read and write operations, and after waiting for the first bugfix release of v6.8, I re-tested using v6.8.1.

In my last test I used a combination of batch files and copy and paste, this time I wrote a tool to make repeat testing easy. I am not going to describe the usage here, see the instructions at the GitHub repository. There are new reports in v6.8 of poor SMB performance when a folders contains large numbers of files, so I added a test to try and simulate that behavior, by creating a large number of files, then reading each file, then deleting each file.

I configured my Unraid server with two test SMB shares, one pointing to the cache, and one pointing to a single spinning disk. The cache consists of 4 x 1TB Samsung Pro 860 drives in a BTRFS volume, and the spinning disk is a Seagate IronWolf 12TB disk formatted XFS protected by a single similar model parity disk. The third share is backed by a Windows Server 2019 VM running on the cache disk.

I upgraded the server from v6.7.2 to v6.8.1, verified operation, and then restored it back to v6.7.2. I ran the first set of tests with v6.7.2, upgraded to v6.8.1, and re-ran the same set of tests. Both tests used exactly the same hardware configuration and environment, and were run back to back.

Here are results in graph form:

What did we learn?

• Windows Server 2019 SMB performance is still far superior compared to Unraid.
• I don’t know if the Linux SMB implementation is just that much slower compared to Windows, or if the performance degradation is attributed to Unraid.
• TODO: Test SMB performance between a Linux VM and Windows VM.
• The cache performance in v6.8.1 is worse compared to v6.7.2.
• No noticeable SMB performance improvement in v6.8.1.

## Unraid repeat parity errors on reboot

This post started with a quick experiment, but after hardware incompatibilities forced me to swap SSD drives, and subsequently losing a data volume, it turned into a much bigger effort.

My two Unraid servers have been running nonstop without any issues for many months, last I looked the uptime on v6.7.2 was around 240 days. We recently experienced an extended power failure, and I noticed 5 parity errors, on both servers, after the servers were restarted.

Jan 1 06:09:23 Server-1 kernel: md: recovery thread: PQ corrected, sector=1962934168
Jan 1 06:09:23 Server-1 kernel: md: recovery thread: PQ corrected, sector=1962934176
Jan 1 06:09:23 Server-1 kernel: md: recovery thread: PQ corrected, sector=1962934184
Jan 1 06:09:23 Server-1 kernel: md: recovery thread: PQ corrected, sector=1962934192
Jan 1 06:09:23 Server-1 kernel: md: recovery thread: PQ corrected, sector=1962934200

Jan 1 04:42:39 Server-2 kernel: md: recovery thread: P corrected, sector=1962934168
Jan 1 04:42:39 Server-2 kernel: md: recovery thread: P corrected, sector=1962934176
Jan 1 04:42:39 Server-2 kernel: md: recovery thread: P corrected, sector=1962934184
Jan 1 04:42:39 Server-2 kernel: md: recovery thread: P corrected, sector=1962934192
Jan 1 04:42:39 Server-2 kernel: md: recovery thread: P corrected, sector=1962934200

I initially suspected that a dirty shutdown caused the corruption, but my entire rack is on a large UPS, and the servers are configured, and tested, to cleanly shutdown in case of a low battery condition. Unfortunately Unraid does not persist logs across reboots, so it was impossible to verify the shutdown behavior via logs. Unraid logs to memory and not to the USB flash drive to prevent flash wear, but I think this needs to be at least configurable, as no logs means troubleshooting after an unexpected reboot is near impossible. Yes, I know I can enable the Unraid syslog server, and I can redirect syslog to write to the flash drive, but syslog is not as reliable or complete as native logging, especially during a shutdown scenario, but more importantly, syslog was not enabled, so no shutdown logs.

I could not entirely rule out a dirty shutdown, but I could test a clean reboot scenario. I restarted from within Unraid, ran a parity check, same exact 5 parity errors were back, ran a parity check again, and clean. It takes more than a day to run a single parity check, so this is a cumbersome and time consuming exercise. It is  very suspicious that it is exactly the same 5 sectors, every time.

Jan  3 10:03:07 Server-2 kernel: md: recovery thread: P corrected, sector=1962934168
Jan  3 10:03:07 Server-2 kernel: md: recovery thread: P corrected, sector=1962934176
Jan  3 10:03:07 Server-2 kernel: md: recovery thread: P corrected, sector=1962934184
Jan  3 10:03:07 Server-2 kernel: md: recovery thread: P corrected, sector=1962934192
Jan  3 10:03:07 Server-2 kernel: md: recovery thread: P corrected, sector=1962934200

I searched the Unraid forums, and I found that there are other reports of similar repeat parity errors. In some instances attributed to a Marvel chipset, or a Supermicro AOC-SASLP-MV8 controller, or the SASLP2 driver. My systems use Adaptec RAID cards, 7805Q SAS2 and 81605ZQ SAS3, in HBA mode, so no Marvel chipset and no SASLP2 driver, but the same symptoms.

An all too common forum reply to storage problems is to switch to a LSI HBA, and I got the same reply when I reported the parity problem with my Adaptec hardware.

I was sceptical, causation vs. correlation. As example, take the SQLite corruption bug introduced in v6.7 and for the longest time it was blamed on hardware or 3rd party apps, but it eventually turns out to be an Unraid bug.

Arguing my case on a community support forum is not productive, and I just want the parity problem resolved, so I decided to switch to LSI HBA cards. I really do have a love hate relationship with community support, especially when I pay for a product, like Unraid or Plex Pass, but have no avenue to dedicated support.

I am no stranger to LSI cards, and the problems flashing from IR to IT mode firmware, so I got my LSI cards preflashed with the latest IT mode firmware at the Art of Server eBay store. My systems are wired with miniSAS HD SFF-8643 cables, and the only cards offered with miniSAS HD ports were LSI SAS9340-8i ServeRAID M1215 cards. I know the RAID hardware is overkill when using IT mode, and maybe I should have gone for vanilla LSI SAS 9300-8i cards, especially when the the Unraid community was quick to comment that a 9340 is not a “true” HBA.

I replaced the 7805Q with the SAS9340 in Server-2, and noticed that none of my SSD drives showed up in the LSI BIOS utility, only the spinning disks showed up. I put the 7805Q card back, and all the drives, including the SSD drives, showed up in the Adaptec BIOS utility. I replaced the 81605ZQ with the SAS9340 in Server-1, and this time some of the SSD’s showed up. None of my Samsung EVO 840 SSD’s showed up, but the Samsung Pro 850 and Pro 860 SSD’s did show up. I again replaced the 7805Q in Server-2 with the SAS9340, but this time I added a Samsung Pro 850, and it did show up.

The problem seemed to be limited to my Samsung EVO drives. I reached out to Art of Server for help, and although he was very responsive, he had not seen or heard of this problem. I looked at the LSI hardware compatibility list, and the EVO drives were listed. Some more searching, and I found a LSI KB article mentioning TRIM support not being supported on Samsung Pro 850 drives. It seems that the LSI HBA’s need TRIM to support DRAT (Deterministic Read After TRIM) / (Data Set Management TRIM supported (limit 8 blocks)), and RZAT (Deterministic read ZEROs after TRIM). The Wikipedia article on TRIM mentions specific drives for faulty TRIM implementations, including the Samsung 840 and 850 (without specifying Pro or EVO), and the Linux kernel has special handling for Samsung 840 and 850 drives.

	/* devices that don't properly handle queued TRIM commands */
{ "Micron_M500IT_*",		"MU01",	ATA_HORKAGE_NO_NCQ_TRIM |
ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "Micron_M500_*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "Crucial_CT*M500*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "Micron_M5[15]0_*",		"MU01",	ATA_HORKAGE_NO_NCQ_TRIM |
ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "Crucial_CT*M550*",		"MU01",	ATA_HORKAGE_NO_NCQ_TRIM |
ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "Crucial_CT*MX100*",		"MU01",	ATA_HORKAGE_NO_NCQ_TRIM |
ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "Samsung SSD 840*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "Samsung SSD 850*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "FCCT*M500*",			NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
ATA_HORKAGE_ZERO_AFTER_TRIM, },

This is all still circumstantial, as it does not explain why the LSI controller would not mount the 840 EVO drives, but will mount the 850 Pro drive, when both are listed as problematic, and both are included on the LSI hardware compatibility list. I do not have EVO 850’s to test with, so I can not confirm if the problem is limited to EVO 840’s.

I still had the original parity problem to deal with, and to verify that a LSI HBA will resolve the problem, so I needed a working Unraid with LSI HBA system. Server-1 had two EVO 840’s, a 850 Pro, and a 860 Pro for the BTRFS cache volume. I pulled a Pro 850 and a Pro 860 drive from another system, and proceeded to replace the two EVO 840’s. Per the Unraid FAQ, I should be able to replace the drives one at a time, waiting for the BTRFS volume to rebuild. I replaced the first disk, it took about a day to rebuild, I replaced the second disk using the same procedure, but something went wrong, and my cache volume would not mount, and reported being corrupt.

Jan  6 07:25:41 Server-1 kernel: BTRFS info (device sdf1): allowing degraded mounts
Jan  6 07:25:41 Server-1 kernel: BTRFS info (device sdf1): disk space caching is enabled
Jan  6 07:25:41 Server-1 kernel: BTRFS info (device sdf1): has skinny extents
Jan  6 07:25:41 Server-1 kernel: BTRFS warning (device sdf1): devid 4 uuid 94867179-94ed-4580-ace4-f026694623f6 is missing
Jan  6 07:25:41 Server-1 kernel: BTRFS error (device sdf1): failed to verify dev extents against chunks: -5
Jan  6 07:25:41 Server-1 root: mount: /mnt/cache: wrong fs type, bad option, bad superblock on /dev/sdr1, missing codepage or helper program, or other error.
Jan  6 07:25:41 Server-1 emhttpd: shcmd (7033): exit status: 32
Jan  6 07:25:41 Server-1 emhttpd: /mnt/cache mount error: No file system
Jan  6 07:25:41 Server-1 emhttpd: shcmd (7034): umount /mnt/cache
Jan  6 07:25:41 Server-1 kernel: BTRFS error (device sdf1): open_ctree failed
Jan  6 07:25:41 Server-1 root: umount: /mnt/cache: not mounted.
Jan  6 07:25:41 Server-1 emhttpd: shcmd (7034): exit status: 32
Jan  6 07:25:41 Server-1 emhttpd: shcmd (7035): rmdir /mnt/cache

In retrospect I should have known something was wrong when Unraid reported the array being stopped, but I still saw lots of disk activity on the SSD drive bay lights. I suspect the BTRFS rebuild was still ongoing, or mounted, even if Unraid reported the array being stopped. No problem, I thought, I make daily data backups to Backblaze B2 using Duplicacy, and weekly Unraid (appdata and docker) backups, that are then backed up to B2. I recreated the cache volume, and got the server started again, but my Unraid data backups were missing.

It was an oversight and configuration mistake: I configured my backup share to be cached, I ran daily backups of the backup share to B2 at 2am, and weekly Unraid backups to the backup share on Mondays at 3am. The last B2 backup was Monday morning at 2am, the last Unraid backup was Monday morning at 3am. When the cache died all data on the cache was lost, including the last Unraid backup, that never made it to B2. My last recoverable Unraid backup on B2 was a week old.

So a few key learnings: do not use the cache for backup storage, schedule offsite backups to run after onsite backups, and if the lights are still blinking don’t pull the disk.

Once I had all the drives installed, I tested for TRIM support.

Samsung Pro 860, supports DRAT and RZAT:

root@Server-1:/mnt# hdparm -I /dev/sdf | grep TRIM
* Data Set Management TRIM supported (limit 8 blocks)
* Deterministic read ZEROs after TRIM

Samsung Pro 850, supports DRAT:

root@Server-2:~# hdparm -I /dev/sdf | grep TRIM
* Data Set Management TRIM supported (limit 8 blocks)

Samsung EVO 840, supports DRAT, but does not work with the LSI HBA:

root@Server-2:~# hdparm -I /dev/sdc | grep TRIM
* Data Set Management TRIM supported (limit 8 blocks)

The BTRFS volume consisting 4 x Pro 860 drives reported trimming what looks like all disks, 3.2 TiB:

root@Server-1:~# fstrim -v /mnt/cache
/mnt/cache: 3.2 TiB (3489240088576 bytes) trimmed

The BTRFS volume consisting of 2 x Pro 860 + 2 x Pro 850 drives reported trimming what looks like only 2 disks, 1.8 TiB:

root@Server-2:~# fstrim -v /mnt/cache
/mnt/cache: 1.8 TiB (1946586398720 bytes) trimmed

In summary, Samsung EVO 840 no good, Samsung Pro 850 avoid, Samsung Pro 860 is ok.

Server-2 uses SFF-8643 to SATA breakout cables with sideband SGPIO connectors, controlling the drive bay lights. With the Adaptec controller the drive bay lights worked fine, but with the LSI the lights do not appear to work. I am really tempted to replace the chassis with a SAS expander, alleviating the need for the breakout cables, but that is a project for another day.

After I recreated the cache volume, reinstalled the Duplicacy web container and tried to restore my now week old backup file. I could not get the web UI to restore the 240GB backup file, either the session timed out or the network connection was dropped. I reverted to using the CLI, and with a few retries, eventually restored the file. It was disappointing to learn that the web UI must remain open during the restore, and that the CLI does not automatically retry on network failures. Fortunately Duplicacy will do block-based restores and can resume restoring large files.

2020/01/06 07:59:01 Created restore session 1o8nqw
2020/01/06 07:59:01 Running /home/duplicacy/.duplicacy-web/bin/duplicacy_linux_x64_2.3.0 [-log restore -r 101 -storage B2-Backup -overwrite -stats --]
2020/01/06 07:59:01 Set current working directory to /cache/localhost/restore
2020/01/06 09:37:35 Deleted listing session jnji7l
2020/01/06 09:37:41 Invalid session
2020/01/06 12:07:57 Stopping the restore operation in session 1o8nqw
2020/01/06 12:07:57 Failed to restore files for backup B2-Backup-Backup revision 101 in the storage B2-Backup: Duplicacy was aborted
2020/01/06 12:07:57 closing log file restore-20200106-075901.log
2020/01/06 12:08:17 Deleted restore session 1o8nqw
Downloaded chunk 34683 size 13140565, 15.05MB/s 01:20:44 70.1%
1ff9d2c082d06226b0d81019338d048bf5a4428827a3fc0d3f6f337d66fd7fa9: read tcp 192.168.1.113:49858->206.190.215.16:443: wsarecv: An existing connection was forcibly closed by the remote host.
...
Files: 1 total, 243982.58M bytes
Total running time: 01:30:23

I did lose my DW-Spectrum IPVMS running on an Ubuntu Server VM. I’ve known that I don’t have a VM backup solution, but the video footage is on the storage server not in the VM, video backups go to B2, and it is reasonably easy to recreate the VM. I am still working on a DW-Spectrum docker solution for Unraid, but as of today the VMS does not recognize Unraid mapped storage volumes.

After all this trouble, I could finally test a parity check after reboot with the LSI HBA.  With the system up I ran a parity check, all clear, rebooted, ran the parity check again, and … no errors. I performed this operation on both servers, no problems.

I was really sceptical that the LSI would work where the Adaptec failed, and this does not rule out Unraid as the cause, but it does show that Unraid with the LSI HBA does not have the dirty parity on reboot problem.

## Unraid in production, a bit rough around the edges, and terrible SMB performance

In my last two posts I described how I migrated from W2K16 and hardware RAID6 to Unraid. Now that I’ve had two Unraid servers in production for a while, I’ll describe some of the good and not so good I experienced.

Running Docker on Unraid is magnitudes easier compared to getting Docker to work on Windows. Docker allowed me to move all but one of my workloads from VM’s to containers, simplifying updates, reducing the memory footprint, and improving performance.

For my IP security camera NVR software I did switch from Milestone XProtect Express running on a W2K16 VM, to DW Spectrum running on an Ubuntu Server VM. DW Spectrum is the US brand name for the Nx Witness product, and the DW Spectrum branded product is sold in the US. I chose to switch to Nx Witness, no DW Spectrum, from XProtect because Nx Witness is lighter in resource consumption, easier to deploy, easier to update, has perpetual licenses, includes native remote viewing, and an official Docker release is forthcoming.

I have been a long time user of CrashPlan, and I switched to CrashPlan Pro when they stopped offering a consumer product. I tested CrashPlan Pro and Duplicati containers on Unraid, with Duplicati backing up to Backblaze B2. Duplicati is the clear winner, backups were very fast, and completed in about 3 days. Where after 5 days I stopped CrashPlan, when it estimated another 18 days to complete the same backup operation, and it showed the familiar out of memory error. My B2 storage cost will be a few \$ higher compared to a single seat license for CrashPlan Pro, but the Duplicati plus B2 functionality and speed is superior.

When the Unraid 6.7.0 release went public, I immediately updated, but soon realized my mistake, when several plugins stopped working. It took several weeks before plugin updates were released that restored full functionality. It is worth mentioning, again, that I find it strange that Unraid without community provided plugins is really not that usable, but the functionality still remains in community provided plugins, not in Unraid. Next time I will wait a few weeks for the dust to settle in the plugin community before updating.

Storage and disk management is reasonably easy, and much more flexible compared to hardware RAID management. But adding and removing disks is still mostly a manual process, and doing it without invalidating parity is very cumbersome and time consuming. At several times I gave up on the convoluted steps required to add or remove disks without invalidating parity, and just reconfigured the array and then rebuilt parity, hoping nothing goes wrong during the parity rebuild. This is in my opinion a serious shortcoming, maybe not in technology, but in lack of an easy to use and reliable workflow to help retain redundant protection at all times.

In order to temporarily make enough storage space in my secondary server, I removed all the SSD cache drives and replaced them with 12TB Seagate IronWolf drives. I did move all the data that used to be on the cache to regular storage, including the docker appdata folder. This should not be a big deal, but I immediately started getting SQLite DB corruption errors in apps like Plex, that store data in SQLite on the appdata share. After some troubleshooting I found many people complaining about this issue, that seems to have been exasperated by the recent Unraid 6.7.0 update. Apparently this is a known problem with the Fuse filesystem used by Unraid. Fuse dynamically spans shares and folders across disks, but apparently breaks file and file-region locking required by SQLite. The recommended workaround is to put all files that require locking to work on the cache, or on a single disk, effectively bypassing Fuse. If it is Fuse that breaks file locking behavior, I find it troubling that this is not considered a critical bug.

I am quite familiar with VM snapshot management using Hyper-V and VMWare, it is a staple of VM management. In Unraid I am using a Docker based Virt-Manager, which seems far less flexible, but more importantly, fails to take snapshots of UEFI based VM’s. Apparently this is a known shortcoming. I have not looked very hard for alternatives, but this seems to be a serious functional gap compared to Hyper-V or VMWare’s snapshot capabilities.

As I started using the SMB file shares, now hosted on Unraid, in my regular day to day activities, I noticed that under some conditions the write speed becomes extremely slow, often dropping to around 2MB/s. This seems to happen when there are other file read operations in progress, and even a few KB/s of reads can drastically reduce the array SMB write performance. Interestingly the issue does not appear to affect my use of rsync between Unraid servers, but only SMB. I did find at least one other recent report of similar slowdowns, where only SMB is affected.

Since the problem appeared to be specific to Unraid SMB, and not general network performance, I compared the Unraid SMB performance with Windows SMB in a W2K19 VM running on the same Unraid system. By running W2K19 as a VM on the same Unraid system, the difference in performance will be mostly the SMB stack, not hardware or network.

On Unraid I created a share that is backed by the SSD cache array, that same SSD cache array holds the W2K19 VM disk image, so the storage subsystems are similar. I ran a similar test against an Unraid share backed by disk instead of cache.

I found a few references (1, 2) to SMB benchmarking using DiskSpd, and I used them as a basis for the test options I used. Start by creating a 64GB test file on all test shares, we reuse the file and it saves a lot of time to not recreate it every time. Note, we get a warning when creating the file on Unraid, due to SetFileValidData() not being supported by Unraid’s SMB implementation, but that should not be an issue.

>diskspd.exe -c64G \\storage\testcache\testfile64g.dat
WARNING: Could not set valid file size (error code: 50); trying a slower method of filling the file (this does not affect performance, just makes the test preparation longer)

>diskspd.exe -c64G \\storage\testmnt\testfile64g.dat
WARNING: Could not set valid file size (error code: 50); trying a slower method of filling the file (this does not affect performance, just makes the test preparation longer)

>diskspd.exe -c64G \\WIN-EKJ8HU9E5QC\TestW2K19\testfile64g.dat

I ran several tests similar to the following commandlines:

>diskspd -w50 -b512K -F2 -r -o8 -W60 -d120 -Srw -Rtext \\storage\testcache\testfile64g.dat > d:\diskspd_unraid_cache.txt
>diskspd -w50 -b512K -F2 -r -o8 -W60 -d120 -Srw -Rtext \\storage\testmnt\testfile64g.dat > d:\diskspd_unraid_mnt.txt
>diskspd -w50 -b512K -F2 -r -o8 -W60 -d120 -Srw -Rtext \\WIN-EKJ8HU9E5QC\TestW2K19\testfile64g.dat > d:\diskspd_w2k19.txt

For a full explanation of the commandline arguments see here. The test will do 50% read and 50% write, block sizes varied from 4KB to 2048KB, 2 threads, 8 outstanding IO operations, random aligned IO, warm up for 60s, run for 120s, disable local caching for remote filesystems.

From the results we can see that the Unraid SMB performance for this test is pretty poor. I redid the tests, this time doing independent read and write tests, and instead of various block sizes, I just did a 512KB block size test (I got lazy).

No matter how we look at it, the Unraid SMB write performance is still really bad.

I wanted to validate the synthetic tests results with a real world test, so I collected a folder containing around 65.2GB of fairly large files, on SSD, and copied the files up and down using robocopy from my Win10 system. I chose the size of files to be about double the size of the memory on the Unraid system, such that the impact of caching can be minimized. I made sure to use a RAW VM disk to eliminate any performance impact of growing a QCOW2 image file.

>robocopy d:\temp\out \\storage\testmnt\in /mir /fft > d:\robo_pc_mnt.txt
>robocopy d:\temp\out \\storage\testcache\in /mir /fft > d:\robo_pc_cache.txt
>robocopy d:\temp\out \\WIN-EKJ8HU9E5QC\TestW2K19\in /mir > d:\robo_pc_w2k19.txt

>robocopy \\storage\testmnt\in d:\temp\in /mir /fft > d:\robo_mnt_pc.txt
>robocopy \\storage\testcache\in d:\temp\in /mir /fft > d:\robo_cache_pc.txt
>robocopy \\WIN-EKJ8HU9E5QC\TestW2K19\in d:\temp\in /mir > d:\robo_w2k19_pc.txt

During the robocopy to Unraid I notice that sporadically the Unraid web UI, and web browsing in general, becomes very slow. This never happens while copying to W2K19. I can’t explain this, I see no errors reported in my Win10 client eventlog or resource monitor, I see no unusual errors on the network switches, and no errors in Unraid. I suspect whatever is impacting SMB performance is affecting network performance in general, but without data I am really just speculating.

The robocopy read results are pretty even, but again shows inferior Unraid SMB write performance. Do note that the W2K19 VM is still not as fast as my previous W2K16 RAID6 setup where I could consistently saturate the 1Gbps link for read and writes, on the same hardware and using the same disk.

It is very disappointing to discover the poor SMB performance, I reported my findings to the Unraid support forum, and I hope they can do something to improve performance, or maybe invalidate my findings.

## Unraid and Robocopy Problems

In my last post I described how I converted one of my W2K16 servers to Unraid, and how I am preparing for conversion of the second server.

As I’ve been copying all my data from W2K16 to Unraid, I discovered some interesting discrepancies between W2K16 SMB and Unraid SMB. I use robocopy to mirror files from one server to the other, and once the first run completes, any subsequent runs should complete without needing to copy any files again (unless they were modified).

First, you have to use the “robocopy.exe /mir [dest] /mir /fft” option, for Fat File Times, allowing for 2 seconds of drift in file timestamps.

I found a large number of files that would copy over and over with no changes to the source files. I also found a particular folder that would “magically” show up on Unraid, and cannot be deleted from the Unraid share by robocopy.

After some troubleshooting, I discovered that files with old timestamps, and folder names that end in a dot, do not copy correctly to Unraid.

I looked at the files that would not copy, and I discovered that the file modified timestamps were all set to “1 Jan 1970 00:00”. I experimented by changing the modified timestamp to today’s date, and the files copied correctly. It seems that if the modified timestamp on the source file is older than 1 Jan 1980, the modified timestamp on Unraid for the same newly created file will always be set as 1 Jan 1980. When then running robocopy again, the source files will always be reported as older, and the file copied again.

Below is an example of a folder of test files with a created date of 1 Jan 1970 UTC, I copy the files using robocopy, and copy them again. The second run of robocopy again copies all the files, instead of reporting them as similar. One can see that the destination timestamp is set to 1 Jan 1980, not 1 Jan 1970 as expected.

The second set of problem files occur in folder names ending in a dot. Unraid ignores the dots on the end of the folder names, and when another folder exists without dots, the copy operation uses the wrong folder.

Below is an example of a folder that contains two directories, one named “LocalState”, and one named “LocalState..”. I robocopy the folder contents, and when running robocopy again, it reports an extra folder. That extra folder gets “magically” created in the destination directory, but the “LocalState..” folder is missing.

The same robocopy operations to the W2K16 server over SMB works as expected.

From what I researched, the timestamp ranges for NTFS is 1 January 1601 to 14 September 30828, FAT is 1 January 1980 to 31 December 2107, and EXT4 is 1 January 1970 to 19 January 2106 (2038 + 408). I could not create files with a date earlier than 1 Jan 1980, but I could set file modified timestamps to dates greater than 2106, so I do not know what the Unraid timestamp range is.

Creating and accessing directories with trailing dots requires special care on Windows using the NT style notation, e.g. “CreateDirectoryW(L”\\\\?\\C:\\Users\\piete\\Unraid.Badfiles\\TestDot..”, NULL), but robocopy does handle that correctly on W2K16 SMB.

I don’t know if the observed behavior is specific to Unraid SMB, or if it would apply to Samba on Linux in general. But, it posed a problem as I wanted to make sure I do indeed have all files correctly backed up.

I decided to write a quick little app to find problem files and folders. The app iterates through all files and folders, it will fix timestamps that are out of range, and report on finding files or folders that end in a dot. I ran it through my files, it fixed the timestamps for me, and I deleted the folders ending in dot by hand. Multiple robocopy runs now complete as expected.

## Moving from W2K16 to Unraid

I have been happy with my server rack running my UniFi network equipment and two Windows Server 2016 (W2K16) instances. I use the servers for archiving my media collection and running Hyper-V for all sorts of home projects and work related experiments. But, time moves on, one can never have enough storage, and technology changes. So I set about a path that lead to me replacing my W2K16 servers with Unraid.

I currently use Adaptec 7805Q and 81605ZQ RAID cards, with a mixture of SSD for caching, SSD RAID1 for boot and VM images, and HDD RAID6 for the large media storage array. The setup has been solid, and although I’ve had both SSD and HDD failures, the hot spares kicked in, and I replaced the failed drives with new hot spares, no data lost.

For my large RAID6 media array I used lots of HGST 4TB Ultrastar (enterprise) and Deskstar (consumer) drives, but I am out of open slots in my 24-bay 4U case, so adding more storage has become a problem. I can replace the 4TB drives with larger drives, but in order to expand the RAID6 volume without loosing data, I need to replace all disks in the array, one-by-one, rebuilding parity in between every drive upgrade, and then expand the volume. This will be very expensive, take a very long time, and risk the data during during every drive rebuild.

I have been looking for more flexible provisioning solutions, including Unraid, FreeNAS, OpenMediaVaultStorage Spaces, and Storage Spaces Direct. I am not just looking for dynamic storage, I also want a system that can run VM’s, and Docker containers, I want it to work with consumer and or small business hardware, and I do not want to spend all my time messing around in a CLI.

I have tried Storage Spaces with limited success, but that was a long time ago. Storage Spaces Direct offers significant improvements, but with more stringent enterprise hardware requirements, that would make it too costly and complicated for my home use.

FreeNAS offers the best storage capabilities, but I found the VM and Docker ecosystem to be an afterthought and still lacking.

OpenMediaVault (OMV) is a relative newcomer, the web front-end is modern, think of OMV as Facebook and FreeNAS and Unraid as MySpace, with growing support for VM’s and Docker. Compared to FreeNAS and Unraid the OMV community is still very small, and I was reluctant to entrust my data to it.

Unraid offered a good balance between storage, VM, and Docker, with a large support community. Unlike FreeNAS and OMV, Unraid is not free, but the price is low enough.

An ideal solution would have been the storage flexibility offered by FreeNAS, the docker and VM app ecosystem offered by Unraid, and the UI of OMV. Since that does not exist, I opted to go with Unraid.

Picking a replacement OS was one problem, but moving the existing systems to run on it, without loosing data or workloads, quite another. I decided to convert the two servers one at a time, so I moved all the Hyper-V workloads from Server-2 with the 8-bay chassis, to Server-1 with the 24-bay chassis. This left Server-1 unused, and I could go about converting it to Unraid. I not only had to install Unraid, I also had to provision enough storage in the 8-bay chassis to hold all the data from the 24-bay chassis, so that I could then move the data on Server-1 to Server-2, convert Server-1 to Unraid, and move the data back to Server-1. And I had to do this without risking the data, and without an extended outage.

To get all the data from Server-1 to fit on Server-2, I pruned the near 60TB set down to around 40TB. You know how it works, no matter how much storage you have it will always be filled. I purchased 4 x 12TB Seagate IronWolf ST12000VN0007 drives, and combined with 2 x 4TB HGST drives, gave me around 44TB of of usable storage space, enough to copy all the important data from Server-1 to Server-2.

While I was at it, I decided to upgrade the IPMI firmware, motherboard BIOS, and RAID controller firmware. I knew it is possible to upgrade the SuperMicro BIOS through IPMI, but you have to buy a per-motherboard locked Out-of-Band feature key from SuperMicro to enable this, something I had never bothered doing. While looking for a way to buy a code online, I found an interesting post describing a method of creating my own activation keys, and it worked.

IPMI updated, motherboard BIOS updated, RAID firmware updated, I set about converting the Adaptec RAID controller from RAID to HBA mode. Unlike the LSI controllers that need to be re-flashed with IR or IT firmware to change modes, the Adaptec controller allows this configuration via the controller BIOS. In order to change modes, all drives have to be uninitialized, but there were two drives that I could not uninitialize. After some troubleshooting I discovered that it is not possible to delete MaxCache arrays from the BIOS. I had to boot using the Adaptec bootUSB Utility, that is a Linux bootable image that runs the MaxView storage controller GUI. MaxCache volumes deleted, I could convert to HBA mode.

With the controller in HBA mode, I set about installing Unraid, well, it is not really installing in the classic sense, Unraid runs from a USB drive, and all drives in the system are used for storage. There are lots of info online on installing and configuring Unraid, but I found very good info on the Spaceinvader One Youtube channel. I have seen some reports of issues with USB drives, but I had no problems using a SanDisk Cruzer Fit drive.

It took a couple iterations before I was happy with the setup, and here are a few important things I learned:

• Unraid does not support SSD drives as data drives, see the install docs; “Do not assign an SSD as a data/parity device. While unRAID won’t stop you from doing this, SSDs are only supported for use as cache devices due TRIM/discard and how it impacts parity protection. Using SSDs as data/parity devices is unsupported and may result in data loss at this time.” This is one area where FreeNAS and OMV offer much better redundancy solutions using e.g. ZFS over Unraid’s parity solution, or many other commercial solutions that have for many years been using SSD’s in drive arrays.
• Unraid’s caching solution using SSD drives and BTRFS works just ok. Unlike e.g. Adaptec MaxCache that seamlessly caches block storage regardless of the file system, the Unraid cache works at the file level. While this does create flexibility in deciding which files from which shares should be using the cache, it greatly complicates matters when running out of space on the cache. When a file is created on the cache, and the file is then enlarged to the point it no longer fits in the available space, the file operation will permanently fail. E.g. copying a large file to a cached share, and the file is larger that the available space, the copy will proceed until the cache runs out of space, and then fail, repeat and get the same. To avoid this, one has to set the minimum free space setting to a value larger than the largest file that would ever be created on the cache, for large files, this is very wasteful. Imagine a thin provisioned VM image, it can grow until no space, and then fail, until manually moved to a different drive.
• The cache re-balancing and file moving algorithm is very rudimentary, the operation is schedule per time period, and will move files from the cache to regular storage. There is no support for flushing the cache in real-time as it runs out of space, there is no high water or low water mechanisms, no LRU or MRU file access logic. I installed the Mover Tuning plugin that allows balancing the cache based on consumed space, better, but still not good enough.
• Exhausting the cache space while copying files to Unraid is painfully slow. I used robocopy to copy files from W2K16 to a share on Unraid that had caching set to “preferred”, meaning use the cache if it has space, and as soon as the cache ran out of space, the copy operation slowed down to a crawl. As soon as the cache ran out of space, new files were supposed to be written to HDD, but my experience showed that something was not working, and I had to disable the cache and then copy the files. The whole SSD and caching thing is a big disappointment.
• Building parity while copying files is very slow. Copying files using robocopy while the parity was building resulted in about 200Mbps throughput, very slow. I cancelled the parity operation, disabled the parity drive, and copied with no parity protection in place, and got near the expected 1Gbps throughput. I will re-enable parity building after all data is copied across.
• Performing typical disk based operations like add-, remove-, or replace- a drive, is very cumbersome. The wiki tries to explain, but it is still very confusing. I really expected much easier ways of doing typical disk based operations, especially when almost all operations result in the parity becoming invalid, leaving the system exposed to failure.
• It is really easy to use Docker, with containers directly from Docker Hub, or from the Community Applications plugin that acts like an app store.
• It is reasonably easy to create VM’s, one has to manually install the LibVirt KVM/QEMU drivers in Windows OS’s, but it is made easy with the automatic mounting of the LibVirt driver ISO.
• I could not get any Ubuntu Desktop VM’s working, they would all hang during install. I had no problems with Ubuntu Server installs. I am sure there is a solution, I just did not try looking yet as I only needed Ubuntu Server.
• VM runtime management is lacking, there is no support for snapshots or backups. One can install the Virt-Manager container to help, but it is still rather rudimentary compared to offerings from VMWare, Hyper-V, and VirtualBox.
• In order to get things working I had to install several community plugins, I would have expected this functionality to be included in the base installation. Given how active the plugin authors are in the community, I wonder if not including said functionality by default may be intentional?
• Drive power saving works very well, and drives are spun down when not in use. I will have to revisit the file and folder to drive distribution, as common access patterns to common files should be constrained to the same physical drive.
• The community forum is very active and very helpful.

I still have a few days of file copying left, and I will keep my W2K16 server operational until I am confident in the integrity and performance of Unraid. When I’m ready, I’ll convert the second server to Unraid, and then re-balance the storage, VM, and Docker workloads between the two servers.