Unraid and Robocopy Problems

In my last post I described how I converted one of my W2K16 servers to Unraid, and how I am preparing for conversion of the second server.

As I’ve been copying all my data from W2K16 to Unraid, I discovered some interesting discrepancies between W2K16 SMB and Unraid SMB. I use robocopy to mirror files from one server to the other, and once the first run completes, any subsequent runs should complete without needing to copy any files again (unless they were modified).

First, you have to use the “robocopy.exe /mir [dest] /mir /fft” option, for Fat File Times, allowing for 2 seconds of drift in file timestamps.

I found a large number of files that would copy over and over with no changes to the source files. I also found a particular folder that would “magically” show up on Unraid, and cannot be deleted from the Unraid share by robocopy.

After some troubleshooting, I discovered that files with old timestamps, and folder names that end in a dot, do not copy correctly to Unraid.

I looked at the files that would not copy, and I discovered that the file modified timestamps were all set to “1 Jan 1970 00:00”. I experimented by changing the modified timestamp to today’s date, and the files copied correctly. It seems that if the modified timestamp on the source file is older than 1 Jan 1980, the modified timestamp on Unraid for the same newly created file will always be set as 1 Jan 1980. When then running robocopy again, the source files will always be reported as older, and the file copied again.

Below is an example of a folder of test files with a created date of 1 Jan 1970 UTC, I copy the files using robocopy, and copy them again. The second run of robocopy again copies all the files, instead of reporting them as similar. One can see that the destination timestamp is set to 1 Jan 1980, not 1 Jan 1970 as expected.

The second set of problem files occur in folder names ending in a dot. Unraid ignores the dots on the end of the folder names, and when another folder exists without dots, the copy operation uses the wrong folder.

Below is an example of a folder that contains two directories, one named “LocalState”, and one named “LocalState..”. I robocopy the folder contents, and when running robocopy again, it reports an extra folder. That extra folder gets “magically” created in the destination directory, but the “LocalState..” folder is missing.

The same robocopy operations to the W2K16 server over SMB works as expected.

From what I researched, the timestamp ranges for NTFS is 1 January 1601 to 14 September 30828, FAT is 1 January 1980 to 31 December 2107, and EXT4 is 1 January 1970 to 19 January 2106 (2038 + 408). I could not create files with a date earlier than 1 Jan 1980, but I could set file modified timestamps to dates greater than 2106, so I do not know what the Unraid timestamp range is.

Creating and accessing directories with trailing dots requires special care on Windows using the NT style notation, e.g. “CreateDirectoryW(L”\\\\?\\C:\\Users\\piete\\Unraid.Badfiles\\TestDot..”, NULL), but robocopy does handle that correctly on W2K16 SMB.

I don’t know if the observed behavior is specific to Unraid SMB, or if it would apply to Samba on Linux in general. But, it posed a problem as I wanted to make sure I do indeed have all files correctly backed up.

I decided to write a quick little app to find problem files and folders. The app iterates through all files and folders, it will fix timestamps that are out of range, and report on finding files or folders that end in a dot. I ran it through my files, it fixed the timestamps for me, and I deleted the folders ending in dot by hand. Multiple robocopy runs now complete as expected.

 

 

 

Moving from W2K16 to Unraid

I have been happy with my server rack running my UniFi network equipment and two Windows Server 2016 (W2K16) instances. I use the servers for archiving my media collection and running Hyper-V for all sorts of home projects and work related experiments. But, time moves on, one can never have enough storage, and technology changes. So I set about a path that lead to me replacing my W2K16 servers with Unraid.

I currently use Adaptec 7805Q and 81605ZQ RAID cards, with a mixture of SSD for caching, SSD RAID1 for boot and VM images, and HDD RAID6 for the large media storage array. The setup has been solid, and although I’ve had both SSD and HDD failures, the hot spares kicked in, and I replaced the failed drives with new hot spares, no data lost.

For my large RAID6 media array I used lots of HGST 4TB Ultrastar (enterprise) and Deskstar (consumer) drives, but I am out of open slots in my 24-bay 4U case, so adding more storage has become a problem. I can replace the 4TB drives with larger drives, but in order to expand the RAID6 volume without loosing data, I need to replace all disks in the array, one-by-one, rebuilding parity in between every drive upgrade, and then expand the volume. This will be very expensive, take a very long time, and risk the data during during every drive rebuild.

I have been looking for more flexible provisioning solutions, including Unraid, FreeNAS, OpenMediaVaultStorage Spaces, and Storage Spaces Direct. I am not just looking for dynamic storage, I also want a system that can run VM’s, and Docker containers, I want it to work with consumer and or small business hardware, and I do not want to spend all my time messing around in a CLI.

I have tried Storage Spaces with limited success, but that was a long time ago. Storage Spaces Direct offers significant improvements, but with more stringent enterprise hardware requirements, that would make it too costly and complicated for my home use.

FreeNAS offers the best storage capabilities, but I found the VM and Docker ecosystem to be an afterthought and still lacking.

OpenMediaVault (OMV) is a relative newcomer, the web front-end is modern, think of OMV as Facebook and FreeNAS and Unraid as MySpace, with growing support for VM’s and Docker. Compared to FreeNAS and Unraid the OMV community is still very small, and I was reluctant to entrust my data to it.

Unraid offered a good balance between storage, VM, and Docker, with a large support community. Unlike FreeNAS and OMV, Unraid is not free, but the price is low enough.

An ideal solution would have been the storage flexibility offered by FreeNAS, the docker and VM app ecosystem offered by Unraid, and the UI of OMV. Since that does not exist, I opted to go with Unraid.

Picking a replacement OS was one problem, but moving the existing systems to run on it, without loosing data or workloads, quite another. I decided to convert the two servers one at a time, so I moved all the Hyper-V workloads from Server-2 with the 8-bay chassis, to Server-1 with the 24-bay chassis. This left Server-1 unused, and I could go about converting it to Unraid. I not only had to install Unraid, I also had to provision enough storage in the 8-bay chassis to hold all the data from the 24-bay chassis, so that I could then move the data on Server-1 to Server-2, convert Server-1 to Unraid, and move the data back to Server-1. And I had to do this without risking the data, and without an extended outage.

To get all the data from Server-1 to fit on Server-2, I pruned the near 60TB set down to around 40TB. You know how it works, no matter how much storage you have it will always be filled. I purchased 4 x 12TB Seagate IronWolf ST12000VN0007 drives, and combined with 2 x 4TB HGST drives, gave me around 44TB of of usable storage space, enough to copy all the important data from Server-1 to Server-2.

While I was at it, I decided to upgrade the IPMI firmware, motherboard BIOS, and RAID controller firmware. I knew it is possible to upgrade the SuperMicro BIOS through IPMI, but you have to buy a per-motherboard locked Out-of-Band feature key from SuperMicro to enable this, something I had never bothered doing. While looking for a way to buy a code online, I found an interesting post describing a method of creating my own activation keys, and it worked.

IPMI updated, motherboard BIOS updated, RAID firmware updated, I set about converting the Adaptec RAID controller from RAID to HBA mode. Unlike the LSI controllers that need to be re-flashed with IR or IT firmware to change modes, the Adaptec controller allows this configuration via the controller BIOS. In order to change modes, all drives have to be uninitialized, but there were two drives that I could not uninitialize. After some troubleshooting I discovered that it is not possible to delete MaxCache arrays from the BIOS. I had to boot using the Adaptec bootUSB Utility, that is a Linux bootable image that runs the MaxView storage controller GUI. MaxCache volumes deleted, I could convert to HBA mode.

With the controller in HBA mode, I set about installing Unraid, well, it is not really installing in the classic sense, Unraid runs from a USB drive, and all drives in the system are used for storage. There are lots of info online on installing and configuring Unraid, but I found very good info on the Spaceinvader One Youtube channel. I have seen some reports of issues with USB drives, but I had no problems using a SanDisk Cruzer Fit drive.

It took a couple iterations before I was happy with the setup, and here are a few important things I learned:

  • Unraid does not support SSD drives as data drives, see the install docs; “Do not assign an SSD as a data/parity device. While unRAID won’t stop you from doing this, SSDs are only supported for use as cache devices due TRIM/discard and how it impacts parity protection. Using SSDs as data/parity devices is unsupported and may result in data loss at this time.” This is one area where FreeNAS and OMV offer much better redundancy solutions using e.g. ZFS over Unraid’s parity solution, or many other commercial solutions that have for many years been using SSD’s in drive arrays.
  • Unraid’s caching solution using SSD drives and BTRFS works just ok. Unlike e.g. Adaptec MaxCache that seamlessly caches block storage regardless of the file system, the Unraid cache works at the file level. While this does create flexibility in deciding which files from which shares should be using the cache, it greatly complicates matters when running out of space on the cache. When a file is created on the cache, and the file is then enlarged to the point it no longer fits in the available space, the file operation will permanently fail. E.g. copying a large file to a cached share, and the file is larger that the available space, the copy will proceed until the cache runs out of space, and then fail, repeat and get the same. To avoid this, one has to set the minimum free space setting to a value larger than the largest file that would ever be created on the cache, for large files, this is very wasteful. Imagine a thin provisioned VM image, it can grow until no space, and then fail, until manually moved to a different drive.
  • The cache re-balancing and file moving algorithm is very rudimentary, the operation is schedule per time period, and will move files from the cache to regular storage. There is no support for flushing the cache in real-time as it runs out of space, there is no high water or low water mechanisms, no LRU or MRU file access logic. I installed the Mover Tuning plugin that allows balancing the cache based on consumed space, better, but still not good enough.
  • Exhausting the cache space while copying files to Unraid is painfully slow. I used robocopy to copy files from W2K16 to a share on Unraid that had caching set to “preferred”, meaning use the cache if it has space, and as soon as the cache ran out of space, the copy operation slowed down to a crawl. As soon as the cache ran out of space, new files were supposed to be written to HDD, but my experience showed that something was not working, and I had to disable the cache and then copy the files. The whole SSD and caching thing is a big disappointment.
  • Building parity while copying files is very slow. Copying files using robocopy while the parity was building resulted in about 200Mbps throughput, very slow. I cancelled the parity operation, disabled the parity drive, and copied with no parity protection in place, and got near the expected 1Gbps throughput. I will re-enable parity building after all data is copied across.
  • Performing typical disk based operations like add-, remove-, or replace- a drive, is very cumbersome. The wiki tries to explain, but it is still very confusing. I really expected much easier ways of doing typical disk based operations, especially when almost all operations result in the parity becoming invalid, leaving the system exposed to failure.
  • It is really easy to use Docker, with containers directly from Docker Hub, or from the Community Applications plugin that acts like an app store.
  • It is reasonably easy to create VM’s, one has to manually install the LibVirt KVM/QEMU drivers in Windows OS’s, but it is made easy with the automatic mounting of the LibVirt driver ISO.
  • I could not get any Ubuntu Desktop VM’s working, they would all hang during install. I had no problems with Ubuntu Server installs. I am sure there is a solution, I just did not try looking yet as I only needed Ubuntu Server.
  • VM runtime management is lacking, there is no support for snapshots or backups. One can install the Virt-Manager container to help, but it is still rather rudimentary compared to offerings from VMWare, Hyper-V, and VirtualBox.
  • In order to get things working I had to install several community plugins, I would have expected this functionality to be included in the base installation. Given how active the plugin authors are in the community, I wonder if not including said functionality by default may be intentional?
  • Drive power saving works very well, and drives are spun down when not in use. I will have to revisit the file and folder to drive distribution, as common access patterns to common files should be constrained to the same physical drive.
  • The community forum is very active and very helpful.

I still have a few days of file copying left, and I will keep my W2K16 server operational until I am confident in the integrity and performance of Unraid. When I’m ready, I’ll convert the second server to Unraid, and then re-balance the storage, VM, and Docker workloads between the two servers.

CrashPlan throws in the towel … for home users

Today CrashPlan, my current online backup provider, announced on Facebook of all places, that they threw in the towel, and will no longer provide service to home users. The backlash was heated, and I found the CEO’s video message on the blog post rather condescending.

I’ve been a long time user of online backup providers, and many have thrown in the towel, especially when free file sync from Google and Microsoft offers ever expanding capabilities and more and more free storage. Eventually even the cheapest backup storage implementation becomes expensive, when compared to a cloud provider, and not profitable as a primary business.

I’ve been using CrashPlan’s unlimited home plan for quite some time now, they were one of a few, today none, that were reasonably priced, allowed unlimited storage, and supported server class OS’s. But, I could sense the writing was on the wall; they split the home and business Facebook account, they split the website, the home support site has not seen activity in ages, they made major improvements to the enterprise backup agent, switching to a much leaner and faster C++ agent, while the home agent remained the old Java app with its many shortcomings, and there were some vague rumors on the street of a home business selloff attempt.

The transition offered a free switch to the small business plan, for the remaining duration of the home subscription, plus 3 months, and then a 75% discount on next year’s plan. For my account, this means free CrashPlan Pro until 12 June 2018, then $2.50 per month until 12 June 2019, and then $10.00 per month.

I’ve switched to the Pro plan, as they promised the agent updated itself, going from the old Java to the new C++ agent, the already backed up data was retained without needing to backup again, and all seems well, for now…

Rack that server

It’s been a year and a half since we moved into the new house, and I finally have the servers racked in the garage. Looks pretty nice compared to my old setups.

My old setup was as follows:
Two DELL OptiPlex 990 small form factor machines with Windows Server 2008 R2 as Hyper-V servers. One server ran the important 24/7 VM’s, the other was used for testing and test VM’s. The 24/7 VM’s included a W2K8R2 domain controller and a W2K12 file server.
For storage I used a Synology DS2411+ NAS, with 12 x 3TB Hitachi Ultrastar drives, configured in RAID6, and served via iSCSI. The the iSCSI drive was mounted in the Hyper-V host, and configured as a 30TB passthrough disk for the file server VM, that served files over SMB and NFS.
These servers stood on a wooden storage rack in the garage, and at the new house they were temporarily housed under the desk in my office.

One of my primary objectives was to move the server equipment to the garage in an enclosed server rack, with enough space for expansion and away from dust. A garage is not really dust free and does get hot in the summer, not an ideal location for a server rack, but better than finding precious space inside the house. To keep dust to a minimum I epoxy coated the floor and installed foam air filters in the wall and door air inlet vents. To keep things cool, especially after parking two hot cars, I installed an extractor fan. I had planned on connecting it to a thermostat, but opted to use a Panasonic WhisperGreen extractor fan rated for 24/7 operation, and I just leave it on all the time. We have ongoing construction next door, and the biggest source of dust are the gaps around the garage door. I’ve considered applying sticky foam strips next to the garage door edges, but have not done so yet.

In retrospect, preparing the garage concrete surface by hand, and applying the Epoxy Coat kit by myself, is not something I would recommend for a novice. If you can, pay a pro to do it for you, or at least get a friend to help, and rent a diamond floor abrasion machine.

I did half the garage at a time, moving everything to one side, preparing the surface by hand, letting it dry, applying the epoxy and flakes, letting it dry, and then repeating the process for the other side. I decided the 7″ roller that came with the kit was too small, and I bought a 12″ roller, big mistake, as soon as I started rolling the epoxy there was lint everywhere. From the time you start applying the epoxy you have 20 minutes working time, no time to go buy the proper type of lint free roller. I did not make the same mistake twice, and used the kit roller for the second half, no lint. With the experience gained from the first half it was much easier the second time round, and the color flake application was also much more even compared to the first half.

To conserve space in the garage I used a Middle Atlantic WR-24-32 WR Series Roll Out Rotating Rack. The roll out and rotate design allowed me to mount the rack right against the wall and against other equipment, as it does not require rear or side panel access. I also used a low noise MW-4QFT-FC thermostatically controlled integrated extractor fan top to keep things cool, and a WRPFD-24 plexiglass front door to make it look nice.

The entire interior cage rolls out on heavy duty castors, and the bottom assembly rotates on ball bearings. The bottom of the enclosure is open in the center with steel plate tracks for the castors, and must be mounted down on a sturdy and level surface. My garage floor is not level and slopes towards the door, and consequently a fully loaded rack wants to roll out the door, and all the servers keep sliding out of the rails.

I had to level the enclosure by placing spacers under the front section, and then bolting it down on the concrete floor. This leaves the enclosure and the rails inside the enclosure level, but as soon as I pull the rack out on the floor, the chassis slide out and the entire rack wants to roll out the door. I had to build a removable wood platform with spacers to provide a level runway surface in front of the rack, that way I can pull the rack out on a level surface, and store the runway when not in use.

The WR-24-32 is 24U high, and accommodates equipment up to 26″ in length, quite a bit shorter than most standard racks. The interior rack assembly pillar bars are about 23″ apart, with equipment extending past the pillar ends. This turned out to be more of a challenge than the 26″ equipment length constraint. When the rack is in its outside rotated position, the 23″ pillars just clears the enclosure, but the 26″ equipment sticking out past the pillars do not, and prevents the rack from rotating. This requires brute force to lift the castors, and a very heavy loaded rack, over the rail edge and pull the enclosure out all the way before the rack would rotate freely.

Another problem with the 23″ pillar spacing is the minimum adjustable distance for the 4U Supermicro chassis rails is about 25″, and they would not fit between the pillars. I had to order a shorter set of adjustable rails, and use the chassis side of the original rails to match the chassis mounting holes, and the rack side of the rails to clear the pillars, fortunately they fit perfectly into each other, but not on the rack. The WR-24-32 has tapped 10-32 screw holes in all locations, i.e. no square holes anywhere, which meant I had to use my Dremel to cut the quick mount tabs from the rails in order to screw them on instead of hanging them on.

Rather than using another NAS based storage solution I opted for direct attached storage, so I was looking for a 24-bay chassis, less than 26″ in length, with low noise fans. I opted for a Supermicro 4U 24-bay SuperChassis 846BE16-R920B for the main file server, and a 4U 8-bay SuperChassis 745BTQ-R1K28B-SQ for the utility server. It was the SC846’s included rails that were too long to fit between the posts, and I replaced them with a MCP-290-00058-0N short rail set.

I used Supermicro X10SLM+-F Xeon boards with Intel Xeon E3-1270 v3 processors for both systems. Low power and low heat was a higher priority than performance, and the E3 v3 processors were a good balance. I’ve had good experiences with the X9 series SM boards, but I have mixed feelings about the X10 boards. Kingston dropped support for these boards due to memory chip incompatibilities, and SM certified memory for this board is very expensive, and I had endless troubles getting the boards to work with an Adaptec 7805Q controller. The 7805Q controller would simply fail to start, and after being bounced around between SM and Adaptec support, SM eventually provided me with a special BIOS build, that is yet to be publicly updated, that resolved the problem. I had no such problems with the newer 81605ZQ controller I used in the 24-bay chassis.

For the 24-bay system storage, I used 2 x Samsung 840 Pro 512GB SSD drives in RAID1 for booting the OS and for MaxCache, 4 x Samsung 840 EVO 1TB SSD drives in RAID5 to host VM’s, 16 x Hitachi 4TB Coolspin drives plus 2 x hot spares in RAID6 for main storage. The 56TB RAID6 volume is mounted as a passthrough disk to the file server VM. To save power and reduce heat I host all the VM’s on the SSD array, and opted to use the consumer grade Hitachi Coolspin drives over the more expensive but reliable Ultrastar drives. The 8-bay system has a similar configuration, less the large RAID6 data array.

The SM boards are very easy to manage using the integrated IPMI KVM functionality. Other than configuring the BIOS and IPMI IP settings on the first boot, I rarely have to use the rack mounted KVM console. Each server runs W2K12R2 with the Hyper-V role. I am no longer running a domain controller, the complexity outweighed the benefit, especially with the introduction of Microsoft online accounts used in Windows 8. The main VM is a W2K12R2 storage file server VM, with the RAID6 disk in passthrough, serving data over SMB and NFS. My other VM’s include a system running Milestone XProtect IP security camera network video recorder, a MSSQL and MySQL DB VM, a Spiceworks VM, a Splunk VM, a UniFi Controller VM, and several work related VM’s.

I had Verizon switch my internet connection from Coax to Ethernet, and I now run a Ubiquity EdgeRouter Pro. I did run a MiktroTik Routerboard CCR1009-8G-1S-1S+ for a while, and it is a very nice box, but as I also switched out my EnGenius EAP600 access points to Ubiquity UniFi AC units, and I replaced the problematic TRENDNet TPE-1020WS POE+ switches with Ubiquity ToughSwitch TS-8-Pro POE units, I preferred to stick to one brand in the hopes of better interoperability. Be weary of the ToughSwitch units though, seems that under certain conditions mixing 100Mbps and 1Gbps ports have serious performance problems. I am still on the fence about the UniFi AC units, they are really easy to manage via the UniFi controller, but some devices, like my Nest thermostats, are having problems staying connected. Not sure if it is a problem with access points or the Nest’s, as there are many people blaming this problem on a Nest firmware update.

I used an APC Smart-UPS X 1500VA Rack/Tower LCD 120V with Network Card for clean and reliable power, and an ITWatchDogs SuperGoose II Climate Monitor for environmental monitoring and alerting.

After building and configuring everything, I copied all 30TB of data from the DS2411+ to the new server using robocopy with the multithreaded option, took about 5 days to copy. I continued using the old systems for two weeks while I let the new systems settle in, in case anything breaks. I then re-synced the data using robocopy, moved the VM’s over, and pointed clients to the new systems.

VM’s are noticeably more response, presumable due to being backed by SSD. I can now have multiple XBMC systems simultaneously watch movies while I copy data to storage without any playback stuttering, something that used to be an issue on the old iSCSI system.

The best part is really the way the storage cabinet looks 🙂

This is the temporary server home under my office desk:
Before

Finished product:
After

The “runway” I constructed to create a level surface:
Runway

Pulled out all the way, notice the cage is clear, but the equipment won’t clear:
Out

To clear the equipment the castors have to be pulled over the edge:
Cleared

Rotated view:
Rotated

The rarely used KVM drawer:
KVM

Extractor fans:
Fans

Night mode:
Night

LSI turns their back on Green

I previously blogged here and here on my research into finding a power saving RAID controllers.

I have been using LSI MegaRAID SAS 9280-4i4e controllers in my Windows 7 workstations and LSI MegaRAID SAS 9280-8e controllers Windows Server 2008 R2 servers. These controllers work great, my workstations go to sleep and wake up, and in workstations and servers drives spin down when not in use.

I am testing a new set of workstation and server systems running Windows 8 and Server 2012, and using the “2nd generation” PCIe 3.0 based LSI RAID controllers. I’m using LSI MegaRAID SAS 9271-8i with CacheVault and LSI MegaRAID SAS 9286CV-8eCC controllers.

I am unable to get any of the configured drives to spin down on either of the controllers, nor in Windows 8 or Windows Server 2012.

LSI has not yet published any Windows 8 or Server 2012 drivers on their support site. In September 2012, after the public release of Windows Server 2012, LSI support told me drivers would ship in November, and now they tell me drivers will ship in December. All is not lost as the 9271 and 9286 cards are detected by the default in-box drivers, and appear to be functional.

I had hoped the no spin-down problem was a driver issue, and that it would be corrected by updated drivers, but that appears to be wishful thinking.

I contacted LSI support about the drive spin-down issue, and was referred to this August 2011 KB 16563, pointing to KB 16385 stating:

newer versions of firmware no longer support DS3; the newest version of firmware to support DS3 was 12.12.0-0045_SAS_2108_FW_Image_APP-2.120.33-1197

When I objected to the removal, support replied with this canned quote:

In some cases, when Dimmer Switch with DS3 spins down the volume, the volume cannot spin up in time when I/O access is requested by the operating system.  This can cause the volume to go offline, requiring a reboot to access the volume again.

LSI basically turned their back on green by disabling drive spin-down on all new controllers and new firmware versions.

I have not had any issues with this functionality on my systems, and spinning down unused drives to save power and reduce heat is a basic operational requirement. Maybe there are issues with some systems, but at least give me the choice of enabling it in my environment.

A little bit of searching shows I am not alone in my complaint, see here and here.

And from Intel a November 2012 KB 033877 that they have disabled drive power save on all their RAID controllers, maybe not that surprising given that Intel uses rebranded LSI controllers.

After a series of overheating batteries and S3 failures, I have long ago given up on Adaptec RAID controllers, but this situation with LSI is making me take another look at them.

Adaptec is advertising Intelligent Power Management as a feature of their controllers, I ordered a 7805Q controller, and will report my findings in a future post.

Storage Spaces Leaves Me Empty

I was very intrigued when I found out about Storage Spaces and ReFS being introduced in Windows Server 2012 and Windows 8. But now that I’ve spent some time with it, I’m left disappointed, and I will not be trusting my precious data with either of these features, just yet.

 

Microsoft publicly announced Storage Spaces and ReFS in early Windows 8 blog posts. Storage Spaces was of special interest to the Windows Home Server community in light of Microsoft first dropping support for Drive Extender in Windows Home Server 2011, and then completely dropping Windows Home Server, and replacing it with Windows Server 2012 Essentials. My personal interest was more geared towards expanding my home storage capacity in a cost effective and energy efficient way, without tying myself to proprietary hardware solutions.

 

I archive all my CD’s, DVD’s, and BD discs, and store the media files on a Synology DS2411+ with 12 x 3TB drives in a RAID6 volume, giving me approximately 27TB of usable storage. Seems like a lot of space, but I’ve run out of space, and I have a backlog of BD discs that need to be archived. In general I have been very happy with Synology (except for an ongoing problem with “Local UPS was plugged out” errors), and they do offer devices capable of more storage, specifically the RS2212+ with the RX1211 expansion unit offering up to 22 combined drive bays. But, at $2300 plus $1700, this is expensive, capped at 22 drives, and further ties me in with Synology. Compare that with $1400 for a Norco DS24-E or $1700 for a SansDigital ES424X6+BS 24 bay 4U storage unit, an inexpensive LSI OEM branded SAS HBA from eBay, or a LSI SAS 9207-8e if you like the real thing, connected to Windows Server 2012, running Storage Spaces and ReFS, and things look promising.

Arguable I am swapping one proprietary technology for another, but with native Windows support, I have many more choices for expansion. One could make the same argument for the use of ZFS on Linux, and if I was a Linux expert, that may have been my choice, but I’m not.

 

I tested using a SuperMicro SuperWorkstation 7047A-73, with dual Xeon E5-2660 processors and 32GB RAM. The 7047A-73 uses a X9DA7 motherboard, that includes a LSI SAS2308 6Gb/s SAS2 HBA, connected to 8 hot-swap drive bays.

For comparison with a hardware RAID solution I also tested using a LSI MegaRAID SAS 9286CV-8e 6Gb/s SAS2 RAID adapter, with the CacheCade 2.0 option, and a Norco DS12-E 12 bay SAS2 2U expander.

For drives I used Hitachi Deskstar 7K4000 4TB SATA3 desktop drives and Intel 520 series 480GB SATA3 SSD drives. I did not test with enterprise class drives, 4TB models are still excessively expensive, and defeats the purpose of cost effective home use storage.

 

I previously reported that the Windows Server 2012 and Windows 8 install will hang when trying to install on a SSD connected to the SAS2308. As such I installed Server 2012 Datacenter on an Intel 480GB SSD connected to the onboard SATA3 controller.

Windows automatically installed the drivers for the LSI SAS2308 controller.

I had to manually install the drivers for the C600 chipset RSTe controller, and as reported before, the driver works, but suffers from dyslexia.

The SAS2308 controller firmware was updated to the latest released SuperMicro v13.0.57.0.

 

Since LSI already released v14.0.0.0 firmware for their own SAS2308 based boards like the SAS 9207-8e, I asked SuperMicro support for their v14 version, and they provided me with an as yet unreleased v14.0.0.0 firmware version for test purposes. Doing a binary compare between the LSI version and the SuperMicro version, the differences appear to be limited to descriptive model numbers, and a few one byte differences that are probably configuration or default parameters. It is possible to cross-flash between some LSI and OEM adapters, but since I had a SuperMicro version of the firmware, this was not necessary.

SuperMicro publishes a v2.0.58.0 LSI driver that lists Windows 8 support, but LSI has not yet released Windows 8 or Server 2012 drivers for their own SAS2308 based products. I contacted LSI support, and their Windows 8 and Server 2012 drivers are scheduled for release in the P15 November 2012 update.

I tested the SuperMicro v14.0.0.0 firmware with the SuperMicro v2.0.58.0 driver, the SuperMicro v14.0.0.0 firmware with the Windows v2.0.55.84 driver, and the SuperMicro v2.0.58.0 driver with the SuperMicro v13.0.57.0 firmware. Any combination that included the SuperMicro v2.0.58.0 driver or the SuperMicro v14.0.0.0 firmware resulted in problems with the drives or controller not responding. The in-box Windows v2.0.55.84 driver and the released SuperMicro v13.0.57.0 firmware was the only stable combination.

Below are some screenshots of the driver versions and errors:

LSI.2.0.55.84LSI.2.0.58.0

Eventlog.Controller.ErrorEventlog.IO.RetriedEventlog.Reset.DeviceFormat.Failed

 

One of the reasons I am not yet prepared to use Storage Spaces or ReFS is because of the complete lack of decent documentation, best practice guides, or deployment recommendations. As an example, the only documentation on SSD journal drive configuration is in TechNet forum post from a Microsoft employee, requiring the use of PowerShell, and even then there is no mention of scaling or size ratio requirements. Yes, the actual PowerShell commandlet parameters are documented on MSDN, but not the use or the meaning.

PowerShell is very powerful and Server 2012 is completely manageable using PowerShell, but an appeal of Windows has always been the management user interface, especially important for adoption by SMB’s that do not have a dedicated IT staff. With Windows Home Server being replaced by Windows Server 2012 Essentials, the lack of storage management via the UI will require regular users to become PowerShell experts, or maybe Microsoft anticipates that configuration UI’s will be developed by hardware OEM’s deploying Windows Storage Server 2012 or Windows Server 2012 Essentials based systems.

My feeling is that Storage Spaces will be one of those technologies that matures and becomes generally usable after one or two releases or service packs post the initial release.

 

I tested disk performance using ATTO Disk Benchmark 2.47, and CrystalDiskMark 3.01c.

I ran each test twice, back to back, and report the average. I realize two runs are not statistically significant, but with just two runs it took several days to complete the testing in between regular work activities. I opted to only publish the CrystalDiskMark data as the ATTO Disk Benchmark results varied greatly between runs, while the CrystalDiskMark results were consistent.

Consider the values useful for relative comparison under my test conditions, but not useful for absolute comparison with other systems.

 

Before we get to the results, a word on the tests.

The JBOD tests were performed using the C600 SATA3 controller.
The Simple, Mirror, Triple, and RAID0 tests were performed using the SAS 2308 SAS2 controller.
The Parity, RAID5, RAID6, and CacheCade tests were performed using the SAS 9286CV-8e controller.

The Simple test created a simple storage pool.
The Mirror test created a 2-way mirrored storage pool.
The Triple test created a 3-way mirrored storage pool.
The Parity test created a parity storage pool.
The Journal test created a parity storage pool, with SSD drives used for the journal disks.
The CacheCade test created RAID sets, with SSD drives used for caching.

 

As I mentioned earlier, there is next to no documentation on how to use Storage Spaces. In order to use SSD drives as journal drives, I followed information provided in a TechNet forum post.

Create the parity storage pool using PowerShell or the GUI. Then associate the SSD drives as journal drives with the pool.

Windows PowerShell
Copyright (C) 2012 Microsoft Corporation. All rights reserved.

PS C:\Users\Administrator> Get-PhysicalDisk -CanPool $True

FriendlyName CanPool OperationalStatus HealthStatus Usage Size
------------ ------- ----------------- ------------ ----- ----
PhysicalDisk4 True OK Healthy Auto-Select 447.13 GB
PhysicalDisk5 True OK Healthy Auto-Select 447.13 GB

PS C:\Users\Administrator> $PDToAdd = Get-PhysicalDisk -CanPool $True
PS C:\Users\Administrator>
PS C:\Users\Administrator> Add-PhysicalDisk -StoragePoolFriendlyName "Pool" -PhysicalDisks $PDToAdd -Usage Journal
PS C:\Users\Administrator>
PS C:\Users\Administrator>
PS C:\Users\Administrator> Get-VirtualDisk

FriendlyName ResiliencySettingNa OperationalStatus HealthStatus IsManualAttach Size
me
------------ ------------------- ----------------- ------------ -------------- ----
Pool Parity OK Healthy False 18.18 TB

PS C:\Users\Administrator> Get-PhysicalDisk

FriendlyName CanPool OperationalStatus HealthStatus Usage Size
------------ ------- ----------------- ------------ ----- ----
PhysicalDisk0 False OK Healthy Auto-Select 3.64 TB
PhysicalDisk1 False OK Healthy Auto-Select 3.64 TB
PhysicalDisk2 False OK Healthy Auto-Select 3.64 TB
PhysicalDisk3 False OK Healthy Auto-Select 3.64 TB
PhysicalDisk4 False OK Healthy Journal 446.5 GB
PhysicalDisk5 False OK Healthy Journal 446.5 GB
PhysicalDisk6 False OK Healthy Auto-Select 3.64 TB
PhysicalDisk7 False OK Healthy Auto-Select 3.64 TB
PhysicalDisk8 False OK Healthy Auto-Select 447.13 GB
PhysicalDisk10 False OK Healthy Auto-Select 14.9 GB

PS C:\Users\Administrator>

I initially added the journal drives after the virtual drive was already created, but that would not use the journal drives. I had to delete the virtual drive, recreate it, and then the journal drives kicked in. There must be some way to manage this after virtual drives already exist, but again, no documentation.

 

In order to test Storage Spaces using the SAS 9286CV-8e RAID controller I had to switch it to JBOD mode using the commandline MegaCli utility.


D:\Install>MegaCli64.exe AdpSetProp EnableJBOD 1 a0

Adapter 0: Set JBOD to Enable success.

Exit Code: 0x00

D:\Install>MegaCli64.exe AdpSetProp EnableJBOD 0 a0

Adapter 0: Set JBOD to Disable success.

Exit Code: 0x00

D:\Install>

 

The RAID and CacheCade disk sets were created using the LSI MegaRAID Storage Manager GUI utility.

 

Below is a summary of the throughput results:

ReadWriteKBPS

ReadWriteIOPS

 

Not surprisingly the SSD drives had very good scores all around for JBOD, Simple, and RAID0. I only had two drives to test with, but I expect more drives to further improve performance.

The Simple, Mirror, and Triple test results speak for themselves, performance halving, and halving again.

The Parity test shows good read performance, and bad write performance. The write performance approaches that of a single disk.

The Parity with SSD Journal disks shows about the same read performance as without journal disks, and the write performance double that of a single disk.

The RAID0 and Simple throughput results are close, but the RAID0 write IOPS doubling that of the Simple volume.

The RAID5 and RAID6 read performance is close to Parity, but the write performance almost ten fold that of Parity. It appears that the SLI card writes to all drives in parallel, while Storage Spaces parity writes to one drive only.

The CacheCade read and write performance is less than without CacheCade, but the IOPS ten fold higher.

The ReFS performance is about 30% less than the equivalent NTFS performance.

 

 

Until Storage Spaces gets thoroughly documented and improves performance, I’m sticking with hardware RAID solutions.

Synology DS2411+ Performance Review

In my last post I compared the performance of  Synology DS1511+ against the QNAP TS-859 Pro. As I finished writing that post, Synology announced the new Synology DS2411+.
Instead of using a DS1511+ and DX510 extender for 10 disks, the DS2411+ offers 12 disks in a single device. The price difference is also marginal, DS1511+ is $836, the DX510 is $500, and the DS2411+ is $1700. That is a difference of only $364, and well worth it for the extra storage space, and the reliability and stability of all drives in one enclosure. I ended up returning my DX510 and DS1511+, and got a DS2411+ instead.

To test the DS2411+, I ran the same performance tests, using the same MPIO setup as I described in my previous post. The only slight difference was in the way I configured the iSCSI LUN; the DS1511+ was configured as SHR2, while the DS2411+ was configured as RAID6. Theoretically both are the same when all the disks are the same size, and SHR2 ends up using RAID6 internally.
iSCSI LUN configuration:
DS2411.iSCSI.LUN

At idle the DS2411+ used 42W power, and under load it used 138W power. The idle power usage is close to the advertised 39W idle power usage, but quite a bit more than the advertised 105W power usage under load.

I use Remote Desktop Manager to manage all my devices in one convenient application. RDM supports web portals, Remote Desktop, Hyper-V, and many more remote configuration options, all in a single tabbed UI. What I found was that the Synology DSM has some problems when running in a tabbed IE browser. When I open the log history, I get a script error, and whenever I focus away and back on the browser window, the DSM desktop windows shift all the way to the left. I assume this is a DSM problem related to absolute and relative referencing. I logged a support case, and I hope they can fix it.
Script error:
DS2411.DSM.Script.Error

Test results:

Device
ATTO Read
ATTO Write
CDM Read
CDM Write
PM810 267.153 260.839 256.674 251.850
DS2411+ 244.032 165.564 149.802 156.673
DS1511+ 244.032 126.030 141.213 115.032
TS-859 Pro 136.178 95.152 116.015 91.097

Chart
DS2411+:
Atto.Synology.MPIOCDM.Synology.MPIO
DS1511+
Atto.Synology.MPIOCDM.Synology.MPIO

The DS2411+ published performance numbers are slightly better than the DS1511+ numbers, and my testing confirms that. so far I am really impressed with the DS2411+.

Synology DS1511+ vs. QNap TS-859 Pro, iSCSI MPIO Performance

Untitled Page I have been very happy with my QNap TS-859 Pro (Amazon), but I’ve run out of space while archiving my media collection, and I needed to expand the storage capacity. You can read about my experience with the TS-859 Pro here, and my experience archiving my media collection here.
My primary objective with this project is storage capacity expansion, and my secondary objective is improved performance.

My choices for storage capacity expansion included:

  • Replace the 8 x 2TB drives with 8 x 3TB drives, to give me 6TB of extra storage. The volume expansion would be very time consuming, but my network setup can remain unchanged during the expansion.
  • Get a second TS-859 Pro with 8 x 3TB drives, to give me 18TB of extra storage. I would need to add the new device to my network, and somehow rebalance the storage allocation across the two devices, without changing the file sharing paths, probably by using directory mount points.
  • Get a Synology DS1511+ (Amazon) and a DX510 (Amazon) expansion unit with 10 x 3TB drives to replace the QNap, to give me 12TB of extra storage, expandable to 15 x 3TB drives for 36TB of total storage. I will need to copy all data to the new device, then mount the new device in place of the old device.

I opted for the DS1511+ with one DX510 expansion unit, I can always add a second DX510 and expand the volume later if needed.
As far as hard drives go, I’ve been very happy with the Hitachi Ultrastar A7K2000 2TB drives I use in my workstations and the QNap, so I stayed with the larger Hitachi Ultrastar 7k3000 3TB drives for the Synology expansion.

For improving performance I had a few ideas:

  • The TS-859 Pro is a bit older than the DS1511+, and there are newer and more powerful QNap models available, like the TS-859 Pro+ (Amazon) with a faster processor, or the TS-659 Pro II (Amazon) with a faster processor and SATA3 support, so it not totally fair to compare the TS-859 Pro performance against the newer DS1511+. But, the newer QNap models do not support my capacity needs.
  • I use Hyper-V clients and dynamic VHD files located on an iSCSI volume mounted in the host server. I elected this setup because it allowed me great flexibility in creating logical volumes for the VM’s, without actually requiring the space to be allocated. In retrospect this may have been convenient, but it was not performing well in large file transfers between the iSCSI target and the file server Hyper-V client.
    For my new setup I was going to mount the iSCSI volume as a raw disk in the file server Hyper-V client. This still allowed me to easily move the iSCSI volume between hosts, but the performance will be better than fixed size VHD files, and much better than dynamic VHD files.
    Here is a blog post describing some options for using iSCSI and Hyper-V.
  • I used iSCSI thin provisioning, meaning that the logical target has a fixed size, but the physical storage only gets allocated as needed. This is very convenient, but turned out to be slower than instant allocation. The QNap iSCSI implementation is also a file-level iSCSI LUN, meaning that the iSCSI volume is backed by a file on an EXT4 volume.
    For my new setup I was going to use the Synology block-level iSCSI LUN, meaning that the iSCSI volume is directly mapped to a physical storage volume.
  • I use a single LAN port to connect to the iSCSI target, meaning the IO throughput is limited by network bandwidth to 1Gb/s or 125MB/s.
    For my new setup I wanted to use 802.3ad link aggregation or Multi Path IO (MPIO) to extend the network speed to a theoretical 2Gb/s or 250MB/s. My understanding of link aggregation turned out to be totally wrong, and I ended up using MPIO instead.

To create a 2Gb/s network link between the server and storage, I teamed two LAN ports on the Intel server adapter, I created a bond of the two LAN ports on the Synology, and I created two trunks for those connections on the switch. This gave me a theoretical 2Gb/s pipe between the server and the iSCSI target. But my testing showed no improvement in performance over a single 1Gb/s link. After some research I found that the logical link is 2Gb/s, but that the physical network stream going from one MAC address to another MAC address is still limited by the physical transport speed, i.e. 1Gb/s. This means that the link aggregation setup is very well suited to e.g. connect a server to a switch using a trunk, and allow multiple clients access to the server over the switch, each at full speed, but it has no performance benefit when there is a single source and destination, as is the case with iSCSI. Since link aggregation did not improve the iSCSI performance, I used MPIO instead.

I set up a test environment where I could compare the performance of different network and device configurations using readily available hardware and test tools. Although my testing produced reasonably accurate relative results, due to the differences in environments, it can’t really be used for absolute performance comparisons.

Disk performance test tools:

Server setup:

Network setup:

  • HP ProCurve V1810 switch, Jumbo Frames enabled, Flow Control enabled.
  • Jumbo Frames enabled on all adapters.
  • CAT6 cables.
  • All network adapters connected to the switch.

QNap setup:

Synology setup:

To test the performance using the disk test tools I mounted the iSCSI targets as drives in the server. I am not going to cover details on how to configure iSCSI, you can read the Synology and QNap iSCSI documentation, and more specifically the MPIO documentation for Windows, Synology and QNap.
A few notes on setting up iSCSI:

  • The QNap MPIO documentation shows that LAN-1 and LAN-2 are in a trunked configuration. As far as I could tell the best practices documentation from Microsoft, DELL, Synology, and other SAN vendors, say that trunking and MPIO should not be mixed. As such I did not trunk the LAN ports on the QNap.
  • I connected all LAN cables to the switch. I could have done direct connections to eliminate the impact of the switch, but this is not how how I will install the setup, and the switch should be sufficiently capable of handling the load and not add any performance degradation.
  • Before trying to enable MPIO on Windows Server, first connect one iSCSI target and map the device, then add the MPIO feature. If you do not have a mapped device, the MPIO iSCSI option will be greyed out.
  • The server’s iSCSI target configuration explicitly bound the source and destination devices based on the adapters IP address, i.e. server LAN-1 would bind to NAS LAN-1, etc. This ensured that traffic would only be routed to and from the specified adapters.
  • I found that the best MPIO load balance policy was the Least Queue Depth Option.

During my testing I encountered a few problems:

  • The DX510 expansion unit would sometimes not power on when the DS1511+ is powered on, or would sometimes fail to initialize the RAID volume, or would sometimes go offline while powered on. I RMA’d the device, and the replacement unit works fine.
  • During testing of the DS1511+, the write performance would sometimes degrade by 50% and never recover. The only solution was to reboot the device. Upgrading the the latest 3.1-1748 DSM firmware solved this problem.
  • During testing of the DS1511+, when one of the MPIO network links would go down, e.g. I unplug a cable, ghost iSCSI connections would remain open, and the iSCSI processes would consume 50% of the NAS CPU time. The only solution was to reboot the device. Upgrading the the latest 3.1-1748 DSM firmware solved this problem.
  • I could not get MPIO to work with the DS1511+, yet no errors were reported. It turns out that LAN-1 and LAN-2 must be on different subnets for MPIO to work.
  • Both the QNap and Synology exhibits weird LAN traffic behavior when both LAN-1 and LAN-2 is connected, and the server generates traffic directed to LAN-1 only. The NAS resource monitor would show high traffic volumes on LAN-1 and and LAN-2, even with no traffic directed at LAN-2. I am uncertain why this happens, maybe a reporting issue, maybe a switching issue, but to avoid it influencing the tests, I disconnected LAN-2 while not testing MPIO.

My test methodology was as follows:

  • Mount either the QNap or Synology iSCSI device, power of the other device while not being tested.
  • Connect the iSCSI target using LAN-1 only and unplug LAN-2, or connect using MPIO with LAN-1 and LAN-2 active.
  • Run all CDM tests with iterations set at 9, and a 4GB file-set size.
  • Run ATTO with the queue depth set to 8, and a 2GB file-set size.
  • As a baseline, I also tested the Samsung PM810 SSD drive using ATTO and CDW.

Test result summary:

Device
ATTO Read
ATTO Write
CDM Read
CDM Write
Total (MB/s)
PM810 267.153 260.839 256.674 251.850 1,036.516
DS1511+ MPIO 244.032 126.030 141.213 115.032 626.307
TS-859 Pro MPIO 136.178 95.152 116.015 91.097 438.442
DS1511+ 122.294 120.172 89.258 105.618 437.342
TS-859 Pro 119.370 99.864 76.529 89.752 385.515

image

Detailed results:
PM810:
Atto.P810 CDM.P810
DS1511+ MPIO:
Atto.Synology.MPIO CDM.Synology.MPIO
TS-859 Pro MPIO:
Atto.Qnap.MPIO CDM.Qnap.MPIO
DS1511+:
Atto.Synology CDM.Synology
TS-859 Pro:
Atto.Qnap CDM.Qnap

Initially, I was a little concerned about the DX510 being in a separate case connected with an eSATA cable to the main DS1511+. Especially after I had to RMA my first DX510 because of what appeared to be connectivity issues. I was also concerned that there would be a performance difference between the 5 drives in the DS1511+ and the 5 drives in the DX510. Testing showed no performance difference between a 5 drive volume and a 10 drive volume, and the only physically noticeable difference was that the drives in the DX510 ran a few degrees hotter compared to the drives in the DS1511+.

As you can see from the results, the DS1511+ with MPIO performs really very well. Especially the 244MB/s ATTO read performance that gets close to the theoretical maximum of 250MB/s over a 2Gb/s link.

But technology moves quickly, and as I was compiling my test data for this post, Synology released two new NAS units, the DS3611xs and the DS2411+. The DS2411+ is very appealing, it is equivalent in performance to the DS1511+, but supports 12 drives in the main enclosure.
I may just have to exchange my DS1511+ and DX510 for a DS2411+…

[Update: 25 July 2011]
I returned the DS1511+ and DX510 in exchange for a DS2411+.
Read my performance review here.

Archiving my CD, DVD and BD collection

I am about two thirds done archiving my entire CD, DVD, and BD collection to network storage. I have been ripping on a part time basis for about 5 months, and so far I’ve ripped over 700 discs.

I have considered archiving my media collection for some time, but just never got around to it. Recently our toddler discovered how to open discs and use them as toys, so storing the discs safely quickly became a priority. I’d like to give you some insight into what I’ve learned and what process I follow.

 

After ripping, I store the discs in aluminum storage cases that hold 600 discs in hanging sleeves. There are similar cases with a larger capacity, but the dimensions of the 600-disc case allows for easy manipulation and storage in my garage. I download or scan the cover images as part of the ripping process, so I had no need to keep them, and I, reluctantly, threw them away. If I could I would have kept the covers, but I found no convenient way to store them.

Below is a picture of the storage case:

StorageCase

 

All the ripped content is saved on my home server, and the files are accessible over wired Gigabit Ethernet and 802.11n Wireless. My server setup is probably excessive, but it serves a purpose. I run a Windows 2008 R2 Hyper-V Server. In the Hyper-V host I run two W2K8R2 guests, one being a Domain Controller, DHCP server, and DNS Server, and the other being a File Server. The file server storage is provided by 2 x QNAP TS-859 Pro iSCSI targets, each with 8 x 2TB drives in RAID6. This gives the file server about 24TB of usable disk space.

24TB may sound like a lot of storage, but considering that I store my documents, my pictures of which most are RAW, my home movies of which most are HD, and all my ripped media in uncompressed format, I really need that much storage.

 

I am currently using Boxee Boxes for media playback. The Boxee Box does not have all the features of XBMC, and I sometimes have to hard boot it to become operational again, but it plays most file types, it runs the Netflix app, and is reasonably maintenance free.

Although Boxee is derived from XBMC, I really miss some of the XBMC features, specifically the ability to set the type of content in a directory, and to sort by media meta-data. Like XBMC, Boxee expects directories and video files to be named a specific way, and the naming is used to lookup the content details. Unlike XBMC, Boxee treats all media sources the same way, so when I add a folder with TV episodes and another folder with movies Boxee often incorrectly classifies the content, and I have to spend time correcting the meta-data. What makes it worse is that I have to apply the same corrections on each individual Boxee Box, it would have been much more convenient if my Boxee account allowed my different Boxee Boxes to share configurations.

 

Ripping and storing the discs is part of the intake process, but I also need a searchable catalog of the disc information, where the ripped files are stored, and where the physical disc is stored. I use Music Collector and Movie Collector to catalog and record the disc information. Unlike other tools I’ve tested, the Music Collector Connect and Movie Collector Connect online services allow access my catalog content anywhere using a web browser. The Connect service does allow you to add content online, theoretically negating the need for the desktop products, but I found the desktop products to be much more effective to use for intake, and then export the content online.

To catalog a CD I take the following steps: I start the automatic add feature, that computes the disc fingerprint and uses the fingerprint to lookup the disc details online. In most cases the disc is correctly identified, including album, artist, track names, etc. In many cases the front disc cover image is available, but it is rare that both the front and back covers are available. If either cover is not available, I scan my own covers, and add them to the record. I found that many of the barcode numbers (UPC) do not match the barcode of my version of the discs, if they do not match, I scan my barcode and update the record. If I made any corrections, or added missing covers, I submit the updated data, so that other users can benefit from my corrections and additions.

To catalog a DVD or BD I take the following steps: I start the automatic add feature, I use a barcode scanner and I scan in the barcode, the barcode is used to lookup the disc details online. In most cases the disc is correctly identified, including name, release year, etc. In some cases my discs do not have barcodes, this is especially true for box sets where the box may have a barcode but the individual movies in the box does not, or where I threw away the part of the box that had the barcode.

Since I buy most of my movies from Amazon, I can use my order history to find the Amazon ASIN number of the item I purchased. I then use IMDB to lookup the UPC code associated with the ASIN number. To do this search for the movie by name in IMDB, then click on the “dvd details” dropdown in the “quick links” section, then search the page for the ASIN number, and copy and paste the associated UPC code. Alternatively you can just use Google and search for the “[ASIN number] UPC”, this is sometimes successful. I don’t know why Amazon, who owns IMDB, does not display UPC codes on the product details page?

If I still do not have a UPC code, I search for the movie by name, look at the results, and pick the movie with the cover matching my disc. In most cases the disc front and back cover is available. If either cover is not available, I scan my own covers, and add them to the record. If I made any corrections, or added missing covers, I submit the updated data, so that other users can benefit from my corrections and additions.

Below are screenshots of Music Collector and Music Collector Online:

MusicCollector    MusicCollector.Online

Below are screenshots of Movie Collector and Movie Collector Online:

MovieCollector    MovieCollector.Online

 

In terms of the ripping process, ripping CD’s is really the most problematic and time consuming. Unlike BD’s that are very resilient, CD’s scratch easily resulting in read errors. Sometimes I had to re-rip the same disc multiple times, between multiple drives, before all tracks ripped accurately. I want accurate and complete meta-data for the ripped files. Sometimes automatic meta detection did not work and I had to manually find and enter the artist, album, song title, etc. This is especially problematic when there are multiple variants, such as pressings and regional track content or track order, of the same logical disc, and I have to match the online meta-data against my particular version of the disc. BD’s and DVD’s typically have only one movie per disc, where each CD has multiple tracks, and the correct metadata has to be set for the album and each track. So although a CD may physically rip much faster compared to a BD, it takes a lot more time and manual effort to accurately rip, tag, and catalog a CD.

I use dBpoweramp for ripping CD’s, it has two advantages over other tools I’ve tested; AccurateRip and PerfectMeta.

Unlike data CD’s, audio CD track data cannot be read 100% accurately using a data CD drive. If the CD drive reads a data track and encounters a read failure, it reports the failure to the reading software. If the CD drive reads an audio track and encounters a read failure, it may ignore the error, it may interpolate the data, or it may replace the data with silence, all without telling the reading software that there was an error. As a result the saved file may contain pops, inaccurate data, or silence. In order to rip a CD track accurately, the ripping software needs to read the the same track several times, and compare the results, and keep on re-reading the track until the same result has been obtained a number of times. This makes ripping CD’s accurately a very time consuming process. Even if you do get the same results with every read, you are still not guaranteed the what you read is accurate, you may just have read the same bad data multiple times. You can read more about the technicalities of ripping audio CD’s accurately here.

AccurateRip solves this problem by creating an online database of disc and track fingerprints. A track is read at full speed, the track’s fingerprint is computed, and compared against the online database of similar tracks, if the fingerprint matches, the track is known to be good, and there is no need to re-read the track. This allows CD’s to be ripped very fast and very accurately.

I use the Free Lossless Audio Codec (FLAC) format for archiving my CD’s. FLAC reduces the file size, but retains the original audio quality. FLAC also supports meta-data allowing the track artist, album, title, and image of the CD cover, etc. to be stored in the file. Unlike the very common MP3 format, FLAC playback is by default not supported by Windows Media Player (WMP). To make WMP, and Windows, play FLAC, you need to install the Xiph FLAC DirectShow filters. Or use Media Player Classic Home Cinema (MPC-HC). A typical audio CD rips to about 400MB in FLAC files.

Just like a CD track can be identified using a fingerprint, an entire CD can also be identified using a fingerprint. When the same CD is manufactured in different batches, or different factories, it results in different track fingerprints for the same logical CD. The same logical CD may also contain different tracks or track orders when released in different regions, also resulting in a different CD fingerprints. CDDB ID is the classic fingerprint, but with uniqueness problems, the more modern Disc ID algorithm does not suffer from such problems, and allows very unique fingerprints to be created by just looking at the track layout, i.e. no need to read the track data.

CD meta-data providers match CD fingerprints against logical album details. Some of this information is freely available, such as freeDB, Discogs, and MusicBrainz, and some information is commercially available, such as Gracenote, GD3, and AMG. Free providers are typically community driven, while commercial providers may have more accurate data.

PerfectMeta makes the tagging process easy, fast, detailed, and accurate. By integrating with a variety of different meta-data providers, including commercial GD3 and AMG, the track meta-data will automatically be selected based on the most reliable provider, or the most consistent data.

Below are screenshots of dBpoweramp ripping a CD, and reviewing the meta-data:

dBpoweramp.Rip     dBpoweramp.MetaData

 

I use MakeMKV for ripping DVD’s and BD’s, it is fast and easy to use, and supports extracting multiple audio, subtitle, and video tracks to a single output file.

MakeMKV creates Matroska Media Container (MKV) format output files. MKV supports multiple media streams and meta-data in the same file. MKV is not a compression format, it is just a container file, inside the container can be any type of media stream such as an AVC video stream, a DTS-HD audio stream, a PGS subtitle stream, chapter markers, etc. MKV playback is by default not supported by WMP or Windows Media Center (WMC). One solution is to install codec packs such as the K-Lite Codec Pack, but I prefer to use standalone players such as Boxee, XBMC, or MPC-HC.

MakeMKV does not perform any recompression of the streams found on the DVD or BD, it simply reads them from the source and writes them to the MKV file. This means that the playback quality is unaltered and equivalent to that of the source material. This also means that the MKV file is normally the same size as the original DVD or BD disc, typically 7GB for a DVD and 35GB for a BD.

I hate starting a BD or DVD, and I have to sit there watching one trailer after the next, especially when the disc prohibits skipping the clip and the kids are getting impatient. I paid good money for the disc, why am I forced to watch advertising on a disc I own? MakeMKV solves this problem by allowing me to rip only the main movie, and when I start playing the MKV file, I immediately see the main movie start. The downside to ripping only the main movie is that disc extras are not available, and the downside to ripping in general is that BD+ interaction is also not available. Some people prefer to rip a disc to an ISO, and then play the ISO with a software player that still allows menu navigation, I have no such need, and ripping only the main movie satisfies my requirements.

When I make my stream selection I pick the main movie, the main English audio track, the English subtitles, and the English forced subtitles. If a movie contains an HD audio track, such as DTS-HD, TrueHD, or LPCM, I also select the non-HD audio track. I do this in case the playback hardware device does not support HD audio, or the player software cannot down-convert the HD audio to a format supported by the playback hardware. On some discs an HD audio and a non-HD audio track is included, but if not, MakeMKV can automatically extract DTS from DTS-HD and can extract AC3 from TrueHD.

On some discs where there are many subtitle streams of the same language, selection gets very complicated, this is especially true when the disc contains forced subtitles. Forced subtitles are the subtitles that are displayed when there is dialog in a language other than the main audio langue, such as when aliens are talking to each other, but when people talk there are no subtitles. On DVD’s the forced subtitles are normally in a separate subtitle stream, on BD’s the subtitle stream includes a forced-bit for specific sentences. MakeMKV can automatically extract forced subtitles as a separate stream from a subtitle stream that contains normal and forced subtitles. When I encounter a disc where I cannot make out which video, audio, or subtitle streams to extract, I use EAC3TO to extract the individual tracks, view, listen, or read them, and then decide which tracks to select in MakeMKV.

Ripping television series on DVD or BD has its own challenges. In order for players like Boxee and XBMC to correctly identify the shows, the files and folders must be properly organized and named. A disc typically contains a few episodes of the series, and some discs contain extras. When you make the track selections you need to include the episodes but exclude the extras. MakeMKV creates a folder for every disc, and names each file according to its track number on that disc. This results in multiple folders, one per disc, with duplicate file names in each folder. In order to re-assemble the series in one folder, you need to rename the episodes from each folder according to the correct season and episode number, such as S01E01.mkv, then move all the files to one folder. What makes this very complicated is when the episode order on disc is different to the aired episode order. The TV scrapers use community television series websites, such as TheTVDB and TVRage, to retrieve show information. The season number and show number must match the aired episode number, not the disc order number. It is a real pain to manually match the disc to aired episode numbers, and I don’t know why discs would use a different show order compared to the aired order? Once you have your episodes named, such as S01E01.mkv, it is very easy to correctly name the file and folder by using an application called TVRename. Point TVRename to your ripped television show folder, it will try to automatically match show names to the TheTVDB show names, you can manually search and correct mappings, it will then automatically rename the show, season, and filenames, according to your preference, and in a format that Boxee and XBMC recognizes.

Below are screenshots of MakeMKV with the stream selection screen for a DVD and a BD:

MakeMKV.WrathOfKhan    MakeMKV.IronMan2

 

When I started ripping my collection I had no idea it would take this long. If I were to dedicate my time to ripping and ripping only, I would have been done a long time ago, but I typically rip only a few discs per week, in between regular work activities; get to the office, insert disc, start working, swap disc, continue working, swap disc, go to meeting, rip a few discs while having lunch at my desk, rip a few discs during the weekend, repeat. The time it takes to rip a disc is important when you stare at the screen, but less so when you have other things to do.

Over the months I’ve used a variety of BD readers, some worked well for BD’s, but were really bad for CD’s, some were fast and some were slow. To illustrate the performance, I selected a BD, a DVD, and a CD, and I ripped them all using the same settings, on the same machine, but using a variety of drive models.

Some drive models incorporate a feature called riplocking, that limits the read speed when reading video discs in order to reduce drive noise. A riplocked drive will read a video BD or DVD much slower than a data BD or DVD, and this results in slow rip times. I used an application called Media Code Speed Edit (MCSE) to remove the riplock restriction on some of the drives.

All drives include Regional Playback Control (RPC) that restricts the media than can be played in that drive by region. There are different regions for DVD and BD discs. RPC-1 drives allow software to enforce the region protection, RPC-2 drives perform the region protection in drive hardware. Most new drives are RPC-2 drives. Drive region protection is not an issue for MakeMKV, and it can rip any region disc on any region drive. RPC-1 versions of firmware is available for many drives at the RPC-1 Database.

I tested the following drives:

Drive

Firmware

Notes

LG BH12LS35 1.00  
LG BH12LS35 1.00 Riplock removed
LG UH10LS20 1.00  
LG UH10LS20 1.00 Riplock removed
LG UH10LS20 1.00 RPC-1
Plextor PX-B940SA 1.08 Rebranded Pioneer BDR-205
Sony BD-5300S 1.04 Rebranded Lite-On iHBS112
Lite-On iHBS212 5L09  
Pioneer BDR-206 1.05  

I measured the rip speed in Mbps, as computed by dividing the output file size by the rip time in seconds. The file size is the size of the MKV file for DVD’s and BD’s, and the size of all files in the album folder for CD’s. The rip time is computed by subtracting the file create time from the file modified time. The test methodology is not a standard test, and the results should not be used in absolute comparisons, but are very valid in relative comparisons. For more standard testing and reviews visit CDFreaks.

Test results:

Chart
From the results we can see that the Sony BD-5300S (a rebranded Lite-On iHBS112) and the Lite-On iHBS212 drives are the fastest overall ripping drives, the fastest BD ripping drives, the fastest DVD dripping drives, but second slowest CD ripping drives. It is further interesting to note that the stock Lite-On drives were still faster than the riplock removed LG drives. The Lite-On drives also have the smallest AccurateRip drive correction offsets of all the drives.

 

I still have quite a way to go before all my discs are ripped, but at least I have the process down; rip, swap, repeat.

Unlimited online backup providers becoming extinct

I just received an email from ElephantDrive informing me that my legacy unlimited storage account will be terminated in 30 days, and that I must select a new plan.

In July 2009 ElephantDrive announced that they are no longer offering their $100 per year unlimited storage plan. ElephantDrive is now offering a $200 per year for 500GB plan.
In February 2011 Mozy announced that they are no longer offering their $55 per year unlimited storage plan. Mozy is now offering a $120 per year for 125GB plan.
In February 2011 Trend Micro SafeSync announced that they are bandwidth throttling large accounts. In March 2011 they announced that they are no longer offering their $35 per year unlimited storage plan. SafeSync is now offering a $150 per year for 150GB plan.
Carbonite offers a $55 per year for unlimited storage plan, but they are bandwidth throttling accounts over 35GB to 512Kbps and accounts over 200GB to 100Kbps access speeds.
AVG LiveKive offers a $80 per year for unlimited storage plan, but the terms of service defines unlimited as 500GB.
BackBlaze offers a $60 per year for unlimited storage plan.
CrashPlan offers a $50 per year for unlimited storage plan.
Neither BackBlaze nor CrashPlan supports their unlimited plan on server class machines.

I currently have 2.1TB of data backed up online with ElephantDrive running on my Windows Server 2008 R2 machine. Needles to say, none of their new plans are affordable for that amount of storage. I either need to significantly trim down what I backup, or I need to find a new unlimited storage provider, that also allows installs on Windows Server.
For now, I’m uninstalling ElephantDrive.

[Update]
CrashPlan’s new v3 software installs and runs fine on Windows Server 2008 R2, and I have switched to using CrashPlan for my backup needs.

Here is an example snippet of the status emails I receive from CrashPlan:

Source → Target Selected Files Backed
Up %
Last
Connected
Last
Backup
VM-STORAGE → CrashPlan Central 2.1TB ↑1KB 423k 0 100.0% 2.5 hrs 4.3 hrs