I am about two thirds done archiving my entire CD, DVD, and BD collection to network storage. I have been ripping on a part time basis for about 5 months, and so far I’ve ripped over 700 discs.
I have considered archiving my media collection for some time, but just never got around to it. Recently our toddler discovered how to open discs and use them as toys, so storing the discs safely quickly became a priority. I’d like to give you some insight into what I’ve learned and what process I follow.
After ripping, I store the discs in aluminum storage cases that hold 600 discs in hanging sleeves. There are similar cases with a larger capacity, but the dimensions of the 600-disc case allows for easy manipulation and storage in my garage. I download or scan the cover images as part of the ripping process, so I had no need to keep them, and I, reluctantly, threw them away. If I could I would have kept the covers, but I found no convenient way to store them.
Below is a picture of the storage case:
All the ripped content is saved on my home server, and the files are accessible over wired Gigabit Ethernet and 802.11n Wireless. My server setup is probably excessive, but it serves a purpose. I run a Windows 2008 R2 Hyper-V Server. In the Hyper-V host I run two W2K8R2 guests, one being a Domain Controller, DHCP server, and DNS Server, and the other being a File Server. The file server storage is provided by 2 x QNAP TS-859 Pro iSCSI targets, each with 8 x 2TB drives in RAID6. This gives the file server about 24TB of usable disk space.
24TB may sound like a lot of storage, but considering that I store my documents, my pictures of which most are RAW, my home movies of which most are HD, and all my ripped media in uncompressed format, I really need that much storage.
I am currently using Boxee Boxes for media playback. The Boxee Box does not have all the features of XBMC, and I sometimes have to hard boot it to become operational again, but it plays most file types, it runs the Netflix app, and is reasonably maintenance free.
Although Boxee is derived from XBMC, I really miss some of the XBMC features, specifically the ability to set the type of content in a directory, and to sort by media meta-data. Like XBMC, Boxee expects directories and video files to be named a specific way, and the naming is used to lookup the content details. Unlike XBMC, Boxee treats all media sources the same way, so when I add a folder with TV episodes and another folder with movies Boxee often incorrectly classifies the content, and I have to spend time correcting the meta-data. What makes it worse is that I have to apply the same corrections on each individual Boxee Box, it would have been much more convenient if my Boxee account allowed my different Boxee Boxes to share configurations.
Ripping and storing the discs is part of the intake process, but I also need a searchable catalog of the disc information, where the ripped files are stored, and where the physical disc is stored. I use Music Collector and Movie Collector to catalog and record the disc information. Unlike other tools I’ve tested, the Music Collector Connect and Movie Collector Connect online services allow access my catalog content anywhere using a web browser. The Connect service does allow you to add content online, theoretically negating the need for the desktop products, but I found the desktop products to be much more effective to use for intake, and then export the content online.
To catalog a CD I take the following steps: I start the automatic add feature, that computes the disc fingerprint and uses the fingerprint to lookup the disc details online. In most cases the disc is correctly identified, including album, artist, track names, etc. In many cases the front disc cover image is available, but it is rare that both the front and back covers are available. If either cover is not available, I scan my own covers, and add them to the record. I found that many of the barcode numbers (UPC) do not match the barcode of my version of the discs, if they do not match, I scan my barcode and update the record. If I made any corrections, or added missing covers, I submit the updated data, so that other users can benefit from my corrections and additions.
To catalog a DVD or BD I take the following steps: I start the automatic add feature, I use a barcode scanner and I scan in the barcode, the barcode is used to lookup the disc details online. In most cases the disc is correctly identified, including name, release year, etc. In some cases my discs do not have barcodes, this is especially true for box sets where the box may have a barcode but the individual movies in the box does not, or where I threw away the part of the box that had the barcode.
Since I buy most of my movies from Amazon, I can use my order history to find the Amazon ASIN number of the item I purchased. I then use IMDB to lookup the UPC code associated with the ASIN number. To do this search for the movie by name in IMDB, then click on the “dvd details” dropdown in the “quick links” section, then search the page for the ASIN number, and copy and paste the associated UPC code. Alternatively you can just use Google and search for the “[ASIN number] UPC”, this is sometimes successful. I don’t know why Amazon, who owns IMDB, does not display UPC codes on the product details page?
If I still do not have a UPC code, I search for the movie by name, look at the results, and pick the movie with the cover matching my disc. In most cases the disc front and back cover is available. If either cover is not available, I scan my own covers, and add them to the record. If I made any corrections, or added missing covers, I submit the updated data, so that other users can benefit from my corrections and additions.
Below are screenshots of Music Collector and Music Collector Online:
Below are screenshots of Movie Collector and Movie Collector Online:
In terms of the ripping process, ripping CD’s is really the most problematic and time consuming. Unlike BD’s that are very resilient, CD’s scratch easily resulting in read errors. Sometimes I had to re-rip the same disc multiple times, between multiple drives, before all tracks ripped accurately. I want accurate and complete meta-data for the ripped files. Sometimes automatic meta detection did not work and I had to manually find and enter the artist, album, song title, etc. This is especially problematic when there are multiple variants, such as pressings and regional track content or track order, of the same logical disc, and I have to match the online meta-data against my particular version of the disc. BD’s and DVD’s typically have only one movie per disc, where each CD has multiple tracks, and the correct metadata has to be set for the album and each track. So although a CD may physically rip much faster compared to a BD, it takes a lot more time and manual effort to accurately rip, tag, and catalog a CD.
Unlike data CD’s, audio CD track data cannot be read 100% accurately using a data CD drive. If the CD drive reads a data track and encounters a read failure, it reports the failure to the reading software. If the CD drive reads an audio track and encounters a read failure, it may ignore the error, it may interpolate the data, or it may replace the data with silence, all without telling the reading software that there was an error. As a result the saved file may contain pops, inaccurate data, or silence. In order to rip a CD track accurately, the ripping software needs to read the the same track several times, and compare the results, and keep on re-reading the track until the same result has been obtained a number of times. This makes ripping CD’s accurately a very time consuming process. Even if you do get the same results with every read, you are still not guaranteed the what you read is accurate, you may just have read the same bad data multiple times. You can read more about the technicalities of ripping audio CD’s accurately here.
AccurateRip solves this problem by creating an online database of disc and track fingerprints. A track is read at full speed, the track’s fingerprint is computed, and compared against the online database of similar tracks, if the fingerprint matches, the track is known to be good, and there is no need to re-read the track. This allows CD’s to be ripped very fast and very accurately.
I use the Free Lossless Audio Codec (FLAC) format for archiving my CD’s. FLAC reduces the file size, but retains the original audio quality. FLAC also supports meta-data allowing the track artist, album, title, and image of the CD cover, etc. to be stored in the file. Unlike the very common MP3 format, FLAC playback is by default not supported by Windows Media Player (WMP). To make WMP, and Windows, play FLAC, you need to install the Xiph FLAC DirectShow filters. Or use Media Player Classic Home Cinema (MPC-HC). A typical audio CD rips to about 400MB in FLAC files.
Just like a CD track can be identified using a fingerprint, an entire CD can also be identified using a fingerprint. When the same CD is manufactured in different batches, or different factories, it results in different track fingerprints for the same logical CD. The same logical CD may also contain different tracks or track orders when released in different regions, also resulting in a different CD fingerprints. CDDB ID is the classic fingerprint, but with uniqueness problems, the more modern Disc ID algorithm does not suffer from such problems, and allows very unique fingerprints to be created by just looking at the track layout, i.e. no need to read the track data.
CD meta-data providers match CD fingerprints against logical album details. Some of this information is freely available, such as freeDB, Discogs, and MusicBrainz, and some information is commercially available, such as Gracenote, GD3, and AMG. Free providers are typically community driven, while commercial providers may have more accurate data.
PerfectMeta makes the tagging process easy, fast, detailed, and accurate. By integrating with a variety of different meta-data providers, including commercial GD3 and AMG, the track meta-data will automatically be selected based on the most reliable provider, or the most consistent data.
Below are screenshots of dBpoweramp ripping a CD, and reviewing the meta-data:
I use MakeMKV for ripping DVD’s and BD’s, it is fast and easy to use, and supports extracting multiple audio, subtitle, and video tracks to a single output file.
MakeMKV creates Matroska Media Container (MKV) format output files. MKV supports multiple media streams and meta-data in the same file. MKV is not a compression format, it is just a container file, inside the container can be any type of media stream such as an AVC video stream, a DTS-HD audio stream, a PGS subtitle stream, chapter markers, etc. MKV playback is by default not supported by WMP or Windows Media Center (WMC). One solution is to install codec packs such as the K-Lite Codec Pack, but I prefer to use standalone players such as Boxee, XBMC, or MPC-HC.
MakeMKV does not perform any recompression of the streams found on the DVD or BD, it simply reads them from the source and writes them to the MKV file. This means that the playback quality is unaltered and equivalent to that of the source material. This also means that the MKV file is normally the same size as the original DVD or BD disc, typically 7GB for a DVD and 35GB for a BD.
I hate starting a BD or DVD, and I have to sit there watching one trailer after the next, especially when the disc prohibits skipping the clip and the kids are getting impatient. I paid good money for the disc, why am I forced to watch advertising on a disc I own? MakeMKV solves this problem by allowing me to rip only the main movie, and when I start playing the MKV file, I immediately see the main movie start. The downside to ripping only the main movie is that disc extras are not available, and the downside to ripping in general is that BD+ interaction is also not available. Some people prefer to rip a disc to an ISO, and then play the ISO with a software player that still allows menu navigation, I have no such need, and ripping only the main movie satisfies my requirements.
When I make my stream selection I pick the main movie, the main English audio track, the English subtitles, and the English forced subtitles. If a movie contains an HD audio track, such as DTS-HD, TrueHD, or LPCM, I also select the non-HD audio track. I do this in case the playback hardware device does not support HD audio, or the player software cannot down-convert the HD audio to a format supported by the playback hardware. On some discs an HD audio and a non-HD audio track is included, but if not, MakeMKV can automatically extract DTS from DTS-HD and can extract AC3 from TrueHD.
On some discs where there are many subtitle streams of the same language, selection gets very complicated, this is especially true when the disc contains forced subtitles. Forced subtitles are the subtitles that are displayed when there is dialog in a language other than the main audio langue, such as when aliens are talking to each other, but when people talk there are no subtitles. On DVD’s the forced subtitles are normally in a separate subtitle stream, on BD’s the subtitle stream includes a forced-bit for specific sentences. MakeMKV can automatically extract forced subtitles as a separate stream from a subtitle stream that contains normal and forced subtitles. When I encounter a disc where I cannot make out which video, audio, or subtitle streams to extract, I use EAC3TO to extract the individual tracks, view, listen, or read them, and then decide which tracks to select in MakeMKV.
Ripping television series on DVD or BD has its own challenges. In order for players like Boxee and XBMC to correctly identify the shows, the files and folders must be properly organized and named. A disc typically contains a few episodes of the series, and some discs contain extras. When you make the track selections you need to include the episodes but exclude the extras. MakeMKV creates a folder for every disc, and names each file according to its track number on that disc. This results in multiple folders, one per disc, with duplicate file names in each folder. In order to re-assemble the series in one folder, you need to rename the episodes from each folder according to the correct season and episode number, such as S01E01.mkv, then move all the files to one folder. What makes this very complicated is when the episode order on disc is different to the aired episode order. The TV scrapers use community television series websites, such as TheTVDB and TVRage, to retrieve show information. The season number and show number must match the aired episode number, not the disc order number. It is a real pain to manually match the disc to aired episode numbers, and I don’t know why discs would use a different show order compared to the aired order? Once you have your episodes named, such as S01E01.mkv, it is very easy to correctly name the file and folder by using an application called TVRename. Point TVRename to your ripped television show folder, it will try to automatically match show names to the TheTVDB show names, you can manually search and correct mappings, it will then automatically rename the show, season, and filenames, according to your preference, and in a format that Boxee and XBMC recognizes.
Below are screenshots of MakeMKV with the stream selection screen for a DVD and a BD:
When I started ripping my collection I had no idea it would take this long. If I were to dedicate my time to ripping and ripping only, I would have been done a long time ago, but I typically rip only a few discs per week, in between regular work activities; get to the office, insert disc, start working, swap disc, continue working, swap disc, go to meeting, rip a few discs while having lunch at my desk, rip a few discs during the weekend, repeat. The time it takes to rip a disc is important when you stare at the screen, but less so when you have other things to do.
Over the months I’ve used a variety of BD readers, some worked well for BD’s, but were really bad for CD’s, some were fast and some were slow. To illustrate the performance, I selected a BD, a DVD, and a CD, and I ripped them all using the same settings, on the same machine, but using a variety of drive models.
Some drive models incorporate a feature called riplocking, that limits the read speed when reading video discs in order to reduce drive noise. A riplocked drive will read a video BD or DVD much slower than a data BD or DVD, and this results in slow rip times. I used an application called Media Code Speed Edit (MCSE) to remove the riplock restriction on some of the drives.
All drives include Regional Playback Control (RPC) that restricts the media than can be played in that drive by region. There are different regions for DVD and BD discs. RPC-1 drives allow software to enforce the region protection, RPC-2 drives perform the region protection in drive hardware. Most new drives are RPC-2 drives. Drive region protection is not an issue for MakeMKV, and it can rip any region disc on any region drive. RPC-1 versions of firmware is available for many drives at the RPC-1 Database.
I tested the following drives:
|LG BH12LS35||1.00||Riplock removed|
|LG UH10LS20||1.00||Riplock removed|
|Plextor PX-B940SA||1.08||Rebranded Pioneer BDR-205|
|Sony BD-5300S||1.04||Rebranded Lite-On iHBS112|
I measured the rip speed in Mbps, as computed by dividing the output file size by the rip time in seconds. The file size is the size of the MKV file for DVD’s and BD’s, and the size of all files in the album folder for CD’s. The rip time is computed by subtracting the file create time from the file modified time. The test methodology is not a standard test, and the results should not be used in absolute comparisons, but are very valid in relative comparisons. For more standard testing and reviews visit CDFreaks.
From the results we can see that the Sony BD-5300S (a rebranded Lite-On iHBS112) and the Lite-On iHBS212 drives are the fastest overall ripping drives, the fastest BD ripping drives, the fastest DVD dripping drives, but second slowest CD ripping drives. It is further interesting to note that the stock Lite-On drives were still faster than the riplock removed LG drives. The Lite-On drives also have the smallest AccurateRip drive correction offsets of all the drives.
I still have quite a way to go before all my discs are ripped, but at least I have the process down; rip, swap, repeat.