Geeky storage ideas, but also of use to non-geeks

20 Feb 2005

Pontification 

I've had an idea about better ways to do "consumer friendly" storage. It's been kicking around in my head for a while, and I've explained it to a few others, but I've only just managed to nail down how I plan to do this. Big words and chewable version first, then the geeky details below a cut. The basic concept is as follows: as we start to get more and more stuff on our computers, we need more hard drive space, but after a while there's only so much space in your computer, and the drives do fail every so often (more often if you're unlucky like me). There's a couple of ways already to stop you from losing all your data when a drive dies, but most of them involve you needing a set of identical drives, and making one of them a "backup" drive. This is a) complicated to do b) a pain to upgrade (i.e. you still haven't got enough space, no matter how big a set of drives you get).

So what if we could make it simple? You'd have this one big drive shown in "My Computer" (or equivalent) to store all of your data, and all of it would be protected against hardware failure. This is all provided by a series of "storage blocks" - a set of up to 4 or 5 easily replacable (switch off computer, pull out module, slide in new module, switch on) modules in the front of your computer. If you need more space, or something goes wrong (you'd see a helpful blinking light on the front of a dead drive), you can simply replace the module, and everything gets done for you. Replacing a small module with a bigger module (not physical size, but they'd have capacities marked on the side) would get you more space automatically, without having to think about copying data from the old drive, or anything like that. Sounds good doesn't it? Hence why I'm putting some thought into this.

This is definately getting implemented on Linux first for the reasons that a) it's a heck of lot easier to modify b) I can do things like UML to test it without trashing drives c) I want to know more about the Kernel, and have just ordered "Linux Device Drivers" from Amazon.

The basic concept revolves around bog standard RAID 1 and 5, but done in a different way to usual. Instead of RAID'ing a drive as one honking big array, have several different ones. In general, we try and make the biggest RAID-5 array possible across as many drives as possible (maybe limited to 4 or 5 to avoid multiple simulatenous failure), and then we build RAID-1 across the remaining space. Any left over space after this is left for future expansion. For example, if we have a pair of 80gb drives and a 60gb, then we use the first 60gb on all 3 drives to make a RAID-5, and the last 20 on the 80s to make a RAID-1. Then, if later we replace the 60 with another 80, then we increase the size of the RAID-5 incrementally, replacing the RAID-1 a section at a time. If we'd replaced the 60 with a 100gb drive instead, then the last 20gb of that drive would be left empty as there's no way we can use it without either gaining less available storage or breaking the '1-drive failure is not fatal' promise.

I'd been considering various ways on how to implement this (Device-mapper fun and games in user-space, extensions to the existing RAID drivers, etc) and the problems came down to three operations:

  1. Atomic swap - make a backup copy of data somewhere else and swap original backing storage for the backup atomically in an easy way.
  2. Splitting arrays - take exisiting RAID 1/5 array, make it into 2 RAID arrays so we can manipulate one without altering the other.
  3. Merging arrays - take 2 RAID 1/5 arrays with similar configurations (same RAID level, mapped over same set of drives, using contiguous storage) and merge them into 1.

2 & 3 needed support from the existing drivers, but because of the whole problem that you need some config data for the array at the beginning of it, I think that's going to be hard to hack into the existing system. However, I'm considering a way with defining a RAID array as a series of blocks (of 10mb or so size, maybe adaptable later on) and having config info at the beginning of every block. The overhead is a bit more than standard RAID, but I'm not too concerned about it. This makes splitting and merging (certainly at a block-by-block level) damn easier. I'd been considering using device-mapper on top of this possible driver to do atomic swap, but I think it's going to be easier to implement within the drivers themselves, mainly 'cause device-mapper can't do this yet. Does make things like making accurate backups of a block easier.

I'm going to have a go at this at some point, despite not having enough spare time. I suspect I'll find some somewhere.

Previously: Alice, we're lost Next: Portal