If you follow the evolution of hard drive technology much at all you’re probably aware of something called “AF” ( Advanced Format, not to be confused with “af” as the kids say in text messages or tweets). Basically, the 512 byte sectors which we’ve had since the 60s are deprecated in favor of sectors that are 8x as long (4096 bytes or 4k).

There’s overhead associated with writing a sector on the disk - a header at the beginning and an ECC at the end. The ECC is useful for correcting errors where a few bits got nailed by an imperfection on the disk surface. As the physical size of the blocks got smaller over time (more density), the size of potential imperfections didn’t get smaller at the same rate, leading to a requirement for longer ECCs to match the size of potential errors.

Meanwhile, the size of the individual drives in logical terms was getting out of hand, while the prevalence of tiny files (less than 4k) has gotten smaller over time.

Remember disk controllers that couldn’t talk to drives bigger than 2TB? That’s because 2TB is 2^32 512 byte sectors. Who would have ever imagined we’d have drives that big, let alone the 14TB behemoths that are the biggest you can get at this writing?

Until recently, my understanding how to deal with AF disks has been limited to “make sure that ashift=12 is set on your zpools so the right thing happens alignment-wise” and “don’t mix 512n and 512e/4kn disks in the same vdev (or maybe even zpool?) or you’ll be sorry”.

But what’s actually going on under the hood? Dell wrote a white paper on this (local mirror). It provides a good overview about what’s going on behind the scenes and how things might go off the rails if you’re unlucky or insufficiently cautious. If you should happen to run into some unexpected crummy performance on a RAID, you’ll be glad you read it.