It's not easy being green | technotes.seastrom.com

Thu 21 January 2016
misc

About two and a half years ago when Gaige and I were evaluating SmartOS in the datacenter, we got some 3TB WD Green drives. Gaige had used them in other service with good results. I'd heard that using them on a RAID and in particular under ZFS was not such a hot idea. The decision that we came to was to run them on the lab machine as an experiment. For two and a half years they caused us no problems...

Earlier this week I went to rebuild/upgrade an AS112 node VM on Tarros, the aforementioned lab machine. I noticed it was misbehaving - slow and taking entirely too long to do "normal" stuff. Emailed Gaige who had also noticed similar behavior in the past 24 hours but decided to investigate later due to travel constraints. We were both puzzled by a lack of error messages in any logs, zpool status appeared (mostly, see below) OK, etc. Decided to do a zpool scrub to see if that fixed it. A scrub took almost 3x as long to run as the previous time it was run (similar amount of data in the zpool). Clearly there was a problem, but what?

iostat -E showed that there was one particular physical drive that was throwing hard errors. Some number of illegal requests is business as usual. The 56 hard errors on sd3 (= c1t2d0) not so much. "zpool offline zones c1t2d0" cleared the sluggish behavior. Afterwards, "vmadm create" and ansible scripts against SmartMachines ran at the expected speed. A one drive degraded raidz2 is essentially a raidz1 (== raid5), so we are now degraded to a "probably ought to fix that sometime soon" state.

The problem here is that "green" drives are designed for desktop use in a non-RAID arrangement. If the drive itself detects a bad read, the firmware has it read again and again and again until it has a successful read or throws an unrecoverable error. You can imagine that this is undesirable behavior when one has a RAID and could easily recover the data from an alternate source and then mark the sector "bad" and not use it in the future.

Drives that are intended for use in a NAS (which implies RAID) have different firmware, that exhibits "fail fast" behavior. Didn't get the data on the first try? Might retry once but that's it - return an error and let the data integrity layer of the RAID hardware or software sort it out.

We are replacing this drive with a 4TB WD "Red" drive. We've had good luck with 3TB "Red" drives elsewhere in our installation. This is probably the first step in upgrading the entire raidz2 pool to 4TB drives (which can be done on the fly under ZFS by iteratively swapping out drives).

More on replacing disks in a zpool at http://docs.oracle.com/cd/E19253-01/819-5461/gbcet/

Here's some terminal I/O showing me making it "better" by knocking the offending drive offline.

[root@f4-ce-46-bc-29-92 ~]# zpool status
  pool: zones
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
    still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
    the pool may no longer be accessible by software that does not support
    the features. See zpool-features(5) for details.
  scan: scrub repaired 652K in 2h43m with 0 errors on Wed Jan 20 14:05:32 2016
config:

[SHORT PAUSE HERE WHILE PRINTING OUT COMMAND RESULTS]
    NAME        STATE     READ WRITE CKSUM
    zones       ONLINE       0     0     0
      raidz2-0  ONLINE       0     0     0
    c1t0d0  ONLINE       0     0     0
    c1t1d0  ONLINE       0     0     0
[LONG PAUSE HERE WHILE PRINTING OUT COMMAND RESULTS]
    c1t2d0  ONLINE       0     0     0
    c1t3d0  ONLINE       0     0     0


errors: No known data errors
[root@f4-ce-46-bc-29-92 ~]#
[root@f4-ce-46-bc-29-92 ~]# iostat -E
sd0       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SanDisk Product: Cruzer Fit Revision: 1.00 Serial No:  Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 50 Predictive Failure Analysis: 0
sd1       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: WDC WD30EZRX-00D Revision: 0A80 Serial No: WD-WMC1T1865558
Size: 3000.59GB <3000592982016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 169 Predictive Failure Analysis: 0
sd2       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: WDC WD30EZRX-00D Revision: 0A80 Serial No: WD-WMC1T1974059
Size: 3000.59GB <3000592982016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 128 Predictive Failure Analysis: 0
sd3       Soft Errors: 0 Hard Errors: 56 Transport Errors: 0
Vendor: ATA      Product: WDC WD30EZRX-00D Revision: 0A80 Serial No: WD-WCC1T0148335
Size: 3000.59GB <3000592982016 bytes>
Media Error: 50 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 135 Predictive Failure Analysis: 0
sd4       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: WDC WD30EZRX-00D Revision: 0A80 Serial No: WD-WMC1T1793225
Size: 3000.59GB <3000592982016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 128 Predictive Failure Analysis: 0
[root@f4-ce-46-bc-29-92 ~]#
[root@f4-ce-46-bc-29-92 ~]# zpool offline zones c1t2d0
[root@f4-ce-46-bc-29-92 ~]# zpool status
  pool: zones
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
    Sufficient replicas exist for the pool to continue functioning in a
    degraded state.
action: Online the device using 'zpool online' or replace the device with
    'zpool replace'.
  scan: scrub repaired 652K in 2h43m with 0 errors on Wed Jan 20 14:05:32 2016
config:

    NAME        STATE     READ WRITE CKSUM
    zones       DEGRADED     0     0     0
      raidz2-0  DEGRADED     0     0     0
    c1t0d0  ONLINE       0     0     0
    c1t1d0  ONLINE       0     0     0
    c1t2d0  OFFLINE      0     0     0
    c1t3d0  ONLINE       0     0     0


errors: No known data errors
[root@f4-ce-46-bc-29-92 ~]#