Forcing a hard disk to reallocate bad sectors – Up level work support of Linux servers. Modern Clouds

Sometimes a hard disk is hinting on an upcoming failure. Some disks start to make unexpected sounds, others are silent and only cause some noise in your syslog. In most cases the disk will automatically reallocate one or two damaged sectors and you should start planning on buying a new disk while your data is safe. However, sometimes the disk won’t automatically reallocate these sectors and you’ll have to do that manually yourself. Luckily, this doesn’t include any rocket science.

Disks can reported some problems in syslog while rebuilding a RAID5-array:

    Jan 29 18:19:54 dragon kernel: [66774.973049] end_request: I/O error, dev sdb, sector 1261069669
    Jan 29 18:19:54 dragon kernel: [66774.973054] raid5:md3: read error not correctable (sector 405431640 on sdb6).
    Jan 29 18:19:54 dragon kernel: [66774.973059] raid5: Disk failure on sdb6, disabling device.
    Jan 29 18:20:11 dragon kernel: [66792.180513] sd 3:0:0:0: [sdb] Unhandled sense code
    Jan 29 18:20:11 dragon kernel: [66792.180516] sd 3:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
    Jan 29 18:20:11 dragon kernel: [66792.180521] sd 3:0:0:0: [sdb] Sense Key : Medium Error [current] [descriptor]
    Jan 29 18:20:11 dragon kernel: [66792.180547] sd 3:0:0:0: [sdb] Add. Sense: Unrecovered read error – auto reallocate failed
    Jan 29 18:20:11 dragon kernel: [66792.180553] sd 3:0:0:0: [sdb] CDB: Read(10): 28 00 4b 2a 6c 4c 00 00 c0 00
    Jan 29 18:20:11 dragon kernel: [66792.180564] end_request: I/O error, dev sdb, sector 1261071601

Modern hard disk drives are equipped with a small amount of spare sectors to reallocate damaged sectors. However, a sector only gets relocated when a write operation fails. A failing read operation will, in most cases, only throw an I/O error. In the unlikely event a second read does succeed, some disks perform a auto-reallocation and data is preserved. In this case, the second read failed miserably (“Unrecovered read error – auto reallocate failed“).

The read errors were caused by a sync of a new RAID5 array, which was initially running in degraded mode (on /dev/sdb and /dev/sdc, with /dev/sdd missing). Obviously, mdadm kicked sdb out of the already degraded RAID5-array, leaving nothing but sdc.
The only solution to this problem, was to force sdb to dynamically relocate the damaged sectors. That way, mdadm wouldn’t encounter the read errors and the initial sync of the array would succeed. A tool like hdparm can help you with forcing a disk to reallocate a sector, by simply issuing a write command to the damaged sector. First, check out the number of reallocated sectors on the disk:

    $ smartctl -a /dev/sdb | grep -i reallocated
    5 Reallocated_Sector_Ct   0×0033   100   100   005    Pre-fail  Always       -       0
    196 Reallocated_Event_Count 0×0032   100   100   000    Old_age   Always       -       0

The zeroes at the end of the lines indicate that there are no reallocated sectors on /dev/sdb. Let’s check whether sector 1261069669 is really damaged:

    $ hdparm –read-sector 1261069669
    /dev/sdb: Input/Output error

Now, issue the write command (note that hdparm will completely bypass regular block layer read/write mechanisms) to the damaged sector(s). Note that the data on these sectors will be lost forever!

    $ hdparm –write-sector 1261069669 /dev/sdb

    /dev/sdc:
    Use of –write-sector is VERY DANGEROUS.
    You are trying to deliberately overwrite a low-level sector on the media
    This is a BAD idea, and can easily result in total data loss.
    Please supply the –yes-i-know-what-i-am-doing flag if you really want this.

    Program aborted.

    $ hdparm –write-sector 1261069669 –yes-i-know-what-i-am-doing /dev/sdb
    /dev/sdb: re-writing sector 1261069669: succeeded
    $hdparm –write-sector 1261071601 –yes-i-know-what-i-am-doing /dev/sdb
    /dev/sdb: re-writing sector 1261071601: succeeded

Now, use hdparm again to check the availability of the reallocated sectors:

    $ hdparm –read-sector 1261069669
    /dev/sdb:
    reading sector 1261069669: succeeded
    (a lot of zeroes should follow)

And using SMART we can check whether the disk has registered two reallocated sectors:

    
    $ smartctl -a /dev/sdb | grep -i reallocated
    5 Reallocated_Sector_Ct   0×0033   100   100   005    Pre-fail  Always       -       2
    196 Reallocated_Event_Count 0×0032   100   100   000    Old_age   Always       -       2

It’s actually quite simple to force mdadm to continue using sdb as if nothing ever happened:

    $ mdadm –assemble –force /dev/md3 /dev/sdb6 /dev/sdc6
    (mdadm will complain about being forced to increase the event counter of sdb6)
    $ mdadm /dev/md3 –add /dev/sdd6

And a few minutes later, the array is as good as new!

based on http://www.sj-vs.net/forcing-a-hard-disk-to-reallocate-bad-sectors

You May Also Like

MDADM: Resize underlying partitions in mdadm RAID1

MDADM: Full usefull ‘Cheat Sheet’

Protected: Forcing a hard disk to reallocate bad sectors