ZFS checksum error

Moderator: cah

Post Reply
cah
General of the Army / Fleet Admiral / General of the Air Force
General of the Army / Fleet Admiral / General of the Air Force
Posts: 1342
Joined: Sun Aug 17, 2008 5:05 am

ZFS checksum error

Post by cah »

When I was checking zfs pool status today, I saw the following:

Code: Select all

%zpool status
  pool: rpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
  scan: scrub repaired 47.5K in 1h3m with 0 errors on Sun May 20 04:03:55 2012
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            c5d0s0  ONLINE       0     0     0
            c6d0s0  ONLINE       0     0     3

errors: No known data errors
I then checked the link http://www.sun.com/msg/ZFS-8000-9P for details.
It is hard to tell whether the disk is failing or this is a temporary thing.
So, I did a clear first.

Code: Select all

%zpool clear rpool c6d0s0
And that cleared the checksum error.

Code: Select all

%zpool status            
  pool: rpool
 state: ONLINE
  scan: scrub repaired 47.5K in 1h3m with 0 errors on Sun May 20 04:03:55 2012
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            c5d0s0  ONLINE       0     0     0
            c6d0s0  ONLINE       0     0     0

errors: No known data errors
The following check also confirms it.

Code: Select all

%zpool status -x         
all pools are healthy
If this happens more often, I will be replacing the failing disk in the future.

I am writing a zfs status checking script that runs regularly. It will email me zpool status so that I know how often this happens.

Script:

Code: Select all

#!/bin/ksh

ZPOOL=/sbin/zpool
MAILX=/usr/bin/mailx

status=`$ZPOOL status -x`
if [ "$status" != "all pools are healthy" ]
then
  $MAILX -s "Check ZFS Pool Status" [recipient]< /dev/null
fi
Crontab:

Code: Select all

# ZFS status checking
0 4 * * * /export/home/cah/bin/script/zfs_check.sh
CAH, The Great
Post Reply