top of page

Isilon: How to identify if a boot flash drive has failed on 108NL, NL400, S200, X200, or X400 nodes

  • Writer: Balasubramani Ramamurthy
    Balasubramani Ramamurthy
  • Dec 2, 2017
  • 4 min read

Instructions

Introduction

This article provides a procedure to determine if a boot flash drive has failed on 108NL, NL400, S200, X200, or X400 nodes.

Procedure

Identify which boot flash drives have failed and then replace them.

  1. Open an SSH connection to the node and log on using the "root" account.

  2. Run the following command to view boot flash drive information: atacontrol list The following output appears:

ATA channel 0: Master: no device present Slave: no device present ATA channel 1: Master: no device present Slave: no device present ATA channel 2: Master: ad4 <SanDisk SSD P4 8GB/SSD 8.10> Serial ATA v1.0 II Slave: no device present ATA channel 3: Master: no device present Slave: ad7 <SanDisk SSD P4 8GB/SSD 8.10> Serial ATA v1.0 II ATA channel 4: Master: no device present Slave: no device present ATA channel 5: Master: no device present Slave: no device present

The boot flash drives are listed under ATA channel 2 (Master) and ATA channel 3 (Slave). In the previous example, both boot drives are healthy. If a boot flash drive has failed, the display reads no device present for that drive. NOTE


  1. In 108NL nodes, the ATA channel 2 (Master) entry is prefixed by ad2 and the ATA channel 3 (Slave) entry is prefixed byad3.

  2. In newer nodes, such as x410, the slot assigment has changed, please always check the guide for correct assignment.

  3. Make note of whether the failed boot drive is the ATA channel 2 (Master) or ATA channel 3 (Slave) device, and then use the following table to determine the location of the boot drive inside the node.

Boot orderOneFS drive IDBoard drive slot inside node

Masterad4 (or ad2 for 108NL)J3

Slavead7 (or ad3 for 108NL)J4

For new nodes (S210, X210, X410, NL410, HD400)

Boot orderOneFS drive IDBoard drive slot inside node

Masterad3J3

Slavead4J4

  1. Make note of the board drive slot that contains the failed boot drive.

CAUTION! If both drives appear to have failed, do not continue. Contact Isilon Technical Support immediately.

  1. If both drives appear to be healthy, one of the drives may have partially failed. To identify a partially failed drive, check the status of the individual partition mirrors by running the following command: gmirror status From left to right, the output displays the name of each mirror, the status of the mirror relationship, and the component IDs for each boot drive. The following example shows the boot drive partition layout in a healthy node. The mirrors for each partition show:

  • A value of COMPLETE in the Status column.

  • The component IDs for both boot drives in the Components column. The component IDs are a combination of the OneFS Drive ID, and the partition number (the number following the letter p). Both boot drives are listed for each mirror with the exception of the var-crash mirror, which only lists the slave drive.

NOTE - The partition numbers in the display may differ from the following example. - The /var/crash partition may show COMPLETE with either 1 or 2 components, depending on the boot drive type used in the node. This is normal.

Name mirror/root0 mirror/var-crash mirror/mfg mirror/journal-backup mirror/var1 mirror/var0 mirror/root1Status COMPLETE COMPLETE COMPLETE COMPLETE COMPLETE COMPLETE COMPLETEComponents ad7p4 ad4p4 ad7p10 ad7p9 ad4p10 ad7p8 ad4p8 ad7p7 ad4p7 ad7p6 ad4p6 ad7p5 ad4p5

The following example shows the boot drive partition layout as it appears in the event of a failed boot drive. A failed boot drive forces the mirrors for a partition to show:

  • A value of DEGRADED in the Status column.

  • Only the component ID of the healthy boot drive in the Components column. The failed boot drive does not appear.

IMPORTANT! DEGRADED does not refer to a specific drive, but to the mirror relationship between the drives. If a drive appears in the Components column next to the DEGRADED status, it is healthy and should not be removed.

Name mirror/root0 mirror/var-crash mirror/mfg mirror/journal-backup mirror/var1 mirror/var0 mirror/root1Status DEGRADED COMPLETE COMPLETE COMPLETE COMPLETE DEGRADED COMPLETEComponents ad4p4 ad7p10 ad7p9 ad4p10 ad7p8 ad4p8 ad7p7 ad4p7 ad4p6 ad7p5 ad4p5

In the previous example, ad7p4 is missing from the degraded partition mirror/root0, and ad7p6 is missing from the degraded partition mirror/var0. The missing drive, ad7, is the partially failed drive.

  1. Determine which drive has failed. Use the table from step 3 to determine which board drive slot contains the failed boot drive and make a note of the number (J3 or J4).

  2. Contact Isilon Technical Support for assistance in obtaining and installing a replacement boot drive.

Verification

To verify that the replaced boot flash drive was mirrored from the other boot flash drive:

  1. Open an SSH connection to the node and log on using the "root" account.

  2. Run the following command to verify the mirroring process: isi_bootdisk_status Output will look similar to the following:

root0 ad4p4 ad7p4 root1 ad4p5 ad7p5 var0 ad4p6 ad7p6 var1 ad4p7 ad7p7 journal-backup ad4p8 ad7p8 kernelsdump ad4p9 mfg ad4p10 ad7p9 var-crash ad7p10 kerneldump ad4p11

NOTE: The /var/crash partition may show COMPLETE with either 1 or 2 components, depending on the boot drive type used in the node. This is normal.


 
 
 

Comments


Featured Posts
Recent Posts
Archive
Search By Tags
Follow Us
  • Facebook Basic Square
  • Twitter Basic Square
  • Google+ Basic Square
  • Facebook Basic Black
  • Twitter Basic Black
  • LinkedIn Basic Black

Vendor Community Forum 

    Like what you read? Donate now and help me provide fresh news and analysis for my readers   

SanNasAdmin © Copyright 2014, All Rights Reserved

bottom of page