URL
https://opencores.org/ocsvn/test_project/test_project/trunk
Subversion Repositories test_project
[/] [test_project/] [trunk/] [linux_sd_driver/] [Documentation/] [README.DAC960] - Rev 62
Compare with Previous | Blame | View Log
Linux Driver for Mylex DAC960/AcceleRAID/eXtremeRAID PCI RAID ControllersVersion 2.2.11 for Linux 2.2.19Version 2.4.11 for Linux 2.4.12PRODUCTION RELEASE11 October 2001Leonard N. ZubkoffDandelion Digitallnz@dandelion.comCopyright 1998-2001 by Leonard N. Zubkoff <lnz@dandelion.com>INTRODUCTIONMylex, Inc. designs and manufactures a variety of high performance PCI RAIDcontrollers. Mylex Corporation is located at 34551 Ardenwood Blvd., Fremont,California 94555, USA and can be reached at 510.796.6100 or on the World WideWeb at http://www.mylex.com. Mylex Technical Support can be reached byelectronic mail at mylexsup@us.ibm.com, by voice at 510.608.2400, or by FAX at510.745.7715. Contact information for offices in Europe and Japan is availableon their Web site.The latest information on Linux support for DAC960 PCI RAID Controllers, aswell as the most recent release of this driver, will always be available frommy Linux Home Page at URL "http://www.dandelion.com/Linux/". The Linux DAC960driver supports all current Mylex PCI RAID controllers including the neweXtremeRAID 2000/3000 and AcceleRAID 352/170/160 models which have an entirelynew firmware interface from the older eXtremeRAID 1100, AcceleRAID 150/200/250,and DAC960PJ/PG/PU/PD/PL. See below for a complete controller list as well asminimum firmware version requirements. For simplicity, in most places thisdocumentation refers to DAC960 generically rather than explicitly listing allthe supported models.Driver bug reports should be sent via electronic mail to "lnz@dandelion.com".Please include with the bug report the complete configuration messages reportedby the driver at startup, along with any subsequent system messages relevant tothe controller's operation, and a detailed description of your system'shardware configuration. Driver bugs are actually quite rare; if you encounterproblems with disks being marked offline, for example, please contact MylexTechnical Support as the problem is related to the hardware configurationrather than the Linux driver.Please consult the RAID controller documentation for detailed informationregarding installation and configuration of the controllers. This documentprimarily provides information specific to the Linux support.DRIVER FEATURESThe DAC960 RAID controllers are supported solely as high performance RAIDcontrollers, not as interfaces to arbitrary SCSI devices. The Linux DAC960driver operates at the block device level, the same level as the SCSI and IDEdrivers. Unlike other RAID controllers currently supported on Linux, theDAC960 driver is not dependent on the SCSI subsystem, and hence avoids all thecomplexity and unnecessary code that would be associated with an implementationas a SCSI driver. The DAC960 driver is designed for as high a performance aspossible with no compromises or extra code for compatibility with lowerperformance devices. The DAC960 driver includes extensive error logging andonline configuration management capabilities. Except for initial configurationof the controller and adding new disk drives, most everything can be handledfrom Linux while the system is operational.The DAC960 driver is architected to support up to 8 controllers per system.Each DAC960 parallel SCSI controller can support up to 15 disk drives perchannel, for a maximum of 60 drives on a four channel controller; the fibrechannel eXtremeRAID 3000 controller supports up to 125 disk drives per loop fora total of 250 drives. The drives installed on a controller are divided intoone or more "Drive Groups", and then each Drive Group is subdivided furtherinto 1 to 32 "Logical Drives". Each Logical Drive has a specific RAID Leveland caching policy associated with it, and it appears to Linux as a singleblock device. Logical Drives are further subdivided into up to 7 partitionsthrough the normal Linux and PC disk partitioning schemes. Logical Drives arealso known as "System Drives", and Drive Groups are also called "Packs". Bothterms are in use in the Mylex documentation; I have chosen to standardize onthe more generic "Logical Drive" and "Drive Group".DAC960 RAID disk devices are named in the style of the obsolete Device FileSystem (DEVFS). The device corresponding to Logical Drive D on Controller Cis referred to as /dev/rd/cCdD, and the partitions are called /dev/rd/cCdDp1through /dev/rd/cCdDp7. For example, partition 3 of Logical Drive 5 onController 2 is referred to as /dev/rd/c2d5p3. Note that unlike with SCSIdisks the device names will not change in the event of a disk drive failure.The DAC960 driver is assigned major numbers 48 - 55 with one major number percontroller. The 8 bits of minor number are divided into 5 bits for the LogicalDrive and 3 bits for the partition.SUPPORTED DAC960/AcceleRAID/eXtremeRAID PCI RAID CONTROLLERSThe following list comprises the supported DAC960, AcceleRAID, and eXtremeRAIDPCI RAID Controllers as of the date of this document. It is recommended thatanyone purchasing a Mylex PCI RAID Controller not in the following tablecontact the author beforehand to verify that it is or will be supported.eXtremeRAID 30001 Wide Ultra-2/LVD SCSI channel2 External Fibre FC-AL channels233MHz StrongARM SA 110 Processor64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots)32MB/64MB ECC SDRAM MemoryeXtremeRAID 20004 Wide Ultra-160 LVD SCSI channels233MHz StrongARM SA 110 Processor64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots)32MB/64MB ECC SDRAM MemoryAcceleRAID 3522 Wide Ultra-160 LVD SCSI channels100MHz Intel i960RN RISC Processor64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots)32MB/64MB ECC SDRAM MemoryAcceleRAID 1701 Wide Ultra-160 LVD SCSI channel100MHz Intel i960RM RISC Processor16MB/32MB/64MB ECC SDRAM MemoryAcceleRAID 160 (AcceleRAID 170LP)1 Wide Ultra-160 LVD SCSI channel100MHz Intel i960RS RISC ProcessorBuilt in 16M ECC SDRAM MemoryPCI Low Profile Form Factor - fit for 2U heighteXtremeRAID 1100 (DAC1164P)3 Wide Ultra-2/LVD SCSI channels233MHz StrongARM SA 110 Processor64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots)16MB/32MB/64MB Parity SDRAM Memory with Battery BackupAcceleRAID 250 (DAC960PTL1)Uses onboard Symbios SCSI chips on certain motherboardsAlso includes one onboard Wide Ultra-2/LVD SCSI Channel66MHz Intel i960RD RISC Processor4MB/8MB/16MB/32MB/64MB/128MB ECC EDO MemoryAcceleRAID 200 (DAC960PTL0)Uses onboard Symbios SCSI chips on certain motherboardsIncludes no onboard SCSI Channels66MHz Intel i960RD RISC Processor4MB/8MB/16MB/32MB/64MB/128MB ECC EDO MemoryAcceleRAID 150 (DAC960PRL)Uses onboard Symbios SCSI chips on certain motherboardsAlso includes one onboard Wide Ultra-2/LVD SCSI Channel33MHz Intel i960RP RISC Processor4MB Parity EDO MemoryDAC960PJ 1/2/3 Wide Ultra SCSI-3 Channels66MHz Intel i960RD RISC Processor4MB/8MB/16MB/32MB/64MB/128MB ECC EDO MemoryDAC960PG 1/2/3 Wide Ultra SCSI-3 Channels33MHz Intel i960RP RISC Processor4MB/8MB ECC EDO MemoryDAC960PU 1/2/3 Wide Ultra SCSI-3 ChannelsIntel i960CF RISC Processor4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM MemoryDAC960PD 1/2/3 Wide Fast SCSI-2 ChannelsIntel i960CF RISC Processor4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM MemoryDAC960PL 1/2/3 Wide Fast SCSI-2 ChannelsIntel i960 RISC Processor2MB/4MB/8MB/16MB/32MB DRAM MemoryDAC960P 1/2/3 Wide Fast SCSI-2 ChannelsIntel i960 RISC Processor2MB/4MB/8MB/16MB/32MB DRAM MemoryFor the eXtremeRAID 2000/3000 and AcceleRAID 352/170/160, firmware version6.00-01 or above is required.For the eXtremeRAID 1100, firmware version 5.06-0-52 or above is required.For the AcceleRAID 250, 200, and 150, firmware version 4.06-0-57 or above isrequired.For the DAC960PJ and DAC960PG, firmware version 4.06-0-00 or above is required.For the DAC960PU, DAC960PD, DAC960PL, and DAC960P, either firmware version3.51-0-04 or above is required (for dual Flash ROM controllers), or firmwareversion 2.73-0-00 or above is required (for single Flash ROM controllers)Please note that not all SCSI disk drives are suitable for use with DAC960controllers, and only particular firmware versions of any given model mayactually function correctly. Similarly, not all motherboards have a BIOS thatproperly initializes the AcceleRAID 250, AcceleRAID 200, AcceleRAID 150,DAC960PJ, and DAC960PG because the Intel i960RD/RP is a multi-function device.If in doubt, contact Mylex RAID Technical Support (mylexsup@us.ibm.com) toverify compatibility. Mylex makes available a hard disk compatibility list athttp://www.mylex.com/support/hdcomp/hd-lists.html.DRIVER INSTALLATIONThis distribution was prepared for Linux kernel version 2.2.19 or 2.4.12.To install the DAC960 RAID driver, you may use the following commands,replacing "/usr/src" with wherever you keep your Linux kernel source tree:cd /usr/srctar -xvzf DAC960-2.2.11.tar.gz (or DAC960-2.4.11.tar.gz)mv README.DAC960 linux/Documentationmv DAC960.[ch] linux/drivers/blockpatch -p0 < DAC960.patch (if DAC960.patch is included)cd linuxmake configmake bzImage (or zImage)Then install "arch/i386/boot/bzImage" or "arch/i386/boot/zImage" as yourstandard kernel, run lilo if appropriate, and reboot.To create the necessary devices in /dev, the "make_rd" script included in"DAC960-Utilities.tar.gz" from http://www.dandelion.com/Linux/ may be used.LILO 21 and FDISK v2.9 include DAC960 support; also included in this archiveare patches to LILO 20 and FDISK v2.8 that add DAC960 support, along withstatically linked executables of LILO and FDISK. This modified version of LILOwill allow booting from a DAC960 controller and/or mounting the root filesystem from a DAC960.Red Hat Linux 6.0 and SuSE Linux 6.1 include support for Mylex PCI RAIDcontrollers. Installing directly onto a DAC960 may be problematic from otherLinux distributions until their installation utilities are updated.INSTALLATION NOTESBefore installing Linux or adding DAC960 logical drives to an existing Linuxsystem, the controller must first be configured to provide one or more logicaldrives using the BIOS Configuration Utility or DACCF. Please note that sincethere are only at most 6 usable partitions on each logical drive, systemsrequiring more partitions should subdivide a drive group into multiple logicaldrives, each of which can have up to 6 usable partitions. Also, note that withlarge disk arrays it is advisable to enable the 8GB BIOS Geometry (255/63)rather than accepting the default 2GB BIOS Geometry (128/32); failing to so dowill cause the logical drive geometry to have more than 65535 cylinders whichwill make it impossible for FDISK to be used properly. The 8GB BIOS Geometrycan be enabled by configuring the DAC960 BIOS, which is accessible via Alt-Mduring the BIOS initialization sequence.For maximum performance and the most efficient E2FSCK performance, it isrecommended that EXT2 file systems be built with a 4KB block size and 16 blockstride to match the DAC960 controller's 64KB default stripe size. The command"mke2fs -b 4096 -R stride=16 <device>" is appropriate. Unless there will be alarge number of small files on the file systems, it is also beneficial to addthe "-i 16384" option to increase the bytes per inode parameter therebyreducing the file system metadata. Finally, on systems that will only be runwith Linux 2.2 or later kernels it is beneficial to enable sparse superblockswith the "-s 1" option.DAC960 ANNOUNCEMENTS MAILING LISTThe DAC960 Announcements Mailing List provides a forum for informing Linuxusers of new driver releases and other announcements regarding Linux supportfor DAC960 PCI RAID Controllers. To join the mailing list, send a message to"dac960-announce-request@dandelion.com" with the line "subscribe" in themessage body.CONTROLLER CONFIGURATION AND STATUS MONITORINGThe DAC960 RAID controllers running firmware 4.06 or above include a BackgroundInitialization facility so that system downtime is minimized both for initialinstallation and subsequent configuration of additional storage. The BIOSConfiguration Utility (accessible via Alt-R during the BIOS initializationsequence) is used to quickly configure the controller, and then the logicaldrives that have been created are available for immediate use even while theyare still being initialized by the controller. The primary need for onlineconfiguration and status monitoring is then to avoid system downtime when diskdrives fail and must be replaced. Mylex's online monitoring and configurationutilities are being ported to Linux and will become available at some point inthe future. Note that with a SAF-TE (SCSI Accessed Fault-Tolerant Enclosure)enclosure, the controller is able to rebuild failed drives automatically assoon as a drive replacement is made available.The primary interfaces for controller configuration and status monitoring arespecial files created in the /proc/rd/... hierarchy along with the normalsystem console logging mechanism. Whenever the system is operating, the DAC960driver queries each controller for status information every 10 seconds, andchecks for additional conditions every 60 seconds. The initial status of eachcontroller is always available for controller N in /proc/rd/cN/initial_status,and the current status as of the last status monitoring query is available in/proc/rd/cN/current_status. In addition, status changes are also logged by thedriver to the system console and will appear in the log files maintained bysyslog. The progress of asynchronous rebuild or consistency check operationsis also available in /proc/rd/cN/current_status, and progress messages arelogged to the system console at most every 60 seconds.Starting with the 2.2.3/2.0.3 versions of the driver, the status informationavailable in /proc/rd/cN/initial_status and /proc/rd/cN/current_status has beenaugmented to include the vendor, model, revision, and serial number (ifavailable) for each physical device found connected to the controller:***** DAC960 RAID Driver Version 2.2.3 of 19 August 1999 *****Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>Configuring Mylex DAC960PRL PCI RAID ControllerFirmware Version: 4.07-0-07, Channels: 1, Memory Size: 16MBPCI Bus: 1, Device: 4, Function: 1, I/O Address: UnassignedPCI Address: 0xFE300000 mapped at 0xA0800000, IRQ Channel: 21Controller Queue Depth: 128, Maximum Blocks per Command: 128Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63SAF-TE Enclosure Management EnabledPhysical Devices:0:0 Vendor: IBM Model: DRVS09D Revision: 0270Serial Number: 68016775HADisk Status: Online, 17928192 blocks0:1 Vendor: IBM Model: DRVS09D Revision: 0270Serial Number: 68004E53HADisk Status: Online, 17928192 blocks0:2 Vendor: IBM Model: DRVS09D Revision: 0270Serial Number: 13013935HADisk Status: Online, 17928192 blocks0:3 Vendor: IBM Model: DRVS09D Revision: 0270Serial Number: 13016897HADisk Status: Online, 17928192 blocks0:4 Vendor: IBM Model: DRVS09D Revision: 0270Serial Number: 68019905HADisk Status: Online, 17928192 blocks0:5 Vendor: IBM Model: DRVS09D Revision: 0270Serial Number: 68012753HADisk Status: Online, 17928192 blocks0:6 Vendor: ESG-SHV Model: SCA HSBP M6 Revision: 0.61Logical Drives:/dev/rd/c0d0: RAID-5, Online, 89640960 blocks, Write ThruNo Rebuild or Consistency Check in ProgressTo simplify the monitoring process for custom software, the special file/proc/rd/status returns "OK" when all DAC960 controllers in the system areoperating normally and no failures have occurred, or "ALERT" if any logicaldrives are offline or critical or any non-standby physical drives are dead.Configuration commands for controller N are available via the special file/proc/rd/cN/user_command. A human readable command can be written to thisspecial file to initiate a configuration operation, and the results of theoperation can then be read back from the special file in addition to beinglogged to the system console. The shell command sequenceecho "<configuration-command>" > /proc/rd/c0/user_commandcat /proc/rd/c0/user_commandis typically used to execute configuration commands. The configurationcommands are:flush-cacheThe "flush-cache" command flushes the controller's cache. The systemautomatically flushes the cache at shutdown or if the driver module isunloaded, so this command is only needed to be certain a write back cacheis flushed to disk before the system is powered off by a command to a UPS.Note that the flush-cache command also stops an asynchronous rebuild orconsistency check, so it should not be used except when the system is beinghalted.kill <channel>:<target-id>The "kill" command marks the physical drive <channel>:<target-id> as DEAD.This command is provided primarily for testing, and should not be usedduring normal system operation.make-online <channel>:<target-id>The "make-online" command changes the physical drive <channel>:<target-id>from status DEAD to status ONLINE. In cases where multiple physical driveshave been killed simultaneously, this command may be used to bring all butone of them back online, after which a rebuild to the final drive isnecessary.Warning: make-online should only be used on a dead physical drive that isan active part of a drive group, never on a standby drive. The commandshould never be used on a dead drive that is part of a critical logicaldrive; rebuild should be used if only a single drive is dead.make-standby <channel>:<target-id>The "make-standby" command changes physical drive <channel>:<target-id>from status DEAD to status STANDBY. It should only be used in cases wherea dead drive was replaced after an automatic rebuild was performed onto astandby drive. It cannot be used to add a standby drive to the controllerconfiguration if one was not created initially; the BIOS ConfigurationUtility must be used for that currently.rebuild <channel>:<target-id>The "rebuild" command initiates an asynchronous rebuild onto physical drive<channel>:<target-id>. It should only be used when a dead drive has beenreplaced.check-consistency <logical-drive-number>The "check-consistency" command initiates an asynchronous consistency checkof <logical-drive-number> with automatic restoration. It can be usedwhenever it is desired to verify the consistency of the redundancyinformation.cancel-rebuildcancel-consistency-checkThe "cancel-rebuild" and "cancel-consistency-check" commands cancel anyrebuild or consistency check operations previously initiated.EXAMPLE I - DRIVE FAILURE WITHOUT A STANDBY DRIVEThe following annotated logs demonstrate the controller configuration and andonline status monitoring capabilities of the Linux DAC960 Driver. The testconfiguration comprises 6 1GB Quantum Atlas I disk drives on two channels of aDAC960PJ controller. The physical drives are configured into a single drivegroup without a standby drive, and the drive group has been configured into twological drives, one RAID-5 and one RAID-6. Note that these logs are from anearlier version of the driver and the messages have changed somewhat with newerreleases, but the functionality remains similar. First, here is the currentstatus of the RAID configuration:gwynedd:/u/lnz# cat /proc/rd/c0/current_status***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 *****Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>Configuring Mylex DAC960PJ PCI RAID ControllerFirmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MBPCI Bus: 0, Device: 19, Function: 1, I/O Address: UnassignedPCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9Controller Queue Depth: 128, Maximum Blocks per Command: 128Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63Physical Devices:0:1 - Disk: Online, 2201600 blocks0:2 - Disk: Online, 2201600 blocks0:3 - Disk: Online, 2201600 blocks1:1 - Disk: Online, 2201600 blocks1:2 - Disk: Online, 2201600 blocks1:3 - Disk: Online, 2201600 blocksLogical Drives:/dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru/dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write ThruNo Rebuild or Consistency Check in Progressgwynedd:/u/lnz# cat /proc/rd/statusOKThe above messages indicate that everything is healthy, and /proc/rd/statusreturns "OK" indicating that there are no problems with any DAC960 controllerin the system. For demonstration purposes, while I/O is active Physical Drive1:1 is now disconnected, simulating a drive failure. The failure is noted bythe driver within 10 seconds of the controller's having detected it, and thedriver logs the following console status messages indicating that LogicalDrives 0 and 1 are now CRITICAL as a result of Physical Drive 1:1 being DEAD:DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02DAC960#0: Physical Drive 1:1 killed because of timeout on SCSI commandDAC960#0: Physical Drive 1:1 is now DEADDAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICALDAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICALThe Sense Keys logged here are just Check Condition / Unit Attention conditionsarising from a SCSI bus reset that is forced by the controller during its errorrecovery procedures. Concurrently with the above, the driver status availablefrom /proc/rd also reflects the drive failure. The status message in/proc/rd/status has changed from "OK" to "ALERT":gwynedd:/u/lnz# cat /proc/rd/statusALERTand /proc/rd/c0/current_status has been updated:gwynedd:/u/lnz# cat /proc/rd/c0/current_status...Physical Devices:0:1 - Disk: Online, 2201600 blocks0:2 - Disk: Online, 2201600 blocks0:3 - Disk: Online, 2201600 blocks1:1 - Disk: Dead, 2201600 blocks1:2 - Disk: Online, 2201600 blocks1:3 - Disk: Online, 2201600 blocksLogical Drives:/dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru/dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write ThruNo Rebuild or Consistency Check in ProgressSince there are no standby drives configured, the system can continue to accessthe logical drives in a performance degraded mode until the failed drive isreplaced and a rebuild operation completed to restore the redundancy of thelogical drives. Once Physical Drive 1:1 is replaced with a properlyfunctioning drive, or if the physical drive was killed without having failed(e.g., due to electrical problems on the SCSI bus), the user can instruct thecontroller to initiate a rebuild operation onto the newly replaced drive:gwynedd:/u/lnz# echo "rebuild 1:1" > /proc/rd/c0/user_commandgwynedd:/u/lnz# cat /proc/rd/c0/user_commandRebuild of Physical Drive 1:1 InitiatedThe echo command instructs the controller to initiate an asynchronous rebuildoperation onto Physical Drive 1:1, and the status message that results from theoperation is then available for reading from /proc/rd/c0/user_command, as wellas being logged to the console by the driver.Within 10 seconds of this command the driver logs the initiation of theasynchronous rebuild operation:DAC960#0: Rebuild of Physical Drive 1:1 InitiatedDAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01DAC960#0: Physical Drive 1:1 is now WRITE-ONLYDAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 1% completedand /proc/rd/c0/current_status is updated:gwynedd:/u/lnz# cat /proc/rd/c0/current_status...Physical Devices:0:1 - Disk: Online, 2201600 blocks0:2 - Disk: Online, 2201600 blocks0:3 - Disk: Online, 2201600 blocks1:1 - Disk: Write-Only, 2201600 blocks1:2 - Disk: Online, 2201600 blocks1:3 - Disk: Online, 2201600 blocksLogical Drives:/dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru/dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write ThruRebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 6% completedAs the rebuild progresses, the current status in /proc/rd/c0/current_status isupdated every 10 seconds:gwynedd:/u/lnz# cat /proc/rd/c0/current_status...Physical Devices:0:1 - Disk: Online, 2201600 blocks0:2 - Disk: Online, 2201600 blocks0:3 - Disk: Online, 2201600 blocks1:1 - Disk: Write-Only, 2201600 blocks1:2 - Disk: Online, 2201600 blocks1:3 - Disk: Online, 2201600 blocksLogical Drives:/dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru/dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write ThruRebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 15% completedand every minute a progress message is logged to the console by the driver:DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 32% completedDAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 63% completedDAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 94% completedDAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 94% completedFinally, the rebuild completes successfully. The driver logs the status of thelogical and physical drives and the rebuild completion:DAC960#0: Rebuild Completed SuccessfullyDAC960#0: Physical Drive 1:1 is now ONLINEDAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINEDAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE/proc/rd/c0/current_status is updated:gwynedd:/u/lnz# cat /proc/rd/c0/current_status...Physical Devices:0:1 - Disk: Online, 2201600 blocks0:2 - Disk: Online, 2201600 blocks0:3 - Disk: Online, 2201600 blocks1:1 - Disk: Online, 2201600 blocks1:2 - Disk: Online, 2201600 blocks1:3 - Disk: Online, 2201600 blocksLogical Drives:/dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru/dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write ThruRebuild Completed Successfullyand /proc/rd/status indicates that everything is healthy once again:gwynedd:/u/lnz# cat /proc/rd/statusOKEXAMPLE II - DRIVE FAILURE WITH A STANDBY DRIVEThe following annotated logs demonstrate the controller configuration and andonline status monitoring capabilities of the Linux DAC960 Driver. The testconfiguration comprises 6 1GB Quantum Atlas I disk drives on two channels of aDAC960PJ controller. The physical drives are configured into a single drivegroup with a standby drive, and the drive group has been configured into twological drives, one RAID-5 and one RAID-6. Note that these logs are from anearlier version of the driver and the messages have changed somewhat with newerreleases, but the functionality remains similar. First, here is the currentstatus of the RAID configuration:gwynedd:/u/lnz# cat /proc/rd/c0/current_status***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 *****Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>Configuring Mylex DAC960PJ PCI RAID ControllerFirmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MBPCI Bus: 0, Device: 19, Function: 1, I/O Address: UnassignedPCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9Controller Queue Depth: 128, Maximum Blocks per Command: 128Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63Physical Devices:0:1 - Disk: Online, 2201600 blocks0:2 - Disk: Online, 2201600 blocks0:3 - Disk: Online, 2201600 blocks1:1 - Disk: Online, 2201600 blocks1:2 - Disk: Online, 2201600 blocks1:3 - Disk: Standby, 2201600 blocksLogical Drives:/dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru/dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write ThruNo Rebuild or Consistency Check in Progressgwynedd:/u/lnz# cat /proc/rd/statusOKThe above messages indicate that everything is healthy, and /proc/rd/statusreturns "OK" indicating that there are no problems with any DAC960 controllerin the system. For demonstration purposes, while I/O is active Physical Drive1:2 is now disconnected, simulating a drive failure. The failure is noted bythe driver within 10 seconds of the controller's having detected it, and thedriver logs the following console status messages:DAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02DAC960#0: Physical Drive 1:2 killed because of timeout on SCSI commandDAC960#0: Physical Drive 1:2 is now DEADDAC960#0: Physical Drive 1:2 killed because it was removedDAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICALDAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICALSince a standby drive is configured, the controller automatically beginsrebuilding onto the standby drive:DAC960#0: Physical Drive 1:3 is now WRITE-ONLYDAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completedConcurrently with the above, the driver status available from /proc/rd alsoreflects the drive failure and automatic rebuild. The status message in/proc/rd/status has changed from "OK" to "ALERT":gwynedd:/u/lnz# cat /proc/rd/statusALERTand /proc/rd/c0/current_status has been updated:gwynedd:/u/lnz# cat /proc/rd/c0/current_status...Physical Devices:0:1 - Disk: Online, 2201600 blocks0:2 - Disk: Online, 2201600 blocks0:3 - Disk: Online, 2201600 blocks1:1 - Disk: Online, 2201600 blocks1:2 - Disk: Dead, 2201600 blocks1:3 - Disk: Write-Only, 2201600 blocksLogical Drives:/dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru/dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write ThruRebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completedAs the rebuild progresses, the current status in /proc/rd/c0/current_status isupdated every 10 seconds:gwynedd:/u/lnz# cat /proc/rd/c0/current_status...Physical Devices:0:1 - Disk: Online, 2201600 blocks0:2 - Disk: Online, 2201600 blocks0:3 - Disk: Online, 2201600 blocks1:1 - Disk: Online, 2201600 blocks1:2 - Disk: Dead, 2201600 blocks1:3 - Disk: Write-Only, 2201600 blocksLogical Drives:/dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru/dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write ThruRebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completedand every minute a progress message is logged on the console by the driver:DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completedDAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 76% completedDAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 66% completedDAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 84% completedFinally, the rebuild completes successfully. The driver logs the status of thelogical and physical drives and the rebuild completion:DAC960#0: Rebuild Completed SuccessfullyDAC960#0: Physical Drive 1:3 is now ONLINEDAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINEDAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE/proc/rd/c0/current_status is updated:***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 *****Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com>Configuring Mylex DAC960PJ PCI RAID ControllerFirmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MBPCI Bus: 0, Device: 19, Function: 1, I/O Address: UnassignedPCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9Controller Queue Depth: 128, Maximum Blocks per Command: 128Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63Physical Devices:0:1 - Disk: Online, 2201600 blocks0:2 - Disk: Online, 2201600 blocks0:3 - Disk: Online, 2201600 blocks1:1 - Disk: Online, 2201600 blocks1:2 - Disk: Dead, 2201600 blocks1:3 - Disk: Online, 2201600 blocksLogical Drives:/dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru/dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write ThruRebuild Completed Successfullyand /proc/rd/status indicates that everything is healthy once again:gwynedd:/u/lnz# cat /proc/rd/statusOKNote that the absence of a viable standby drive does not create an "ALERT"status. Once dead Physical Drive 1:2 has been replaced, the controller must betold that this has occurred and that the newly replaced drive should become thenew standby drive:gwynedd:/u/lnz# echo "make-standby 1:2" > /proc/rd/c0/user_commandgwynedd:/u/lnz# cat /proc/rd/c0/user_commandMake Standby of Physical Drive 1:2 SucceededThe echo command instructs the controller to make Physical Drive 1:2 into astandby drive, and the status message that results from the operation is thenavailable for reading from /proc/rd/c0/user_command, as well as being logged tothe console by the driver. Within 60 seconds of this command the driver logs:DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01DAC960#0: Physical Drive 1:2 is now STANDBYDAC960#0: Make Standby of Physical Drive 1:2 Succeededand /proc/rd/c0/current_status is updated:gwynedd:/u/lnz# cat /proc/rd/c0/current_status...Physical Devices:0:1 - Disk: Online, 2201600 blocks0:2 - Disk: Online, 2201600 blocks0:3 - Disk: Online, 2201600 blocks1:1 - Disk: Online, 2201600 blocks1:2 - Disk: Standby, 2201600 blocks1:3 - Disk: Online, 2201600 blocksLogical Drives:/dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru/dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write ThruRebuild Completed Successfully
