Komplette Übersicht aller 3ware Ereignismeldungen

Hinweis: Bitte beachten Sie, dass dieser Artikel / diese Kategorie sich entweder auf ältere Software/Hardware Komponenten bezieht oder aus sonstigen Gründen nicht mehr gewartet wird. Diese Seite wird nicht mehr aktualisiert und ist rein zu Referenzzwecken noch hier im Archiv abrufbar.

Dieser Artikel listet alle möglichen Fehler- und Warnmeldungen eines 3ware Controllers auf. Diese Liste war bis dato nur in der "3ware HTML Bookshelf" hinterlegt welche sich auf der 3ware Codeset CD befindet.

Allgemeine Informationen

Neben den weiter unten angeführten Meldungen aus dem "3ware HTML Bookshelf" finden Sie hier vorab Informationen zu Meldungen, die dort nicht angeführt sind:

kernel: 3w-9xxx: scsi1: ERROR: (0x03:x010D): Invalid field in CDB
Laut Knowledgebase kann diese Meldung ignoriert werden.

kernel: 3w-xxxx: scsi3: Command failed: status = 0xc4, flags = 0x43, unit #8
Dieser Fehler wird durch ein einen Befehl verursacht, wie z.b. Auslesen der SMART Werte. Wenn an Port 8 keine Festplatte angeschlossen ist, kann diese Fehlermeldung ignoriert werden.

kernel: 3w-xxxx: scsi0: AEN: WARNING: Sector repair occured: port #1
Ein fehlerhafter Sektor auf der Festplatte an Port 1 wurde erkannt. Da eine Festplatte mehrere Fehlerkorrektursektoren besitzt, ist diese Meldung nicht weiter tragisch, solang sie nicht öfter auftritt.

kernel: 3w-9xxx: scsi0: AEN: INFO (0x04:0x0053): <NULL>:
Diese Meldung ist nicht richtig dokumentiert, deutet aber lediglich auf ein Speichergrößen Problem auf dem Server hin. Diese Meldung kann ignoriert werden.

kernel: 3w-9xxx: scsi1: AEN: ERROR (0x04:0x005F): Cache synchronization failed; some data lost:unit=1.
Das Dateisystem sollte auf Fehler überprüft und der Controller getauscht werden.

kernel: 3w-9xxx: scsi2: ERROR: (0x03:0x0101): Invalid command opcode:opcode=0x4D
Diese Fehlermeldung taucht auf wenn SMART Werte anstatt von einem physikalischen, von einem logischen Laufwerk angefordert werden.

Weitere Informationen

Weitere Informationen zu 3ware Fehlermeldungen finden Sie im Artikel 3Ware Controller Problem Determination Procedures auf twiki.cern.ch.^[1]

0001 Controller reset occurred

Event Type: Information
Cause: The device driver has sent a soft reset to the 3ware RAID controller. The driver does this when the controller has not responded to a command within the allowed time limit (30 sec.). After the soft reset command has been sent, the driver will resend the command.
Action: If this message occurs more than three times a day, collect the system logs and contact Technical Support.

0002 Degraded unit

Event Type: Error
Cause: An error was encountered and the unit is now operating in degraded (non-redundant) mode. This is usually due to a drive failure or the physical removal of a drive from a redundant unit.
Action: Check hardware connections and reseat the drive or drives. Rescan the controller from 3DM or CLI to see if the unit has been restored. If you are able to restore the unit before any data has been written to the unit, a rebuild will not be necessary. If the unit remains degraded, replace the missing or dead drives and initiate a rebuild.

0003 Controller error occurred

Event Type: Error
Cause: The 3ware RAID controller has encountered an internal error.
Action: Please collect log files and contact AMCC Customer Support, as a replacement board may be required.

0004 Rebuild failed

Event Type: Error
Cause: The 3ware RAID controller was unable to complete a rebuild operation. This error can be caused by drive errors on either the source or the destination of the rebuild. However, because ATA drives can reallocate sectors on write errors, the rebuild failure is most likely caused by the source drive of the rebuild detecting a read error.
Action: The default operation of the 3ware RAID controller is to abort a rebuild if an error is encountered. If you want rebuilds to continue when there is a source error, you can set a unit policy to Continue on Source Error When Rebuilding in 3DM or CLI.

The consequence of continuing a rebuild when there is a source error is that there may be corrupt data in your rebuilt unit. In some cases, however, this may be your only alternative for recovering as much data as possible from a unit that has become degraded. To lower the likelihood of getting this error, schedule regular verifications.

0005 Rebuild completed

Event Type: Information
Cause: The 3ware RAID controller has successfully completed a rebuild. The data is now redundant.
Action: None required.

0006 Incomplete unit detected

Event Type: Warning
Cause: The 3ware RAID controller has detected an incomplete unit.

An incomplete unit is a unit in which the 3ware RAID controller is unable to detect one or more drives. The drives may be missing, dead, or improperly connected. A unit that is incomplete is also degraded (although a degraded unit can be complete if all drives are still detected, including the failed drive).

Action: Check hardware connections and reseat the drives. Rescan the controller from 3DM to see if the unit has been restored. If you are able to restore the unit before any data has been written to the unit, a rebuild will not be necessary. If the unit remains incomplete, replace the missing or dead drives and initiate a rebuild.

0007 Initialize completed

Event Type: Information
Cause: The 3ware RAID controller completed the “synching” background initialization sequence of RAID levels 1, 6, 10, 50, or 5. For RAID 5, RAID 6, and RAID 50, the data on the unit was read and the resultant new parity was written. For RAID 1 and 10, one half of the mirror was copied to the other half (mirrors are synchronized).

This message will not appear for a foreground initialization.

Action: None required.

0008 Unclean shutdown detected

Event Type: Warning
Cause: The 3ware RAID controller detected an unclean shutdown of the operating system, either from a power failure or improper shutdown procedure. The controller will force the unit to begin verifying, due to the possibility that data on a redundant unit could be out of synchronization.
Action: Allow the verification to complete. Verifications have little overhead in terms of system performance and keep your units in optimum condition.

To prevent unclean shutdowns, always go through the normal shutdown procedure. It is also recommended to use an uninterruptible power supply (UPS) to prevent unclean shutdowns due to sudden power loss.

0009 Drive timeout detected

Event Type: Error
Cause: A drive has failed to respond to a command from a 3ware RAID controller within the allowed time limit (20 secs.). After sending this error message, the controller will attempt to recover the drive by sending a reset to that drive and retrying the failed command.

Possible causes of drive time-outs (also known as ATA-Port time-outs) include a bad or intermittent disk drive, power cable or interface cable.

Action: If you have checked hardware connections and no cause other than the drive can be found, replace the drive.

You may also want to use the drive manufacturer’s diagnostic and repair utilities on the drive.

000A Drive error detected

Event Type: Error
Cause: A drive has returned an error to the 3ware RAID controller that it is unable to complete a command. The error type is not a time-out (000A) or uncorrected ECC (0026).

This message may be seen as part of a recovery operation initiated by the 3ware RAID controller on the drive. One possible cause is multiple write commands to a sector forcing the drive to remap a defective sector. This message may be seen if error recovery operations initiated by the 3ware RAID controller are unsuccessful.

Action: If you see this message, the drive repairs may lie outside of the 3ware RAID controller’s abilities.Try running the drive manufacturer’s diagnostic and repair utilities on the drive.

If necessary, replace the drive.

000B Rebuild started

Event Type: Information
Cause: The 3ware RAID controller started to rebuild a degraded unit. The rebuild may have been initiated by you, may have started automatically on a hot spare or may have started after drive removal or insertion (due to the Auto Rebuild policy).
Action: Allow the rebuild to complete. This will return the unit to its normal redundant state.

000C Initialize started

Event Type: Information
Cause: The 3ware RAID controller started an initialization. This is always a “synching” background initialization and does not erase user data. Initialization either occurs at unit creation time for larger RAID 5, 6, or 50 units or later during the initial verification of redundant units.
Action: Allow the initialization to complete. This will return the unit to its normal redundant state.

000E Initialize failed

Event Type: Error
Cause: The 3ware RAID controller was unable to complete the initialization. This error can be caused by unrecoverable drive errors.

If this unit was a redundant unit, and the initialization failed because of a problem on a particular disk drive, then the unit will be degraded.

Action: If the unit was degraded, then rebuild the unit. This may necessitate replacing the drive.

Check physical cable and power connections. You can also run the drive manufacturer’s diagnostic and repair utilities on the drive.

000F SMART threshold exceeded

Event Type: Warning
Cause: SMART monitoring is predicting a potential drive failure.

The 3ware RAID controller supports SMART monitoring, whereby the individual drives automatically monitor certain parametric information such as error rates and retry counts.This type of monitoring may be able to predict a drive failure before it happens, allowing you to schedule service of the unit before it becomes degraded. The SMART status of each drive attached to the 3ware RAID controller is monitored daily.

Action: AMCC recommends that you replace any drive that has exceeded the SMART threshold.

If the drive is part of a redundant unit, remove the drive through 3DM 2 or CLI. Replace the drive and start a rebuild. If the drive is not part of a redundant unit, then you will need to backup your data before replacing the drive.

0019 Drive removed

Event Type: Warning
Cause: A drive was physically removed from the controller while the controller was powered on.
Action: If the drive is not part of a redundant unit, return the drive as soon as possible. You may need to rescan the controller to have the drive recognized. If at all possible, do not remove a drive from a non-redundant unit as this may cause data loss or a system hang.

001A Drive inserted

Event Type: Information
Cause: A drive was connected to the controller while the controller was powered on.
Action: The drive is now available for use. If the drive is part of a unit add the remaining drives and rescan the controller, in 3DM or CLI, to bring the unit online.

001E Unit inoperable

Event Type: Error
Cause: The 3ware RAID controller is unable to detect sufficient drives for the unit to be operable. Some drives have failed or are missing.

The controller only generates this message if the unit is missing drives for more than 20 seconds. This allows a hot swap of a drive to be completed without generating this error.

Action: The unit is no longer available for use. Return all missing drives to the unit. If the drives are physically present, check all data and power connections.

001F Unit Operational

Event Type: Information
Cause: Drive insertion caused a unit that was inoperable to become operational again. Any data that was on that unit will still be there. This message is only sent if the unit was inoperable for more than 20 seconds. That means that if the hot swap of a drive occurred within 20 seconds, messages are not generated.
Action: None Required. The unit is available for use.

0021 Downgrade UDMA mode

Event Type: Warning
Cause: The 3ware RAID controller has downgraded the UDMA transfer rate between the controller and the ATA disk drives. This message only applies to parallel ATA and certain legacy serial ATA drives.

Background Information

The 3ware RAID controller communicates to the ATA disk drives through the Ultra DMA (UDMA) protocol. This protocol ensures data integrity across the ATA cable by appending a Cyclical Redundancy Check (CRC) for all ATA data that is transferred. If the data becomes corrupted between the drive and the 3ware RAID controller (because of an intermittent or poor quality cable connection) the 3ware RAID controller detects this as a UDMA CRC or cable error. The 3ware RAID controller then retries the failed command three times at the current UDMA transfer rate. If the error persists, it lowers the UDMA transfer rate (for example, from UDMA 100 to UDMA 66) and retries another three times.

Action: Check for possible causes of UDMA CRC errors such as defective or poor quality interface cables or cable routing problems through electrically noisy environments (for instance, cables are too close to the power supply). Also check for cables which are not standard or exceed the ATA specification.

0022 Upgrade UDMA mode

Event Type: Warning
Cause: During a self-test, the controller found that a drive was not in the optimal UDMA mode and upgraded its UDMA transfer rate.
Action: None required. The drive and cable are working in optimal mode.

0023 Sector repair completed

Event Type: Warning
Cause: The 3ware RAID controller moved data from a bad sector on the drive to a new location.

Background Information

The 3ware RAID controller supports a feature called dynamic sector repair that allows the unit to recover from certain drive errors that would normally result in a degraded unit situation. For redundant units such as RAID 1, 5, 6, 10, and 50, the 3ware RAID controller essentially has two copies of your data available. If a read command to a sector on a disk drive results in an error, it reverts to the redundant copy in order to satisfy the host’s request. At this point, the 3ware RAID controller has a good copy of the requested data in its cache memory. It will then use this data to force the failing drive to reallocate the bad sector, which essentially repairs the sector.

Action: Sector repairs are an indication of the presence of grown defects on a particular drive. While typical modern disk drives are designed to allow several hundred grown defects, special attention should be paid to any drive in a unit that begins to indicate sector repair messages. This may be an indication of a drive that is beginning to fail. You may wish to replace the drive, especially if the number of sector repair errors exceeds 3 per month.

0024 Buffer integrity test failed

Event Type: Error
Cause: The 3ware RAID controller performs diagnostics on its internal RAM devices as part of its data integrity features. Once a day, a non-destructive test is performed on the cache memory. Failure of the test indicates a failure of a hardware component on the 3ware RAID controller. This message is sent to notify you of the problem.
Action: You should replace the 3ware RAID controller.

If the controller is still under warranty, contact 3ware Technical Support for a replacement controller.

0025 Cache flush failed; some data lost

Event Type: Error
Cause: The 3ware RAID controller was not able to commit data to the drive(s) during a caching operation.This is due to a serious drive failure, possibly from a power outage.

Background Information

The 3ware RAID controller uses caching layer firmware to improve performance. For write commands this means that the controller acknowledges it has completed a write operation before the data is committed to disk. If the 3ware RAID controller cannot commit the data to the drive after it has acknowledged to the host, this message is posted.

Action: To troubleshoot the reasons for the failure, collect the logs for your system and contact 3ware technical support.

0026 Drive ECC error reported

Event Type: Error
Cause: Drive ECC errors are an indication of grown defects on a particular drive. For redundant units, this typically means that dynamic sector repair has been invoked (see message 0023 Sector repair completed). For non-redundant units (Single Disk, RAID 0 and degraded units), which do not have another copy of the data, drive ECC errors result in the 3ware RAID controller returning failed status to the associated host command.
Action: Schedule periodic verifications of all units so that drive ECC errors can be found and corrected. If the unit is non-redundant a unit file system check is recommended.

Under Windows, right-click on your drive icon and choose Properties> Tools> Check Now. Under Linux or FreeBSD use fsck /dev/sda1. If you have more than one SATA device, substitute the correct drive letter and partition number, such as sdb2, for sda1.

0027 DCB checksum error detected

Event Type: Error
Cause: The drive’s Drive Configuration Block (DCB) has been corrupted.

The 3ware RAID controller stores certain configuration parameters on a reserved area of each disk drive called the Drive Configuration Block. As part of power-on initialization, the 3ware RAID controller performs a checksum of the DCB area to ensure consistency.

Action: If this error occurs, please contact 3ware technical support.

0028 DCB version unsupported

Event Type: Error
Cause: The unit that is connected to your 3ware RAID controller was created on a legacy 3ware product that is incompatible with your new controller.

During the evolution of the 3ware product line, the format of the Drive Configuration Block (DCB) has been changed to accommodate new features. The DCB format expected by the 3ware RAID controller and the DCB that is written on the drive must be compatible. If they are not, this message is sent.

Action: Return the drives back to their original controller and contact 3ware technical support.

0029 Verify started

Event Type: Information
Cause: The 3ware RAID controller has started verifying the data integrity of a unit.
Action: Allow verify to complete to identify any possible data integrity issues.

002A Verify failed

Event Type: Error
Cause: Verification of a unit has terminated with an error.
Action: When a verify fails, redundant units will automatically resynchronize user data through a background initialization. The initialize will not erase user data, but will recalculate and rewrite user parity data.

If the unit was non-redundant, any data in the error location is lost. (However, the error could be in a part of the drive that did not contain data.) A unit file system check is recommended. Under Windows, right-click on your drive icon and choose Properties> Tools> Check Now. Under Linux or FreeBSD use fsck /dev/sda1. If you have more than one SATA device, substitute the correct drive letter and partition number, such as sdb2, for sda1. The resynchronization of data that takes place during a background initialization can slow down access to the unit. Once initialization has begun, it cannot be canceled. You can pause it, however, by scheduling it to take place during off-hours. For more information, see “Scheduling Background Tasks” on page 166. You can also set the initialization process to go slower and use fewer system resources. For more information, see “Setting Background Task Rate” on page 165. (Initialization occurs at the Rebuild rate.)

002B Verify completed

Event Type: Information
Cause: Verification of the data integrity of a unit was completed successfully.

002C Source drive ECC error overwritten

Event Type: Error
Cause: A read error was encountered during a rebuild and the controller is configured to ‘ignore ECC’ or to ‘Force continue on source errors’. The sector in error was reallocated.This will cause uncorrectable blocks to be rewritten, but the data may be incorrect.
Action: It is recommended that you execute a file system check when the rebuild completes.

Under Windows, right-click on your drive icon and choose Properties> Tools> Check Now. Under Linux or FreeBSD use fsck /dev/sda1. If you have more than one SATA device, substitute the correct drive letter and partition number, such as sdb2, for sda1.

002D Source drive error occurred

Event Type: Error
Cause: An error on the source drive was detected during a rebuild operation. The rebuild has stopped as a result.
Action: The controller will report an error, even if the area of the source drive that had the error did not contain data. Scheduling regular verifies will lessen the chance of getting this error.

You can force the rebuild to continue by setting the Overwrite ECC Error policy through 3DM, CLI, or 3BM, and then rebuilding the unit again. This will cause uncorrectable blocks to be rewritten, but the data may be incorrect. It is recommended that you execute a file system check when the rebuild completes. Under Windows, right-click on your drive icon and choose Properties> Tools> Check Now. Under Linux or FreeBSD use fsck /dev/sda1. If you have more than one SATA device, substitute the correct drive letter and partition number, such as sdb2, for sda1.

002E Replacement drive capacity too small

Event Type: Error
Cause: The storage capacity of the drive you are using as a replacement drive is too small and cannot be used.
Action: Use a replacement drive equal to or larger than the drives already in use.

002F Verify not started; unit never initialized

Event Type: Warning
Cause: A verify operation has been attempted by the 3ware RAID controller, but the unit has never been initialized before. The unit will automatically transition to initializing mode and then start a verify.
Action: None required.

This is considered a normal part of operation. Not all types of RAID units need to be initialized in order to have full performance. The initialize will not erase user data, but will calculate and write parity data or mirror data to the drives in the unit.

0030 Drive not supported

Event Type: Error
Cause: 3ware 8000 and 9500S Serial ATA controllers only support UltraDMA-100/133 drives when using the parallel-to-serial ATA converter. This message indicates that an unsupported drive was detected during rollcall or a hot swap. This message could also indicate that the parallel-to-serial converter was jumpered incorrectly.
Action: Use a parallel ATA drive which supports UDMA 100 or 133 and check that the parallel-to-serial converter was correctly jumpered to correspond to UDMA 100 or 133 drives.

0032 Spare capacity too small

Event Type: Warning
Cause: There is a valid hot spare but the capacity is not sufficient to use it for a drive replacement in existing units.
Action: Replace the spare with a drive of equal or larger capacity than the existing drives.

0033 Migration started

Event Type: Information
Cause: The 3ware RAID controller has started the migration of a unit.

0034 Migration failed

Event Type: Error
Cause: The migration of a unit has failed.
Action: Review the list of events on the Alarms page for other entries that may give you an idea of why the migration failed (for example, a drive error on a specific port).

0035 Migration completed

Event Type: Information
Cause: The migrated unit is now ready to be used.

0036 Verify fixed data/parity mismatch

Event Type: Warning
Cause: A verify error was found and fixed by the 3ware RAID controller.
Action: None required.

0037 SO-DIMM not compatible

Event Type: Error
Cause: There is incompatible SO-DIMM memory connected to the 9500S controller.

Note: This message only applies to the 3ware 9500S controller, which has removable memory. Other 3ware controller models do not have memory that can be removed.

Action: Replace the incompatible SO-DIMM with a compatible one.

0038 SO-DIMM not detected

Event Type: Error
Cause: The 3ware 9500S RAID controller is inoperable due to missing SO-DIMM memory.

Note: This message only applies to the 3ware 9500S controller, which has removable memory. Other 3ware controller models do not have memory that can be removed.

Action: Install a compatible SO-DIMM on the controller.

0039 Buffer ECC error corrected

Event Type: Warning
Cause: The controller has detected and corrected a memory ECC error.
Action (according to twiki.cern.ch): This is a serious error which merits a vendor call. [...] The controller should be replaced.^[1]

003A Drive power on reset detected

Event Type: Error
Cause: The controller has detected that a drive has lost power and then restarted. The controller may degrade the unit if it is a redundant unit (non-redundant units cannot be degraded).
Action: If this drive was the only one to lose power, check the cable connections. Also, check that your power supply is adequate for the type and number of devices attached to it.

003B Rebuild paused

Event Type: Information
Cause: The rebuild operation is paused.

Rebuilds are normally paused for two (formerly ten) minutes after a system first boots up and during non-scheduled times when scheduling is enabled. Disabling or modifying the schedule with 3DM or CLI will allow the rebuild to resume.

003C Initialize paused

Event Type: Information
Cause: The initialization is paused.

Initializations are normally paused for two (formerly ten) minutes after a system first boots up. Initialization is also paused during non-scheduled times when scheduling is enabled. Initializations follow the rebuild schedule.

Action: If you want the initialize to resume, you can disable or modify the schedule through 3DM or CLI.

003D Verify paused

Event Type: Information
Cause: The verify operation is paused.

Verifies are normally paused for 2 (formerly 10) minutes after a system first boots up. Verifies are also paused during non-scheduled times when scheduling is enabled.

Action: If you want the verification to resume, you can disable or modify the schedule through 3DM or CLI.

003E Migration paused

Event Type: Information
Cause: Migration is paused. Migration follows the rebuild schedule.
Action: If you want the migration to resume, you can disable or modify the schedule through 3DM or CLI

003F Flash file system error detected

Event Type: Warning
Cause: A corrupted flash file system was found on the 3ware RAID controller during boot-up.

The 3ware RAID controller stores configuration parameters as files in its flash memory. These files can be corrupted when a flash operation is interrupted by events such as a power failure. The controller will attempt to restore the flash files from a backup copy.

Action: Update to the latest firmware, as earlier firmware resets corrupted files to default settings.

We recommend using 3DM, CLI or 3BM to check your settings, in case they were not able to be restored.

0040 Flash file system repaired

Event Type: Information
Cause: A corrupted flash file system has been successfully repaired.

Some of the flash files with insufficient data may have been lost in the operation. The configuration parameters which are lost will then return to their default values.

Action: We recommend using 3DM, CLI or 3BM to check your settings, in case they were not able to be restored.

0041 Unit number assignments lost

Event Type: Warning
Cause: The unit number assignments have been lost.

This may have occurred as a result of a soft reset.

Action: Please contact AMCC 3ware technical support.

0042 Primary DCB read error occurred

Event Type: Warning
Cause: The controller found an error while reading the primary copy of the Disk Configuration Block (DCB).

The controller will attempt to correct the error by reading the back-up copy of the DCB. If a valid DCB is found, the primary DCB is re-written to rectify the errors.

Action: AMCC recommends verifying the unit.

0043 Backup DCB read error detected

Event Type: Warning
Cause: The controller has detected a latent error in the backup Disk Configuration Block (DCB).

The 3ware RAID controller checks the backup DCB, even when the primary DCB is OK. If an error is found, the controller will attempt to correct the error by reading the primary copy. If the primary copy is valid, the backup DCB will be rewritten to rectify the errors.

Action: AMCC recommends verifying the unit.

0044 Battery voltage is normal

Event Type: Information
Cause: The battery pack voltage being monitored by the Battery Backup Unit fell outside of the acceptable range and then came back within the acceptable range.
Action: None required.

0045 Battery voltage is low

Event Type: Warning
Cause: The battery pack voltage being monitored by the Battery Backup Unit has fallen below the warning threshold.
Action: The Battery Backup Unit is presently still able to backup the 3ware RAID controller, but you should replace the battery pack if the warning continues.

0046 Battery voltage is high

Event Type: Warning
Cause: The battery pack voltage being monitored by the Battery Backup Unit has risen above the warning threshold.
Action: The Battery Backup Unit is presently still able to backup the 3ware RAID controller, but you should replace the battery pack if the warning continues.

0047 Battery voltage is too low

Event Type: Error
Cause: The battery pack voltage being monitored by the Battery Backup Unit is too low to backup the 3ware RAID controller.

You may see this message during a battery capacity test. In this case, it is not a sign of battery failure. You may also see this message if the battery pack is plugged in while the computer is on. This is not advised.

Action: Replace the battery pack if none of the above causes apply and the warning continues.

0048 Battery voltage is too high

Event Type: Error
Cause: The battery pack voltage being monitored by the Battery Backup Unit is too high to backup the 3ware RAID controller.
Action: The battery pack must be replaced.

This may be a fault in the BBU control module. If you get this error, do the following: 1. Turn off the computer and remove the 3ware RAID controller. 2. Remove the BBU control module from the 3ware RAID controller and the battery module from the remote card. 3. Unplug the battery from the control module. 4. Return the BBU control module and battery module to 3ware. For more details on removing the BBU, see the installation guide that came with your 3ware RAID controller.

0049 Battery temperature is normal

Event Type: Information
Cause: The battery pack temperature being monitored by the Battery Backup Unit fell outside of the acceptable range and then came back within the acceptable range.
Action: None required.

004A Battery temperature is low

Event Type: Warning
Cause: The battery pack temperature being monitored by the Battery Backup Unit has fallen below the acceptable range. The most likely cause is ambient temperature.
Action: The Battery backup Unit is presently still able to backup the 3ware RAID controller, but you should replace the battery pack if the temperature warning persists and is not due to environmental reasons.

004B Battery temperature is high

Event Type: Error
Cause: The battery pack temperature being monitored by the Battery Backup Unit has risen above the acceptable range. However, the BBU is still able to backup the 3ware RAID controller.
Action: Check for sufficient airflow around the card. To increase airflow you can:
Leave the PCI slots next to the controller empty
Add fans to your computer case
Move and bundle wiring that is blocking air circulation

The Battery Backup Unit is presently still able to backup the 3ware RAID controller, but you should replace the battery pack if the temperature warning persists.

004C Battery temperature is too low

Event Type: Error
Cause: The battery pack temperature being monitored by the Battery Backup Unit is too low.

The BBU is unable to backup the 3ware RAID controller.

Action: Contact 3ware technical support.

004D Battery temperature is too high

Event Type: Error
Cause: The battery pack temperature being monitored by the Battery Backup Unit is too high.

The BBU is unable to backup the 3ware RAID controller.

Action: Check for sufficient airflow around the card. To increase airflow you can:
Leave the PCI slots next to the controller empty
Add fans to your computer case
Move and bundle wiring that is blocking air circulation

004E Battery capacity test started

Event Type: Information
Cause: A battery test was started through CLI or 3DM.

Background Information

The test estimates how many hours the Battery Backup Unit will be able to back up the 3ware RAID controller in case of a power failure. This test performs a full battery charge/discharge/re-charge cycle and may take up to 20 hours to complete. During this test the Battery Backup Unit cannot backup the 3ware RAID controller. In addition, all units have their write cache disabled until the test completes.

Action: None required.

004F Cache synchronization skipped

Event Type: Warning
Cause: The cache synchronization that is normally performed when power is restored after a power failure was skipped and write data is still being backed up in the controller cache. This can occur if a unit was physically removed or became inoperable during the power outage.
Action: Return missing drive(s) to the controller so that the missing write data can be saved.

0050 Battery capacity test completed

Event Type: Information
Cause: The Battery Backup Unit has completed a battery capacity test.

The BBU is again able to backup the 3ware RAID controller and write cache has been re-enabled for all units. (During the test, backup and write cache were disabled).

0051 Battery health check started

Event Type: Information
Cause: The Battery Backup Unit periodically evaluates the health of the battery and its ability to backup the 3ware RAID controller in case of a power failure. This health check has started.

0052 Battery health check completed

Event Type: Information
Cause: The Battery Backup Unit evaluates periodically the health of the battery and its ability to backup the 3ware RAID controller in case of a power failure. This message is posted to the host when this health check has completed.

0053 Battery capacity test is overdue

Event Type: Information
Cause: There has not been a battery capacity test run in the last 6 months, which is the maximum recommended interval. This message will be sent once every week until the test is run.
Action: AMCC recommends running the test at least once every 6 months, if the measured battery capacity is longer than 120 hours. If the measured battery capacity is less than 120 hours the recommended test interval is 4 weeks.

0055 Battery charging started

Event Type: Information
Cause: The Battery Backup Unit has started a battery charge cycle.
Action: None required.

0056 Battery charging completed

Event Type: Information
Cause: The Battery Backup Unit has completed a battery charge cycle.

0057 Battery charging fault

Event Type: Error
Cause: The Battery Backup Unit has detected a battery fault during a charge cycle. The Battery Backup Unit is not ready and is unable to backup the 3ware RAID controller.
Action: Replace the battery pack.

0058 Battery capacity is below warning level

Event Type: Information
Cause: The measured capacity of the battery is below the warning level. The Battery Backup Unit is presently still able to backup the 3ware RAID controller, but it is weakening.
Action: Replace the battery pack if the warnings persist.

0059 Battery capacity is below error level

Event Type: Error
Cause: The measured capacity of the battery is below the error level. The Battery Backup Unit is not ready and is unable to backup the 3ware RAID controller.
Action: Replace the battery pack.

005A Battery is present

Event Type: Information
Cause: A battery pack is connected to the 3ware RAID controller.

005B Battery is not present

Event Type: Error
Cause: The battery pack has been removed from the 3ware RAID controller.
Action: Reinstall the battery pack.

005C Battery is weak

Event Type: Warning
Cause: The Battery Backup Unit periodically evaluates the health of the battery and its ability to backup the 3ware RAID controller in case of a power failure. This message is posted when the result of the health test is below the warning threshold.
Action: Replace the battery pack if warnings persist.

005D Battery health check failed

Event Type: Error
Cause: The Battery Backup Unit is not able to backup the 3ware RAID controller.

The Battery Backup Unit periodically evaluates the health of the battery and its ability to backup the 3ware RAID controller in case of a power failure. This message is posted when the result of the health test is below the fault threshold.

Action: Replace the battery pack. The BBU cannot presently backup the controller.

005E Cache synchronization completed

Event Type: Information
Cause: The 3ware RAID controller performs cache synchronization when system power is restored following a power failure. This message is posted for each unit when the cache synchronization completes successfully.

You will also see this message if drive insertion causes a unit to become operational and retained write cache data was flushed.

005F Cache synchronization failed; some data lost

Event Type: Error
Cause: The 3ware RAID controller performs cache synchronization when system power is restored following a power failure. The cache synchronization was not successful for some reason.

0062 Enclosure removed

Event Type: Warning
Cause: Applies only to the 9690SA controller.

An enclosure is no longer accessible to the 9690SA RAID controller. The likely cause is that the enclosure has been powered down or that a cable has been unplugged.

0063 Enclosure added

Event Type: Information
Cause: Applies only to the 9690SA controller.

An enclosure is now accessible to the 9690SA RAID controller. The likely cause is that an enclosure connected to the controller has been powered up or that a cable has been plugged in.

0064 Local link up

Event Type: Information
Cause: Applies only to the 9690SA controller.

A cable has been plugged in, restoring a link to a controller phy.

0065 Local link down

Event Type: Warning
Cause: Applies only to the 9690SA controller.

A cable has been unplugged, removing a link to a controller phy.

8000 Enclosure fan normal

Event Type: Information
Cause: Applies only to the 9690SA controller.

The fan’s performance or operation is now back within the acceptable range.

Action: None required.

8001 Enclosure fan error

Event Type: Error
Cause: Applies only to the 9690SA controller.

The enclosure fan is not functioning normally and may be blocked or defective.

Action: Check that the fan or fans are not blocked. If a fan appears defective, replace it as soon as possible.

For information on replacing a fan, see your enclosure documentation or contact your enclosure manufacturer.

8002 Enclosure fan removed

Event Type: Warning
Cause: Applies only to the 9690SA controller.

A fan has either been removed or has become unplugged.

Action: Replace or reseat fan and make sure it is operational. An insufficient number of operating fans may lead to overheating of the components in the enclosure.

8003 Enclosure fan added

Event Type: Information
Cause: Applies only to the 9690SA controller.

A fan has been added to the enclosure or an existing fan has been plugged in.

Action: None required.

8004 Enclosure fan unknown

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The enclosure is unable to recognize the fan. The fan may not be seated correctly or may be malfunctioning.

Action: Reseat the fan.

If it is necessary to replace the fan, see your enclosure documentation or contact your enclosure manufacturer.

8005 Enclosure fan off

Event Type: Warning
Cause: Applies only to the 9690SA controller.

An enclosure fan has been turned off. It is no longer cooling the enclosure.

Action: The enclosure normally controls the on/off function of the fan. If there is no over-heating problem, no action is necessary.

8020 Enclosure temp normal

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure temperature is now back within the acceptable range.

Action: None required.

8021 Enclosure temp low

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure temperature is lower than normal.

Action: In general, cooler operating temperatures are good for enclosure components.

However, if the temperature is very cold, condensation can occur and cause media errors and damage. Take steps to bring the operating environment back within the enclosure manufacturer’s specifications. Allow cold equipment to warm up gradually before powering on.

8022 Enclosure temp high

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The enclosure temperature is higher than normal.

Action: Take steps to lower the enclosure temperature, such as adding fans, clearing enclosure openings of blockages, and increased ventilation.

Make sure that the enclosure environment does not get any hotter. See your enclosure documentation or contact your enclosure manufacturer to ensure that you are adhering to proper operating conditions and environments.

8023 Enclosure temp below operating

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure temperature is below the enclosure manufacturer’s specified operating temperature.

Action: In general, cooler operating temperatures are good for enclosure components.

However, if the temperature is very cold, condensation can occur and cause media errors and damage. Take steps to bring the operating environment back within the enclosure manufacturer’s specifications. Allow cold equipment to warm up gradually before powering on.

8024 Enclosure temp above operating

Event Type: Error
Cause: Applies only to the 9690SA controller.

The enclosure temperature is above the enclosure manufacturer’s specified operating temperature.

Action: Make sure that the fans are operational. Check for blocked ventilation in the enclosure and the operating environment.

Continued operation of the enclosure at high temperatures may lead to data loss and operational failure.

8025 Enclosure temp removed

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The temperature sensor in the enclosure has been removed or has failed and the enclosure temperature is no longer being monitored.

Action: The temperature sensor should be replaced or repaired. This may require specialized skills. Contact your enclosure manufacturer for more information.

8026 Enclosure temp added

Event Type: Information
Cause: Applies only to the 9690SA controller.

A temperature sensor has been added to the enclosure or an existing sensor has been plugged in. This error message can occur due to a poor connection.

Action: If due to a poor connection, the repair will require specialized skills. Contact your enclosure manufacturer for more information.x

8027 Enclosure temp critical

Event Type: Error
Cause: Applies only to the 9690SA controller.

The temperature in the enclosure is dangerously out of the recommended operating range.

Action: Take immediate steps to correct the temperature problem.

Take steps to lower the enclosure temperature, such as adding fans, clearing enclosure openings of blockages, and increased ventilation of the operating environment Continued operation of the enclosure at high temperatures may lead to data loss and operational failure. See your enclosure documentation or contact your enclosure manufacturer to make sure you are following proper operating procedures.

8028 Enclosure temp unknown

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The enclosure is reporting that it is unable to determine the temperature of the unit. This may be due to a failed or missing sensor.

Action: Check the operational status of the temperature sensor. If it has failed, replace it.

See your enclosure documentation or contact your enclosure manufacturer for more information.

8030 Enclosure power normal

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure power supply is now back within the acceptable range.

Action: None required.

8031 Enclosure power fail

Event Type: Error
Cause: Applies only to the 9690SA controller.

One of the enclosure power supplies is not working. Either a power supply has failed or a cord is unplugged.

Action: Reseat the power supply cord. Replace any failed power supply as soon as possible.

It is recommended to use an uninterruptible power supply (UPS) to protect against power failures.

8032 Enclosure power removed

Event Type: Warning
Cause: Applies only to the 9690SA controller.

One of the enclosure power supplies has been removed from the enclosure or a power supply is unplugged.

Action: Return or reconnect the power supply as soon as possible.

It is recommended to use an uninterruptible power supply (UPS) to protect against power failures.

8033 Enclosure power added

Event Type: Information
Cause: Applies only to the 9690SA controller.

A power supply has been added to the enclosure or an existing power supply has been plugged in.

Action: None required.

8034 Enclosure power unknown

Event Type: Warning
Cause: Applies only to the 9690SA controller.

There is a power supply in the enclosure, but it is not of a known type.

Action: Check to be sure the power supply is operational by re-seating or replacing the failed power supply. See your enclosure documentation or contact your enclosure manufacturer for more information.

It is recommended to use an uninterruptible power supply (UPS) to protect against power failures.

8037 Enclosure power off

Event Type: Warning
Cause: Applies only to the 9690SA controller.

An enclosure power supply has been turned off.

Action: If needed for power supply redundancy or to meet power requirements, turn the power supply back on.

It is recommended to use an uninterruptible power supply (UPS) to protect against power failures.

8040 Enclosure voltage normal

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure power supply voltage is now back within the acceptable range.

Action: None required.

8041 Enclosure voltage over

Event Type: Error
Cause: Applies only to the 9690SA controller.

The enclosure power supply voltage is higher than the normal range.

Action: This error is rare. If you see it you may need a UPS or voltage regulator to stay within the recommended voltage range.

8042 Enclosure voltage under

Event Type: Error
Cause: Applies only to the 9690SA controller.

The enclosure power supply voltage is lower than the normal range. This can be due to a failing power supply or an unreliable power source.

Action: If due to a failing power supply, replace it as soon as possible.

It is recommended to use an uninterruptible power supply (UPS) to protect against power failures.

8043 Enclosure voltage unknown

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The enclosure is not reporting voltage data. This can be due to a failing power supply.

Action: If applicable, replace the failed power supply.

Contact your enclosure manufacturer for more information.

8044 Enclosure current normal

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure power supply current is now back within the acceptable range.

Action: None required.

8045 Enclosure current over

Event Type: Error
Cause: Applies only to the 9690SA controller.

The enclosure power supply amperage is higher than normal.

Action: Replace the failing power supply or remove any extra devices. This may be caused by too many hard drives, which have exceeded the enclosure power specifications. See your enclosure documentation or contact your enclosure manufacturer for more details.

Too much current cannot be corrected by a UPS, although it can provide some protection against a power surge.

8046 Enclosure current unknown

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The enclosure’s amperage is unknown. A power supply may have failed.

Action: If applicable, replace the failed power supply.

See your enclosure documentation or contact your enclosure manufacturer for more details.

8000 Enclosure fan normal

Event Type: Information
Cause: Applies only to the 9690SA controller.

The fan’s performance or operation is now back within the acceptable range.

Action: None required.

8001 Enclosure fan error

Event Type: Error
Cause: Applies only to the 9690SA controller.

The enclosure fan is not functioning normally and may be blocked or defective.

Action: Check that the fan or fans are not blocked. If a fan appears defective, replace it as soon as possible.

For information on replacing a fan, see your enclosure documentation or contact your enclosure manufacturer.

8002 Enclosure fan removed

Event Type: Warning
Cause: Applies only to the 9690SA controller.

A fan has either been removed or has become unplugged.

Action: Replace or reseat fan and make sure it is operational. An insufficient number of operating fans may lead to overheating of the components in the enclosure.

8003 Enclosure fan added

Event Type: Information
Cause: Applies only to the 9690SA controller.

A fan has been added to the enclosure or an existing fan has been plugged in.

Action: None required.

8004 Enclosure fan unknown

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The enclosure is unable to recognize the fan. The fan may not be seated correctly or may be malfunctioning.

Action: Reseat the fan.

If it is necessary to replace the fan, see your enclosure documentation or contact your enclosure manufacturer.

8005 Enclosure fan off

Event Type: Warning
Cause: Applies only to the 9690SA controller.

An enclosure fan has been turned off. It is no longer cooling the enclosure.

Action: The enclosure normally controls the on/off function of the fan. If there is no over-heating problem, no action is necessary.

8020 Enclosure temp normal

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure temperature is now back within the acceptable range.

Action: None required.

8021 Enclosure temp low

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure temperature is lower than normal.

Action: In general, cooler operating temperatures are good for enclosure components.

However, if the temperature is very cold, condensation can occur and cause media errors and damage. Take steps to bring the operating environment back within the enclosure manufacturer’s specifications. Allow cold equipment to warm up gradually before powering on.

8022 Enclosure temp high

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The enclosure temperature is higher than normal.

Action: Take steps to lower the enclosure temperature, such as adding fans, clearing enclosure openings of blockages, and increased ventilation.

Make sure that the enclosure environment does not get any hotter. See your enclosure documentation or contact your enclosure manufacturer to ensure that you are adhering to proper operating conditions and environments.

8023 Enclosure temp below operating

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure temperature is below the enclosure manufacturer’s specified operating temperature.

Action: In general, cooler operating temperatures are good for enclosure components.

However, if the temperature is very cold, condensation can occur and cause media errors and damage. Take steps to bring the operating environment back within the enclosure manufacturer’s specifications. Allow cold equipment to warm up gradually before powering on.

8024 Enclosure temp above operating

Event Type: Error
Cause: Applies only to the 9690SA controller.

The enclosure temperature is above the enclosure manufacturer’s specified operating temperature.

Action: Make sure that the fans are operational. Check for blocked ventilation in the enclosure and the operating environment.

Continued operation of the enclosure at high temperatures may lead to data loss and operational failure.

8025 Enclosure temp removed

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The temperature sensor in the enclosure has been removed or has failed and the enclosure temperature is no longer being monitored.

Action: The temperature sensor should be replaced or repaired. This may require specialized skills. Contact your enclosure manufacturer for more information.

8026 Enclosure temp added

Event Type: Information
Cause: Applies only to the 9690SA controller.

A temperature sensor has been added to the enclosure or an existing sensor has been plugged in. This error message can occur due to a poor connection.

Action: If due to a poor connection, the repair will require specialized skills. Contact your enclosure manufacturer for more information.

8027 Enclosure temp critical

Event Type: Error
Cause: Applies only to the 9690SA controller.

The temperature in the enclosure is dangerously out of the recommended operating range.

Action: Take immediate steps to correct the temperature problem.

Take steps to lower the enclosure temperature, such as adding fans, clearing enclosure openings of blockages, and increased ventilation of the operating environment Continued operation of the enclosure at high temperatures may lead to data loss and operational failure. See your enclosure documentation or contact your enclosure manufacturer to make sure you are following proper operating procedures.

8028 Enclosure temp unknown

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The enclosure is reporting that it is unable to determine the temperature of the unit. This may be due to a failed or missing sensor.

Action: Check the operational status of the temperature sensor. If it has failed, replace it.

See your enclosure documentation or contact your enclosure manufacturer for more information.

8030 Enclosure power normal

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure power supply is now back within the acceptable range.

Action: None required.

8031 Enclosure power fail

Event Type: Error
Cause: Applies only to the 9690SA controller.

One of the enclosure power supplies is not working. Either a power supply has failed or a cord is unplugged.

Action: Reseat the power supply cord. Replace any failed power supply as soon as possible.

It is recommended to use an uninterruptible power supply (UPS) to protect against power failures.

8032 Enclosure power removed

Event Type: Warning
Cause: Applies only to the 9690SA controller.

One of the enclosure power supplies has been removed from the enclosure or a power supply is unplugged.

Action: Return or reconnect the power supply as soon as possible.

It is recommended to use an uninterruptible power supply (UPS) to protect against power failures.

8033 Enclosure power added

Event Type: Information
Cause: Applies only to the 9690SA controller.

A power supply has been added to the enclosure or an existing power supply has been plugged in.

Action: None required.

8034 Enclosure power unknown

Event Type: Warning
Cause: Applies only to the 9690SA controller.

There is a power supply in the enclosure, but it is not of a known type.

Action: Check to be sure the power supply is operational by re-seating or replacing the failed power supply. See your enclosure documentation or contact your enclosure manufacturer for more information.

It is recommended to use an uninterruptible power supply (UPS) to protect against power failures.

8037 Enclosure power off

Event Type: Warning
Cause: Applies only to the 9690SA controller.

An enclosure power supply has been turned off.

Action: If needed for power supply redundancy or to meet power requirements, turn the power supply back on.

It is recommended to use an uninterruptible power supply (UPS) to protect against power failures.

8040 Enclosure voltage normal

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure power supply voltage is now back within the acceptable range.

Action: None required.

8041 Enclosure voltage over

Event Type: Error
Cause: Applies only to the 9690SA controller.

The enclosure power supply voltage is higher than the normal range.

Action: This error is rare. If you see it you may need a UPS or voltage regulator to stay within the recommended voltage range.

8042 Enclosure voltage under

Event Type: Error
Cause: Applies only to the 9690SA controller.

The enclosure power supply voltage is lower than the normal range. This can be due to a failing power supply or an unreliable power source.

Action: If due to a failing power supply, replace it as soon as possible.

It is recommended to use an uninterruptible power supply (UPS) to protect against power failures.

8043 Enclosure voltage unknown

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The enclosure is not reporting voltage data. This can be due to a failing power supply.

Action: If applicable, replace the failed power supply.

Contact your enclosure manufacturer for more information.

8044 Enclosure current normal

Event Type: Information
Cause: Applies only to the 9690SA controller.

The enclosure power supply current is now back within the acceptable range.

Action: None required.

8045 Enclosure current over

Event Type: Error
Cause: Applies only to the 9690SA controller.

The enclosure power supply amperage is higher than normal.

Action: Replace the failing power supply or remove any extra devices. This may be caused by too many hard drives, which have exceeded the enclosure power specifications. See your enclosure documentation or contact your enclosure manufacturer for more details.

Too much current cannot be corrected by a UPS, although it can provide some protection against a power surge.

8046 Enclosure current unknown

Event Type: Warning
Cause: Applies only to the 9690SA controller.

The enclosure’s amperage is unknown. A power supply may have failed.

Action: If applicable, replace the failed power supply.

See your enclosure documentation or contact your enclosure manufacturer for more details.

Einzelnachweise

↑ ^1,0 ^1,1 [1] (twiki.cern.ch) - auch in der LSI Knowledge Base

Autor: Thomas-Krenn.AG

Bei der Thomas-Krenn.AG achten wir auf den bestmöglichen Service. Um dem gerecht zu werden, haben wir unser Thomas-Krenn Wiki ins Leben gerufen. Hier teilen wir unser Wissen mit Ihnen und informieren Sie über Grundlagen und Aktuelles aus der IT-Welt. Ihnen gefällt unsere Wissenskultur und Sie wollen Teil des Teams werden? Besuchen Sie unsere Stellenangebote.

[cern-1] 1,0 ^1,1 [1] (twiki.cern.ch) - auch in der LSI Knowledge Base

[1]