INFORMATION RELEASE AB 159

edplogo.gif (5330 bytes)

DAC960PL MYLEX RAID CONTROLLER

The purpose of this document is to introduce the DAC960PL Mylex Controller to Field Service.

The Mylex DAC960PL PCI to SCSI disk array controller is an intelligent, high performance controller targetted at the small to mid-range server marketplace.

The DAC960PL supports RAID levels 0, 1, 5 & 10, (0+1) for single or multiple drive arrays.

Arrays are configured using MS-DOS utilities. These utilities are highly intuitive and have an excellent graphical interface.

The DAC960PL is supplied with 1 to 3 independent fast & wide SCSI channels. Using a 32 bit RISC based microprocessor, ASIC logic arrays and dedicated read/write battery backed cache memory, the DAC960PL reduces the CPU load whilst supporting a disk I/O throughput of up to 20MB/second per channel.

CONTROLLER FUNCTIONS AND FEATURES

Key Features

1.Complete RAID/SCSI disk array configuration and management.

2.Automatic rebuild after disk failure without user intervention.

3.SCSI performance enhancement for faster data transfers.

4.Automatic fault monitoring and recovery increases system availability.

5.Supports all major operating systems and network environments.

Key 1 - Manages RAID/SCSI Disk Arrays

-Supports multiple RAID levels (0, 1, 5, and 0 + 1) allowing user to select the desired combination of storage capacity, data availability (redundancy) and I/O transfer performance for any data application.

-Connects up to 21 SCSI drives that can be grouped and managed as a single very large-capacity disk drive (up to 32GB), as multiple large-capacity drive groups (256GB addressable), or as individual disk drives.

-Industry-standard Fast/Wide SCSI-2 interface supports most SCSI drives.

Key 2 - Automates RAID Functions

-Automatic failed-drive detection

-Automatic rebuild of the array using stand-by (hot spare) disk after a drive failure.

-Transparent drive rebuild permits automatic rebuild of failed drives during normal operation without having to take the array off-line.

-Automatic error detection/correction of parity errors, bad blocks, etc.

-Automatic sector re-mapping recovers defective media and corrects data errors.

Key 3 - Enhances SCSI Performance

-Fast/Wide SCSI channels provide high-performance data transfers at up to 20MB/second per channel.

-PCI bus mastering provides up to 132MB/second burst data rates.

-Tag-queuing to the host allows processing of up to 64 simultaneous multi-thread system commands or data requests.

-User-defined performance-tuning through selectable cache write policy, variable stripe width, and rebuild priority to optimize controller performance during rebuild.

-Disconnect/reconnect capability for enhanced performance and SCSI bus optimization.

Key 4 - Increases System Availability

-Built-in diagnostics provide controller and drive fault monitoring during power-on and continuous operation.

-Status alerts notify the administrator or user of critical conditions.

-Supports AEMI protocols for integrated monitoring of enclosing power supplies, fans, and temperature.

-Battery backup option protects data in the controller cache in the event of a power interruption.

Key 5 - Supports Popular Operating Systems

-Novell NetWare 3.1x, 4.0x, 4.1x; and UnixWare v2.0

-Microsoft Windows NT 3.5x and Advanced Server

-IBM OS/2 2.1, 2.2, 3.0 (WARP), SMP

-SCO UNIX 3.2.4 and SCO ODT

-Novell UnixWare v2.0

-Banyan Vines 6.x

-MS-DOS 5.x, 6.x, and above

SPECIFICATIONS

ControllerDAC960PL

CPUIntel i960JF R RISC 32-bit microprocessor

Memory

Module TypeDRAM, 72-pin SIMM, 70ns or faster

SizeMinimum : 2MB

Optional: 4, 8, 16, or 32MB (nx36)

Cache TypeWrite: Selectable, Write Through or Write Back

Read: Always enabled

Firmware

ROM TypeFlash EEPROM, 256K x 8; boot-sectored

BIOSExecuted from DRAM (Shadow RAM)

PCI

I/O ProcessorMylex 189206 ASIC

Bus Type32-bit, 33 MHz, PCI Local Bus

ModeBus Master

Transfer RateUp to 132 MB/second (burst)

SCSI

I/O Processor 53C720 R, one per channel

Bus Type8 or 16-bit Fast/Wide SCSI-2 compliant

Transfer RateUp to 20 MB/second per channel

Up to 60MB/second, 3 channels

RAID

Levels supportedRAID 0, Striping

RAID 1, Mirroring

RAID 5, Parity

RAID 0 + 1, Striping and Mirroring

JBOD, Single-drive control

Electrical requirements

Input Power5V 5% @ 2.5 Amp** (w/4MB memory)

5V 5% @ 3.5 Amp** (w/16MB memory)

** (supply currents assume drives providing term power)

Environmental

TemperatureOperating: 5 degrees C to 55 degrees C

Storage: -60 degrees C to +150 degrees C

HumidityOperating: 20% to 90% rh

(non-condensing)Non-operating 20% to 90% rh

Dimensions

Length12.5 inches

Height4.19 inches

OVERVIEW

The Mylex DAC960PL provides high-performance PCI-to-SCSI disk array control functionality for small to medium-size network servers or work-stations. When properly configured, the DAC960PL delivers a high degree of fault tolerance and advanced disk array management features through the use of RAID technology.

The DAC960PL Disk Array Controller plugs into one of the host system's Peripheral Component Interface (PCI) bus slots and connects to either internal disk drives and/or external drive enclosures via standard SCSI-2 compliant cabling.

Figure 1 - System Diagram

CONTROLLER COMPONENTS

Key components of the DAC960PL controller, shown in Figure 2 are:

-i960 RISC processor

-Memory subsystem and DRAM cache

-PCI and SCSI I/O subsystems

The i960 Processor

The DAC960PL CPU is a 32-bit Intel i960JF RISC microprocessor. The CPU controls all functions of the DAC960PL, including PCI and SCSI bus transfers, RAID processing, configuration, data striping, error recovery, and drive rebuild.

Memory Subsystem and DRAM Cache

The DAC960PL can be configured with up to 32 megabytes of DRAM cache, depending on the type of memory modules being used. A minimum of 2MB, 70ns DRAM is required for controller operation. Cache write policy is user-selectable for each logical unit in the configuration.

A fast 32-bit interface between the i960 CPU and the cache memory DRAM is provided by the Mylex 189105 Memory Addressing and Control (MAC) unit. In addition to memory control and addressing functions, this ASIC provides the device mapping and decode for the NVRAM (non-volatile memory) and the electronically-erasable / programmable read-only memory (flash EEPROM).

Controller Firmware

The DAC960PL firmware contains the programs executed by the i960CPU. The firmware resides in the on-board Flash EEPROM. This memory device retains information even after power is off, and can also be re-written, to allow the controller firmware to be upgraded without the need to replace any hardware chips.

The NVRAM stores data on the current configuration of the controller and its attached disk drives, and lists of pending write operations issued to any redundant drives. As the disk drive configurations change (for example, when a drive fails), the NVRAM keeps a record of the changes. This data is checksum protected so that after a power failure, the controller will recall the configuration and will restore consistency for all outstanding writes on restart.

Figure 2 - DAC960PL Controller Block Diagram

PCI Bus Interface

The interface between the host system PCI bus and the i960 processor on the DAC960PL is controlled in hardware by the Mylex 189206 PCU ASIC. The PCI provides fast data transfers without the limitations associated with PCI bridge technology. Interface to the host is by 32-bit, 33Mhz PCI local bus, using the single interrupt line, INTA#. Through PCI Bus Mastering, the DAC960PL supports burst data transfers up to 132 MB/second.

SCSI Bus Interface

The DAC960PL uses the 53C720SE SCSI I/O processor chip on each SCSI channel to allow the controller to simultaneously read or write data on up to seven disk drives per channel. The DAC960PL supports the Fast/Wide (8/16-bit) SCSI-2 standard, which is backward compatible with earlier SCSI standards. The DAC960PL delivers SCSI data transfer rates up to 20MB per second per channel (60 MB/sec, 3-channel).

SCSI FUNCTIONS

The DAC960PL i960 RISC processor and SCSI I/O processor(s), provide intelligent, high-performance SCSI interface and control. The DAC960PL manages and controls the SCSI bus arbitration between the controller and its connected devices, and all SCSI activity of the connected devices.

Multiple SCSI Format Support

The standard DAC960PL provides at least one, and optionally up to three, SCSI channels for connecting disk drives or other devices, such as CD-ROM and tape drives. With the appropriate cabling, these devices may be any combination of Narrow, Fast, or Wide SCSI formats (see Table 1).

SCSI Cabling and Termination Conventions

Disk drives equipped with a SCSI interface should be connected to the controller by means of cables that comply with standard SCSI data-rate, pinout, and cable-length conventions (including all internal wiring). Up to seven SCSI devices can be connected to each of the controller's drive channels. The first and last device on each channel must be terminated. The DAC960PL supports active termination (alternative-2, or ALT-2).

SCSI Address (Target ID) Selection

Each drive or device on a specific SCSI channel must be configured for a target address (or target ID) that is different from all other devices on that channel. The target ID (TID) is a SCSI address number from 0 to 7, and is assigned to each device attached to a SCSI channel during installation.

The default SCSI address for the DAC960PL controller is target ID 7. Subsequently, you must assign to each connected disk drive a different (unique) SCSI address, typically a target ID number from 0 to 6.

Table 1 - Supported SCSI Formats

SCSI TYPE CLOCK RATE DATA RATE CONNECTOR CABLE LENGTH

Wide SCSI-2 10Mhz 20MB/sec 68 pin 3m (10ft)

(16-bit) 5Mhz 10MB/sec 6m (20ft)

Narrow SCSI-2 10Mhz 10MB/sec 68-pin or 3m (10ft)

(8-bit) 5Mhz 5MB/sec 50-pin* 6m (20ft)

SCSI-1 (8-bit) 5Mhz 5MB/sec 50-pin* 6m (20ft)

*50-pin to 68-pin adapter required

DRIVE ORGANISATION

The DAC960PL controller organises the SCSI drives connected to it as physical drives and logical units.

Physical Drives (Drive Groups or Packs)

Using the DAC960PL, up to eight individual disk drives can be used together to form a pack, or drive group, of physical drives that will be used to comprise the array's logical unit capacity.

Note:If all of the disks in a drive group are not the same size, the drive group has the effective capacity or the multiple of the smallest drive.

To determine the total size of a drive group, multiply the size of the smallest drive in the drive group by the number of disk drives in the group.

For example, if there are four drives of 4GB each, and one drive of 2GB comprising a drive group, the effective capacity available for use is 10GB (5x2), not 18GB.

The DAC960PL supports up to eight (8) drive groups.

Logical Units (System Drives)

A logical unit (or system drive) is that portion of a drive group (or a combination of up to eight drive groups) seen by the host system as a single logical device. The maximum addressable size of a single logical unit is 32GB.

Each logical unit is identified to the host by its logical unit number (LUN). The DAC960PL supports up to eight (8) LUNs per drive group. For example, on the first channel of the controller, the third logical unit having a SCSI target ID of 1 will be seen by the host computer as CH 0, ID 1, LUN 2 (since LUN numbering begins at 0, and continues 1, 2, 3, etc.).

Note:Use the DACCF software utility to configure the logical units (system drives).

RAID MANAGEMENT

RAID is an acronym for Redundant Array of Independent Disks. The DAC960PL controller implements several different versions of the Berkeley RAID technology, and two special versions that are specific only to the DAC960 family of RAID controllers. Each version (referred to as a RAID Level) that is supported by the DAC960PL controller is shown in Table 2.

An appropriate RAID level is selected when the logical drives are defined or created using the configuration software utility (eg. DACCF). Deciding which RAID level to use is based on the following priorities:

-Disk capacity

-Data availability (fault tolerance or redundancy)

-Disk performance

The DAC960PL controller makes the RAID implementation and the disks' physical configuration transparent to the host operating system. This means that the host operating system drivers and software utilities are not affected, regardless of the RAID level selected.

Table 2 - Supported RAID Levels

RAID Drives/Chnl.

Level Description Min Max 0Block striping is provided, which yields 2 8 higher performance than with individual drives. There is no redundancy.

1Drives are paired and mirrored. All data 2 2 is 100% duplicated on an equivalent drive (fully redundant).

5Data is striped across several physical 3 8 drives. Parity protection is used for data redundancy.

0+1(Mylex RAID 6) Combination of RAID levels 3 8 0 and 1. This level provides striping and redundancy through mirroring.

JBOD(Mylex RAID 7) "Just a Bunch of Drives". 1 1 Each drive can operate independently like with a common host bus adapter; or multiple drives may be spanned and seen as a single very large drive. No redundancy is provided.

RAID Techniques and Terms

The techniques of disk striping, mirroring, and parity (redundancy) are fundamental elements of RAID technology performed by the DAC960PL.

Mirroring (RAID 1)

Mirroring refers to the 100% duplication of data from one disk drive onto another. Each disk contains the mirror image of the data on the other drive.

Striping (RAID 0)

Striping refers to the storing of a sequential block of incoming data across multiple drives in a drive group. For example, if there are three drives in a drive group (or pack), the data will be separated into blocks. Block one of the data will be stored on drive one, block two on drive two, block three on drive three. Drive one will again be the location of the next block (block four); then, block five is stored on drive two, block six on drive three, and so on. This method can significantly increase disk system throughput, particularly for transferring large, sequential data blocks.

i) Stripe Order

The order in which SCSI drives appear within a drive group is the stripe order. It is critical that the selected stripe order is always maintained, to assure data integrity and the controller's ability to rebuild failed drives.

ii) Stripe Size

The size of the logically contiguous data block recorded on all drives connected to the controller is the stripe size. The default is 8KB. Other choices are 16, 32 or 64KB, which may be selected with the DACCF configuration utility (Advanced Functions menu, Physical Parameters option).

iii) Stripe Width

The number of drives within a drive group is referred to as the stripe width.

Striping with Parity (RAID 5)

Striping with parity (rotated XOR redundancy) is a method of providing complete data redundancy that requires only a fraction of the storage capacity than mirroring for storing redundant information.

In a system configured under RAID-5 (which requires at least three SCSI drives), all data and parity blocks are divided between the drives in such a way that if any single drive is removed (or fails), the data on the missing drive can be regenerated using the data on the remaining drives (XOR refers to the Boolean "Exclusive-OR" operator).

DRIVE MANAGEMENT

The DAC960PL functions that monitor and control the operation of the physical and logical drives are instrumental to the controller's ability to perform RAID management and automated error recovery tasks.

Controlling Physical Drive States

The state of a physical drive refers to a SCSI drive's current operational status. At any given time, a SCSI drive can be in one of several states: ONLINE, STANDBY, READY, DEAD, REBUILD, or WRITE-ONLY.

The controller stores the state of the attached SCSI drives in its non-volatile memory. This information is retained even after power-off. If a SCSI disk is labeled DEAD in one session, it will stay in the dead state until a change is made either by using a system level utility or after a maintenance/rebuild procedure is performed.

On-line (ONL)

A SCSI drive (physical drive) is online if it:

1. Is powered on

2. Has been defined as a member of a drive group

3. Is operating properly.

Standby (SBY)

A SCSI disk drive is in a standby state if it:

1. Is powered on

2. Is able to operate properly

3. Has not been defined as part of any drive group

4. Has been defined as a standby

Dead (DED)

A drive is dead if it:

1. Is not present

2. Is present, but not powered on

3. Failed to operate properly and was killed by the controller

When the controller detects a failure on a disk, it kills that disk by changing its state to dead. A SCSI drive that is in the dead state does not participate in any I/O activity. No commands are issued to dead drives.

Write-Only (WOL)

A SCSI driver is in a write-only state if it was in the process of being rebuilt, that is...

- During a RAID 1 rebuild process, data is copied from the mirrored drive to the replacement drive.

- During a RAID 5 or RAID 0+1 rebuild, data is regenerated via the XOR redundancy algorithm and written to the replacement drive.

... and the rebuild was terminated abnormally before it completed.

Ready (RDY)

A SCSI disk drive is in a ready state if it:

1. Is powered on

2. Is able to operate properly

3. Has not been defined as part of any drive group

4. Has not been defined as a standby.

Ready is not an actual drive state or command issued by the controller. The drive will change from RDY to SBY (standby) when the configuration is saved to memory.

Controlling Logical Unit States

The state of a DAC960PL logical unit can be ONLINE, CRITICAL, or OFF-LINE. Notice that the same term online is used for both

Online

A logical unit is online if all of its participating physical drives are online.

Critical

A logical unit is considered critical when any failure of another of its physical drives may result in a loss of data.

A logical unit is critical if it meets both of the following conditions:

1. It is configured for RAID 1, RAID 5 or RAID 0+1

2. One (and no more than one) of its physical drives is not online (refer to the description of Off-line below).

Off-line

An off-line logical unit is one on which no data can be read or written. No operations can be performed on off-line logical units. System commands issued to off-line logical units are returned with an error status.

A logical unit can be off-line under one of two conditions:

1. It is configured with a redundant RAID level (1, 5 or 0+1) and two or more of its SCSI drives are not online.

2. It is configured as RAID 0 or JBOD (or in a spanned set) and one or more of its SCSI drives are not online.

Controlling Standby Replacement Drives (Hot Spares)

The standby replacement drive, or hot spare, is one of the most important features the DAC960PL controller provides to achieve automatic, non-stop service with a high degree of fault-tolerance. With the standby rebuild function, the controller performs a rebuild operation automatically when a SCSI disk drive fails and both of the following conditions are true:

1.A standby SCSI disk drive of identical or larger size is found attached to the same controller.

2.All of the system drives that are dependent on the failed disk are redundant system drives, eg. RAID 1, RAID 5, or RAID 0+1.

During the automatic rebuild process, system activity continues as normal. System performance may degrade slightly during a rebuild.

Using Standby Rebuild

To use the automatic standby rebuild feature, it is necessary to always maintain a standby disk in the system.

A standby disk can be created in one of two ways.

1.When the DAC960PL configuration is created or changed using the DACCF software utility, all disks attached to the controller that are online and not assigned to a drive group will be automatically labeled as standby disks.

2.A disk that is not part of any drive group may be made a standby drive by using the DOS-based DAC960 Toolkit utility, DAC960TK.EXE.

Standby Replacement Table

A standby replacement table stores data on up to eight automatic replacement events in any session (from one power-on/reset to the next power-off/reset). When the limit of eight is reached and a disk failure occurs, a standby replacement can take place but is not recorded in the replacement table.

The standby replacement table can be cleared from the DAC960PL by using the DACCF software utility Save Configuration command under either the New Configuration command or the View/Update Configuration command, System Drive Definition menu.

Hot-Swap Drive Replacement

The DAC960PL supports the ability of certain drive enclosures to perform a hot-swap drive replacement while the system is on-line. A disk can be disconnected, removed, or replaced with a different disk without taking the system off-line. The SCSI bus termination must be arranged so that a drive can be removed without disrupting the termination scheme.

Disk Failure Detection

The DAC960PL controller automatically detects SCSI disk failures. A monitoring process running on the controller checks, among other things, elapsed time on all commands issued to disks. A time-out will cause the disk to be reset and the command will be re-tried. If the command time-out occurs again, the disk could be killed by the controller (that is, its state changed to dead).

The DAC960PL controller also monitors SCSI bus parity errors and other potential problems. Any disk with too many errors will be changed to a dead state.

Disk Media Error Management

The DAC960PL controller manages SCSI disk media errors in a manner transparent to the user.

Disks are programmed to report errors. When a disk reports a media error during a read, the controller reads the data from the mirror (RAID 1 or RAID 0+1), or computes the data from the other blocks (RAID 5), and writes the data back to the disk that encountered the error. If the write fails, or the following verify-of-data fails (media error on write), the controller issues a REASSIGN command to the disk, and then writes the data to a new location. Since the problem has been resolved, no error is reported to the system.

When a disk reports a media error during a write, the controller issues a REASSIGN command to the disk, and writes the data out to a new location on the disk.

Checking Disk Parity

A parity check is a process that verifies the integrity of redundant data. For example, performing a parity check of a mirrored drive assures that the data on both drives of the mirrored pair are exactly the same. To verify RAID 5 redundancy, a parity check reads all associated data blocks, computes parity, reads parity, and verifies that the computed parity matches the read parity.

CACHE MANAGEMENT

The DAC960PL provides performance enhancement of data transfers through its on-board cache memory. The controller supports cache memory sizes from 2MB (minimum) to 32MB (maximum). Cache memory is allocated by the controller memory management functions for Read Cache and Write Cache. Write cache policy is user-selectable for each logical unit to achieve optimum performance within specific applications.

Read Cache

Read Cache is always enabled by the controller. Its operation is transparent and requires no user intervention.

Write-Back Cache

Write-Back Cache refers to a caching strategy whereby write operations result in a completion status being sent to the host operating system as soon as the cache (not the disk drive) receives the data to be written. The target SCSI Drive will receive the data at a more appropriate time in order to increase controller performance. This is EDP's preferred configuration.

Write-Through Cache

Write-Through Cache refers to a cache writing strategy whereby data is written to the SCSI Drive before a completion status is returned to the host operating system. This caching strategy is considered more secure, since a power failure will be less likely to cache loss of data. However, a Write-Through cache results in a slightly lower performance in most applications.

Cache Battery Backup

An optional Cache Battery Backup is available that can be used to protect against cache data loss in the event of a power failure. This is mandatory for EDP supplied systems.

CONNECTORS AND JUMPERS

Connector and jumper locations on the DAC960PL are shown in Figure 3 and described in Table 3.

Figure 3 - DAC960PL Component Locator

Table 3 - DAC960PL Connectors and Jumpers

ConnectorDescription

J1Fast/Wide SCSI Connector, 68-pin, Drive Channel 0

J2Fast/Wide SCSI Connector, 68-pin, Drive Channel 1*

J3Fast/Wide SCSI Connector, 68-pin, Drive Channel 2*

J4AEMI (Array Enclosure Management Interface) Port

J5Connector, Battery Backup Module*

J6External Fast/Wide SCSI Connector, 8mm Champ, Drive

Channel 0

J7External Fast/Wide SCSI Connector, 8mm Champ, Drive Channel 1*

JP1Jumper, SCSI Termination, Drive Channel 0

JP2Jumper, SCSI Termination, Drive Channel 1*

JP3Jumper, SCSI Termination, Drive Channel 2*

JP4Reserved (Factory test use only)

JP5Connector block, Bus/Drive Activity LEDs

* Optional

External LED Connectors

Jumper JP5 is a six-pin header that provides connection for three status LEDs. The pins are listed in Table 4. In each case, the odd-numbered pin is the +5V source (Pins, 1, 3 and 5). Pin 1 is on the left. An external series resistor is not required for connecting LEDs.

JP5

1 6

Figure 4 - Status LED Connectors

Table 4 - Status LED Connectors

Connector Indicator Meaning if ON JP5, Pin 1-2 SCSI Activity One (or more) of the SCSI channels on the controller is transmitting or receiving data JP5, Pin 3-4 PCI Activity The controller is transmitting or receiving data to or from the host JP5, Pin 5-6 Write Pending The cache memory on the DAC960 holds data that is more current than the data on the hard drive(s).

WARNING

DATA WILL BE LOST IF THE SYSTEM EITHER LOSES POWER OR IS RESET WHILE THE WRITE PENDING LED IS ON

(indicating the cache contains data not yet written to disk). To prevent data loss, install the optional cache battery backup module.

AEMI Interface

Connector J4 is the AEMI (Array Enclosure Management Interface) connector and provides a set of inputs and outputs that can be used to interface the DAC960PL disk array controller with certain AEMI compliant Disk Array subsystem cabinets.

SCSI Termination

Terminating a SCSI chain is accomplished either by adding a terminator (or by terminating the devices) on each of the two ends of the SCSI bus.

The DAC960PL has on board ALT-2 type SCSI terminators on all drive channels. Jumpers JP1, JP2, and JP3 are used to enable or disable the SCS termination for Channels 0, 1, and 2 respectively.

By default, all three jumpers are installed when shipped from the factory (termination) enabled). This is the normal termination required when the controller is installed at one end of the SCSI cable. In this case, whenever JP1, JP2, or JP3 are shunted, the controller will provide termination to each SCSI channel, and will also power the SCSI TERMPWR signal for that SCSI bus.

Terminating Internal Disk Arrays

On a disk array system, the termination should be set in such a way that when any drive is removed from the SCSI bus, termination and termination power are left intact.

Return to Index