(SCSI-3)Persistent Reserve on Windows 2000 and clustering.

Hello,

Currently MS cluster service uses SCSI -2 reserve/release to acquire the
resources in the cluster for the active node. I want to use SCSI - 3
Persistent Reserve IN/OUT to acquire and break the reservation, reason for
that being my host each have two HBAs, and with the MS approach only one
path can access the storage, and another path from the same host (active)
via second HBA is unable to access the resource. So using Persistent Reserve
with the reservation key, I want to have access from both the paths.

I am doing the following;
–I have a multi-path solution which works fine with SCSI-2 reserve/reset.
–Now when I try to use SCSI 3 (PERSISTENT RESERVE OUT “5F”), along with a
reservation key to REGISTER/RESERVE the first paths on the active node, the
command is processed successfully, after which I REGISTER rest of other
paths using the same reservation key. I can confirm this by querying the
persistent registration key (doing PERSISTENT RESERVE IN), using a usermode
SPTI program. The problem that I am noticing is that I see multiple
SRB_FUNCTION_EXECUTE_SCSI/SCSIOP_RESERVE_UNIT after this (actually 4 after
the first one), and the cluster fails on its attempt to start the cluster. I
am returning STATUS_SUCCESS (0x00000000) back from the driver if the
PERSISTENT RESERVE call is successful. The cluster log file contains the
following information;

[cluster.log]
::2004/01/14-18:32:13.750 [FM] Initializing resource
b5c544ed-bee4-42c1-8cce-1af9254d5510 from the registry.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] Name for Resource
b5c544ed-bee4-42c1-8cce-1af9254d5510 is ‘Disk G:’.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] FmpAddPossibleEntry: adding
node 1 as possible host for resource b5c544ed-bee4-42c1-8cce-1af9254d5510.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] FmpQueryResTypeInfo: Calling
FmpAddPossibleNodeToList for restype Physical Disk
00000724.000008c4::2004/01/14-18:32:13.750 [FM] FmpAddPossibleNodeToList:
adding node 1 to resource type’s possible node list
00000724.000008c4::2004/01/14-18:32:13.750 [FMX] Found the quorum resource
b5c544ed-bee4-42c1-8cce-1af9254d5510.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] All dependencies for
resource b5c544ed-bee4-42c1-8cce-1af9254d5510 created.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] arbitrate for quorum
resource id b5c544ed-bee4-42c1-8cce-1af9254d5510.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] Initializing resource
b5c544ed-bee4-42c1-8cce-1af9254d5510 from the registry.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] Name for Resource
b5c544ed-bee4-42c1-8cce-1af9254d5510 is ‘Disk G:’.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] FmpRmCreateResource:
creating resource b5c544ed-bee4-42c1-8cce-1af9254d5510 in shared resource
monitor
00000898.00000708::2004/01/14-18:32:13.781 Physical Disk: PnP window created
successfully.
00000724.000008c4::2004/01/14-18:32:13.781 [FM] FmpRmCreateResource: created
resource b5c544ed-bee4-42c1-8cce-1af9254d5510, resid 651704
00000724.000008c4::2004/01/14-18:32:13.781 [MM] MmSetQuorumOwner(1,1), old
owner 0.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Wait for offline thread to complete…
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]------- DisksArbitrate -------.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]DisksOpenResourceFileHandle: Attach successful.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]DisksOpenResourceFileHandle: CreateFile successful.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] Arbitration Parameters (1 9999).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] Issuing GetSectorSize on signature d942732a.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] GetSectorSize completed, status 0.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] ArbitrationInfo.SectorSize is 512
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] Issuing GetPartInfo on signature d942732a.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] GetPartInfo completed, status 0.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Successful read (sector 12) [ST:0] (0,6897c372:01c3daca).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Successful write (sector 11) [ST:0] (0,babfaa14:01c3dacc).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Successful read (sector 12) [ST:0] (0,6897c372:01c3daca).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Successful write (sector 12) [ST:0] (0,babfaa14:01c3dacc).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Successful read (sector 11) [ST:0] (0,babfaa14:01c3dacc).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] Issuing Reserve on signature d942732a.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] Reserve completed, status 1117.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Arbitrate returned status 1117.
00000724.000008c4::2004/01/14-18:32:13.781 [MM] MmSetQuorumOwner(0,0), old
owner 1.
00000724.000008c4::2004/01/14-18:32:13.781 [FM] FmGetQuorumResource failed,
error 1117.
00000724.000008c4::2004/01/14-18:32:13.781 [INIT] ClusterForm: Could not get
quorum resource. No fixup attempted. Status = 5086
00000724.000008c4::2004/01/14-18:32:13.781 [INIT] Cleaning up failed form
attempt.
00000724.000008c4::2004/01/14-18:32:13.781 [INIT] Failed to form cluster,
status 5086.
00000724.000008c4::2004/01/14-18:32:13.781 [CS] ClusterInitialize failed
5086
00000724.000008c4::2004/01/14-18:32:13.781 [INIT] The cluster service is
shutting down.
00000724.000008c4::2004/01/14-18:32:13.781 [EVT] EvShutdown
00000724.000008c4::2004/01/14-18:32:13.781 [FM] Shutdown: Failover Manager
requested to shutdown groups.
00000724.000008c4::2004/01/14-18:32:13.781 [FM] FmpCleanupGroups: Entry
00000724.000008c4::2004/01/14-18:32:13.781 [FM] FmpCleanupGroups: Exit
00000724.000008c4::2004/01/14-18:32:13.781 [Dm] DmShutdown
00000724.000008c4::2004/01/14-18:32:13.781 [DM] DmpShutdownFlusher: Entry
00000724.000008c4::2004/01/14-18:32:13.781 [DM] DmpShutdownFlusher: Setting
event
00000724.000008a8::2004/01/14-18:32:13.781 [DM] DmpRegistryFlusher: got 0
00000724.000008a8::2004/01/14-18:32:13.781 [DM] DmpRegistryFlusher: exiting
00000724.000008c4::2004/01/14-18:32:13.796 [MM] MMLeave is called when
rgp=NULL.
00000724.000008c4::2004/01/14-18:32:13.796 [CS] Service Stopped. exit code =
5086

[end cluster.log]

I am using SCSI_PASS_THROUGH_WITH_BUFFERS/spti approach to do the Persistent
Reserve in the driver.

Any help in this regard would be highly appreciated.

Regards,

TM

Somehow the original cluster reserve was completed with error. The log
shows:

00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] Reserve completed, status 1117.

Error 1117 is one of these errors:

1117 ERROR_IO_DEVICE <–> 0xc000003c STATUS_DATA_OVERRUN
1117 ERROR_IO_DEVICE <–> 0xc000003d STATUS_DATA_LATE_ERROR
1117 ERROR_IO_DEVICE <–> 0xc00000e9 STATUS_UNEXPECTED_IO_ERROR
1117 ERROR_IO_DEVICE <–> 0xc000015f STATUS_FT_MISSING_MEMBER
1117 ERROR_IO_DEVICE <–> 0xc000016d STATUS_FT_ORPHANING
1117 ERROR_IO_DEVICE <–> 0xc0000183 STATUS_DRIVER_INTERNAL_ERROR
1117 ERROR_IO_DEVICE <–> 0xc0000185 STATUS_IO_DEVICE_ERROR
1117 ERROR_IO_DEVICE <–> 0xc0000186 STATUS_DEVICE_PROTOCOL_ERROR

You should trace your RESERVE code to see what is returning one of these
status codes. First, double check the structures in the SPTI commands -
direction, lengths, etc. If you are running a trace and actually see the
PRO on the wire, make sure you aren’t returning a SCSI status other than
0 and that you aren’t mangling the sense buffer or the flags associated
with it.

(FYI: RESERVE and RELEASE are defined as SCSI-3 Primary Commands
through SPC-2.)

-----Original Message-----
From: TeeM [mailto:xxxxx@hotmail.com]
Sent: Thursday, January 15, 2004 11:10 AM
Subject: (SCSI-3)Persistent Reserve on Windows 2000 and clustering.

Hello,

Currently MS cluster service uses SCSI -2 reserve/release to acquire the
resources in the cluster for the active node. I want to use SCSI - 3
Persistent Reserve IN/OUT to acquire and break the reservation, reason
for
that being my host each have two HBAs, and with the MS approach only one
path can access the storage, and another path from the same host
(active)
via second HBA is unable to access the resource. So using Persistent
Reserve
with the reservation key, I want to have access from both the paths.

I am doing the following;
–I have a multi-path solution which works fine with SCSI-2
reserve/reset.
–Now when I try to use SCSI 3 (PERSISTENT RESERVE OUT “5F”), along with
a
reservation key to REGISTER/RESERVE the first paths on the active node,
the
command is processed successfully, after which I REGISTER rest of other
paths using the same reservation key. I can confirm this by querying the
persistent registration key (doing PERSISTENT RESERVE IN), using a
usermode
SPTI program. The problem that I am noticing is that I see multiple
SRB_FUNCTION_EXECUTE_SCSI/SCSIOP_RESERVE_UNIT after this (actually 4
after
the first one), and the cluster fails on its attempt to start the
cluster. I
am returning STATUS_SUCCESS (0x00000000) back from the driver if the
PERSISTENT RESERVE call is successful. The cluster log file contains the
following information;

[cluster.log]
::2004/01/14-18:32:13.750 [FM] Initializing resource
b5c544ed-bee4-42c1-8cce-1af9254d5510 from the registry.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] Name for Resource
b5c544ed-bee4-42c1-8cce-1af9254d5510 is ‘Disk G:’.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] FmpAddPossibleEntry:
adding
node 1 as possible host for resource
b5c544ed-bee4-42c1-8cce-1af9254d5510.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] FmpQueryResTypeInfo:
Calling
FmpAddPossibleNodeToList for restype Physical Disk
00000724.000008c4::2004/01/14-18:32:13.750 [FM]
FmpAddPossibleNodeToList:
adding node 1 to resource type’s possible node list
00000724.000008c4::2004/01/14-18:32:13.750 [FMX] Found the quorum
resource
b5c544ed-bee4-42c1-8cce-1af9254d5510.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] All dependencies for
resource b5c544ed-bee4-42c1-8cce-1af9254d5510 created.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] arbitrate for quorum
resource id b5c544ed-bee4-42c1-8cce-1af9254d5510.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] Initializing resource
b5c544ed-bee4-42c1-8cce-1af9254d5510 from the registry.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] Name for Resource
b5c544ed-bee4-42c1-8cce-1af9254d5510 is ‘Disk G:’.
00000724.000008c4::2004/01/14-18:32:13.750 [FM] FmpRmCreateResource:
creating resource b5c544ed-bee4-42c1-8cce-1af9254d5510 in shared
resource
monitor
00000898.00000708::2004/01/14-18:32:13.781 Physical Disk: PnP window
created
successfully.
00000724.000008c4::2004/01/14-18:32:13.781 [FM] FmpRmCreateResource:
created
resource b5c544ed-bee4-42c1-8cce-1af9254d5510, resid 651704
00000724.000008c4::2004/01/14-18:32:13.781 [MM] MmSetQuorumOwner(1,1),
old
owner 0.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Wait for offline thread to complete…
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]------- DisksArbitrate -------.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]DisksOpenResourceFileHandle: Attach successful.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]DisksOpenResourceFileHandle: CreateFile successful.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] Arbitration Parameters (1 9999).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] Issuing GetSectorSize on signature d942732a.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] GetSectorSize completed, status 0.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] ArbitrationInfo.SectorSize is 512
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] Issuing GetPartInfo on signature d942732a.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] GetPartInfo completed, status 0.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Successful read (sector 12) [ST:0] (0,6897c372:01c3daca).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Successful write (sector 11) [ST:0] (0,babfaa14:01c3dacc).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Successful read (sector 12) [ST:0] (0,6897c372:01c3daca).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Successful write (sector 12) [ST:0] (0,babfaa14:01c3dacc).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Successful read (sector 11) [ST:0] (0,babfaa14:01c3dacc).
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] Issuing Reserve on signature d942732a.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb] Reserve completed, status 1117.
00000898.000007bc::2004/01/14-18:32:13.781 Physical Disk :
[DiskArb]Arbitrate returned status 1117.
00000724.000008c4::2004/01/14-18:32:13.781 [MM] MmSetQuorumOwner(0,0),
old
owner 1.
00000724.000008c4::2004/01/14-18:32:13.781 [FM] FmGetQuorumResource
failed,
error 1117.
00000724.000008c4::2004/01/14-18:32:13.781 [INIT] ClusterForm: Could not
get
quorum resource. No fixup attempted. Status = 5086
00000724.000008c4::2004/01/14-18:32:13.781 [INIT] Cleaning up failed
form
attempt.
00000724.000008c4::2004/01/14-18:32:13.781 [INIT] Failed to form
cluster,
status 5086.
00000724.000008c4::2004/01/14-18:32:13.781 [CS] ClusterInitialize failed
5086
00000724.000008c4::2004/01/14-18:32:13.781 [INIT] The cluster service is
shutting down.
00000724.000008c4::2004/01/14-18:32:13.781 [EVT] EvShutdown
00000724.000008c4::2004/01/14-18:32:13.781 [FM] Shutdown: Failover
Manager
requested to shutdown groups.
00000724.000008c4::2004/01/14-18:32:13.781 [FM] FmpCleanupGroups: Entry
00000724.000008c4::2004/01/14-18:32:13.781 [FM] FmpCleanupGroups: Exit
00000724.000008c4::2004/01/14-18:32:13.781 [Dm] DmShutdown
00000724.000008c4::2004/01/14-18:32:13.781 [DM] DmpShutdownFlusher:
Entry
00000724.000008c4::2004/01/14-18:32:13.781 [DM] DmpShutdownFlusher:
Setting
event
00000724.000008a8::2004/01/14-18:32:13.781 [DM] DmpRegistryFlusher: got
0
00000724.000008a8::2004/01/14-18:32:13.781 [DM] DmpRegistryFlusher:
exiting
00000724.000008c4::2004/01/14-18:32:13.796 [MM] MMLeave is called when
rgp=NULL.
00000724.000008c4::2004/01/14-18:32:13.796 [CS] Service Stopped. exit
code =
5086

[end cluster.log]

I am using SCSI_PASS_THROUGH_WITH_BUFFERS/spti approach to do the
Persistent
Reserve in the driver.

Any help in this regard would be highly appreciated.

Regards,

TM