Re: [ntdev] about BUS-RESET of SCSI miniport

thanks for jerry’s answer and it works.

As you told me, i add a TimeOutValue of 30 seconds to HKLM\System\CurrentControl\Services\Disk and
found that all SRBs SCSIport passed to miniport got a timeout of 30 seconds.
Also, i found there was an error in my previous question. That is, the system’s
hang-up is not due to BUS RESET, but the limit of maximum outstanding SRBs in MS SCSIport.
At least in my system, if the storage server fails to respond to a disk I/O request,
(maybe because request packet is lost or storage server is busy), the disk I/O request will be blocked
and the SRB stay outstanding (not completed). But the SCSIport may continue to queue SRBs to miniport,
which makes the number of outstanding SRBs to increase, eventually, the number will exceed 254
(the maximum outstanding SRBs that MS SCSIport supports), and the system hang-up occurs.
To my opinion, if the blocked disk I/O time out, the phenomenon of BUS RESET occurs, but the real
reason that hang up the system is that the SRBs queue reaches its 254 limit.
So, if the storage server eventually manages to satisfy the blocked disk I/O request (maybe the miniport
re-transmit it’s request, or the storage server become idle), the system will recover from hang-up state.
Perhaps i need not handle bus reset at all, as long as the server can satisfy every disk I/O request.
any comment?

xxxxx@attotech.com дµÀ£º
If you’re talking about a Disk device, the timeout value can be changed by entering a DWORD value HKEY_LOCAL_MACHINE\CurrentControlSet\Services\Disk. The value name is TimeOutValue. The default is 10 seconds. This value gets multiplied by the number of 64k segments or parts thereof in the transfer - so if you’re transferring 129k, it would be 30 seconds (default). The value affects disk Read/Write commands, except possibly those sent via SCSI_PASS_THROUGH, for ALL disks in the system unfortunately. (There are some Fibre Channel HBAs that change this value to 60 seconds (!). That means a 1 meg request will time out in 16 MUNUTES!)

You can avoid the bus reset by timing out the command in your miniport before the TimeOutValue specified in the SRB, aborting the command and setting SrbStatus to SRB_STATUS_TIMEOUT.

Jerry.

xxxxx@lists.osr.com wrote: -----

>To: “Windows System Software Devs Interest List”
>
>From: identifier scorpio
>Sent by: xxxxx@lists.osr.com
>Date: 06/03/2005 10:50PM
>cc: xxxxx@lists.osr.com
>Subject: [ntdev] about BUS-RESET of SCSI miniport
>
>
>Hi, everyone
>
>i wrote a SCSI miniport driver to redirect all disk I/O to remote
>storage server, and i found that the Port driver above me pass down a
>SRB with a timeout of 10 seconds. so, if i fail to complete the SRB
>in 10 seconds, then a BUS RESET occurs, and the driver (and the whole
>system) hang up.
>
>who can tell me how can i avoid BUS RESET ? may i enlarge the 10
>seconds timeout limit, e.g., to 1 minute or more? or, how can i
>recover from BUS RESET state?
>
>thanks in advance.
>
>
>DO YOU YAHOO!?
>ÑÅ»¢Ãâ·ÑGÓÊÏ䣭ÖйúµÚÒ»¾øÎÞÀ¬»øÓʼþɧÈų¬´óÓÊÏä — Questions? First check the Kernel Driver
>FAQ at http://www.osronline.com/article.cfm?id=256 You are currently
>subscribed to ntdev as: xxxxx@attotech.com To unsubscribe send a
>blank email to xxxxx@lists.osr.com — Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256 You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’ To unsubscribe send a blank email to xxxxx@lists.osr.com
__________________________________________________
¸Ï¿ì×¢²áÑÅ»¢³¬´óÈÝÁ¿Ãâ·ÑÓÊÏä?
http://cn.mail.yahoo.com

I also think that a well programmed miniport should implement HwScsiResetBus.

I wanna known that if i implement HwScsiResetBus and successfully complete every
ourstanding request with SrbStatus set to SRB_STATUS_BUS_RESET, then the disk.sys will
re-send me all those requests that i completed with SRB_STATUS_BUS_RESET status just
now. according to your experience, is that true? (the DDK never say that).
about your “minor clarification on TimeOutValue”, i have observed that, thanks.

xxxxx@attotech.com дµÀ£º

  1. The Bus Reset operation should cause your Miniport to terminate all outstanding requests to be completed with SrbStatus set to SRB_STATUS_BUS_RESET. Apparently from what you are saying, your driver is not doing this and once the bus reset is invoked you are never completing the outstanding requests.

(The DDK documentation says you can call ScsiPortCompleteRequest to complete all outstanding requests. Our drivers never do this - they always call ScsiPortNotification for each ourstanding request, so I can’t comment on the ScsiPortCompleteRequest method.)

  1. I think you still need to support HwScsiResetBus. What happens if for some reason requests are lost and don’t complete even in the 30 seconds you have set? Then the system will invoke your ResetBus function and if there is a problem with it, you are going to see this all over again.

  2. Minor clarification on TimeOutValue - When doing a bus scan, ScsiPort sends Inquiry commands with a fixed TimeOutValue of 5 seconds regardless of device type. Also, as far as I know, the registry entry only applies to disk Read/Write commands.

Jerry.

xxxxx@lists.osr.com wrote: -----

To: “Windows System Software Devs Interest List”

From: identifier scorpio
Sent by: xxxxx@lists.osr.com
Date: 06/05/2005 05:42AM
Subject: [ntdev] »Ø¸´£º Re: [ntdev] about BUS-RESET of SCSI miniport

thanks for jerry’s answer and it works.

As you told me, i add a TimeOutValue of 30 seconds to
HKLM\System\CurrentControl\Services\Disk and
found that all SRBs SCSIport passed to miniport got a timeout of 30
seconds.
Also, i found there was an error in my previous question. That is,
the system’s
hang-up is not due to BUS RESET, but the limit of maximum outstanding
SRBs in MS SCSIport.
At least in my system, if the storage server fails to respond to a
disk I/O request,
(maybe because request packet is lost or storage server is busy), the
disk I/O request will be blocked
and the SRB stay outstanding (not completed). But the SCSIport may
continue to queue SRBs to miniport,
which makes the number of outstanding SRBs to increase, eventually,
the number will exceed 254
(the maximum outstanding SRBs that MS SCSIport supports), and the
system hang-up occurs.
To my opinion, if the blocked disk I/O time out, the phenomenon of
BUS RESET occurs, but the real
reason that hang up the system is that the SRBs queue reaches its 254
limit.
So, if the storage server eventually manages to satisfy the blocked
disk I/O request (maybe the miniport
re-transmit it’s request, or the storage server become idle), the
system will recover from hang-up state.
Perhaps i need not handle bus reset at all, as long as the server can
satisfy every disk I/O request.
any comment?

xxxxx@attotech.com дµÀ£º
If you’re talking about a Disk device, the timeout value can be
changed by entering a DWORD value
HKEY_LOCAL_MACHINE\CurrentControlSet\Services\Disk. The value name
is TimeOutValue. The default is 10 seconds. This value gets
multiplied by the number of 64k segments or parts thereof in the
transfer - so if you’re transferring 129k, it would be 30 seconds
(default). The value affects disk Read/Write commands, except
possibly those sent via SCSI_PASS_THROUGH, for ALL disks in the
system unfortunately. (There are some Fibre Channel HBAs
that change this value to 60 seconds (!). That means a 1 meg request
will time out in 16 MUNUTES!)

You can avoid the bus reset by timing out the command in your
miniport before the TimeOutValue specified in the SRB, aborting the
command and setting SrbStatus to SRB_STATUS_TIMEOUT.

Jerry.

xxxxx@lists.osr.com wrote: -----

>To: “Windows System Software Devs Interest List”
>
>From: identifier scorpio
>Sent by: xxxxx@lists.osr.com
>Date: 06/03/2005 10:50PM
>cc: xxxxx@lists.osr.com
>Subject: [ntdev] about BUS-RESET of SCSI miniport
>
>
>Hi, everyone
>
>i wrote a SCSI miniport driver to redirect all disk I/O to remote
>storage server, and i found that the Port driver above me pass down
a
>SRB with a timeout of 10 seconds. so, if i fail to complete the SRB

>in 10 seconds, then a BUS RESET occurs, and the driver (and the
whole
>system) hang up.
>
>who can tell me how can i avoid BUS RESET ? may i enlarge the 10
>seconds timeout limit, e.g., to 1 minute or more? or, how can i
>recover from BUS RESET state?
>
>thanks in advance.
>
>
>DO YOU YAHOO!?
>ÑÅ»¢Ãâ·ÑGÓÊÏ䣭ÖйúµÚÒ»¾øÎÞÀ¬»øÓʼþɧÈų¬´óÓÊÏä — Questions? First check the Kernel
Driver
>FAQ at http://www.osronline.com/article.cfm?id=256 You are currently

>subscribed to ntdev as: xxxxx@attotech.com To unsubscribe send a
>blank email to xxxxx@lists.osr.com — Questions? First
check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256 You are currently
subscribed to ntdev as: unknown lmsubst tag argument: ‘’ To
unsubscribe send a blank email to xxxxx@lists.osr.com


¸Ï¿ì×¢²áÑÅ»¢³¬´óÈÝÁ¿Ãâ·ÑÓÊÏä?
http://cn.mail.yahoo.com — Questions? First check the Kernel Driver
FAQ at http://www.osronline.com/article.cfm?id=256 You are currently
subscribed to ntdev as: xxxxx@attotech.com To unsubscribe send a
blank email to xxxxx@lists.osr.com — Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256 You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’ To unsubscribe send a blank email to xxxxx@lists.osr.com


¸Ï¿ì×¢²áÑÅ»¢³¬´óÈÝÁ¿Ãâ·ÑÓÊÏä?
http://cn.mail.yahoo.com

as to my miniport driver, in HwScsiStartIo, i insert the incomming SRBs into a request list,
and i create a system thread to fetch SRBs (1 SRB each time) from the request list
and redirect the disk I/O request to remote server. before the request is
satisfied, the system thread keeps waiting for the response. (synchronized I/O)
after the system thread is awaken by a disk I/O response, it insert the currently processed SRB
into a completion list. And there is a soft interrupt handler HwScsiTimer keeping
removing the SRBs from the completion list and complete them.

in my circumstance, when bus reset occurs, the SRB that timed out causing the
bus reset is hold by the system thread, and the thread is being blocked by
WaitForXXX since the disk I/O response has not been received yet. so, the thread
can’t detect bus reset until WaitForXXX timeout.
well, below is my plan of HwScsiBusReset implementation:

  1. create a boolean flag named ‘BusReseted’ and initialized with false
  2. in HwScsiBusReset, do nothing but set ‘BusReseted’ to true.
  3. in system thread, when it is awaken from WaitForXXX with return value of STATUS_TIMEOUT,
    it test whether ‘BusReseted’ is true. if true, it set currently processed SRB’s SrbStatus to SRB_STATUS_TIMEOUT and insert it into
    completion list, and then set all other outstanding SRBs’ SrbStatus to SRB_STATUS_BUS_RESET
    and insert them into completion list too.
  4. the HwScsiTimer remove SRBs from completion list and complete them.
  5. the thread then monitor the request list to wait for SCSIport re-sending those abnormally
    completed SRBs again.
    before i code blindly, i hope some comment from you.
    by the way, does SCSIport have any time limit on bus rest processing? and if have, how long?
    because bus reset may occur just after the system thread calls WaitForXXX, thus there may be
    a long time before the thread detects the bus reset event (till after the thread awake from
    WaitForXxx). so, if the time limit exits, the long period in waiting may result another timeout
    (bus reset processing timeout). that’s miserable. :frowning:

xxxxx@attotech.com дµÀ£º

Well, from my looking at the class driver for just a few minutes, it appears that the requests will be resent up to the max retry count. What that retry count is I don’t know, but I think it might be something like 5; I think it depends on who sent the request to the class driver.

In my experience, I have seen commands retried with this status.

Note that you have to be concerned probably about two statuses. I think that the command that timed out causing the bus reset will probably end up getting a status of SRB_STATUS_TIMEOUT, and the other commands that were in process at the time will get status SRB_STATUS_BUS_RESET.

Some of this is speculation on my part.

Jerry.

xxxxx@lists.osr.com wrote on 06/06/2005 02:12:03 AM:

I wanna known that if i implement HwScsiResetBus and successfully
complete every
ourstanding request with SrbStatus set to SRB_STATUS_BUS_RESET, then
the disk.sys will
re-send me all those requests that i completed with
SRB_STATUS_BUS_RESET status just
now. according to your experience, is that true? (the DDK never say that).
about your “minor clarification on TimeOutValue”, i have observed
that, thanks. —
Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com


DO YOU YAHOO!?
»¶Ó­Ê¹ÓÃÑÅ»¢³¬´óÈÝÁ¿Ãâ·ÑÓÊÏä

i think it’s time for me to do some experiment first.

  1. i’ll first try to avoid bus reset by complete SRB_STATUS_TIMEOUT.

  2. i now don’t know how to queue a DPC to a thread yet.

thank jerry’s detailed explanation during these days.

WeiYu Dong, China

xxxxx@attotech.com дµÀ£º

  1. Why not specify some Timeout slightly less than the Srb’s TimeOutValue on the WaitForXxx call? Then when the Wait times out you can put the request on the completion list with SRB_STATUS_TIMEOUT and you shouldn’t receive the bus reset call. What you do with the rest of the requests on the request list at that point is up to you.

  2. Just as a point of information, about the fastest you can get the HwScsiTimer routine to execute is about 15 ms. The resolution on the timer is something like 15.6 ms.

  3. Are you using per-request extension data (i.e. Srb->SrbExtension)? Once you post a completion, you are not allowed to touch anything in the Srb or its SrbExtension. So if the timed out request eventually completes after you complete it to ScsiPort and something touches the Extension, you will most likely get a blue screen.

  4. It is still possible that you might get a call to reset the bus. At that point you can schedule a DPC, which would signal the Wait to get the in-process request completed.

  5. I think there is some conflicting information on the timing of bus reset. There might be some place that says all requests must be completed before HwScsiResetBus returns, but I can’t seem to find that now. However, in my experience, it’s OK to complete these requests within a short time period (maybe a few milliseconds?) after the reset.

  6. How about a real name?

Jerry.

>>>>>
as to my miniport driver, in HwScsiStartIo, i insert the incomming SRBs into a request list,
and i create a system thread to fetch SRBs (1 SRB each time) from the request list
and redirect the disk I/O request to remote server. before the request is
satisfied, the system thread keeps waiting for the response. (synchronized I/O)
after the system thread is awaken by a disk I/O response, it insert the currently processed SRB
into a completion list. And there is a soft interrupt handler HwScsiTimer keeping
removing the SRBs from the completion list and complete them.

in my circumstance, when bus reset occurs, the SRB that timed out causing the
bus reset is hold by the system thread, and the thread is being blocked by
WaitForXXX since the disk I/O response has not been received yet. so, the thread
can’t detect bus reset until WaitForXXX timeout.
well, below is my plan of HwScsiBusReset implementation:

  1. create a boolean flag named ‘BusReseted’ and initialized with false
  2. in HwScsiBusReset, do nothing but set ‘BusReseted’ to true.
  3. in system thread, when it is awaken from WaitForXXX with return value of STATUS_TIMEOUT,
    it test whether ‘BusReseted’ is true. if true, it set currently processed SRB’s SrbStatus to SRB_STATUS_TIMEOUT and insert it into
    completion list, and then set all other outstanding SRBs’ SrbStatus to SRB_STATUS_BUS_RESET
    and insert them into completion list too.
  4. the HwScsiTimer remove SRBs from completion list and complete them.
  5. the thread then monitor the request list to wait for SCSIport re-sending those abnormally
    completed SRBs again.
    before i code blindly, i hope some comment from you.
    by the way, does SCSIport have any time limit on bus rest processing? and if have, how long?
    because bus reset may occur just after the system thread calls WaitForXXX, thus there may be
    a long time before the thread detects the bus reset event (till after the thread awake from
    WaitForXxx). so, if the time limit exits, the long period in waiting may result another timeout
    (bus reset processing timeout). that’s miserable. :frowning:
    <<<<<< —
    Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com


DO YOU YAHOO!?
»¶Ó­Ê¹ÓÃÑÅ»¢³¬´óÈÝÁ¿Ãâ·ÑÓÊÏä

Jerry,

actrually, i’m developing a virtual software driver and don’t have an
hardware interrupt, therefore no HwScsiInterrupt to be called. i use
HwScsiTimer to simulate interrupt. i think that HwScsiTimer runs
at least at DISPATCH_LEVEL (may be HwScsiTimer itself is a DPC,
i guess), so, maybe i can call KeSetEvent in HwScsiTimer to alert
my thread.

Dong.

xxxxx@attotech.com дµÀ£º

WeiYu,

To get a DPC to run, see the DDK documentation for ScsiPortNotification with notification types CallEnableInterrupts and CallDisableInterrupts. The point of running a DPC to process the bus reset is that you can’t call KeSetEvent at IRQL > DISPATCH_LEVEL. I believe that HwScsiResetBus might be called at high IRQL, so to signal your thread, you need to schedule the DPC to do it.

Jerry.

identifier scorpio >>>>>>

i think it’s time for me to do some experiment first.
1. i’ll first try to avoid bus reset by complete SRB_STATUS_TIMEOUT.

2. i now don’t know how to queue a DPC to a thread yet.

thank jerry’s detailed explanation during these days.

WeiYu Dong, China

xxxxx@attotech.com
1. Why not specify some Timeout slightly less than the Srb’s TimeOutValue on the WaitForXxx call? Then when the Wait times out you can put the request on the completion list with SRB_STATUS_TIMEOUT and you shouldn’t receive the bus reset call. What you do with the rest of the requests on the request list at that point is up to you.

2. Just as a point of information, about the fastest you can get the HwScsiTimer routine to execute is about 15 ms. The resolution on the timer is something like 15.6 ms.

3. Are you using per-request extension data (i.e. Srb->SrbExtension)? Once you post a completion, you are not allowed to touch anything in the Srb or its SrbExtension. So if the timed out request eventually completes after you complete it to ScsiPort and something touches the Extension, you will most likely get a blue screen.

4. It is still possible that you might get a call to reset the bus. At that point you can schedule a DPC, which would signal the Wait to get the in-process request completed.

5. I think there is some conflicting information on the timing of bus reset. There might be some place that says all requests must be completed before HwScsiResetBus returns, but I can’t seem to find that now. However, in my experience, it’s OK to complete these requests within a short time period (maybe a few milliseconds?) after the reset.

6. How about a real name?

Jerry.

>>>>>>
as to my miniport driver, in HwScsiStartIo, i insert the incomming SRBs into a request list,
and i create a system thread to fetch SRBs (1 SRB each time) from the request list
and redirect the disk I/O request to remote server. before the request is
satisfied, the system thread keeps waiting for the response. (synchronized I/O)
after the system thread is awaken by a disk I/O response, it insert the currently processed SRB
into a completion list. And there is a soft interrupt handler HwScsiTimer keeping
removing the SRBs from the completion list and complete them.

in my circumstance, when bus reset occurs, the SRB that timed out causing the
bus reset is hold by the system thread, and the thread is being blocked by
WaitForXXX since the disk I/O response has not been received yet. so, the thread
can’t detect bus reset until WaitForXXX timeout.
well, below is my plan of HwScsiBusReset implementation:
1. create a boolean flag named ‘BusReseted’ and initialized with false
2. in HwScsiBusReset, do nothing but set ‘BusReseted’ to true.
3. in system thread, when it is awaken from WaitForXXX with return value of STATUS_TIMEOUT,
it test whether ‘BusReseted’ is true. if true, it set currently processed SRB’s SrbStatus to SRB_STATUS_TIMEOUT and insert it into
completion list, and then set all other outstanding SRBs’ SrbStatus to SRB_STATUS_BUS_RESET
and insert them into completion list too.
4. the HwScsiTimer remove SRBs from completion list and complete them.
5. the thread then monitor the request list to wait for SCSIport re-sending those abnormally
completed SRBs again.
before i code blindly, i hope some comment from you.
by the way, does SCSIport have any time limit on bus rest processing? and if have, how long?
because bus reset may occur just after the system thread calls WaitForXXX, thus there may be
a long time before the thread detects the bus reset event (till after the thread awake from
WaitForXxx). so, if the time limit exits, the long period in waiting may result another timeout
(bus reset processing timeout). that’s miserable. :frowning:
<<<<<< —
Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

---------------------------------
DO YOU YAHOO!?
ÑÅ»¢ÓÊÏ䳬ǿÔöÖµ·þÎñ£­2G³¬´ó¿Õ¼ä¡¢pop3ÊÕÐÅ¡¢ÎÞÏÞÁ¿ÓʼþÌáÐÑ