Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting... Please check out the Community Guidelines in the
Announcements and Administration Category.

Virtual Storport Miniport - HwStorStartIo

In Virtual Storport Miniport "No locks are held prior to calling HwStorStartIo". So multiple HwStorStartIo()s can be called on multiple CPUs parallely.

How should the virtual storport miniport driver queue and process the SRBs in correct order. i.e without of running into out of order processing of SRBs?

Thanks,
nirranjan

Comments

  • HwStorStartIo is called by StorPort driver and according to MSDN documentation
    "Storport always calls HwStorStartIo at DISPATCH IRQL by using an internal spin lock to ensure that requests are initiated sequentially." I took this reference from description of HwStorStartIo routine of miniport StorPort driver but it should work for virtual miniport driver too. Getting request in HwStorStartIo you could use your own internal queue.

    Igor Sharovar
  • Why would the need to be processed in any order? Since the disk could reorder operations, why would you need to.....

    --Mark Cariddi
    OSR, Open Systems Resources, Inc.

    -----Original Message-----
    From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@yahoo.com
    Sent: Friday, February 19, 2010 11:46 AM
    To: Windows System Software Devs Interest List
    Subject: [ntdev] Virtual Storport Miniport - HwStorStartIo

    In Virtual Storport Miniport "No locks are held prior to calling HwStorStartIo". So multiple HwStorStartIo()s can be called on multiple CPUs parallely.

    How should the virtual storport miniport driver queue and process the SRBs in correct order. i.e without of running into out of order processing of SRBs?

    Thanks,
    nirranjan



    ---
    NTDEV is sponsored by OSR

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
  • To back Mark's statement - a scsi or storport miniport is free to issue the I/O operations it receives in any order it wishes. You do not need to maintain order because no component above you in the stack is (or is allowed to) expect you to do so.

    So queue them to the hardware however you see fit.

    -p

    -----Original Message-----
    From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Mark Cariddi
    Sent: Friday, February 19, 2010 9:27 AM
    To: Windows System Software Devs Interest List
    Subject: RE: [ntdev] Virtual Storport Miniport - HwStorStartIo

    Why would the need to be processed in any order? Since the disk could reorder operations, why would you need to.....

    --Mark Cariddi
    OSR, Open Systems Resources, Inc.

    -----Original Message-----
    From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@yahoo.com
    Sent: Friday, February 19, 2010 11:46 AM
    To: Windows System Software Devs Interest List
    Subject: [ntdev] Virtual Storport Miniport - HwStorStartIo

    In Virtual Storport Miniport "No locks are held prior to calling HwStorStartIo". So multiple HwStorStartIo()s can be called on multiple CPUs parallely.

    How should the virtual storport miniport driver queue and process the SRBs in correct order. i.e without of running into out of order processing of SRBs?

    Thanks,
    nirranjan



    ---
    NTDEV is sponsored by OSR

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

    ---
    NTDEV is sponsored by OSR

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
  • >
    > To back Mark's statement - a scsi or storport miniport is free to
    issue the
    > I/O operations it receives in any order it wishes. You do not need to
    > maintain order because no component above you in the stack is (or is
    allowed
    > to) expect you to do so.
    >
    > So queue them to the hardware however you see fit.
    >

    Surely Windows has the concept of 'write barriers' in its disk io
    queues??? If I were you I'd do a bit more research before taking the
    advice of "queue them to the hardware however you see fit". To take it
    to a ridiculous extreme (and way beyond what the above text implied),
    how is filesystem integrity going to be maintained if you reorder your
    writes across a 5 minute timespan, and you get a power failure or bsod
    in that time? Windows might think it has committed the journal to disk
    when in fact it hasn't.

    I'm not sure how tagged scsi requests will work either if you were to
    get the requests in the order A, B, C, but you delayed the queuing to
    the hardware of request A by an arbitrary amount of time...

    James
  • If you need write A and write B to complete in a particular order you send B after A completes. If you need a barrier you can issue a flush.

    As the storage miniport you may not complete the request for the I/O operation until you've actually sent it to the disk. But you may send them in any order you like.

    -p

    -----Original Message-----
    From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of James Harper
    Sent: Friday, February 19, 2010 2:04 PM
    To: Windows System Software Devs Interest List
    Subject: RE: [ntdev] Virtual Storport Miniport - HwStorStartIo

    >
    > To back Mark's statement - a scsi or storport miniport is free to
    issue the
    > I/O operations it receives in any order it wishes. You do not need to
    > maintain order because no component above you in the stack is (or is
    allowed
    > to) expect you to do so.
    >
    > So queue them to the hardware however you see fit.
    >

    Surely Windows has the concept of 'write barriers' in its disk io queues??? If I were you I'd do a bit more research before taking the advice of "queue them to the hardware however you see fit". To take it to a ridiculous extreme (and way beyond what the above text implied), how is filesystem integrity going to be maintained if you reorder your writes across a 5 minute timespan, and you get a power failure or bsod in that time? Windows might think it has committed the journal to disk when in fact it hasn't.

    I'm not sure how tagged scsi requests will work either if you were to get the requests in the order A, B, C, but you delayed the queuing to the hardware of request A by an arbitrary amount of time...

    James

    ---
    NTDEV is sponsored by OSR

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
  • >
    > If you need write A and write B to complete in a particular order you
    send B
    > after A completes. If you need a barrier you can issue a flush.
    >
    > As the storage miniport you may not complete the request for the I/O
    operation
    > until you've actually sent it to the disk. But you may send them in
    any order
    > you like.
    >

    I thought TCQ solved all these problems years ago, and you could queue
    up write requests like A B C D <barrier> E F G H, and the storage device
    would re-order A-D and E-H however it liked, as long as A-D were
    completed before E-H were started, thus keeping the i/o pipeline full
    and letting the drive reorder the requests to maximise throughput. At
    the driver level, Waiting for requests to complete before starting
    others seems pretty poor for performance, especially for a storage
    system with high latencies (eg FC over a WAN link).

    But maybe you're the head storage guy at Microsoft so I'm a little
    reluctant to argue :)

    James
  • I think waiting for a request to complete before issuing another one only
    has performance impact if there are no other requests you can issue that are
    not dependent on the ordered requests. Typically a disk has requests coming
    from many sources, so looks like:

    Source 1 - A1,B1,C1,sync,D1,E1
    Source 2 - A2,sync,B2,C2,D2
    Source 3 - A3,B3,C3,D3,E3,sync

    So the disk request pool looks like:

    Start - A1,B1,C1,A2,A3,B3,C3,D3,E3
    After A2 completes it might look like - C1,B2,C2,D2,A3,B3,C3,D3,E3
    After C1 completes it might look like - D1,E1,C2,D2,D3,E3
    After E3 completes it might look like - E1,D2,...

    My guess is you could come up with workloads where TCQ helps, but I can also
    imagine lots of workloads where it doesn't.

    TCQ does seem like it terribly complicates error recovery too. If you have
    A,sync,B I would hope B only executes if A was successful.

    Without being the head storage designer of a major OS and looking at lots of
    workload analysis data, and seeing how real world hardware handles things
    like errors (devices do not always follow the standards), it's hard to say
    if TCQ is a win over just waiting for completions. I suspect file systems
    can be designed such that global synchronization is rarely needed, and as a
    result the benefits of TCQ are less significant. Global sync points would be
    when you need to wait for a completion, and can't issue any new requests. It
    seems like one design goals on a modern file system would be to minimize the
    synchronization points needed.

    Another issue, since a logical disk might be multiple spindles, TCQ does not
    help ordering across multiple spindles. Disk drives do not talk to each
    other, although the logical disk may be an array controller and TCQ is
    really communicating with the array controller about ordering. This implies
    you would need to either have the local storage pool management (i.e. like
    software raid) implement the TCQ synchronization (or report no TCQ support)
    and the file system would need to alter its synchronization behavior based
    on disk characteristics.

    If we look at medium/larger servers, the storage has a good chance of
    actually being sophisticated array controllers, who cache writes in NVRAM or
    something similar, so will return success to many writes immediately anyway.

    What exactly does "ordering" mean if you have multiple initiators to a disk,
    which are common on clusters and NTFS cluster storage.

    Jan



    > -----Original Message-----
    > From: xxxxx@lists.osr.com [mailto:bounce-402191-
    > xxxxx@lists.osr.com] On Behalf Of James Harper
    > Sent: Friday, February 19, 2010 5:41 PM
    > To: Windows System Software Devs Interest List
    > Subject: RE: [ntdev] Virtual Storport Miniport - HwStorStartIo
    >
    > >
    > > If you need write A and write B to complete in a particular order you
    > send B
    > > after A completes. If you need a barrier you can issue a flush.
    > >
    > > As the storage miniport you may not complete the request for the I/O
    > operation
    > > until you've actually sent it to the disk. But you may send them in
    > any order
    > > you like.
    > >
    >
    > I thought TCQ solved all these problems years ago, and you could queue
    > up write requests like A B C D <barrier> E F G H, and the storage
    > device
    > would re-order A-D and E-H however it liked, as long as A-D were
    > completed before E-H were started, thus keeping the i/o pipeline full
    > and letting the drive reorder the requests to maximise throughput. At
    > the driver level, Waiting for requests to complete before starting
    > others seems pretty poor for performance, especially for a storage
    > system with high latencies (eg FC over a WAN link).
    >
    > But maybe you're the head storage guy at Microsoft so I'm a little
    > reluctant to argue :)
    >
    > James
    >
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
  • Mark_RoddyMark_Roddy Posts: 4,269
    Is ordered tcq used? As far as I know, NT only uses simple tcq: it
    tosses multiple requests at a device and leaves it to the device to
    worry about how those requests get processed.

    On Friday, February 19, 2010, Jan Bottorff <xxxxx@pmatrix.com> wrote:
    > I think waiting for a request to complete before issuing another one only
    > has performance impact if there are no other requests you can issue that are
    > not dependent on the ordered requests. Typically a disk has requests coming
    > from many sources, so looks like:
    >
    > Source 1 - A1,B1,C1,sync,D1,E1
    > Source 2 - A2,sync,B2,C2,D2
    > Source 3 - A3,B3,C3,D3,E3,sync
    >
    > So the disk request pool looks like:
    >
    > Start - A1,B1,C1,A2,A3,B3,C3,D3,E3
    > After A2 completes it might look like - C1,B2,C2,D2,A3,B3,C3,D3,E3
    > After C1 completes it might look like - D1,E1,C2,D2,D3,E3
    > After E3 completes it might look like - E1,D2,...
    >
    > My guess is you could come up with workloads where TCQ helps, but I can also
    > imagine lots of workloads where it doesn't.
    >
    > TCQ does seem like it terribly complicates error recovery too. If you have
    > A,sync,B I would hope B only executes if A was successful.
    >
    > Without being the head storage designer of a major OS and looking at lots of
    > workload analysis data, and seeing how real world hardware handles things
    > like errors (devices do not always follow the standards), it's hard to say
    > if TCQ is a win over just waiting for completions. I suspect file systems
    > can be designed such that global synchronization is rarely needed, and as a
    > result the benefits of TCQ are less significant. Global sync points would be
    > when you need to wait for a completion, and can't issue any new requests. It
    > seems like one design goals on a modern file system would be to minimize the
    > synchronization points needed.
    >
    > Another issue, since a logical disk might be multiple spindles, TCQ does not
    > help ordering across multiple spindles. Disk drives do not talk to each
    > other, although the logical disk may be an array controller and TCQ is
    > really communicating with the array controller about ordering. This implies
    > you would need to either have the local storage pool management (i.e. like
    > software raid) implement the TCQ synchronization (or report no TCQ support)
    > and the file system would need to alter its synchronization behavior based
    > on disk characteristics.
    >
    > If we look at medium/larger servers, the storage has a good chance of
    > actually being sophisticated array controllers, who cache writes in NVRAM or
    > something similar, so will return success to many writes immediately anyway.
    >
    > What exactly does "ordering" mean if you have multiple initiators to a disk,
    > which are common on clusters and NTFS cluster storage.
    >
    > Jan
    >
    >
    >
    >> -----Original Message-----
    >> From: xxxxx@lists.osr.com [mailto:bounce-402191-
    >> xxxxx@lists.osr.com] On Behalf Of James Harper
    >> Sent: Friday, February 19, 2010 5:41 PM
    >> To: Windows System Software Devs Interest List
    >> Subject: RE: [ntdev] Virtual Storport Miniport - HwStorStartIo
    >>
    >> >
    >> > If you need write A and write B to complete in a particular order you
    >> send B
    >> > after A completes. ?If you need a barrier you can issue a flush.
    >> >
    >> > As the storage miniport you may not complete the request for the I/O
    >> operation
    >> > until you've actually sent it to the disk. ?But you may send them in
    >> any order
    >> > you like.
    >> >
    >>
    >> I thought TCQ solved all these problems years ago, and you could queue
    >> up write requests like A B C D <barrier> E F G H, and the storage
    >> device
    >> would re-order A-D and E-H however it liked, as long as A-D were
    >> completed before E-H were started, thus keeping the i/o pipeline full
    >> and letting the drive reorder the requests to maximise throughput. At
    >> the driver level, Waiting for requests to complete before starting
    >> others seems pretty poor for performance, especially for a storage
    >> system with high latencies (eg FC over a WAN link).
    >>
    >> But maybe you're the head storage guy at Microsoft so I'm a little
    >> reluctant to argue :)
    >>
    >> James
    >>
    >>
    >>
    >> ---
    >> NTDEV is sponsored by OSR
    >>
    >> For our schedule of WDF, WDM, debugging and other seminars visit:
    >> http://www.osr.com/seminars
    >>
    >> To unsubscribe, visit the List Server section of OSR Online at
    >> http://www.osronline.com/page.cfm?name=ListServer
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
    >

    --
    Mark Roddy
  • Alex_GrigAlex_Grig Posts: 3,238
    There is no "in order" or "out of order" concept for requests coming at
    different CPUs. Likewise, you can't tell whether an event happened earlier
    on Earth or Mars, if they happen during certain time window. A request,
    issued by an application at CPU may take unpredictably shorter or longer
    time to reach the driver, than a request issued at CPU.

    wrote in message news:xxxxx@ntdev...
    > In Virtual Storport Miniport "No locks are held prior to calling
    > HwStorStartIo". So multiple HwStorStartIo()s can be called on multiple
    > CPUs parallely.
    >
    > How should the virtual storport miniport driver queue and process the SRBs
    > in correct order. i.e without of running into out of order processing of
    > SRBs?
    >
    > Thanks,
    > nirranjan
    >
    >
    >
  • It's been a while since I decoded the scsi command packets on Windows, but
    also believe it doesn't use ordered tcq at all, so the device is free to
    reorder things as it desires. The OS does set the FUA bit (force unit
    access) on requests if the storage irp has SL_WRITE_THROUGH in the flags.
    Also note that advanced storage arrays often lie about completion, even when
    the FUA bit is set, as they have ways of caching a write such that it (in
    theory) should not get lost (like storing the write request in NVRAM). You
    would need to read the SCSI spec to decide if FUA causes a write fence in
    relation to other pending requests.

    The link at http://msdn.microsoft.com/en-us/library/dd979523(VS.85).aspx has
    some curious info about Transactional NTFS's interaction with storage
    devices.

    Jan

    > Is ordered tcq used? As far as I know, NT only uses simple tcq: it
    > tosses multiple requests at a device and leaves it to the device to
    > worry about how those requests get processed.
  • > It's been a while since I decoded the scsi command packets on Windows,
    > but also believe it doesn't use ordered tcq at all, so the device is free to
    > reorder things as it desires

    Sorry, this design paradigm really bothers me as I have seen too many driver writers doing such things. Just because you haven't seen a particular action in the debugger eons ago is in no way valid reasoning to skip such an important design consideration that impacts data integrity. Any new Windows versions or updates may, if they have not already, legally choose to use an ordered queue at anytime. Could you really sleep well at night knowing such a time bomb is in your code that could trash every single customer disk drive? And by the way FUA has nothing to do with ordering AFAICS.
  • Mark_RoddyMark_Roddy Posts: 4,269
    It mostly doesn't affect a storport miniport at all. Miniports do not
    perform queueing, they pass requests from storport to the addressed
    device. Any queueing issues and requirements are the domain of the
    actual device, storport itself, and the initiator. However, I worked
    directly on windows storage scsi and storport drivers and on storage
    stack drivers for ten years up until last year, and I never once saw
    anything other than simple tagged requests and I am convinced that the
    only mechanisms in use in NT to enforce any ordering on storage device
    are flush mechanisms. That said, and as I started out, even if NTFS or
    disk,sys suddenly started using ordered tag queues, that would not
    affect miniports, scsi or storport, in the slightest. They would
    simply be passing those requests on at the direction of the containing
    port driver.

    Mark Roddy



    On Sat, Feb 20, 2010 at 8:24 PM, <xxxxx@gmail.com> wrote:
    >> It's been a while since I decoded the scsi command packets on Windows,
    >> but also believe it doesn't use ordered tcq at all, so the device is free to
    >> reorder things as it desires
    >
    > Sorry, this design paradigm really bothers me as I have seen too many driver writers doing such things. Just because you haven't seen a particular action in the debugger eons ago is in no way valid reasoning to skip such an important design consideration that impacts data integrity. Any new Windows versions or updates may, if they have not already, legally choose to use an ordered queue at anytime. Could you really sleep well at night knowing such a time bomb is in your code that could trash every single customer disk drive? And by the way FUA has nothing to do with ordering AFAICS.
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
    >
  • It was more of a calibration/disclaimer on the validity of my comment (i.e
    do your own research for a better answer). I actually DID look though the
    sources in the WDK 7600 disk class driver, but I have never run the WDK 7600
    disk class driver and don't know if it matches the current shipping driver
    (it used to). I found no evidence of ordered tag queuing happening in those
    sources.

    There are no specifications other than the disk class driver sources that
    might describe this area, so suspect reverse engineering OS behavior would
    be needed. Reverse engineering never tells you the intent/future direction
    of a components owner, only it's past and current implementation. Other
    options would be to open a support ticket with Microsoft, which may or may
    not get a definitive answer about Windows use of tagged queuing, or if
    you're one of the fortunate people to have source code access, you could do
    your own research. Neither of these options will likely tell you what's in
    the future product plans, only past/current implementation.

    I would certainly love a design paradigm where EVERYTHING is deeply
    documented, but that is simply not reality for the majority of developers on
    this list.

    I did say you would have to look at the scsi spec to understand any queuing
    impact setting the FUA bit would have.

    I do like to think 15 years of experience writing Windows drivers gives my
    comments rather more accuracy that some. I was also concurring with Mark's
    comment, also a VERY experienced Windows driver developer. Unless you have
    the Windows 7 source code in front of you, or a scsi trace showing ordered
    tag queuing happening (i.e. reverse engineering), then it seems like the
    best available evidence from two highly experienced developers say the
    chances are better than 50% that Windows does not use ordered tagged
    queuing. The reality is, YOU as an engineer would have to come to your own
    conclusion about what reality is, and make your engineering decision based
    on your views. Mark and I are just data points to be considered.

    So how do you suggest we solve the problem of writing software in an
    environment where we don't absolutely know all decisions by other developers
    of other components? The most accurate answer about Windows TCQ is, it's not
    specified what the behavior is. The question then becomes, what are the
    boundaries on what software you can write given the unspecified behavior. It
    only takes reading this list for a short time to see many developers have
    little interest in putting boundaries on their product because certain
    behavior is unspecified.

    The sticky problem is that since some developers have no problem making
    assumptions about unspecified behavior, giving their products new features,
    ALL of us have to create products that compete with these products. I've had
    this conversation a million times with management/sales/marketing people:
    "There is no official way to make that feature work, we might find a way
    that bends/breaks the rules, but as an engineer, I have to recommend against
    doing things we can't be sure are stable and work correctly." Frequently,
    the response from management/sales/marketing is "But brand B has that
    feature, and to be competitive we need it too, so do anything you need to do
    to make that feature work, and we'll deal with any problems later". I as a
    developer than am faced with the ethical conflict of: do I refuse to do
    things that I believe are inappropriate or risky, possibly risking my
    employment or client relationship, or do I implement features that I have no
    way of validating as safe (i.e. it seems to work in (often inadequate)
    testing, but who knows what those other components really do). I actually
    have somewhat of a reputation of sticking with my ethical values, and not
    bending to pressure from peers/authority, it doesn't always make me liked
    (at least initially, but then when I deliver stable products and others
    deliver unstable products, I get a lot of respect). I'm sure this is a
    problem Microsoft grapples with all the time, because the instability caused
    by some developers reflects back on them.

    All engineering is about risk management, if it's designing a building or
    writing software. Nothing is absolutely risk free, as there are always
    factors you can't control. Engineers make their best judgment call of what
    reality will be like in the future, and design products with the assumption
    that view of reality is accurate. Engineers frequently don't get the last
    word on these risk management tradeoffs.

    Jan

    > -----Original Message-----
    > From: xxxxx@lists.osr.com [mailto:bounce-402297-
    > xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
    > Sent: Saturday, February 20, 2010 5:25 PM
    > To: Windows System Software Devs Interest List
    > Subject: RE:[ntdev] Virtual Storport Miniport - HwStorStartIo
    >
    > > It's been a while since I decoded the scsi command packets on
    > Windows,
    > > but also believe it doesn't use ordered tcq at all, so the device is
    > free to
    > > reorder things as it desires
    >
    > Sorry, this design paradigm really bothers me as I have seen too many
    > driver writers doing such things. Just because you haven't seen a
    > particular action in the debugger eons ago is in no way valid reasoning
    > to skip such an important design consideration that impacts data
    > integrity. Any new Windows versions or updates may, if they have not
    > already, legally choose to use an ordered queue at anytime. Could you
    > really sleep well at night knowing such a time bomb is in your code
    > that could trash every single customer disk drive? And by the way FUA
    > has nothing to do with ordering AFAICS.
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
  • <quote>
    I as a developer than am faced with the ethical conflict of: do I refuse to
    do things that I believe are inappropriate or risky, possibly risking my
    employment or client relationship, or do I implement features that I have no
    way of validating as safe (i.e. it seems to work in (often inadequate)
    testing, but who knows what those other components really do). I actually
    have somewhat of a reputation of sticking with my ethical values, and not
    bending to pressure from peers/authority
    </quote>

    In the event of "do it or go home", I would clearly put the
    assumptions/risks in bold in the design specification, honestly spell out
    under what circumstances, the design may have what kinds of unintended
    effects and how bad it could be. Somebody must sign off before the ball
    starts rolling, right? As long as my product performs to spec, I'm off the
    hook.

    Calvin Guan

    -----Original Message-----
    From: xxxxx@lists.osr.com
    [mailto:xxxxx@lists.osr.com] On Behalf Of Jan Bottorff
    Sent: Saturday, February 20, 2010 9:17 PM
    To: Windows System Software Devs Interest List
    Subject: RE: [ntdev] Virtual Storport Miniport - HwStorStartIo

    It was more of a calibration/disclaimer on the validity of my comment (i.e
    do your own research for a better answer). I actually DID look though the
    sources in the WDK 7600 disk class driver, but I have never run the WDK 7600
    disk class driver and don't know if it matches the current shipping driver
    (it used to). I found no evidence of ordered tag queuing happening in those
    sources.

    There are no specifications other than the disk class driver sources that
    might describe this area, so suspect reverse engineering OS behavior would
    be needed. Reverse engineering never tells you the intent/future direction
    of a components owner, only it's past and current implementation. Other
    options would be to open a support ticket with Microsoft, which may or may
    not get a definitive answer about Windows use of tagged queuing, or if
    you're one of the fortunate people to have source code access, you could do
    your own research. Neither of these options will likely tell you what's in
    the future product plans, only past/current implementation.

    I would certainly love a design paradigm where EVERYTHING is deeply
    documented, but that is simply not reality for the majority of developers on
    this list.

    I did say you would have to look at the scsi spec to understand any queuing
    impact setting the FUA bit would have.

    I do like to think 15 years of experience writing Windows drivers gives my
    comments rather more accuracy that some. I was also concurring with Mark's
    comment, also a VERY experienced Windows driver developer. Unless you have
    the Windows 7 source code in front of you, or a scsi trace showing ordered
    tag queuing happening (i.e. reverse engineering), then it seems like the
    best available evidence from two highly experienced developers say the
    chances are better than 50% that Windows does not use ordered tagged
    queuing. The reality is, YOU as an engineer would have to come to your own
    conclusion about what reality is, and make your engineering decision based
    on your views. Mark and I are just data points to be considered.

    So how do you suggest we solve the problem of writing software in an
    environment where we don't absolutely know all decisions by other developers
    of other components? The most accurate answer about Windows TCQ is, it's not
    specified what the behavior is. The question then becomes, what are the
    boundaries on what software you can write given the unspecified behavior. It
    only takes reading this list for a short time to see many developers have
    little interest in putting boundaries on their product because certain
    behavior is unspecified.

    The sticky problem is that since some developers have no problem making
    assumptions about unspecified behavior, giving their products new features,
    ALL of us have to create products that compete with these products. I've had
    this conversation a million times with management/sales/marketing people:
    "There is no official way to make that feature work, we might find a way
    that bends/breaks the rules, but as an engineer, I have to recommend against
    doing things we can't be sure are stable and work correctly." Frequently,
    the response from management/sales/marketing is "But brand B has that
    feature, and to be competitive we need it too, so do anything you need to do
    to make that feature work, and we'll deal with any problems later". I as a
    developer than am faced with the ethical conflict of: do I refuse to do
    things that I believe are inappropriate or risky, possibly risking my
    employment or client relationship, or do I implement features that I have no
    way of validating as safe (i.e. it seems to work in (often inadequate)
    testing, but who knows what those other components really do). I actually
    have somewhat of a reputation of sticking with my ethical values, and not
    bending to pressure from peers/authority, it doesn't always make me liked
    (at least initially, but then when I deliver stable products and others
    deliver unstable products, I get a lot of respect). I'm sure this is a
    problem Microsoft grapples with all the time, because the instability caused
    by some developers reflects back on them.

    All engineering is about risk management, if it's designing a building or
    writing software. Nothing is absolutely risk free, as there are always
    factors you can't control. Engineers make their best judgment call of what
    reality will be like in the future, and design products with the assumption
    that view of reality is accurate. Engineers frequently don't get the last
    word on these risk management tradeoffs.

    Jan

    > -----Original Message-----
    > From: xxxxx@lists.osr.com [mailto:bounce-402297-
    > xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
    > Sent: Saturday, February 20, 2010 5:25 PM
    > To: Windows System Software Devs Interest List
    > Subject: RE:[ntdev] Virtual Storport Miniport - HwStorStartIo
    >
    > > It's been a while since I decoded the scsi command packets on
    > Windows,
    > > but also believe it doesn't use ordered tcq at all, so the device is
    > free to
    > > reorder things as it desires
    >
    > Sorry, this design paradigm really bothers me as I have seen too many
    > driver writers doing such things. Just because you haven't seen a
    > particular action in the debugger eons ago is in no way valid reasoning
    > to skip such an important design consideration that impacts data
    > integrity. Any new Windows versions or updates may, if they have not
    > already, legally choose to use an ordered queue at anytime. Could you
    > really sleep well at night knowing such a time bomb is in your code
    > that could trash every single customer disk drive? And by the way FUA
    > has nothing to do with ordering AFAICS.
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer


    ---
    NTDEV is sponsored by OSR

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at
    http://www.osronline.com/page.cfm?name=ListServer
  • > It mostly doesn't affect a storport miniport at all.

    Not true. This topic is about virtual, hence queue order is typically handled by driver code that is servicing the requests rather than SBC compliant firmware.

    > I never once saw anything other than simple tagged requests

    You can download 3rd party utilities that exercise every single queue method. Just pray your customers never install any of them or Windows never gets around to performance optimization using advanced queuing.

    > There are no specifications other than the disk class driver sources that might describe this area

    Yes there is: T10.


    I am still unconvinced that browsing sources and looking at things in a debugger is a valid way to pick and choose what should go into driver design. Meet the specifications, don't cut corners. It is not difficult and probably will save time and increase customer satisfaction in the long run.
  • Mark_RoddyMark_Roddy Posts: 4,269
    If your virtual storport is emulating a scsi device then yes that
    emulation should be complete. The miniport functionality of such a
    driver remains as I stated: it passes requests to the device emulation
    function which then processes those requests as appropriate. Your
    example does not contradict what I was saying. The miniport
    functionality is in general request-type agnostic and performs no
    queuing.

    Mark Roddy



    On Sun, Feb 21, 2010 at 8:09 PM, <xxxxx@gmail.com> wrote:
    >> It mostly doesn't affect a storport miniport at all.
    >
    > Not true. This topic is about virtual, hence queue order is typically handled by driver code that is servicing the requests rather than SBC compliant firmware.
    >
    >> I never once saw anything other than simple tagged requests
    >
    > You can download 3rd party utilities that exercise every single queue method. Just pray your customers never install any of them or Windows never gets around to performance optimization using advanced queuing.


    Why? The miniport doesn't care what type of requests are sent to it,
    it will just pass them on. The target device, virtual or otherwise,
    might care.

    Again in your contrived example of a virtual storport miniport that
    performs full scsi device emulation, that emulation functionality
    might care, but that is outside the scope of astorport miniport
    functionality, nor was the original question "if I am implementing
    scsi device emulation in a virtual storport miniport do I have to be
    concerned about queue management in that emulation"? The answer to
    that question is 'yes'.

    >
    >> There are no specifications other than the disk class driver sources that might describe this area
    >
    > Yes there is: T10.
    >
    >
    > I am still unconvinced that browsing sources and looking at things in a debugger is a valid way to pick and choose what should go into driver design. Meet the specifications, don't cut corners. It is not difficult and probably will save time and increase customer satisfaction in the long run.

    What are the specifications for a storport miniport?
    Is it your advice then that a storport miniport must provide support
    for ordering requests?

    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
    >
  • > How should the virtual storport miniport driver queue and process the SRBs in correct order.

    But any means of its own.

    Reordering is also permitted, this is what the real hardware does with the queue tags.

    But note that SP_UNTAGGED requests are exclusive.

    --
    Maxim S. Shatskih
    Windows DDK MVP
    xxxxx@storagecraft.com
    http://www.storagecraft.com
  • >requests are initiated sequentially." I took this reference from description of HwStorStartIo routine of
    >miniport StorPort driver but it should work for virtual miniport driver too.

    No.

    The whole point of "virtual" is "no locks held".

    --
    Maxim S. Shatskih
    Windows DDK MVP
    xxxxx@storagecraft.com
    http://www.storagecraft.com
  • Ordering is enforced by the FSD sending _synchronous_ (send and wait) writes, and never sending the next write to this sector before the previous one was completed.

    >in that time? Windows might think it has committed the journal to disk
    >when in fact it hasn't.

    Journal writes are done with SCSI FUA bit.

    It is really a good idea to honor it in any storage port/miniport.

    --
    Maxim S. Shatskih
    Windows DDK MVP
    xxxxx@storagecraft.com
    http://www.storagecraft.com
  • > > I never once saw anything other than simple tagged requests
    >
    > You can download 3rd party utilities that exercise every single queue
    > method. Just pray your customers never install any of them or Windows
    > never gets around to performance optimization using advanced queuing.

    A SCSI pass through app/driver can pass whatever it wants, and drivers/apps
    that use pass through are responsible for assuring correct operation.

    The original question was about performance optimization being done by the
    standard OS components, specifically the question was "I thought TCQ solved
    all these problems years ago", and a very clear answer on Windows is no, TCQ
    does not solve all the request ordering issues, you need to wait for request
    completion.

    > > There are no specifications other than the disk class driver sources
    > that might describe this area
    >
    > Yes there is: T10.

    That describes how the disk reacts, it says nothing about what requests the
    standard Windows OS will pass down. My guess is the Windows cluster software
    would not include tests for reservation behavior if all storage devices
    perfectly followed the specs.

    > I am still unconvinced that browsing sources and looking at things in a
    > debugger is a valid way to pick and choose what should go into driver
    > design.

    You're free to suggest a better way when there is no other information
    available, really, what do you propose? If you can get Microsoft to send me
    a copy of the source code, I'm sure I could make good use of it at times. My
    guess is no device implements EVERYTHING in T10, they pick and choose the
    stuff that has value or is required for certification testing. There become
    defacto subsets of standards, which are often not documented anyplace. Even
    something as well documented as the TCP/IP protocol has a lot of variations
    in implementations.

    > Meet the specifications, don't cut corners. It is not difficult
    > and probably will save time and increase customer satisfaction in the
    > long run.

    Most specs have fuzzy areas. The real world has bugs and variations on
    interpretations of specs. It doesn't really help when a whole system doesn't
    work correctly to say "my part follows the spec, everybody else can just fix
    their area". Part of not cutting corners is understanding how reality
    doesn't match the specs, in addition to knowing what's in the specs.

    Just a few weeks ago, I was writing a driver for a hardware device, with the
    chip specs in front of me. The device would not do anything, and after some
    digging found some information (not in the docs) that to make that chip work
    you had to write an undocumented value to an undocumented register during
    initialization. I had the chip vendor documentation with the recommended
    initialization sequence in front of me, and it didn't work.

    A LOT of devices don't exactly follow their specification documents, and
    take some magic to actually make work. I've spent a lot of my life looking
    at a lot of spec documents, and have had some pretty large battles to get
    component owner to fix what I believe was their non-conformance. Some of
    those battles were successful some not. In the end, it's often the driver
    writers job to work around hardware bugs. Reality is, spinning a new ASIC
    revision often costs millions of dollars, and a lot of driver workarounds
    can be written for millions of dollars. You should look at the Intel
    processor errata sometime. Even processors don't work quite the way their
    specs say. I'd bet there are a bunch of people reading this list nodding at
    what I'm saying.

    Jan
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!