standby / resume OS init

Correct me if I have my understanding of the S3 resume path incorrect:

<1> It all starts with a bios init

<2> The OS gets loaded / initialized ( is this correct ? ) later.

<3> The OS instructs the drivers to initialize themselves through power
IRPs.

My aim behind asking this being as folllows. While running suspend.exe I see
that the system locks up after about 300 iterations of S3. When I say locks
up I mean that it is a hard hang, the debugger is not responsive and my
driver spew indicates that is has not been invoked at all. I am trying to
find out if

< a > Is the system is locking up in the BIOS?
< b > The OS has loaded but some driver before me has caused a hard hang ?

Any light thrown on this will be very helpful. I can use Catscan but the
fact that this happens in 300 iterations means leaving the PCIE Logic
Analyzer running overnight and risking a probe damage.
Also what are the tools provided by MS for S3/S4 analysis. I ve already come
across bootviz/ suspend.

thanks
banks

Adding to the above:
Maybe it can be easily found out if the lockup is taking place before the OS
kicks in if there is a nt!method that I can baw4 on. Just for debug purpose
and with no intentions of disassembling / hacking :slight_smile: and with the caveat
that this is highly build specific. Well is there any such?

banks

“bank kus” wrote in message news:xxxxx@ntdev…
> Correct me if I have my understanding of the S3 resume path incorrect:
>
> <1> It all starts with a bios init
>
> <2> The OS gets loaded / initialized ( is this correct ? ) later.
>
> <3> The OS instructs the drivers to initialize themselves through power
> IRPs.
>
> My aim behind asking this being as folllows. While running suspend.exe I
> see that the system locks up after about 300 iterations of S3. When I say
> locks up I mean that it is a hard hang, the debugger is not responsive and
> my driver spew indicates that is has not been invoked at all. I am trying
> to find out if
>
> < a > Is the system is locking up in the BIOS?
> < b > The OS has loaded but some driver before me has caused a hard hang ?
>
> Any light thrown on this will be very helpful. I can use Catscan but the
> fact that this happens in 300 iterations means leaving the PCIE Logic
> Analyzer running overnight and risking a probe damage.
> Also what are the tools provided by MS for S3/S4 analysis. I ve already
> come across bootviz/ suspend.
>
>
> thanks
> banks
>
>
>

The answer to your question depends heavily on what kind of machine you’re
working with. Several things are always true:

  1. In S3, the processor is completely off. This means that restarting it
    will cause it to jump to the reset vector in the BIOS. So your BIOS is
    always the thing that gets control first. The BIOS is expected to make the
    machine “coherent” and then jump to the OS’s resume vector. In good
    machines, this means that a few milliseconds pass and the BIOS has done
    little more than setting up the memory controller. In other machines, it
    can be a whole pass through the BIOS, including a VGA BIOS run.

  2. The OS does little more than put the processor back into protected mode
    before it starts sending IRPs to drivers, starting with the root of the PnP
    tree and working outward (with some sorting applied.)

  3. The debugger will start to work as soon as the serial port is turned
    back on. This may happen in the BIOS. It may have never been turned off.
    Or it may happen when the power manager gets to the serial port in the
    (sorted) device tree. This can mean that large parts of S3 resume are not
    debuggable unless you get a special build of the BIOS which turns on the
    serial port for you before passing control to the OS.

  4. NMI/SERR cards that can generate a crashdump for debugging may help, but
    only if the disk stack has been powered-on by the time of your failure.

My suggestion is to try a different motherboard, one with a different brand
of BIOS but largely the same chipset and adapter mix.

Unfortunately, my best statement is that, when I’ve had to debug these
situations (and I used to have to do that more or less endlessly) I would
generally have to resort to an Arium, an ITP or a special build of the BIOS.


Jake Oshins
Windows Kernel Group

The Virtual Machine Team at Microsoft is hiring. Contact
xxxxx@microsoft.com for more information.

This posting is provided “AS IS” with no warranties, and confers no rights.

“bank kus” wrote in message news:xxxxx@ntdev…
> Adding to the above:
> Maybe it can be easily found out if the lockup is taking place before the
> OS kicks in if there is a nt!method that I can baw4 on. Just for debug
> purpose and with no intentions of disassembling / hacking :slight_smile: and with the
> caveat that this is highly build specific. Well is there any such?
>
> banks
>
>
> “bank kus” wrote in message news:xxxxx@ntdev…
>> Correct me if I have my understanding of the S3 resume path incorrect:
>>
>> <1> It all starts with a bios init
>>
>> <2> The OS gets loaded / initialized ( is this correct ? ) later.
>>
>> <3> The OS instructs the drivers to initialize themselves through power
>> IRPs.
>>
>> My aim behind asking this being as folllows. While running suspend.exe I
>> see that the system locks up after about 300 iterations of S3. When I say
>> locks up I mean that it is a hard hang, the debugger is not responsive
>> and my driver spew indicates that is has not been invoked at all. I am
>> trying to find out if
>>
>> < a > Is the system is locking up in the BIOS?
>> < b > The OS has loaded but some driver before me has caused a hard hang
>> ?
>>
>> Any light thrown on this will be very helpful. I can use Catscan but the
>> fact that this happens in 300 iterations means leaving the PCIE Logic
>> Analyzer running overnight and risking a probe damage.
>> Also what are the tools provided by MS for S3/S4 analysis. I ve already
>> come across bootviz/ suspend.
>>
>>
>> thanks
>> banks
>>
>>
>>
>
>
>

To follow up with Jake - this is something we often do in our reference BIOS, but have to turn off on production boards because of the resume time impact due to WHQL. (Some SIO chips which board manufacturers choose to use are awfully slow about coming back online)

Jake - what about USB debug, is that turned on any sooner? (Thinking forward here)


From: xxxxx@lists.osr.com on behalf of Jake Oshins
Sent: Tue 2/14/2006 9:02 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] standby / resume OS init

The answer to your question depends heavily on what kind of machine you’re
working with. Several things are always true:

  1. In S3, the processor is completely off. This means that restarting it
    will cause it to jump to the reset vector in the BIOS. So your BIOS is
    always the thing that gets control first. The BIOS is expected to make the
    machine “coherent” and then jump to the OS’s resume vector. In good
    machines, this means that a few milliseconds pass and the BIOS has done
    little more than setting up the memory controller. In other machines, it
    can be a whole pass through the BIOS, including a VGA BIOS run.

  2. The OS does little more than put the processor back into protected mode
    before it starts sending IRPs to drivers, starting with the root of the PnP
    tree and working outward (with some sorting applied.)

  3. The debugger will start to work as soon as the serial port is turned
    back on. This may happen in the BIOS. It may have never been turned off.
    Or it may happen when the power manager gets to the serial port in the
    (sorted) device tree. This can mean that large parts of S3 resume are not
    debuggable unless you get a special build of the BIOS which turns on the
    serial port for you before passing control to the OS.

  4. NMI/SERR cards that can generate a crashdump for debugging may help, but
    only if the disk stack has been powered-on by the time of your failure.

My suggestion is to try a different motherboard, one with a different brand
of BIOS but largely the same chipset and adapter mix.

Unfortunately, my best statement is that, when I’ve had to debug these
situations (and I used to have to do that more or less endlessly) I would
generally have to resort to an Arium, an ITP or a special build of the BIOS.


Jake Oshins
Windows Kernel Group

The Virtual Machine Team at Microsoft is hiring. Contact
xxxxx@microsoft.com for more information.

This posting is provided “AS IS” with no warranties, and confers no rights.

“bank kus” wrote in message news:xxxxx@ntdev…
> Adding to the above:
> Maybe it can be easily found out if the lockup is taking place before the
> OS kicks in if there is a nt!method that I can baw4 on. Just for debug
> purpose and with no intentions of disassembling / hacking :slight_smile: and with the
> caveat that this is highly build specific. Well is there any such?
>
> banks
>
>
> “bank kus” wrote in message news:xxxxx@ntdev…
>> Correct me if I have my understanding of the S3 resume path incorrect:
>>
>> <1> It all starts with a bios init
>>
>> <2> The OS gets loaded / initialized ( is this correct ? ) later.
>>
>> <3> The OS instructs the drivers to initialize themselves through power
>> IRPs.
>>
>> My aim behind asking this being as folllows. While running suspend.exe I
>> see that the system locks up after about 300 iterations of S3. When I say
>> locks up I mean that it is a hard hang, the debugger is not responsive
>> and my driver spew indicates that is has not been invoked at all. I am
>> trying to find out if
>>
>> < a > Is the system is locking up in the BIOS?
>> < b > The OS has loaded but some driver before me has caused a hard hang
>> ?
>>
>> Any light thrown on this will be very helpful. I can use Catscan but the
>> fact that this happens in 300 iterations means leaving the PCIE Logic
>> Analyzer running overnight and risking a probe damage.
>> Also what are the tools provided by MS for S3/S4 analysis. I ve already
>> come across bootviz/ suspend.
>>
>>
>> thanks
>> banks
>>
>>
>>
>
>
>


Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@nvidia.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

The man knows of what he speaks. Go see your friends at American Arium.
Their ECM-50 (or something similar) is the only reliable (and generally
the only possible) way to attack this problem (unless simulation is an
option; sort of).

>> xxxxx@nvidia.com 2006-02-15 01:44 >>>
To follow up with Jake - this is something we often do in our reference
BIOS, but have to turn off on production boards because of the resume
time impact due to WHQL. (Some SIO chips which board manufacturers
choose to use are awfully slow about coming back online)

Jake - what about USB debug, is that turned on any sooner? (Thinking
forward here)


From: xxxxx@lists.osr.com on behalf of Jake Oshins
Sent: Tue 2/14/2006 9:02 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] standby / resume OS init

The answer to your question depends heavily on what kind of machine
you’re
working with. Several things are always true:

  1. In S3, the processor is completely off. This means that restarting
    it
    will cause it to jump to the reset vector in the BIOS. So your BIOS
    is
    always the thing that gets control first. The BIOS is expected to make
    the
    machine “coherent” and then jump to the OS’s resume vector. In good
    machines, this means that a few milliseconds pass and the BIOS has
    done
    little more than setting up the memory controller. In other machines,
    it
    can be a whole pass through the BIOS, including a VGA BIOS run.

  2. The OS does little more than put the processor back into protected
    mode
    before it starts sending IRPs to drivers, starting with the root of the
    PnP
    tree and working outward (with some sorting applied.)

  3. The debugger will start to work as soon as the serial port is
    turned
    back on. This may happen in the BIOS. It may have never been turned
    off.
    Or it may happen when the power manager gets to the serial port in the
    (sorted) device tree. This can mean that large parts of S3 resume are
    not
    debuggable unless you get a special build of the BIOS which turns on
    the
    serial port for you before passing control to the OS.

  4. NMI/SERR cards that can generate a crashdump for debugging may
    help, but
    only if the disk stack has been powered-on by the time of your
    failure.

My suggestion is to try a different motherboard, one with a different
brand
of BIOS but largely the same chipset and adapter mix.

Unfortunately, my best statement is that, when I’ve had to debug these
situations (and I used to have to do that more or less endlessly) I
would
generally have to resort to an Arium, an ITP or a special build of the
BIOS.


Jake Oshins
Windows Kernel Group

The Virtual Machine Team at Microsoft is hiring. Contact
xxxxx@microsoft.com for more information.

This posting is provided “AS IS” with no warranties, and confers no
rights.

“bank kus” wrote in message news:xxxxx@ntdev…
> Adding to the above:
> Maybe it can be easily found out if the lockup is taking place before
the
> OS kicks in if there is a nt!method that I can baw4 on. Just for
debug
> purpose and with no intentions of disassembling / hacking :slight_smile: and
with the
> caveat that this is highly build specific. Well is there any such?
>
> banks
>
>
> “bank kus” wrote in message news:xxxxx@ntdev…
>> Correct me if I have my understanding of the S3 resume path
incorrect:
>>
>> <1> It all starts with a bios init
>>
>> <2> The OS gets loaded / initialized ( is this correct ? ) later.
>>
>> <3> The OS instructs the drivers to initialize themselves through
power
>> IRPs.
>>
>> My aim behind asking this being as folllows. While running
suspend.exe I
>> see that the system locks up after about 300 iterations of S3. When
I say
>> locks up I mean that it is a hard hang, the debugger is not
responsive
>> and my driver spew indicates that is has not been invoked at all. I
am
>> trying to find out if
>>
>> < a > Is the system is locking up in the BIOS?
>> < b > The OS has loaded but some driver before me has caused a hard
hang
>> ?
>>
>> Any light thrown on this will be very helpful. I can use Catscan but
the
>> fact that this happens in 300 iterations means leaving the PCIE
Logic
>> Analyzer running overnight and risking a probe damage.
>> Also what are the tools provided by MS for S3/S4 analysis. I ve
already
>> come across bootviz/ suspend.
>>
>>
>> thanks
>> banks
>>
>>
>>
>
>
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@nvidia.com
To unsubscribe send a blank email to xxxxx@lists.osr.com


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument:
‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

Re:[ntdev] standby / resume OS initUSB won’t be turned on significantly
sooner. It’s likely that USB is encountered slightly earlier in the sorted
device tree. But the fundamental situation is the same.


Jake Oshins
Windows Kernel Group

The Virtual Machine Team at Microsoft is hiring. Contact
xxxxx@microsoft.com for more information.

This posting is provided “AS IS” with no warranties, and confers no rights.

“Mark Overby” wrote in message news:xxxxx@ntdev…
To follow up with Jake - this is something we often do in our reference
BIOS, but have to turn off on production boards because of the resume time
impact due to WHQL. (Some SIO chips which board manufacturers choose to use
are awfully slow about coming back online)

Jake - what about USB debug, is that turned on any sooner? (Thinking forward
here)

From: xxxxx@lists.osr.com on behalf of Jake Oshins
Sent: Tue 2/14/2006 9:02 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] standby / resume OS init

The answer to your question depends heavily on what kind of machine you’re
working with. Several things are always true:

1) In S3, the processor is completely off. This means that restarting it
will cause it to jump to the reset vector in the BIOS. So your BIOS is
always the thing that gets control first. The BIOS is expected to make the
machine “coherent” and then jump to the OS’s resume vector. In good
machines, this means that a few milliseconds pass and the BIOS has done
little more than setting up the memory controller. In other machines, it
can be a whole pass through the BIOS, including a VGA BIOS run.

2) The OS does little more than put the processor back into protected mode
before it starts sending IRPs to drivers, starting with the root of the PnP
tree and working outward (with some sorting applied.)

3) The debugger will start to work as soon as the serial port is turned
back on. This may happen in the BIOS. It may have never been turned off.
Or it may happen when the power manager gets to the serial port in the
(sorted) device tree. This can mean that large parts of S3 resume are not
debuggable unless you get a special build of the BIOS which turns on the
serial port for you before passing control to the OS.

4) NMI/SERR cards that can generate a crashdump for debugging may help, but
only if the disk stack has been powered-on by the time of your failure.

My suggestion is to try a different motherboard, one with a different brand
of BIOS but largely the same chipset and adapter mix.

Unfortunately, my best statement is that, when I’ve had to debug these
situations (and I used to have to do that more or less endlessly) I would
generally have to resort to an Arium, an ITP or a special build of the BIOS.


Jake Oshins
Windows Kernel Group

The Virtual Machine Team at Microsoft is hiring. Contact
xxxxx@microsoft.com for more information.

This posting is provided “AS IS” with no warranties, and confers no rights.

“bank kus” wrote in message news:xxxxx@ntdev…
> Adding to the above:
> Maybe it can be easily found out if the lockup is taking place before the
> OS kicks in if there is a nt!method that I can baw4 on. Just for debug
> purpose and with no intentions of disassembling / hacking :slight_smile: and with the
> caveat that this is highly build specific. Well is there any such?
>
> banks
>
>
> “bank kus” wrote in message news:xxxxx@ntdev…
>> Correct me if I have my understanding of the S3 resume path incorrect:
>>
>> <1> It all starts with a bios init
>>
>> <2> The OS gets loaded / initialized ( is this correct ? ) later.
>>
>> <3> The OS instructs the drivers to initialize themselves through power
>> IRPs.
>>
>> My aim behind asking this being as folllows. While running suspend.exe I
>> see that the system locks up after about 300 iterations of S3. When I say
>> locks up I mean that it is a hard hang, the debugger is not responsive
>> and my driver spew indicates that is has not been invoked at all. I am
>> trying to find out if
>>
>> < a > Is the system is locking up in the BIOS?
>> < b > The OS has loaded but some driver before me has caused a hard hang
>> ?
>>
>> Any light thrown on this will be very helpful. I can use Catscan but the
>> fact that this happens in 300 iterations means leaving the PCIE Logic
>> Analyzer running overnight and risking a probe damage.
>> Also what are the tools provided by MS for S3/S4 analysis. I ve already
>> come across bootviz/ suspend.
>>
>>
>> thanks
>> banks
>>
>>
>>
>
>
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@nvidia.com
To unsubscribe send a blank email to xxxxx@lists.osr.com