Is there a kernel api or assembly instruction by which I can get the processor clock ticks from with

Hi,

Is there a kernel api or assembly instruction in Windows by which I can get
the number of processor clock ticks from within kernel code?
I want to measure the number of processor cycles that are elapsed between
two particular points within my code.
OS ticks is not granular enough to be useful for this purpose.

Its fine even if its an assembly instruction which I can execute from
within my kernel driver.
I am looking to use it both on systems with AMD/Intel Procs.

I have done such measurements on a different OS. So trying to figure out
similar solution on Windows Vista.

Thanks,
-Praveen

KeQueryPerformanceCounter - but search the archives for the repeated
discussions regarding KeQueryPerformanceCounter vs RDTSC.

On Jan 3, 2008 9:39 AM, Praveen Kumar Amritaluru
wrote:

> Hi,
>
> Is there a kernel api or assembly instruction in Windows by which I can
> get
> the number of processor clock ticks from within kernel code?
> I want to measure the number of processor cycles that are elapsed between
> two particular points within my code.
> OS ticks is not granular enough to be useful for this purpose.
>
> Its fine even if its an assembly instruction which I can execute from
> within my kernel driver.
> I am looking to use it both on systems with AMD/Intel Procs.
>
> I have done such measurements on a different OS. So trying to figure out
> similar solution on Windows Vista.
>
> Thanks,
> -Praveen
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>


Mark Roddy

Hello,

you can use the __rdtsc intrinsic to read the time stamp counter:
http://msdn2.microsoft.com/en-us/library/twchhe95(VS.80).aspx

You will have to take care of processor affinities and IRQLs for precise measurement.

Hi, People,

I have a strange crash that occurs once in a blue moon, I wonder if anyone
out there can point me in the right direction ?

My driver has a kernel-side thread that keeps track of dma and render hw
sequence numbers, manages things such as sequencing at chip level, polling
for interrupts in chips that do not interrupt, keeping an eye on runaway or
hung transactions, and so on: it’s a kind of a device level cop. I have one
such thread per i/o device. Once the thread is created, it waits on a timer
tick, does whatever it needs to do, and then waits until the next timer tick
or the next event wakes it up.

Because of diagnostics and other low-level considerations, the thread isn’t
on at all times: we create it at open time and turn it off at close time.
Because we can have multiple opens per device, the first open creates the
thread, and the last close terminates it.

The problem is this: once in a blue moon we get a bugcheck 7e at the time
the thread is being created. It doesn’t happen very often, but when it
happens it’s always at the same point. The crash seems to be inside Windows.

Here’s the stack:

=======================
STACK_TEXT:
f6308d3c e0cec9bd f6308d78 00000000 fa7f694c
nt!CcPfBuildDumpFromTrace+0x23a
f6308d7c e0c059bd fa7f6810 00000000 fc9c2da8 nt!CcPfEndTrace+0x67
f6308dac e0c9c84c fa7f6810 00000000 00000000 nt!ExpWorkerThread+0xef
f6308ddc e0c1332e e0c058ce 00000001 00000000
nt!PspSystemThreadStartup+0x34
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

Here’s the error information:

=======================
VP2000.0 2008/01/02 15:27:25:0298 @<5.d> p1
vp2000_diag_set_open_mode(440): turned on the timer thread on device VP2000
*** Fatal System Error: 0x0000007e
(0xC0000005,0xE0CEBBC8,0xF6308C3C,0xF6308938)
Break instruction exception - code 80000003 (first chance)
A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.
A fatal system error has occurred.

The first line is issued from the thread itself, showing that the Open ioctl
actually tried to start the thread. It contains the device name (Vp2000),
the open number (.0), a date and time stamp, the PCI slot (@<5.d>), the
processor (p1), and the function and source code line number from where it
was issued. There’s no ensuing debug messages from the thread, which means
that it never started. The machine is a two-processor Dell 670.

The following information is also dumped:

========================
BugCheck 7E, {c0000005, e0cebbc8, f6308c3c, f6308938}
Probably caused by : ntkrpamp.exe ( nt!CcPfBuildDumpFromTrace+23a )

When I run !analyze -v, this comes out:

========================
FAULTING_IP: nt!CcPfBuildDumpFromTrace+23a e0cebbc8 8b5008
mov edx,[eax+0x8]
EXCEPTION_PARAMETER1: f6308c3c
CONTEXT: f6308938 – (.cxr fffffffff6308938)
eax=00000000 ebx=e63c2008 ecx=fa7f6864 edx=00000000 esi=fa7f6810
edi=e63c30a8
eip=e0cebbc8 esp=f6308d04 ebp=f6308d3c iopl=0 nv up ei pl nz ac
po cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00210217
nt!CcPfBuildDumpFromTrace+0x23a:
e0cebbc8 8b5008 mov edx,[eax+0x8]

The reason of the crash is now clear, eax is zero and CcPfBuildDumpFromTrace
tries to reference a null pointer. Looks like something happened during the
thread creation that caused the OS to blow up. My question is, what is
CcPfBuildDumpFromTrace, and how do I debug it ? Any suggestion will be
welcome.

Alberto.

I would assume you were able to break into the debugger or did you have a
dump that you loaded and did the !analyze on it?

Since it is once-in-a-blue-moon, it would perhaps be a good idea to hook (
look at the wdk ) the KeBugCheck ( and its variant ), so in case you need to
put out more information(s) etc.

You did not seem to look at the stack after !analyze ( it is not here in
you post ).

Few questions –

  1. How infrequently you see this? Once in a month! or …
  2. Is it reproducible on debug version ( or is it just on release, well
    since it is x86, you can always have pdb files with good enough infos even
    for release build )
  3. Is there anyway you can reduce the time between occurance?. Anything
    special you do? Or you leave the systems on idle etc…

All it says is a Cc internal routine. And not sure where the break-point
execption came from. It could be in the code somewhere and you don’t have
debug enabled ( just a guess ).

If you could catch under debugger, then I guess it is bit easier to look at
the stack, trap frames etc.

BTW, there are a few very very nice articles in NT Insider that covers some
of the advance techniques you can follow.

-pro

On Jan 4, 2008 5:09 AM, Alberto Moreira wrote:

> Hi, People,
>
> I have a strange crash that occurs once in a blue moon, I wonder if anyone
> out there can point me in the right direction ?
>
> My driver has a kernel-side thread that keeps track of dma and render hw
> sequence numbers, manages things such as sequencing at chip level, polling
> for interrupts in chips that do not interrupt, keeping an eye on runaway
> or
> hung transactions, and so on: it’s a kind of a device level cop. I have
> one
> such thread per i/o device. Once the thread is created, it waits on a
> timer
> tick, does whatever it needs to do, and then waits until the next timer
> tick
> or the next event wakes it up.
>
> Because of diagnostics and other low-level considerations, the thread
> isn’t
> on at all times: we create it at open time and turn it off at close time.
> Because we can have multiple opens per device, the first open creates the
> thread, and the last close terminates it.
>
> The problem is this: once in a blue moon we get a bugcheck 7e at the time
> the thread is being created. It doesn’t happen very often, but when it
> happens it’s always at the same point. The crash seems to be inside
> Windows.
>
> Here’s the stack:
>
> =======================
> STACK_TEXT:
> f6308d3c e0cec9bd f6308d78 00000000 fa7f694c
> nt!CcPfBuildDumpFromTrace+0x23a
> f6308d7c e0c059bd fa7f6810 00000000 fc9c2da8 nt!CcPfEndTrace+0x67
> f6308dac e0c9c84c fa7f6810 00000000 00000000 nt!ExpWorkerThread+0xef
> f6308ddc e0c1332e e0c058ce 00000001 00000000
> nt!PspSystemThreadStartup+0x34
> 00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16
> =======================
>
> Here’s the error information:
>
> =======================
> VP2000.0 2008/01/02 15:27:25:0298 @<5.d> p1
> vp2000_diag_set_open_mode(440): turned on the timer thread on device
> VP2000
> *** Fatal System Error: 0x0000007e
> (0xC0000005,0xE0CEBBC8,0xF6308C3C,0xF6308938)
> Break instruction exception - code 80000003 (first chance)
> A fatal system error has occurred.
> Debugger entered on first try; Bugcheck callbacks have not been
> invoked.
> A fatal system error has occurred.
> ========================
>
> The first line is issued from the thread itself, showing that the Open
> ioctl
> actually tried to start the thread. It contains the device name (Vp2000),
> the open number (.0), a date and time stamp, the PCI slot (@<5.d>), the
> processor (p1), and the function and source code line number from where it
> was issued. There’s no ensuing debug messages from the thread, which means
> that it never started. The machine is a two-processor Dell 670.
>
> The following information is also dumped:
>
> ========================
> BugCheck 7E, {c0000005, e0cebbc8, f6308c3c, f6308938}
> Probably caused by : ntkrpamp.exe ( nt!CcPfBuildDumpFromTrace+23a )
> ========================
>
> When I run !analyze -v, this comes out:
>
> ========================
> FAULTING_IP: nt!CcPfBuildDumpFromTrace+23a e0cebbc8 8b5008
> mov edx,[eax+0x8]
> EXCEPTION_PARAMETER1: f6308c3c
> CONTEXT: f6308938 – (.cxr fffffffff6308938)
> eax=00000000 ebx=e63c2008 ecx=fa7f6864 edx=00000000 esi=fa7f6810
> edi=e63c30a8
> eip=e0cebbc8 esp=f6308d04 ebp=f6308d3c iopl=0 nv up ei pl nz ac
> po cy
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00210217
> nt!CcPfBuildDumpFromTrace+0x23a:
> e0cebbc8 8b5008 mov edx,[eax+0x8]
> =========================
>
> The reason of the crash is now clear, eax is zero and
> CcPfBuildDumpFromTrace
> tries to reference a null pointer. Looks like something happened during
> the
> thread creation that caused the OS to blow up. My question is, what is
> CcPfBuildDumpFromTrace, and how do I debug it ? Any suggestion will be
> welcome.
>
>
> Alberto.
>
>
>
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

On 1/4/08, Prokash Sinha wrote:
> I would assume you were able to break into the debugger or did you have a
> dump that you loaded and did the !analyze on it?

> All it says is a Cc internal routine. And not sure where the break-point
> execption came from. It could be in the code somewhere and you don’t have
> debug enabled ( just a guess ).
>
> If you could catch under debugger, then I guess it is bit easier to look at
> the stack, trap frames etc.
>

sorry offtopic nothing related to the question but prokash isnt this
an AccessViolation (0xc0000005 -> first param)

the breakpoint line is standard Kd break on crash due AccessViolation
and kd is just informing that it got invoked due to some problem and
shows the faulting instruction

e0cebbc8 8b5008 mov edx,[eax+0x8]

eax=00000000

You are right, and I noticed that :). And it is clear that it is trying to
access the low 64KB. In fact addr 8. But to do an analysis, I think there
are several approach –

  1. Offhand I don’t remember what would be the code ( perhaps 0x80000003) if
    it is an embedded bp !

  2. Often there are more verbose output of the code and the args from
    Bugchek. In this particular case 7e. And as you know this kind of code has
    multiple meanings ( bugcodes.h or windbg help gives bit more detail) that
    depends on all the args spitted out from KeBugcheck. It might be that
    pagefaulting at a higher irql, so need more detail.

  3. As rule, the kernel API might not check all the params ( due to trust and
    to avoid slow parm checking ) before processing starts, so eax might be an
    argument coming here ( though usually it is the return value ).

  4. Also the stack is not very interesting, one should try to increase this
    and try stack dump.

Alberto will find it :slight_smile:

-pro

----- Original Message -----
From: “raj_r”
To: “Windows System Software Devs Interest List”
Sent: Saturday, January 05, 2008 8:19 AM
Subject: Re: RE:[ntdev] Is there a kernel api or assembly instruction by
which I can get the processor clock ticks from within kernel code?

> On 1/4/08, Prokash Sinha wrote:
>> I would assume you were able to break into the debugger or did you have a
>> dump that you loaded and did the !analyze on it?
>
>> All it says is a Cc internal routine. And not sure where the break-point
>> execption came from. It could be in the code somewhere and you don’t have
>> debug enabled ( just a guess ).
>>
>> If you could catch under debugger, then I guess it is bit easier to look
>> at
>> the stack, trap frames etc.
>>
>
>
> sorry offtopic nothing related to the question but prokash isnt this
> an AccessViolation (0xc0000005 -> first param)
>
> the breakpoint line is standard Kd break on crash due AccessViolation
> and kd is just informing that it got invoked due to some problem and
> shows the faulting instruction
>
>
> e0cebbc8 8b5008 mov edx,[eax+0x8]
>
> eax=00000000
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer

> On Jan 4, 2008 5:09 AM, Alberto Moreira wrote:
> > Hi, People,

My question is, what is
> > CcPfBuildDumpFromTrace, and how do I debug it ? Any suggestion will be
> > welcome.

google didnt turn up anything so i leeched this ntkrpamp.exe off the net
CCPf is present in ntkrpamp.exe

0:000> .fnent ntkrpamp!CcPfBuildDumpFromTrace
Debugger function entry 02416370 for:
(0059c0ec) ntkrpamp!CcPfBuildDumpFromTrace | (0059c514)
ntkrpamp!CcPfUpdateVolumeList
Exact matches:
ntkrpamp!CcPfBuildDumpFromTrace =

OffStart: 0019c0ec
ProcSize: 0x422
Prologue: 0xe
Params: 0n2 (0x8 bytes)
Locals: 0n11 (0x2c bytes)
Non-FPO

it takes two params

the second argument is passed to esi here
0059c2c0 8b750c mov esi,dword ptr [ebp+0Ch]

the transfer probably happens here
ntkrpamp!CcPfBuildDumpFromTrace+0x31d:
0059c409 3bc1 cmp eax,ecx
0059c40b 0f8515ffffff jne ntkrpamp!CcPfBuildDumpFromTrace+0x23a
(0059c326) Branch

ecx gets its value from here

0059c319 8d4e54 lea ecx,[esi+54h] <— arguments content possibly
0059c31c 8945dc mov dword ptr [ebp-24h],eax
0059c31f 8b01 mov eax,dword ptr [ecx]
0059c321 e9e3000000 jmp ntkrpamp!CcPfBuildDumpFromTrace+0x31d
(0059c409) Branch

windbg sorely lacks find references to functions or xrefs or i cant
find how to make it spit who might be calling this function

regards

raj

> windbg sorely lacks find references to functions or xrefs or i cant

find how to make it spit who might be calling this function

ok this seems to be directly called by
PAGE:0059D124 call _CcPfBuildDumpFromTrace@8 ;
CcPfBuildDumpFromTrace(x,x)

which is part of
PAGE:0059C0EC ; __stdcall CcPfBuildDumpFromTrace(x, x)
PAGE:0059C0EC xxxxx@8 proc near ; CODE XREF:
CcPfEndTrace(x)+72p

which is called by

PAGE:0059D0B2 ; __stdcall CcPfEndTrace(x)
PAGE:0059D0B2 xxxxx@4 proc near ; CODE XREF:
CcPfEndTraceWorkerThreadRoutine(x)+6j

which might endup in

AGE:0059D7E4 ; __stdcall CcPfBeginAppLaunch(x, x)
PAGE:0059D7E4 xxxxx@8 proc near ; CODE XREF:
PspUserThreadStartup(x,x)+D8p

PAGE:00548CE8 ; __stdcall PspUserThreadStartup(x, x)
PAGE:00548CE8 xxxxx@8 proc near ; DATA XREF:
PspCreateThread(x,x,x,x,x,x,x,x,x,x,x)+248o

PAGE:00548FF0 _PspCreateThread@44 proc near ; CODE XREF:
NtCreateThread(x,x,x,x,x,x,x,x)+D8p
PAGE:00548FF0 ;
PsCreateSystemThread(x,x,x,x,x,x,x)+2Ep

ida sure rocks for deadlisting :slight_smile:

regards

raj

Ida rocks in general. Expensive, but so very worth it, in my opinion.

mm

raj_r wrote:

> windbg sorely lacks find references to functions or xrefs or i cant
> find how to make it spit who might be calling this function

ok this seems to be directly called by
PAGE:0059D124 call _CcPfBuildDumpFromTrace@8 ;
CcPfBuildDumpFromTrace(x,x)

which is part of
PAGE:0059C0EC ; __stdcall CcPfBuildDumpFromTrace(x, x)
PAGE:0059C0EC xxxxx@8 proc near ; CODE XREF:
CcPfEndTrace(x)+72p

which is called by

PAGE:0059D0B2 ; __stdcall CcPfEndTrace(x)
PAGE:0059D0B2 xxxxx@4 proc near ; CODE XREF:
CcPfEndTraceWorkerThreadRoutine(x)+6j

which might endup in

AGE:0059D7E4 ; __stdcall CcPfBeginAppLaunch(x, x)
PAGE:0059D7E4 xxxxx@8 proc near ; CODE XREF:
PspUserThreadStartup(x,x)+D8p

PAGE:00548CE8 ; __stdcall PspUserThreadStartup(x, x)
PAGE:00548CE8 xxxxx@8 proc near ; DATA XREF:
PspCreateThread(x,x,x,x,x,x,x,x,x,x,x)+248o

PAGE:00548FF0 _PspCreateThread@44 proc near ; CODE XREF:
NtCreateThread(x,x,x,x,x,x,x,x)+D8p
PAGE:00548FF0 ;
PsCreateSystemThread(x,x,x,x,x,x,x)+2Ep

ida sure rocks for deadlisting :slight_smile:

regards

raj

Hi, Pro,

Since then, things happened, and I have a few clues. We have an issue that a render can hang if we touch the chip the wrong way. I recently added a hang recovery function to the driver, which detects a hang, and tries to dispose of the render and of subsequently enqueued renders and dma transactions, orderly releasing resources and all that good jazz. We found by experimentation that this problem occurs when (1) a render hangs, (2) the recovery kicks in, releasing everything and cleaning up the chip, (3) the device is closed, causing the thread to terminate, (4) another application runs and the device is reopened, causing the creation of a new thread, which then bombs out in the expected way.

Looks like something in my recovery is corrupting memory. Indeed, the idea to hook KeBugCheck is worth trying. Watch this space!

Alberto.

----- Original Message -----
From: Prokash Sinha
To: Windows System Software Devs Interest List
Sent: Friday, January 04, 2008 5:16 PM
Subject: Re: RE:[ntdev] Is there a kernel api or assembly instruction by which I can get the processor clock ticks from within kernel code?

I would assume you were able to break into the debugger or did you have a dump that you loaded and did the !analyze on it?

Since it is once-in-a-blue-moon, it would perhaps be a good idea to hook ( look at the wdk ) the KeBugCheck ( and its variant ), so in case you need to put out more information(s) etc.

You did not seem to look at the stack after !analyze ( it is not here in you post ).

Few questions –

  1. How infrequently you see this? Once in a month! or …
  2. Is it reproducible on debug version ( or is it just on release, well since it is x86, you can always have pdb files with good enough infos even for release build )
  3. Is there anyway you can reduce the time between occurance?. Anything special you do? Or you leave the systems on idle etc…

All it says is a Cc internal routine. And not sure where the break-point execption came from. It could be in the code somewhere and you don’t have debug enabled ( just a guess ).

If you could catch under debugger, then I guess it is bit easier to look at the stack, trap frames etc.

BTW, there are a few very very nice articles in NT Insider that covers some of the advance techniques you can follow.

-pro

On Jan 4, 2008 5:09 AM, Alberto Moreira wrote:

Hi, People,

I have a strange crash that occurs once in a blue moon, I wonder if anyone
out there can point me in the right direction ?

My driver has a kernel-side thread that keeps track of dma and render hw
sequence numbers, manages things such as sequencing at chip level, polling
for interrupts in chips that do not interrupt, keeping an eye on runaway or
hung transactions, and so on: it’s a kind of a device level cop. I have one
such thread per i/o device. Once the thread is created, it waits on a timer
tick, does whatever it needs to do, and then waits until the next timer tick
or the next event wakes it up.

Because of diagnostics and other low-level considerations, the thread isn’t
on at all times: we create it at open time and turn it off at close time.
Because we can have multiple opens per device, the first open creates the
thread, and the last close terminates it.

The problem is this: once in a blue moon we get a bugcheck 7e at the time
the thread is being created. It doesn’t happen very often, but when it
happens it’s always at the same point. The crash seems to be inside Windows.

Here’s the stack:

=======================
STACK_TEXT:
f6308d3c e0cec9bd f6308d78 00000000 fa7f694c
nt!CcPfBuildDumpFromTrace+0x23a
f6308d7c e0c059bd fa7f6810 00000000 fc9c2da8 nt!CcPfEndTrace+0x67
f6308dac e0c9c84c fa7f6810 00000000 00000000 nt!ExpWorkerThread+0xef
f6308ddc e0c1332e e0c058ce 00000001 00000000
nt!PspSystemThreadStartup+0x34
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16
=======================

Here’s the error information:

=======================
VP2000.0 2008/01/02 15:27:25:0298 @<5.d> p1
vp2000_diag_set_open_mode(440): turned on the timer thread on device VP2000
*** Fatal System Error: 0x0000007e
(0xC0000005,0xE0CEBBC8,0xF6308C3C,0xF6308938)
Break instruction exception - code 80000003 (first chance)
A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.
A fatal system error has occurred.
========================

The first line is issued from the thread itself, showing that the Open ioctl
actually tried to start the thread. It contains the device name (Vp2000),
the open number (.0), a date and time stamp, the PCI slot (@<5.d>), the
processor (p1), and the function and source code line number from where it
was issued. There’s no ensuing debug messages from the thread, which means
that it never started. The machine is a two-processor Dell 670.

The following information is also dumped:

========================
BugCheck 7E, {c0000005, e0cebbc8, f6308c3c, f6308938}
Probably caused by : ntkrpamp.exe ( nt!CcPfBuildDumpFromTrace+23a )
========================

When I run !analyze -v, this comes out:

========================
FAULTING_IP: nt!CcPfBuildDumpFromTrace+23a e0cebbc8 8b5008
mov edx,[eax+0x8]
EXCEPTION_PARAMETER1: f6308c3c
CONTEXT: f6308938 – (.cxr fffffffff6308938)
eax=00000000 ebx=e63c2008 ecx=fa7f6864 edx=00000000 esi=fa7f6810
edi=e63c30a8
eip=e0cebbc8 esp=f6308d04 ebp=f6308d3c iopl=0 nv up ei pl nz ac
po cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00210217
nt!CcPfBuildDumpFromTrace+0x23a:
e0cebbc8 8b5008 mov edx,[eax+0x8]
=========================

The reason of the crash is now clear, eax is zero and CcPfBuildDumpFromTrace
tries to reference a null pointer. Looks like something happened during the
thread creation that caused the OS to blow up. My question is, what is
CcPfBuildDumpFromTrace, and how do I debug it ? Any suggestion will be
welcome.

Alberto.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

— NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging and other seminars visit: http://www.osr.com/seminars To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Indeed I will watch, in case something interesting you find :slight_smile:

Seems like now you can force the things to happen more rapidly ( sort of simulating it ), so that is good news.

-pro
----- Original Message -----
From: Alberto Moreira
To: Windows System Software Devs Interest List
Sent: Saturday, January 05, 2008 5:36 PM
Subject: Re: RE:[ntdev] Is there a kernel api or assembly instruction by which I can get the processor clock ticks from within kernel code?

Hi, Pro,

Since then, things happened, and I have a few clues. We have an issue that a render can hang if we touch the chip the wrong way. I recently added a hang recovery function to the driver, which detects a hang, and tries to dispose of the render and of subsequently enqueued renders and dma transactions, orderly releasing resources and all that good jazz. We found by experimentation that this problem occurs when (1) a render hangs, (2) the recovery kicks in, releasing everything and cleaning up the chip, (3) the device is closed, causing the thread to terminate, (4) another application runs and the device is reopened, causing the creation of a new thread, which then bombs out in the expected way.

Looks like something in my recovery is corrupting memory. Indeed, the idea to hook KeBugCheck is worth trying. Watch this space!

Alberto.

----- Original Message -----
From: Prokash Sinha
To: Windows System Software Devs Interest List
Sent: Friday, January 04, 2008 5:16 PM
Subject: Re: RE:[ntdev] Is there a kernel api or assembly instruction by which I can get the processor clock ticks from within kernel code?

I would assume you were able to break into the debugger or did you have a dump that you loaded and did the !analyze on it?

Since it is once-in-a-blue-moon, it would perhaps be a good idea to hook ( look at the wdk ) the KeBugCheck ( and its variant ), so in case you need to put out more information(s) etc.

You did not seem to look at the stack after !analyze ( it is not here in you post ).

Few questions –

  1. How infrequently you see this? Once in a month! or …
  2. Is it reproducible on debug version ( or is it just on release, well since it is x86, you can always have pdb files with good enough infos even for release build )
  3. Is there anyway you can reduce the time between occurance?. Anything special you do? Or you leave the systems on idle etc…

All it says is a Cc internal routine. And not sure where the break-point execption came from. It could be in the code somewhere and you don’t have debug enabled ( just a guess ).

If you could catch under debugger, then I guess it is bit easier to look at the stack, trap frames etc.

BTW, there are a few very very nice articles in NT Insider that covers some of the advance techniques you can follow.

-pro

On Jan 4, 2008 5:09 AM, Alberto Moreira wrote:

Hi, People,

I have a strange crash that occurs once in a blue moon, I wonder if anyone
out there can point me in the right direction ?

My driver has a kernel-side thread that keeps track of dma and render hw
sequence numbers, manages things such as sequencing at chip level, polling
for interrupts in chips that do not interrupt, keeping an eye on runaway or
hung transactions, and so on: it’s a kind of a device level cop. I have one
such thread per i/o device. Once the thread is created, it waits on a timer
tick, does whatever it needs to do, and then waits until the next timer tick
or the next event wakes it up.

Because of diagnostics and other low-level considerations, the thread isn’t
on at all times: we create it at open time and turn it off at close time.
Because we can have multiple opens per device, the first open creates the
thread, and the last close terminates it.

The problem is this: once in a blue moon we get a bugcheck 7e at the time
the thread is being created. It doesn’t happen very often, but when it
happens it’s always at the same point. The crash seems to be inside Windows.

Here’s the stack:

=======================
STACK_TEXT:
f6308d3c e0cec9bd f6308d78 00000000 fa7f694c
nt!CcPfBuildDumpFromTrace+0x23a
f6308d7c e0c059bd fa7f6810 00000000 fc9c2da8 nt!CcPfEndTrace+0x67
f6308dac e0c9c84c fa7f6810 00000000 00000000 nt!ExpWorkerThread+0xef
f6308ddc e0c1332e e0c058ce 00000001 00000000
nt!PspSystemThreadStartup+0x34
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16
=======================

Here’s the error information:

=======================
VP2000.0 2008/01/02 15:27:25:0298 @<5.d> p1
vp2000_diag_set_open_mode(440): turned on the timer thread on device VP2000
*** Fatal System Error: 0x0000007e
(0xC0000005,0xE0CEBBC8,0xF6308C3C,0xF6308938)
Break instruction exception - code 80000003 (first chance)
A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.
A fatal system error has occurred.
========================

The first line is issued from the thread itself, showing that the Open ioctl
actually tried to start the thread. It contains the device name (Vp2000),
the open number (.0), a date and time stamp, the PCI slot (@<5.d>), the
processor (p1), and the function and source code line number from where it
was issued. There’s no ensuing debug messages from the thread, which means
that it never started. The machine is a two-processor Dell 670.

The following information is also dumped:

========================
BugCheck 7E, {c0000005, e0cebbc8, f6308c3c, f6308938}
Probably caused by : ntkrpamp.exe ( nt!CcPfBuildDumpFromTrace+23a )
========================

When I run !analyze -v, this comes out:

========================
FAULTING_IP: nt!CcPfBuildDumpFromTrace+23a e0cebbc8 8b5008
mov edx,[eax+0x8]
EXCEPTION_PARAMETER1: f6308c3c
CONTEXT: f6308938 – (.cxr fffffffff6308938)
eax=00000000 ebx=e63c2008 ecx=fa7f6864 edx=00000000 esi=fa7f6810
edi=e63c30a8
eip=e0cebbc8 esp=f6308d04 ebp=f6308d3c iopl=0 nv up ei pl nz ac
po cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00210217
nt!CcPfBuildDumpFromTrace+0x23a:
e0cebbc8 8b5008 mov edx,[eax+0x8]
=========================

The reason of the crash is now clear, eax is zero and CcPfBuildDumpFromTrace
tries to reference a null pointer. Looks like something happened during the
thread creation that caused the OS to blow up. My question is, what is
CcPfBuildDumpFromTrace, and how do I debug it ? Any suggestion will be
welcome.

Alberto.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

— NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging and other seminars visit: http://www.osr.com/seminars To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer