strange interrupt behaviour when HAL is Standard PC

James_Harper · December 29, 2008, 9:41pm

>

>Glad that FreeBSD is committed to support it in v8.0. Man, on a
Windows
system with *multiple*

What is amazing for me that Windows has ToE support (to some degree)
since
at least w2k, Linux and FreeBSD still have none, and there are myth
believers that consider these UNIXen to be faster then Windows.

If by “to some degree” you are talking about “Large Send Offload” (TCP &
UDP, RX & TX) and “Checksum Offload” (RX and TX) then Linux has had it
for quite a while.

In fact (to make it on-topic again, as I really would like to resolve
these problems), Windows 2003’s ToE is insufficient enough that making
it talk to Linux under Xenis a real pain. I have encountered the
following problems - I believe they are limitations of Windows/NDIS but
maybe the limitation is my understanding of NDIS:

(background) Under Xen, the interfaces that a Linux DomU sees support
checksum offload and large send offload. Linux can therefore send a 64K
TCP packet onto a bridge in Dom0 without calculating the checksum. If
the final destination of the packet lies on the other side of a physical
interface, and the physical interface supports Large Send then the
packet will just be handed to the interface and the interface will do
the offload, otherwise Linux will fake the offload and we are no worse
off than if the packet was checksummed and split up in the DomU (better
off, in fact, as only a single 64K packet had to traverse the bridge
rather than 40 1500 byte packets). If the final destination of the
packet is another DomU, the DomU will accept the packet in its ‘large’
form (even though it is >MTU), and never look at the checksum as it is
flagged as ‘checksum validated’.

(1) Windows refuses to accept a packet from my virtual interface that is
larger than the MTU, which means that my driver has to break up the
packet into MTU sized chunks. Lots of overhead which should be
unnecessary! I think I understand why the problem exists, but (IMHO) it
shows a bit of a lack of foresight on the part of NDIS.

(2) Windows appears to be double-checking the checksum, so if I give
Windows a packet and say “the checksum is correct”, Windows still
re-checks it and drops the packet (the checksum is blank because there
is no need to calculate a checksum when the packet has never been out on
the wire) which means I have to calculate the checksum before handing
the packet to NDIS. Again, pointless overhead.

This is NDIS5.1 (my drivers need to work under XP)… maybe these
problems are resolved in 6.x? If so, I’d consider maintaining both a 5.1
and a 6.x version of the drivers as the performance gains would be
significant.

Thanks

James

James_Harper · December 29, 2008, 9:46pm

>

>(substitute ‘TPR write’ instructions with ‘call my code’
instructions)

Custom HAL with custom KeRaiseIrql would help

Hmmmm… I wonder if it would be as simple as hooking the KeRaiseIrql
vector… or is it actually an inline function? I’m sure if it was that
simple that others would be doing it that way too.

>which will require modification to the Xen hypervisor (not sure yet
if

Well, what I see here is the issue is a minor bit of misarchitecture
in
Xen.

Task: implement a generic way of IO request passing from guest to host
under Xen, which will be able to support different kinds of emulated
hardware in guests.

Proper solution: add message passing feature to Xen hypervisor using
the
“hypervisor call” primitive, then add message server to the host and
message-sending client to the guest.

Improper solution as I see it: introduce an emulated PCI device with
an
interrupt to guests to use for this message-passing.

Actually, what is going on is the usage of emulated PCI device with
BARs,
config space, interrupt etc to implement a hypervisor call.

Isn’t this bad? implementing it as an invalid CPU opcode is simpler
and
more natural.

I think the ‘good way’ that you are describing is done under the PV
version of Linux, where the kernel knows that it is being virtualised.

Under ‘fully virtualised’ (eg Windows where the kernel doesn’t know it
is being virtualised, or at least doesn’t have to know), the PCI device
is there as a mechanism to give the driver something to attach to. If
you don’t use an interrupt, I’m not sure how Xen could signal the DomU
that something needed to be done (eg that a network packet had arrived).

‘Messages’ from Windows to Xen are indeed hypercalls - a special call
(MSR or CPUID or something like that) is done to populate a memory page
with call vectors, the internals of which depend on the architecture
etc.

James

James_Harper · December 29, 2008, 9:49pm

>

Unless you are doing something particularly clever to hide your
tracks,
that will most likely bugcheck all x64 Windows guests (PatchGuard).

I believe that the TPR write problem (where Windows performs writes to
the TPR register when they are not actually necessary as the value of
the register is not changing, which happens very frequently) is only a
problem under 32 bit guests. I also believe that SP2 of Windows 2003 x32
implements a ‘lazy TPR write’ mechanism which only writes to the
register when it would actually change its value, but users are still
reporting performance issues under 2003sp2 with >1 CPU.

We’ll see how it works out!

Thanks

James

Maxim_S_Shatskih · December 29, 2008, 10:13pm

>Hmmmm… I wonder if it would be as simple as hooking the KeRaiseIrql

vector… or is it actually an inline function?

No, it is HAL-dependent and implemented in HAL, and really sets the value to TPR on an APIC HAL.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

Calvin_Guan-2 · December 29, 2008, 10:18pm

Just to clarify a bit: when people say TOE, they specifically mean “stateful offload”, or “connection offload”, i.e. full tcp offload or msft chimney style offload which is huge?amount of?R&D, engineering and marketing?for both OS vendors and IHVs.

The first commercially available TOE,IIRC?was in w2k3 with Scalable Network Pack where you have an Ndis5.2 driver that handles stateful offload packets with NetBufferList and with?old ndis_packet for stateless offload and non-offload?packets .?We’ve been mass shipping toe chips to server OEMs since then.

OTOH, *stateless* offload such?as chksum, LSO/TSO which?have been supported by almost all desktop/server OS?many many years before.

–
Calvin Guan
Broadcom Corp.
Connecting Everything(r)

?

----- Original Message ----
From: Maxim S. Shatskih
To: Windows System Software Devs Interest List
Sent: Monday, December 29, 2008 6:22:24 PM
Subject: Re:[ntdev] strange interrupt behaviour when HAL is Standard PC

>Glad that FreeBSD is committed to support it in v8.0. Man, on a Windows system with multiple

What is amazing for me that Windows has ToE support (to some degree) since at least w2k, Linux and FreeBSD still have none, and there are myth believers that consider these UNIXen to be faster then Windows.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

__________________________________________________________________
Looking for the perfect gift? Give the gift of Flickr!

http://www.flickr.com/gift/

anton_bassov · December 29, 2008, 11:30pm

> I find the whole “In linux you can…” to be inappropriate. It doesn’t make sense in general,

and even discussing it in a Windows group is silly beyond comprehension.

As it happens quite often these days on NTDEV, it is again all about Linux. The only thing that makes it truly unique is that this thread is already 25 posts long…but none of them is mine!!!

I find the whole “you can change anything” argument fallacious in the extreme.

What about the one “It is Anton who generates all the noise on NTDEV” ??? As we have a chance to see it with our own eyes, this argument seems to be fallacious as well (although I DO contribute quite a lot to it - this is out of question)…

Once now I’ve got a documentary proof that this is not only me who starts discussions of this type,
I feel pretty safe to join it…

Sure, you can, and then you become a maintainer of the only version of linux that can run your driver. >From a business viewpoint, this sucks. In the entire history I have been involved with Unix,
it has never made sense to modify the kernel to run something that was intended for distribution
beyond the machine so modified. I find the whole “you can change anything” argument fallacious
in the extreme. The consequences of these actions are profound. I even watched projects crash
and burn because kernel modifications were made that meant that two different programs/drivers
could not coexist on the same machine because they required mutually-incompatible changes,
and they could not run on unmodified kernels at all.

The only argument that is truly fallacious is the one above - it is, indeed, fallacious from the beginning till the end. Let’s say you have written a program for Windows. Would it be reasonable for you to expect
it to run, say, on Mac??? This is exactly the same story - you can think of every UNIX flavor as of an OS in its own right (as it is), and, hence, should not expect it to be able to run programs written for another OSes.
This is what you do when you make system calls that are not specified by POSIX.

There is no mistake here - all POSIX standards end at the level of C library. They don’t specify the system calls that have to be implemented by the OS. Therefore, if you call, say, clone() from your program, you should not expect it to run on any OS other than Linux. If you want it to be able to run on other POSIX-compliant OSes you should call fork() and pthread(), i.e POSIX calls that delegate the job to clone() behind the scenes, instead of calling clone() directly…

Anton Bassov

anton_bassov · December 30, 2008, 12:02am

Maxim,

and there are myth > believers that consider these UNIXen to be faster then Windows.

This is because, apparently, unlike yourself, these “myth believers” do have a first-hand user experience with these OSes. I can tell you from my own experience is that the performance differences are significant even in human perception, and their significance is directly proportional to the amounts of data involved.

Try an experiment. Install Linux on your Windows machine, mount NTFS (in case if it does not get mounted automatically) on the partition where Windows is installed, and try to copy folders on this partition. Then reboot, and copy exactly the_ same_ folders, but under Windows. Although you would not notice any difference if the amounts of data involves are just few K, see what happens when larger amounts are involved. If the amounts of data moved are measured in terms of G, the difference will be measured in terms of minutes (and the number will be not-so-small either). Please note that this is the same machine with the same hardware…

Anton Bassov

OSR_Community_User · December 30, 2008, 12:23am

See below…

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@hotmail.com
Sent: Monday, December 29, 2008 11:29 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] strange interrupt behaviour when HAL is Standard PC

I find the whole “In linux you can…” to be inappropriate. It doesn’t
make sense in general, and even discussing it in a Windows group is silly
beyond comprehension.

As it happens quite often these days on NTDEV, it is again all about Linux.
The only thing that makes it truly unique is that this thread is already 25
posts long…but none of them is mine!!!

I find the whole “you can change anything” argument fallacious in the
extreme.

What about the one “It is Anton who generates all the noise on NTDEV” ???
As we have a chance to see it with our own eyes, this argument seems to be
fallacious as well (although I DO contribute quite a lot to it - this is out
of question)…

Once now I’ve got a documentary proof that this is not only me who starts
discussions of this type, I feel pretty safe to join it…

Sure, you can, and then you become a maintainer of the only version of
linux that can run your driver. >From a business viewpoint, this sucks.
In the entire history I have been involved with Unix, it has never
made sense to modify the kernel to run something that was intended for
distribution beyond the machine so modified. I find the whole “you can
change anything” argument fallacious in the extreme. The consequences
of these actions are profound. I even watched projects crash and burn
because kernel modifications were made that meant that two different
programs/drivers could not coexist on the same machine because they required
mutually-incompatible changes, and they could not run on unmodified kernels
at all.

The only argument that is truly fallacious is the one above - it is, indeed,
fallacious from the beginning till the end. Let’s say you have written a
program for Windows. Would it be reasonable for you to expect it to run,
say, on Mac??? This is exactly the same story - you can think of every UNIX
flavor as of an OS in its own right (as it is), and, hence, should not
expect it to be able to run programs written for another OSes.
This is what you do when you make system calls that are not specified by
POSIX.
******
*
Of all the people in the world, I’m the last one you have to convince of
this. I spent far too much time getting programs to port on “compatible”
versions of Unix, and even port to “non-compatible” versions. But wht that
proved was that there is no portability. The consequence of this, from a
business decision, is that if you create your own version of linux, you must
become a linux distributor of a “non-standard” version (whatever THAT
means). This means your product creates two business divisions of your
company: a presumed for-profit division that creates the product, and a
cost-center (negative profit) division to support the infrastructure
operating system you must maintain and distribute. I watched one company go
under (I was their last outstanding invoice: I’d done the Windows port of
the product) because nobody wanted their custom versions of Unix. They had
to support four or five versions. I also worked for an organization that
supported 30 different versions of Unix. POSIX was uninteresting because
all the product dealt with low-level system calls nearly all the time.
*
*****

There is no mistake here - all POSIX standards end at the level of C
library. They don’t specify the system calls that have to be implemented by
the OS. Therefore, if you call, say, clone() from your program, you should
not expect it to run on any OS other than Linux. If you want it to be able
to run on other POSIX-compliant OSes you should call fork() and pthread(),
i.e POSIX calls that delegate the job to clone() behind the scenes, instead
of calling clone() directly…
*****
*
Which doesn’t change the basic premise I made: that talking about linux is
non-productive because it doesn’t help Windows programmers, and most
especially doesn’t help device driver programmers. End of discussion.
*
*****

Anton Bassov

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

–
This message has been scanned for viruses and dangerous content by
MailScanner, and is believed to be clean.

OSR_Community_User · December 30, 2008, 12:37am

So what? I’m not going to install linux. Why should I care in the slightest
that linux is faster? Linux-is-better discussions should be blocked. It is
an inherently uninteresting topic for this discussion group. Take it to the
linux-is-better-than-Windows discussion group.
joe

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@hotmail.com
Sent: Tuesday, December 30, 2008 12:01 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] strange interrupt behaviour when HAL is Standard PC

Maxim,

and there are myth > believers that consider these UNIXen to be faster
then Windows.

This is because, apparently, unlike yourself, these “myth believers” do
have a first-hand user experience with these OSes. I can tell you from my
own experience is that the performance differences are significant even in
human perception, and their significance is directly proportional to the
amounts of data involved.

Try an experiment. Install Linux on your Windows machine, mount NTFS (in
case if it does not get mounted automatically) on the partition where
Windows is installed, and try to copy folders on this partition. Then
reboot, and copy exactly the_ same_ folders, but under Windows. Although
you would not notice any difference if the amounts of data involves are just
few K, see what happens when larger amounts are involved. If the amounts of
data moved are measured in terms of G, the difference will be measured in
terms of minutes (and the number will be not-so-small either). Please note
that this is the same machine with the same hardware…

Anton Bassov

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

–
This message has been scanned for viruses and dangerous content by
MailScanner, and is believed to be clean.

anton_bassov · December 30, 2008, 1:16am

> Linux-is-better discussions should be blocked.

Yes, but for this or that reason you seem be spend a fair amount of your time on these discussions, trying to convince us that something the first part of the above sentence suggests does not really hold true…

It is an inherently uninteresting topic for this discussion group.

Well, judging from this thread (please note that I was not involved in it until post 26 while you had participated in it well before I jumped in) one would not immediately arrive to this conclusion, don’t you think…

Anton Bassov

David_J_Craig · December 30, 2008, 2:14am

Yes, but since this is the last of the Linux off topic times I just had to
jump in. It may be very true, but with both Linux and FreeBSD it is very
hard to make good money on either product. Yes, we have Linux guys but they
get stuck doing FreeBSD, Solaris, and whatever else is not Windows and there
are not very many of them compared to the Windows groups. Maybe this will
change, but when it does then I will follow. I do use Linux some, but since
Windows is free to me with MSDN & Technet, I don’t have to worry about my
costs. In fact, I pay more for a retail Suse than I do for Windows.

wrote in message news:xxxxx@ntdev…
>
>> Linux-is-better discussions should be blocked.
>
> Yes, but for this or that reason you seem be spend a fair amount of your
> time on these discussions, trying to convince us that something the first
> part of the above sentence suggests does not really hold true…
>
>> It is an inherently uninteresting topic for this discussion group.
>
> Well, judging from this thread (please note that I was not involved in it
> until post 26 while you had participated in it well before I jumped in)
> one would not immediately arrive to this conclusion, don’t you think…
>
> Anton Bassov
>

Maxim_S_Shatskih · December 30, 2008, 4:03am

>experience with these OSes. I can tell you from my own experience is that the performance

differences are significant even in human perception

Yes, for instance, filesystem in FreeBSD is noticeably slower then in Windows XP Home on the same hardware, really so.

amounts are involved. If the amounts of data moved are measured in terms of G, the difference will be
measured in terms of minutes (and the number will be not-so-small either).

Did the same with FreeBSD, it loses big time. Don’t know on Linux, probably it is the same with Linux too.

FreeBSD is not a bad OS, but I would have doubts about it anywhere where performance is important.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

anton_bassov · December 30, 2008, 9:00am

>with both Linux and FreeBSD it is very hard to make good money

Well, it depends on how you want to make money. Under the (until recently, and, apparently, still) prevalent model when user pays for the application the above statement is 100% true. However, please note that the era of applications that you buy and run on your PC seems to be approaching its end - we seem to be moving to the age of the distributed apps and subscription services. This implies totally different business models will replace currently known ones.

Second, please don’t forget that software development business moves well beyond the PC world and to the mobile devices that are tightly coupled, i.e. all drivers are written by a device manufacturer and operated by the OS that is embedded into FLASH. If we take into account that the computing power of these devices approaches the one of the PC, and combine it with G Ethernet (which opens that possibility
of having a remote hard disk. that performs not much worse than the internal, let alone the external one), as well as with 802.16 (which opens the possibility of accessing the network anywhere in the range of 10 km in the urban area and up to 100 km in the rural one)… and, at this point, it becomes pretty obvious where the wind blows.

Apparently, the future lies with diskless mobile devices that have nothing, apart from the embedded OS, on them, and are able to access the network either directly via Ethernet cable or remotely via 802.16. All their apps and data will be stored on the virtual disk that is physically located on one of their service provider’s machines and is accessible via the network. Therefore, the user will pay for the device and for the service subscription, rather than for applications.

Linux is the OS that seems to be most suitable for the above model on both mobile device’s and server sides…

Anton Bassov

anton_bassov · December 30, 2008, 9:46am

> Yes, for instance, filesystem in FreeBSD is noticeably slower then in Windows XP Home

on the same hardware, really so.

…

Did the same with FreeBSD, it loses big time.

Well, I don’t know that much about FreeBSD, but, judging from your statement that its “proper”, in your understanding, VFS does not support the concept of inode number, I am hardly surprised. This concept is
extremely useful, and, I would say, is of vital importance to any high-performance OS.

Let’s say you open, read/write and close the same file all the time. If you have an inode number that uniquely identifies a given file all you have to do is to ask FSD to translate a given logical file offset
for a given inode to disk sectors, and do it just once for offset (on per page basis). From now on, whenever you want to access a file at a given offset, all you have to do is to insert a request into underlying block device’s queue, and do it right from VFS( on read()/write() calls - as long as we speak about memory-mapped file, this part can be done directly by memory manager without even going to VFS).
Once inodes and dentries are cached, you don’t have to go to FSD more than once upon path resolution for a given inode, and you also don’t need FSD more than once in order to understand that inode X is just a symbolic link to inode Y.

In fact, you can go further than that, and avoid going to the disk on read operation until either a program that had read the file actually accesses memory that data was supposed to be read to, or someone writes
to this file offset In the latter case you will have to retrieve data from the disk and back up a given page by a swap file, rather than target one, and you don’t have to do it right on the spot anyway - just pin the newly-written page so that it does not get flushed until the data gets retrieved, and that’s it. In other words, once you have asked FSD to translate file offset to disk sectors , at this point it turns solely into memory manager’s business…

Are you still claiming that inode number is so awfully bad idea???

Anton Bassov

Jake_Oshins · December 30, 2008, 11:36am

Max, it’s inlined in 64-bit drivers. It’s just a CR8 update.

–
Jake Oshins
Hyper-V I/O Architect
Windows Kernel Team

This post implies no warranties and confers no rights.

“Maxim S. Shatskih” wrote in message
news:xxxxx@ntdev…
>>Hmmmm… I wonder if it would be as simple as hooking the KeRaiseIrql
>>vector… or is it actually an inline function?
>
> No, it is HAL-dependent and implemented in HAL, and really sets the value
> to TPR on an APIC HAL.
>
> –
> Maxim S. Shatskih
> Windows DDK MVP
> xxxxx@storagecraft.com
> http://www.storagecraft.com
>
>

Maxim_S_Shatskih · December 30, 2008, 2:07pm

> Max, it’s inlined in 64-bit drivers. It’s just a CR8 update.

Is CR8 an alias to TPR?

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

Maxim_S_Shatskih · December 30, 2008, 2:08pm

> Linux is the OS that seems to be most suitable for the above model on both mobile device’s and

server sides…

Windows CE, Mac OS and Symbian beat Linux on mobile by far.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

Maxim_S_Shatskih · December 30, 2008, 2:09pm

> Well, I don’t know that much about FreeBSD, but, judging from your statement that its “proper”, in your

understanding, VFS does not support the concept of inode number, I am hardly surprised. This
concept is
extremely useful, and, I would say, is of vital importance to any high-performance OS.

How VFS is related to performance? Good interface does not mean that things are fast.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

anton_bassov · December 30, 2008, 4:50pm

>How VFS is related to performance?

Think about it yourself…

Good interface does not mean that things are fast.

Well, first of all, VFS is much more than simply the interface, don’t you think??? It is the whole layer with its own API and functionality. Second, very much depend on architecture of its interactions with a particular FSD and other system components (i.e memory manager and block devices)…

Windows CE, Mac OS and Symbian beat Linux on mobile by far.

I really have no idea how WinCE performs on the mobile phones - AFAIK, its share is just tiny, compared even to Linux (which holds its modest 15%). Symbian is, indeed, the king in the world of mobile phones.
I believe this is directly related to the fact that “mainstream” Linux is not RTOS (and, unless a special settings upon compilation is made, does not even support kernel-level preemption by default). Therefore, whenever you see it on the mobile you can be pretty sure that this is a special-purpose OS that was just derived from Linux. However, this is still an OS in its own right, with its own scheduling logic that may defer execution to the default Linux scheduler from time to time.

This is why I say Linux is just ideal for this purpose - you can modify and optimize it in any way you wish and use your Linux-derived OS on your device without having to pay anything to anyone. However, Symbian is a proprietary OS, so that the only one who can use it these days is Nokia (IIRC, this year they declared its intentions to purchase all shares in the consortium and become sole owners of Symbian OS)…

Anton Bassov

Jake_Oshins · December 30, 2008, 8:00pm

Yes.

–
Jake Oshins
Hyper-V I/O Architect
Windows Kernel Team

This post implies no warranties and confers no rights.

“Maxim S. Shatskih” wrote in message
news:xxxxx@ntdev…
>> Max, it’s inlined in 64-bit drivers. It’s just a CR8 update.
>
> Is CR8 an alias to TPR?
>
> –
> Maxim S. Shatskih
> Windows DDK MVP