marking a page as not containing memory

I have implemented ‘ballooning’ in my Xen drivers, where a Windows VM
can be asked to give up some memory for use by other VM’s. I do this by
having my driver allocate some memory pages and telling Xen that it’s
free to use them. This works fine until a crash dump (and probably a
hibernate) happens and Windows tries to write out that page to disk, and
Xen says no because there is no actual memory there anymore and so the
write operation fails.

I can work around it by detecting this and telling Windows that the
write succeeded even though it failed (only during a crash dump of
course). It leaves some logs in Xen though and is a workaround, not a
solution.

Is there a way to tell Windows that a page no longer contains memory and
should not be written out during a crash dump or hibernate? I’m guessing
that the answer is going to be no, as there is almost certainly no other
reason why you would want to, but I thought I’d ask…

It may be possible to ask Xen for the memory back when the crash dump
occurs, before writing out the dump file, as the operations are pure
hypercall and don’t require any interrupts or anything, but it’s always
possible that Xen is using the memory and it is unavailable for release
which would put me back at square one again.

Thanks

James

>I can work around it by detecting this and telling Windows that the

write succeeded even though it failed (only during a crash dump of
course). It leaves some logs in Xen though and is a workaround, not a
solution.

Maybe implement a hypercall which will allocate a single zeroed page as lots of pages, and all guest’s pages will be then remapped to this zero page?


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

How is that less complicated than just acking the request to write
garbage? I think your workaround is complete as is.
The balloon artificially removes memory from the guest, by design
there is nothing of interest in these pages.

Mark Roddy

On Sun, Jan 31, 2010 at 9:11 AM, Maxim S. Shatskih
wrote:
>>I can work around it by detecting this and telling Windows that the
>>write succeeded even though it failed (only during a crash dump of
>>course). It leaves some logs in Xen though and is a workaround, not a
>>solution.
>
> Maybe implement a hypercall which will allocate a single zeroed page as lots of pages, and all guest’s pages will be then remapped to this zero page?
>
> –
> Maxim S. Shatskih
> Windows DDK MVP
> xxxxx@storagecraft.com
> http://www.storagecraft.com
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>

Presumably, the contents of those pages must not be exposed as they could come from another VM.

  • S

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Mark Roddy
Sent: Sunday, January 31, 2010 8:38 AM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] marking a page as not containing memory

How is that less complicated than just acking the request to write
garbage? I think your workaround is complete as is.
The balloon artificially removes memory from the guest, by design
there is nothing of interest in these pages.

Mark Roddy

On Sun, Jan 31, 2010 at 9:11 AM, Maxim S. Shatskih
wrote:
>>I can work around it by detecting this and telling Windows that the
>>write succeeded even though it failed (only during a crash dump of
>>course). It leaves some logs in Xen though and is a workaround, not a
>>solution.
>
> Maybe implement a hypercall which will allocate a single zeroed page as lots of pages, and all guest’s pages will be then remapped to this zero page?
>
> –
> Maxim S. Shatskih
> Windows DDK MVP
> xxxxx@storagecraft.com
> http://www.storagecraft.com
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

The pages simply aren’t there anymore as far as the virtual machine is
concerned. Reading or writing would be like accessing memory space
with no underlying memory.

The crash dump just steps through each physical page of memory and
asks my driver to write it to disk. In turn my driver asks the xen
backend driver to write the page to disk, but when the backend driver
tries to map the page into its own memory space xen says no because
the page doesn’t really exist anymore, and logs an error.

James

Sent from my iPhone

On 01/02/2010, at 5:32, “Skywing” wrote:

> Presumably, the contents of those pages must not be exposed as they
> could come from another VM.
>
> - S
>
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Mark Roddy
> Sent: Sunday, January 31, 2010 8:38 AM
> To: Windows System Software Devs Interest List
> Subject: Re: [ntdev] marking a page as not containing memory
>
> How is that less complicated than just acking the request to write
> garbage? I think your workaround is complete as is.
> The balloon artificially removes memory from the guest, by design
> there is nothing of interest in these pages.
>
> Mark Roddy
>
>
>
> On Sun, Jan 31, 2010 at 9:11 AM, Maxim S. Shatskih
> wrote:
>>> I can work around it by detecting this and telling Windows that the
>>> write succeeded even though it failed (only during a crash dump of
>>> course). It leaves some logs in Xen though and is a workaround,
>>> not a
>>> solution.
>>
>> Maybe implement a hypercall which will allocate a single zeroed
>> page
as lots of pages, and all guest’s pages will be then remapped
>> to this zero page?
>>
>> –
>> Maxim S. Shatskih
>> Windows DDK MVP
>> xxxxx@storagecraft.com
>> http://www.storagecraft.com
>>
>>
>> —
>> NTDEV is sponsored by OSR
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Yes, meant that’s why the physical memory would need to be unmapped from the VM instead the physical ‘mapping’ remaining.

  • S

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of James Harper
Sent: Sunday, January 31, 2010 12:54 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] marking a page as not containing memory

The pages simply aren’t there anymore as far as the virtual machine is
concerned. Reading or writing would be like accessing memory space
with no underlying memory.

The crash dump just steps through each physical page of memory and
asks my driver to write it to disk. In turn my driver asks the xen
backend driver to write the page to disk, but when the backend driver
tries to map the page into its own memory space xen says no because
the page doesn’t really exist anymore, and logs an error.

James

Sent from my iPhone

On 01/02/2010, at 5:32, “Skywing” wrote:

> Presumably, the contents of those pages must not be exposed as they
> could come from another VM.
>
> - S
>
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Mark Roddy
> Sent: Sunday, January 31, 2010 8:38 AM
> To: Windows System Software Devs Interest List
> Subject: Re: [ntdev] marking a page as not containing memory
>
> How is that less complicated than just acking the request to write
> garbage? I think your workaround is complete as is.
> The balloon artificially removes memory from the guest, by design
> there is nothing of interest in these pages.
>
> Mark Roddy
>
>
>
> On Sun, Jan 31, 2010 at 9:11 AM, Maxim S. Shatskih
> wrote:
>>> I can work around it by detecting this and telling Windows that the
>>> write succeeded even though it failed (only during a crash dump of
>>> course). It leaves some logs in Xen though and is a workaround,
>>> not a
>>> solution.
>>
>> Maybe implement a hypercall which will allocate a single zeroed
>> page
as lots of pages, and all guest’s pages will be then remapped
>> to this zero page?
>>
>> –
>> Maxim S. Shatskih
>> Windows DDK MVP
>> xxxxx@storagecraft.com
>> http://www.storagecraft.com
>>
>>
>> —
>> NTDEV is sponsored by OSR
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

> Is there a way to tell Windows that a page no longer contains memory

I would say this is pretty unwise approach, as well as allocating pages and making them available to the guest OS in an expectation that it would not touch them…

Apparently, the reason why you allocate pages in a driver is to ensure that they don’t get accessed - as long as your driver “owns” them the system would not use them for any other purpose anyway so that you give them to hypervisor. This is how you expect it to work, and it all works fine and dandy…but only as long as everyone plays by the rules. Now consider what happens if some buggy third-party driver that runs within in a guest VM actually writes to the memory your driver owns - with your approach it may well corrupt another guest VM’s ( or even host’s) memory…

Therefore, a solution that is based upon a bold assumption that guest VM would not access memory simply because it was told not to do it by hypervisor is potentially unreliable, and, hence, is flawed in itself…

I can work around it by detecting this and telling Windows that the write succeeded even though
it failed (only during a crash dump of course). It leaves some logs in Xen though and is
a workaround, not a solution.

I think this is the right way to go - IMHO, you should actually do everything in hypervisor, and keep the host
blissfully ignorant of the whole thing. If guest actually tries to access these pages (which it is normally not supposed to do anyway), the whole thing should act the way as /dev/null or /dev/zero does on UNIX-like systems…

Anton Bassov

>

> Is there a way to tell Windows that a page no longer contains memory

I would say this is pretty unwise approach, as well as allocating
pages and
making them available to the guest OS in an expectation that it would
not
touch them…

Apparently, the reason why you allocate pages in a driver is to
ensure that
they don’t get accessed - as long as your driver “owns” them the
system would
not use them for any other purpose anyway so that you give them to
hypervisor.
This is how you expect it to work, and it all works fine and
dandy…but only
as long as everyone plays by the rules. Now consider what happens if
some
buggy third-party driver that runs within in a guest VM actually
writes to the
memory your driver owns - with your approach it may well corrupt
another
guest VM’s ( or even host’s) memory…

Therefore, a solution that is based upon a bold assumption that guest
VM would
not access memory simply because it was told not to do it by
hypervisor is
potentially unreliable, and, hence, is flawed in itself…

For all the reasons you specify Anton, it’s a bit unreasonable to assume
that Xen behaves that way. Once you give the underlying physical page of
memory to Xen, you no longer have access to it. There is no contract
that says you mustn’t access that page - you can try and access it all
you like, there is just no memory there anymore as far as you are
concerned, just like if you were to access a page beyond the end of
memory. I allocate them so that Windows doesn’t try to use them for
something and crash, but there is no way that the memory access could
cause another VM to crash (bugs aside of course :). I’ve never bothered
trying, but writes don’t do anything and reads probably just return
nulls or something.

> I can work around it by detecting this and telling Windows that the
write
succeeded even though
> it failed (only during a crash dump of course). It leaves some logs
in Xen
though and is
> a workaround, not a solution.

I think this is the right way to go - IMHO, you should actually do
everything
in hypervisor, and keep the host
blissfully ignorant of the whole thing. If guest actually tries to
access
these pages (which it is normally not supposed to do anyway), the
whole thing
should act the way as /dev/null or /dev/zero does on UNIX-like
systems…

It mostly does. The problem is that xen logs errors when I ask it to map
that page for access by the ‘backend’ driver to write it to disk. I
don’t want those errors - an error should indicate a problem or a bug
somewhere. If you pollute your error logs with noise then it becomes
hard to tell what are errors and what aren’t.

Someone else suggested mapping a dummy page or adding an additional
mapping to an existing page into the ‘hole’ that is left behind… it’s
extra work but would clean up the holes and make both Windows and Xen
happy, so I’ll investigate that next…

James

There is no dom0 side allocation for the balloon memory in the guest,
that is the point of the balloon. The buggy driver will crash the
guest, just as it would for a physical guest.

Mark Roddy

On Sun, Jan 31, 2010 at 10:47 PM, wrote:
>
>> Is there a way to tell Windows that a page no longer contains memory
>
>
> I would say this is pretty unwise approach, ?as well as allocating pages and making them available to the guest OS in an expectation that it would not touch them…
>
>
> Apparently, the reason why you allocate pages in a driver ?is to ensure that they don’t get accessed - as long as your driver “owns” them the system would not use them for any other purpose anyway so that you give them to hypervisor. This is how you expect it to work, and it all works fine and dandy…but only as long as everyone plays by the rules. Now consider what happens if some buggy third-party driver that runs within in a guest VM actually writes to the memory your driver owns - with your approach it may well corrupt ?another guest VM’s ( or even host’s) memory…
>
> Therefore, a solution that is based upon a bold assumption that guest VM would not access memory simply because ?it was told not to do it by hypervisor is potentially unreliable, and, hence, is flawed in itself…
>
>
>> I can work around it by detecting this and telling Windows that the write succeeded even though
>> ?it failed (only during a crash dump of course). It leaves some logs in Xen though and is
>> a workaround, not a solution.
>
>
> I think this is the right way to go - IMHO, you should actually do everything in hypervisor, and keep the host
> blissfully ignorant of the whole thing. If guest actually tries to access these pages (which it is normally not supposed to do anyway), the whole thing should act the way as /dev/null or /dev/zero does on UNIX-like systems…
>
>
>
> Anton Bassov
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>