NTDev Posting problem ?

msr · April 22, 2008, 6:03pm

Does anybody know how to fix this.

I can only start a new thread, but cannot reply to my thread. Its shows me as guest! Right now I have to start a new thread always just to reply to my own post

(like for Cross Platform/OS INF issues thread. Haggen, Tim, Don thanks for the input there in that thread).

–thankns

OSR_Community_User · April 22, 2008, 6:24pm

It’s hard to say without some information about what interface you are using (web, news, e-mail), but in any case, I believe that
questions/issues of this sort should be directed to one of:

xxxxx@lists.osr.com
xxxxx@osr.com

I don’t know which is preferred; the first is mentioned on the main list page, and the second in is the FAQ.

Speaking of the FAQ, you might want to check out the FAQ, although I have no idea if it comments on your specific problem or not:

http://www.osronline.com/page.cfm?name=ListServerFAQ

Good luck,

mm

xxxxx@yahoo.com wrote:

Does anybody know how to fix this.

I can only start a new thread, but cannot reply to my thread. Its shows me as guest! Right now I have to start a new thread always just to reply to my own post

(like for Cross Platform/OS INF issues thread. Haggen, Tim, Don thanks for the input there in that thread).

–thankns

anton_bassov · April 22, 2008, 6:27pm

I would say this is a browser’s fault. - it happened to me few times as well ( it happens only with the very first thread you have posted to - you can post to all other threads without a problem). Are you using Firefox? Assuming Firefox under Fedora, normally it happens when you shut down the machine while browser is still active, and choose “Restore saved session”, rather than “Start new session” option when you start a browser. In my experience, the problem can be solved simply by closing all instances of a browser, and starting a new session. Please note that it happens only once in a while - normally you are able to restore the session without any conflict with OSR site…

Anton Bassov

msr · April 23, 2008, 12:33am

I am using IE web interface.
I had to restart the machine (closed all IE windows also - not sure why that didn’t work).
anyways it works now…

msr · April 23, 2008, 12:38am

(sorry for the spam…)
Damn even restart did not work… !!
I looked up at FAQ page… will send the issue to them.

–thanks

msr · April 23, 2008, 1:22am

o.k. It works after I cleared cookies and temp-files too on IE

Peter_Viscarola_OSR · April 23, 2008, 12:56pm

All,

Sorry for the annoyance this problem is causing.

We’re aware of the problem, we’ve seen it ourselves. We’re working (in the limited time we have available) on hunting down the cause.

It SEEMS to be a client-side problem, related to the client caching the page in question. You’ll note that if you switch browsers the problem doesn’t repeat (if you see this problem with IE and you switch to Firefox, the problem goes away, or vice versa).

We THINK the fix is to properly pre-expire the forum pages as they’re served. We’re not sure why this is not presently happening (all of OSR Online is a dynamic site, so the pages are ALWAYS served pre-expired, and what caching that is done of non-changed paged – or to reduce repetitive database queries – is done on the server side). The problem is we haven’t had a lot of time to chase this down and get it to work.

I apologize again for the problem. We ARE working on it, and hope to have a solution in place soon.

Peter
OSR

anton_bassov · April 23, 2008, 5:57pm

Peter,

Do you remember a long thread few days ago where Martin complained about improper rendering after around 110 posts? The funniest thing here is that after around 150-th post the thread started getting rendered properly again. The same story here - in my experience, the problem occurs only upon the very first post, but all subsequent ones are successful.
.
In other words, when it comes to self-healing and recovery, your server is pretty much like Linux kernel (if it was like NT-based one, apparently, one would have to set up a new user account after being unrecognized once)…

Anton Bassov

Don_Burn_1 · April 23, 2008, 6:23pm

Anton,

What Linux kernel heals itself? Everyone I’ve seen has had stupid
things like allocating memory the dereferencing the pointer without a check,
so it crashes when low on memory. This was explained to me by some
“experts” as making it easier to know where the problem was, versus trying
to exit gracefully. As one client who is primarily Linux (but had worked on
a number of other OS’es said) “Linux isn’t bad for a sophmore project, but I
would flunk a grad student”

–
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr
Remove StopSpam to reply

wrote in message news:xxxxx@ntdev…
> Peter,
>
> Do you remember a long thread few days ago where Martin complained about
> improper rendering after around 110 posts? The funniest thing here is
> that after around 150-th post the thread started getting rendered properly
> again. The same story here - in my experience, the problem occurs only
> upon the very first post, but all subsequent ones are successful.
> .
> In other words, when it comes to self-healing and recovery, your server
> is pretty much like Linux kernel (if it was like NT-based one, apparently,
> one would have to set up a new user account after being unrecognized
> once)…
>
> Anton Bassov
>

anton_bassov · April 23, 2008, 7:14pm

Don,

What Linux kernel heals itself?

I tried to crash it on purpose, but somehow failed. No matter what I did, it seems to recover from everything, including even invalid writes to control registers. Certainly, it is arguable whether the very idea of recovering from kernel-level faults. is good in itself, but Linux guys are in very different position, compared to MSFT. They can review all the source that comes with plain vanilla kernel, and if a vendor decides to modify it and introduces some buggy driver… well, then this is just vendor’s fault. No wonder the very idea of binary-only kernel modules is so unpopular in Linux community - otherwise, they may get blamed for a buggy third-party driver that crashed their plain vanilla kernel, although it was not written by them. MSFT chose a different approach to this issue- they just introduced driver signing…

As one client who is primarily Linux (but had worked on a number of other OS’es said) “Linux isn’t bad >for a sophmore project, but I would flunk a grad student”

Yes, but somehow this client still chose to run Linux…

In terms of architecture… well, although I do partly agree with you here, please don’t forget that Linux is constantly evolving - no more “bottom halves” but tasklets and kernel threads that are equivalent to DPCs and workitems; kernel-level pre-emption got introduced; device model is getting more centralized; etc,etc,etc…

BTW, according to Mr.Balmer, it would cost dozens of thousands dollars per installation if it was a proprietary OS. In other words, it does not seem to be as awfully bad as your client claims (although it still
has to be improved)…

Anton Bassov

Tim_Roberts · April 23, 2008, 7:28pm

xxxxx@hotmail.com wrote:

Don,

> What Linux kernel heals itself?

I tried to crash it on purpose, but somehow failed. No matter what I did, it seems to recover from everything, including even invalid writes to control registers. …

Guys, I would very much like to stop this thread right now and avoid
getting into another Linux discussion of any kind. The evidence
suggests that the members of this mailing have infinitely more wild
opinions on Linux than real facts, and all we are going to do is spread
libelous hearsay and misinformation.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

anton_bassov · April 23, 2008, 7:33pm

Tim,

Guys, I would very much like to stop this thread right now

Actually, this is the very first thing that got into my head right after I had made my previous post - after all, Peter has already threatened to ban me from the list, and on this particular occasion it was,indeed, me who started the discussion. Therefore, I think it is better for me not to push my luck…

Anton Bassov

Peter_Viscarola_OSR · April 24, 2008, 10:51am

I know there are “self diagnosing” operating systems, that detect system problems and restart so quickly the user (in most cases) is not even aware there was a total failure (apparently, commonly used in things like telephone handsets). But I’ve never heard of a “self healing” operating system.

I’ve heard of “self healing rubber” (google for more, if interested)… And once or twice I’ve had driver problems apparently spontaneously fix themselves (“it USED to blue screen every time! Now, it’s working fine… so I guess we get to ship it, huh?? NO???”).

Never mind,

Peter
OSR

Peter_Viscarola_OSR · April 24, 2008, 2:22pm

Update regarding the NTDEV posting problem via our web interface:

Just a few minutes ago, we added dditional cache control headers onto the list and thread pages to attempt to circumvent (er, FIX) the problem that some folks are having with the web interface to NTDEV.

If you get the “you are not logged in” messages after properly logging in, please refresh your browser window. If you STILL experience the problem (one page tells you that you ARE logged in, but another says you’re not) please reply/follow-up to this message.

We’re TRYING to get this issue fixed. It’s annoying lots of people, including me,

Peter
List Slave and (now) Browser Boy

anton_bassov · April 24, 2008, 3:07pm

Peter,

But I’ve never heard of a “self healing” operating system.

By “self-healing” system I meant the one that catches the exception, aborts the call , fails all subsequent calls to the failing module, logs error, and goes upon its business as if nothing has happened. This is the kind of behavior I had a chance to observe under Linux. Whenever I tried to crash it , it just dumps error messages all over the terminal, and that’s it - the system keeps on running as if nothing has happened.

Certainly, I am unable to unload the failing module, because the process that caused an error is put to an infinite non-interruptible sleep, rather than being terminated( it is understandable that I cannot kill it - you cannot terminate a process in such state, can you). Therefore, from rmmod’s perspective, the module is still needed , so that it stays loaded, and, hence, I cannot load the new instance of it , because, it is already loaded. In fact, it looks like pretty intelligent way to ensure that the buggy module is not going to cause any more trouble, don’t you think?

Please note that GPF-style exceptions are raised as faults, i.e. exception gets raised BEFORE the culprit
instruction actually does something. Therefore, as long as you block all calls to the failing module and ensure that it is not going o cause any more trouble, it is arguable whether you should kill the entire OS.

…that detect system problems and restart

It is even more arguable whether you should restart the module that is already known to be problematic…

Anton Bassov

Peter_Viscarola_OSR · April 24, 2008, 3:22pm

In fact, there’s quite a bit of discussion about this in the industry.

Apparently (this is one of my favorite stories, but it may be apocryphal) there was a specific model of cell-phone handset that was so good at restarting after an error without the user knowing it, that in spite of that fact that most units produced were so buggy that they couldn’t run for more than a few minutes at a time… NOBODY noticed. It actually took the manufacturer several months to become aware of the problem in the field.

Peter
OSR

anton_bassov · April 24, 2008, 3:31pm

Peter,

What kind of OS was it using? If it was using some microkernel-based OS that keeps most of its drivers in non-privileged mode, then, indeed, does make a perfect sense to restart the failing driver(after all, this is what the very idea of microkernel is about). If you are able to restart the entire OS so quickly that no one even notices it, it is perfectly fine as well. However, restarting the kernel-level module that you already know is problematic without restarting the whole OS is, probably, already not so wise…

Anton Bassov

Michal_Vodicka-2 · April 24, 2008, 4:15pm

> -----Original Message-----

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@osr.com
Sent: Thursday, April 24, 2008 9:21 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] NTDev Posting problem ?

Apparently (this is one of my favorite stories, but it may be
apocryphal) there was a specific model of cell-phone handset
that was so good at restarting after an error without the
user knowing it, that in spite of that fact that most units
produced were so buggy that they couldn’t run for more than a
few minutes at a time… NOBODY noticed. It actually took
the manufacturer several months to become aware of the
problem in the field.

It doesn’t look sooo bad for me Maybe bad for developers who aren’t
informed about problems immediatelly but for users it is better if they
don’t notice. This is exactly what I was complaining about some time
before. In NT I have an impression bugchecks are sometimes used as a
debugging tool. Relatively unimportant problems related to relatively
unimportant devices cause BSOD which gives developers a good chance to
debug the problem but destroys all the unsaved work which can be much
more important for the user in given moment.

Sorry for OT, I couldn’t resist

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

Don_Burn_1 · April 24, 2008, 4:28pm

wrote in message news:xxxxx@ntdev…
> Peter,
>
> What kind of OS was it using? If it was using some microkernel-based OS
> that keeps most of its drivers in non-privileged mode, then, indeed, does
> make a perfect sense to restart the failing driver(after all, this is
> what the very idea of microkernel is about). If you are able to restart
> the entire OS so quickly that no one even notices it, it is perfectly fine
> as well. However, restarting the kernel-level module that you already know
> is problematic without restarting the whole OS is, probably, already not
> so wise…
>
The know that a module is problematic is actually the hardest part. While
trying to sell fault tolerant technology over the last 15 years, everyone
asks for the trick to make things detectable, I’ve gotten this from every
firm I’ve talked to. The problem is they do not want to hear that the
“trick” is a lot of work with things like having verification of all data
structures, and duplicates of important ones, and in general rigorously
reviewing and checking code in the system.

–
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr
Remove StopSpam to reply

anton_bassov · April 24, 2008, 6:34pm

> The know that a module is problematic is actually the hardest part.

Well, any problematic module is not yet problematic before it starts causing trouble. Therefore, I am afraid
you can realize it only on post-fact basis, i.e. AFTER it has already caused the exception (or at least passed you the wrong parameter in case you managed to catch it). At this point it becomes questionable whether it should get restarted again (which means that it keeps on causing trouble). Taking it out of play seems to be a better idea.

Let’s look at the things the following way. The OS was running somehow before the target module and all of its clients got loaded, right? Therefore, the OS itself will be able to run if it this module is gone. As long as there is at least some system wrapper between drivers so that they don’t call one another directly, you can always block all subsequent calls to it, report all outstanding ones as failure to callers, and see what happens next. Certainly, you may be barking at the wrong tree, because, in actuality, the culprit is a different module that just corrupted “good” module’s space and/or executable code, but, in any case, it seems to be worth trying…

Anton Bassov