Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

High IRQLs for filter developers

0xrepnz0xrepnz Member Posts: 16

Hey,
I'm not a device driver developer - Mostly doing filter development - Is there any situation you think I will encounter:

  • running code in > DISPATCH_LEVEL?
  • Using KeInsertQueueDpc?
  • Using IPI?

As far as I know:

1) Only ISRs run in > DISPATCH_LEVEL so it's not relevant for filter developers unless some callback is called in an ISR - I don't think any callback is called at this IRQL because it's a high IRQL...
2) Completion routines can only be called in <= DISPATCH_LEVEL (Because you cannot complete an IRP in > DISPATCH_LEVEL..)
3) The only place I think a DPC can be useful for me is for timers... (KeSetTimer)

Also, I wonder:

  • Why can't I queue a work item in > DISPATCH_LEVEL? Sounds very weird that to perform PASSIVE_LEVEL operations I have to go through a DPC.. (Ignoring ThreadedDpc because it can be disabled globally and the DPC will be called in DISPATCH_LEVEL)
  • Why can't I call KeInsertQueue in > DISPATCH_LEVEL? Is there any synchronization object that is allowed in DISPATCH_LEVEL and not allowed in DIRQL?

Most of these questions are probably not that relevant for normal filtering but I'm curious anyway...

Thanks!:)

Comments

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,583

    Your analysis is correct. In general, IRQLs > DISPATCH_LEVEL are restricted to device drivers. I can’t think of any case where a filter driver developer would encounter IRQLs > DISPATCH_LEVEL.

    You can’t queue work items or DPCs at IRQLs > DISPATCH_LEVEL because the operations and/or queues that Windows uses for these features are protected by DISPATCH_LEVEL locks (or the equivalent).

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • anton_bassovanton_bassov Member Posts: 5,092

    I'm not a device driver developer - Mostly doing filter development - Is there any situation you think I will encounter:

    running code in > DISPATCH_LEVEL?
    Using KeInsertQueueDpc?
    Using IPI?

    Out of all the above, only KeInsertQueueDpc() may THEORETICALLY apply to filter drivers. In practical terms, I hardly imagine why
    someone would want to use it in any "proper" driver that does not actually handle interrupts. However, in case if you are just desperate to use it in order to implement some"hack", you can do it in any context.

    For example, IIRC, one of the infamous DRM schemes that was used around 15 years ago (I think it was called StarForce or something similar to that) was filtering a CD drive. What it was actually doing was re-queuing a DPC to the head of the queue over and over again
    ( in order to work around 100us limitation required for driver certification), effectively freezing the system ( UP systems were prevalent at the time) for indefinite periods of time so that the game could not get copied.

    Running at DIRQL applies only to ISRs and to the code that synchronises with them, so that this part is reserved only for certain types of hardware devices (normally only PCI and PCIe ones). Concerning IPIs, Windows drivers are not supposed to send the ones.
    This is done (or at lest was done at the time when I was writing Windows drivers 12 years ago) by, IIRC, HalRequestSoftwareInterrupt(), but this function was undocumented and was not supposed to be called by anyone, apart from the kernel. AFAIK, nothing had changed in this respect since.

    Is there any synchronization object that is allowed in DISPATCH_LEVEL and not allowed in DIRQL?

    Of course...

    For example, you cannot use "regular" DISPATCH_LEVEL spinlocks at DIRQL. In order to understand why, consider what happens if a CPU acquires a spinlock at DISPATCH_LEVEL, then gets interrupted while still holding it, and ISR tries to acquire the same spinlock again. Once you cannot acquire spinlocks recursively, a deadlock is guaranteed in this situation.

    This is why you cannot use "regular" DISPATCH_LEVEL spinlocks at DIRQL. In order to synchronise with ISR you need a special interrupt spinlock that elevates IRQL to DIRL, rather than to DISPATCH_LEVEL.

    Why can't I queue a work item in > DISPATCH_LEVEL?

    Apparently, queuing a workitem involves a "regular" spinlock acquisition - after all, workitem queue must be a global one, and,hence,
    must be accessible to multiple CPUs at once. Therefore, you need to get a lock before you can add new items to it . You already know why you cannot use "regular" spinlocks at IRQL > DISPATCH_LEVEL, right. This is the reason why you cannot queue a workitem
    at IRQL > DISPATCH_LEVEL

    Why can't I call KeInsertQueue in > DISPATCH_LEVEL?

    Well, I don't really know what you mean by "KeInsertQueue", but assuming that you mean KeInsertQueueDpc(), I can assure you that this
    function may be used at IRQL > DISPATCH_LEVEL without a slightest problem. Otherwise, how would you queue a DPC, in the first place???

    https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-keinsertqueuedpc

    This function inserts a DPC into the queue that is local to the caller CPU, and, hence,does not require locking - simply disabling interrupts on the CPU may suffice here. Therefore, this function does not have IRQL-related requirements.....

    Anton Bassov

  • 0xrepnz0xrepnz Member Posts: 16
    edited December 2019

    Thank you both for the comprehensive answer!
    Anton - KeInsertQueue: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-keinsertqueue - this function probably uses the "regular" spinlocks you spoke about to synchronize and that's why it cannot be used in DIRQL.

    about the work items - theoretically, they could implement lock-free queues with Interlocked functions but I guess it's harder and the OS was not designed this way...

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,583

    Queuing work items from the ISR: Yup... you always need to step-down from ISR to DPC, and then from DPC to a scheduled work item. In fact, if you look at WdfInterruptQueueWorkItemForIsr (which is designed to be a handy way to allow WDF devs to queue work items directly from the ISR, just as you describe) you'll see that it queues a DPC and the DPC then queues the work item.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • anton_bassovanton_bassov Member Posts: 5,092

    about the work items - theoretically, they could implement lock-free queues with Interlocked functions

    Please note that a lock-free queue implemented with InterlockedCompareExchange() and friends is, in actuality, not necessarily
    as "wonderful" solution as it may seem to be at the first glance. For example, consider a busy queue with insertions and removal getting performed by multiple CPUs all the time. By the virtue of bus locking that interlocked functions imply, only one of them may be successful
    at any particular moment, right. Therefore, all other contenders will have to repeat their attempts, and do it in a loop until eventually succeeding.

    In terms of bus pressure, this is, apparently, much worse than a "classical" test-and-set-based spinlock with its "thundering herd"
    scenario, which may occur when a current owner releases a lock. At this point all the contenders break out of their inner loops,
    and attempt an interlocked test-and-set all at once, although only one of them may succeed at the time. This is why it is called
    a "thundering herd" scenario, which puts an unnecessary pressure on the system bus.

    In case of test-and-set-based implementation of a spinlock, the above mentioned "thundering herd" scenario arises only in spikes,
    i.e. may occur only when a current owner releases a lock. The system bus may "take a breath and relax" while the spinlock is being held.
    However, in case of a lock-free queue there is no chance of a respite for a bus - what you are getting here is a tight polling loop
    of interlocked operations, with all the pressure on the system bus that it implies.

    In fact, if you look at the whole thing from the system bus's perspective, what you are going to see in the place of this fancy " lock-free queue" is just the most naive implementation of spinlock acquisition ever known,

    [enter ironical mode]

    although Mr.Kyler believes this is the only proper spinlock implementation in existence.

    https://community.osr.com/discussion/comment/122444#Comment_122444

                        [begin quote]
    
                         Everything on this thread has been worthless. Everyone knows the correct implementation of a spinlock is:
    
                         SPIN: BBSSI #0,LOCK,SPIN
    
                         - Dan.
    
                          [end quote]
    

    [leave ironical mode]

    Certainly, I am not saying that a lock-free queue is necessarily worse than a spinlock - surely it has its own advantages.
    Unlike a spinlock, it does not have an owner, and, hence, does not need to get released. Therefore, even if you get preempted after a successful operation on a queue you are not going to block anything. Furthermore, once an operation on it is atomic, it has no chance of getting into an inconsistent state due to interrupt or preemption. As a result, there is no need to subject these operations to any IRQL-related constraints.

    The only thing that I am saying is that these algorithms should be used judiciously, because they may do more harm than good
    under some workloads, at least if implemented naively. Therefore, everything should be judged on case-by-case basis. One has to bear in mind that any particular interlocked operation per se is not THAT expensive., at least as long as it gets performed occasionally. The real overhead comes in only when you start performing them in a loop. Therefore, the main issue here is with a loop. As long as you can find a way to avoid it, everything is going to be just fine. For example, when you are in a position to do something useful in between your attempted interlocked ops, a lock-free queue seems to be, indeed, quite an attractive option

    However, taking into consideration the specifics of a workitem queue, I would say that it is, apparently, better off with being protected by a spinlock.

    Anton Bassov

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,583

    better off with being protected by a spinlock.

    Which is, incidentally, almost certainly an in-stack queued spinlock. But I really don’t recall, and absent a specific request or reason, haven’t looked.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • anton_bassovanton_bassov Member Posts: 5,092

    Which is, incidentally, almost certainly an in-stack queued spinlock.

    Most likely yes,especially if we recall that in-stack queued spinlock (i.e. XCHG-based one) is an optimisation that eliminates a "thundering herd" issue that a "classical" test-and-set-based spinlock is vulnerable to.

    Again, it does not necessarily imply that a in-stack queued spinlock is always better than a "classical" one. Again, it depends on whether you have something useful to do at the moment. If you do, you may KeTryToAcquireSpinLock() . This approach may be possible only with a "classical" spinlock that is built around test-and-set. Both XCHG-based in-stack queued spinlock under Windows and XADD-based ticket lock under Linux disallow this possibility - once you have taken your place in a queue, you already cannot, for understandable reasons, leave it until you have actually acquired a spinlock. Otherwise, you are going to screw up the things for the other contenders, as well as for a lock owner.

    Strange enough, the participants of the thread below mostly did not find any use to it - as it usually happened, Alberto was the only one who had the potential to see the various possibilities and to think "out of the box"

    https://community.osr.com/discussion/72984.

    To make it even more interesting, this function seems to have been removed from the newer OS releases, and is not mentioned anywhere in the documentation any longer. ......

    In any case, taking into consideration the specifics of a workitem queue, an in-stack queued spinlock seems to be more suitable for the purpose, compared to a "classical" one.....

    Anton Bassov

  • 0xrepnz0xrepnz Member Posts: 16
    edited December 2019

    Certainly, I am not saying that a lock-free queue is necessarily worse than a spinlock - surely it has its own advantages.

    Thank you for your answer, I'll take it into consideration if I'll do such a thing;

    Moreover, I looked at KeInsertQueueDpc and I saw that it actually raises the IRQL to HIGH_LEVEL - Is it because DPCs can be queued at any IRQL?

    So I just want to verify I got it correctly: you have to be at least the same IRQL as the code you want to synchronize with - for example if you have an interrupt handler that runs at IRQL 6 and you want to synchronize with it you have to raise the current IRQL to 6 to do so and interrupt spin locks in windows allows the caller to automatically increment to IRQL to the IRQL of the ISR
    Also sounds like spin locks (or other locking mechanisms) are not handy when the the code that synchronizes always runs on the same CPU because when you raise the IRQL no other code on the same CPU can preempt your code.

    If a device sends an interrupt while the IRQL on the CPU is high, does the interrupt gets into a "pending" state until the IRQL lowers?
    I saw KeInsertQueueDpc calls HalSendSoftwareInterrupt(DISPATCH_LEVEL) - as far as I know this is used to run all the pending DPCs in the queue of the CPU - if the IRQL was high when KeInsertQueueDpc called HalSendSoftwareInterrupt(DISPATCH_LEVEL), the interrupt is pending until the IRQL lowers below DISPATCH_LEVEL then the DPC queue is flushed, right?

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,583

    you have to be at least the same IRQL as the code you want to synchronize with

    Hmmmm... yes. Spinlocks on Windows have an IRQL that is implicitly associated with them. This must be at least IRQL DISPATCH_LEVEL and is the highest IRQL from which they may ever be acquired. "Normal" Executive Spinlocks use IRQL DISPATCH_LEVEL. The Interrupt Spin Lock (one per KINTERRUPT) uses the Device IRQL (the IRQL at which the device interrupts and at which the ISR runs).

    So, acquiring a Spin Lock in WIndows has two steps:

    1) The IRQL of the processor is raised to the Spin Lock's IRQL. This, as I said before, is at least IRQL DISPATCH_LEVEL. This (as you noted) disables dispatching, causes the current thread to run until it lowers its IRQL back below DISPATCH_LEVEL, and is sufficient to lock THE CURRENT PROCESSOR.

    2) The thread attempts to acquire the Spin Lock (using an Interlocked Bit Test and Set operation... by implementation if not by architecture). The thread does not return from the attempt to acquire the Spin Lock until the lock is held.

    Also sounds like spin locks (or other locking mechanisms) are not handy when the the code that synchronizes always runs on the same CPU

    Hmmmm... Yes? I guess? But, that's a pretty odd circumstance, isn't it? I mean, I've got a system with N processors... why would my code be restricted to running on only one specific processor. Please don't say "locality of reference" or "caching"... because (remember) Windows treats multiple threaded processors ("hyperthreading") as discrete logical processors... but also understands the physical topology.

    If a device sends an interrupt while the IRQL on the CPU is high, does the interrupt gets into a "pending" state until the IRQL lowers?

    Yes.

    as far as I know this is used to run all the pending DPCs in the queue of the CPU

    Well, not necessarily. It can also be used to call the Dispatcher to perform a scheduling operation.

    the interrupt is pending until the IRQL lowers below DISPATCH_LEVEL then the DPC queue is flushed, right?

    Yes.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • anton_bassovanton_bassov Member Posts: 5,092

    Also sounds like spin locks (or other locking mechanisms) are not handy when the the code that synchronizes
    always runs on the same CPU because when you raise the IRQL no other code
    on the same CPU can preempt your code.

    Basically, raising IRQL is just a way of synchronising with the code that is guaranteed to run on the same logical CPU. As long as all your contenders are guaranteed to run on the same logical CPU, there is, indeed, no need for spinlocks - simply raising IRQL may suffice here.
    This is the reason why UP HAL's implementations of KeAcquireSpinlock() and KeAcquireSpinlockAtDpcLevel() boil down to respectively KeRaiseIrql (DISPATHCH_LEVEL) and NOP. However, this approach is very obviously insufficient if your contenders may be
    running on the other logical CPUs, so that you need a "full-fledged" spinlock in this case.

    This is why KeRaiseIrql() and friends are of very little use to general-purpose driver writers these days, and may only apply to some
    "special-needs" drivers that limit the affinities of their threads and DPCs to specific CPUs, cores and threads
    (like, for example, the ones that try to (mis)use Windows NT as a soft RTOS)

    Moreover, I looked at KeInsertQueueDpc and I saw that it actually raises the IRQL to HIGH_LEVEL - Is it
    because DPCs can be queued at any IRQL?

    So I just want to verify I got it correctly: you have to be at least the same IRQL as the code you want to
    synchronize with - for example if you have an interrupt handler that runs at IRQL 6 and you want to synchronize
    with it you have to raise the current IRQL to 6 to do so and interrupt spin locks in windows allows the caller to
    automatically increment to IRQL to the IRQL of the ISR

    Basically, there are 2 reasons why you may want to raise IRQL

    1. You want to ensure the atomicity of the operation, in a sense of disallowing any external updates to
      the variables that it may access, until the target operation completes. Your question concerning KeInsertQueueDpc()
      is very obviously about this scenario. Consider what happens if an update to a DPC queue gets interrupted, and an interrupt handler
      decides to update it as well. It is obvious that a DPC queue may be left in inconsistent state in this case. Therefore, in order to avoid this "unfortunate scenario", KeInsertQueueDpc() raises IRQL to HIGH_LEVEL, effectively disabling interrupts on a given CPU,
      until update is complete

    2. You are holding a synch construct that cannot get acquired recursively. This is why the system has to raise IRQL
      to the highest one of all the possible contenders, before attempting an interlocked op that a spinlock is built on top of. Otherwise,
      a recursive acquisition attempt may potentially be made if your code gets interrupted and interrupt handler tries to acquire the same spinlock, effectively deadlocking the CPU until the reboot.

    If a device sends an interrupt while the IRQL on the CPU is high, does the interrupt gets into a "pending" state until the IRQL lowers?

    Yes - the OS may rely upon the hardware-provided features in order to ensure this part...

    I saw KeInsertQueueDpc calls HalSendSoftwareInterrupt(DISPATCH_LEVEL) - as far as I know this is used to run
    all the pending DPCs in the queue of the CPU

    Well, this is just a request for a scheduling interrupt on the target CPU.OTOH, in practical terms, a scheduling interrupt handler, indeed,
    flushes a DPC queue in case if it not empty

    • if the IRQL was high when KeInsertQueueDpc called HalSendSoftwareInterrupt(DISPATCH_LEVEL), the interrupt is pending
      until the IRQL lowers below DISPATCH_LEVEL then the DPC queue is flushed, right?

    Well, probably, it is better to say " is about to go below DISPATCH_LEVEL", at least as long as our target OS version provides a purely software-based implementation of the IRQL concept. OTOH, it really depends on the OS and HAL versions. For example, if our target OS/HAL version relies upon TPR for the purpose, a pending DISPATCH_LEVEL interrupt will, indeed, fire immediately after the OS lowers IRQL below DISPATCH_LEVEL by means of writing to TPR.

    In any case, this is just a minor and insignificant detail - I am just trying to be pedantic
    (and, apparently, to show off as well :) )

    Anton Bassov

  • MBond2MBond2 Member Posts: 19

    anton's comments are esentially obsolete as there have been no UP systems in a long time. he is not wronf, but answers about how a PDP8 works might be as useful

  • anton_bassovanton_bassov Member Posts: 5,092

    anton's comments are esentially obsolete as there have been no UP systems in a long time.

    Actually, I was just answering the OP's question concerning synchronisation on the same CPU (namely, whether
    KeAcquireSpinlock() and KeRaiseIrql (DISPATHCH_LEVEL) are equivalent in this case), so that the example of the now obsolete
    UP HAL seemed (at least to me) right for the purpose. After all, this was a purely conceptual question and not the one about
    the details of any particular implementation. Never mind...

    PS. Once we are at it, are you 100% sure there are no more single-core CPUs produced? If my memory serves me well, I think I saw
    a low-end Windows laptop with a single-core Celeron in a shop as recently as around a year ago

    answers about how a PDP8 works might be as useful

    You just made my day, Marion....

    The funniest thing here is that, as long as we are speaking about the concept of IRQL, the examples of the ancient hardware like PDP11 may, indeed, be more than useful. The problem is that the very concept of the OS prioritising hardware interrupts to one another does not really make sense under any architecture, other than PDP11. AFAIK, one of the peculiar features specific to PDP11
    was a non-uniform cost of accessing hardware devices, which depended on the particular device's location on the bus.

    Therefore, the very concepts of IRQL (under VMS) and spl() (under UNIX) were developed for the sole purpose of addressing
    this peculiarity of PDP11. Windows NT just borrowed the concept of IRQL from VMS. In actuality, IRQL and spl() don't really make sense anywhere, apart from PDP11. Although you may still see spl() under the most recent versions of NetBSD and OpenBSD occasionally, the only reason for this is that no one was really bothered to clean it up. FreeBSD had replaced it with the spin mutexes ages ago, and Solaris/Illumos does not seem to be using it for anything, apart from disabling and enabling interrupts on the CPU.

    In other words, the very concept of IRQL was completely obsolete at the time Windows NT was designed
    (OK, I shut up now - otherwise, "The Hanging Judge" may get upset, and I had used up my "lucky coin" more than a year ago)

    Anton Bassov

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,583

    one of the peculiar features specific to PDP11 was a non-uniform cost of accessing hardware devices

    I am not aware of this “feature” and it doesn’t make sense relative to what I know about the hardware architecture and OS architectures... as someone who worked at Digital teaching people how to write device drivers for multiple PDP-11 operating systems, and who wrote a lot of PDP-11 drivers myself, I think I would have heard of this.

    Therefore, the very concepts of IRQL (under VMS) and spl() (under UNIX) were developed for the sole purpose of addressing this peculiarity of PDP11. Windows NT just borrowed the concept of IRQL from VMS. In actuality, IRQL and spl() don't really make sense anywhere, apart from PDP11

    Yeah... no.

    The idea of relative interrupt prioritization was present (In the PDP-11 and the IBM PC) to allow the interrupts from devices that require lower latencies to take priority over devices that can tolerate longer latencies. It’s as simple as that. Every bit cost, back in the day, and a common 8 line terminal interface (the DZ11) only had a 16 character FIFO that was shared by all 8 lines. Even at 9600 baud (the max speed supported) it was pretty easy to overrun the FIFO.

    IRQL is a larger concept that builds on this, and (amusingly enough) very closely mirrors the PDP-11 Interrupt scheme, which includes a dedicated set of IRQs that are used only for software prioritization (IIRC, IRQs 1, 2, and 3 were not usable for devices on the PDP-11).

    You are correct that, by the time of Windows NT, hardware IRQs were effectively useless... and (again IIRC) NT always randomly assigned IRQLs to hardware devices and thus ISRs... a fact that usually blows the mind of embedded and real-time device people.

    IRQLs are primarily a mechanism for the OS to organize it’s work, prior to returning to its default state of “running the highest priority user thread.” They provide us a mechanism to do work that’s “more important than user stuff” but “less important than device interrupt service” that is designed to keep interrupt latency low. RSX111M did this using something Cutler called the Fork List, VMS likewise, Windows with the DPC list, Linux bottom half, etc. A side effect of this, of course, is that you get the capability for multiple levels of spin locking (which is sometimes cited as the reason for IRQLs)... but this is a happy bonus and not the primary architectural motivation for IRQLs.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Don_BurnDon_Burn Member - All Emails Posts: 1,673

    As another person who worked on PDP-11 device drivers, in my case writing them as freelance consultant since 1972 through 1978, I like Peter never heard the claim. Most of my drivers were for DEC OSes but I did write some Unix drivers in that time, and helped port UNIX to a Honeywell architecture, I had contact with the Bell Lab's team and never heard these claims. So Anton, where did you get this dreck?

  • anton_bassovanton_bassov Member Posts: 5,092

    So Anton, where did you get this dreck?

    Actually, IIRC, somewhere in this NG. I've got to double-check it in order to provide some more info - otherwise, in addition to putting a foot in my mouth yet another time, I am going to make some libelous claim as well.

    In fact, this claim in itself is not as nonsensical as it may seem to be at the first glance. For example,consider a NUMA system where every CPU core has its own RAM bank. Depending on the bank's location on the bus relative to a given CPU core (i.e whether it happens to be a "local" bank or a "remote" one), the cost of memory access by a given CPU may be different for different banks. Therefore, I don't see any reason why the same logic may not apply to memory-mapped devices on some " XYZ architecture"

    Anton Bassov

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,583

    this claim in itself is not as nonsensical as it may seem to be at the first glance

    There's nothing inherently nonsensical about the possibility that such an issue existed. It is merely untrue that it applies to the PDP-11. And to extend such an idea to the entire concept of why IRQs exist (when the simplest explanation -- that some devices need "more urgent" service than others -- is more likely to be the correct one)... you know, when you hear hoofbeats, think horses.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Don_BurnDon_Burn Member - All Emails Posts: 1,673

    If you want to go with NUMA architectures then we are talking a different beast. PDP-11's were not NUMA, but DEC or one of its spin-offs did have NUMA based technology that could potentially apply in the Alpha or later timeframe. Of course by the time that came out, Windows NT had shipped so I don't think you can make the claim of that for either Unix or current Windows.

    I should mention in helping port Unix to a Honeywell machine, we talked a lot about the machine architectural design decisions of Unix with the Bell Lab's folks. This was because the machine in question had two types of addressing, one for words and one for bytes, the early Unix code did not handle this well.

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,583

    Imagine the amount of Christmas spirit and good cheer that I must possess... I haven't locked either this thread or the one (that eventually, after several posts the size of War and Peace, turned out to be) about I/O Completion Ports and NtLockFile. I'm clearly ending the year with some major thread drift. Perhaps even facilitating such drift. I shall strive and endeavor to be better in the New Year.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,583

    If my memory serves me well, I think I saw a low-end Windows laptop with a single-core Celeron

    You know, I hate to be the one to defend Anton. But I recently (like, in the past 2 years) worked on an embedded systems project that used exactly such a processor to handle all the C3 between a UAV and the ground: A single-core Celeron (that, as far as I could tell, had been designed primarily for a laptop). In response to my (repeated) scoffing at the single-core CPU I was informed that the primary driver for using this processor was the heat budget for the aircraft.

    A new design for the processor system was eventually created, along with new heat sinks and a new multi-processor capable driver... whether this ever actual got BUILT or deployed I have no idea. It was a super interesting/fun project and one I was proud to work on. My only regret, and it's a biggie, is that I wasn't allowed to take any pictures. For the avoidance of moral ambiguity, the UAV was designed for surveillance only.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • anton_bassovanton_bassov Member Posts: 5,092

    ... you know, when you hear hoofbeats, think horses.

    Well, there are some situations when the "Occam's razor" approach does not really seem to be the optimal one. This particular case seems to fall into this class. More on it below

    And to extend such an idea to the entire concept of why IRQs exist (when the simplest explanation -- that some devices
    need "more urgent" service than others -- is more likely to be the correct one).

    The above suggestions seems to be, indeed, perfectly reasonable at the first glance. However, if you scratch it a bit, you will see that
    things as not necessarily as simple as they first seem to be.

    To begin with, if the concept of "certain device interrupts being more equal than the other ones" was, indeed, defined by the needs of the system software, I would expect it to be platform-agnostic. In case if your target architecture does not provide a hardware-level support for this concept, you can implement it in a software really easily. Therefore, if the above suggestion was correct, I would not
    expect spl's (which happens to be sort of IRQL's twin) lifespan to correlate with the one of PDP11 THAT closely.

    However, this is not how the things had actually worked. The very concept of spl() appeared only when interrupt prioritisation feature got introduced in some newer version of PDP11(AFAIK, the earliest PDP11 versions did not have the one - it had been introduced a bit later), and lost any practical meaning when they started porting UNIX to other hardware architectures, subsequently having had disappeared from practically all major UNICes.

    In other words, I would rather suspect PDP11's specifics and not the OS's needs to be "the key driving force" here....

    Anton Bassov

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,583

    Sigh. Your command of history seems to be a bit off.

    The very first PDP-11 was the PDP-11/20. It was the first PDP-11 to run Unix. It had the same multi-level interrupt prioritization as “later” PDP-11s, such as the (much later) PDP-11/70.

    The PDP-11/15 was a cost-reduced OEM-only follow-on to the 11/20... one way to reduce cost was to make multi-level interrupts an option. I don’t know if anybody ever bought a system configured this way, and I don’t know which OSes supported it. There were some funky-ass operating systems back at the Dawn of PDP-11 Time that didn’t last long or morphed into something else (RSX-11A, B and C for instance, and RSTS-11).

    “OEM” in this context and era meant a system that was not sold by Digital to end users, and was thus bundled by a system integrator into a larger purpose-built hardware and software solution.

    Having been around when interrupt prioritization was an “important thing”, I can assure you that your notions are... misguided. In an era where facts count for little, I’m sure you will wish to continue to champion your incorrect assumptions and assert them as fundamental historic truths.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • anton_bassovanton_bassov Member Posts: 5,092

    Having been around when interrupt prioritization was an “important thing”, I can assure you that your notions are... misguided.
    In an era where facts count for little, I’m sure you will wish to continue to champion your incorrect assumptions
    and assert them as fundamental historic truths.

    Oh, come on...

    Taking into consideration the fact that,unlike me, you had a first-hand experience with PDP11, arguing with you in this respect would
    be a bit to the extreme even by my standards :)

    In any case, it looks like we are agree at least on the following points:

    1.IRQL and SPL are sort of "twins"
    2. SPL lost any practical meaning when UNIX got ported to other machines, and subsequently want away from all major UNIX flavours

    Imagine the amount of Christmas spirit and good cheer that I must possess... I haven't locked either this thread

    The most interesting thing here is that, despite some superficial and insignificant "deviations", this thread does not really seem to go
    anywhere off-topic. Instead, it remains clearly focused on the original topic. I know it may sound bizarre, but that's the way it is...

    Anton Bassov

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,583

    unlike me, you had a first-hand experience with PDP11

    One of the very few advantages to having been programming long enough to know stuff that is in no longer even remotely relevant to computing and operating systems in the 21st Century (if the discussion of multi-level interrupts isn't a good enough example of this... how about "overlays"?) is that you can (sometimes, still, even if through a glass darkly) remember how certain things go to be the way they are today.

    Consider it a party trick that can be used to amuse (or, you know, bore and annoy) younger colleagues. The primary risk of this trick (aside from rolled-eyes and being told "OK boomer") is that the younger colleagues often decide that they know the facts as well as you do, and repeat some nonsense or build some intensely incorrect theory based on invalid assumptions. Witness the current discussion.

    [As an important aside... I hasten to add that there is an entire generation of OS devs who are (MUCH MUCH) older than me who are still writing code, for better or for worse. Cutler, for example, is only a couple of years shy of 80. I also recently met one of the original RSTS devs (speaking of PDP-11s), who is still writing kernel-mode code and is working as a key guy in a pretty interesting multi-national startup.]

    arguing with you in this respect would be a bit to the extreme

    Gee, and here I thought that what you were doing.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • anton_bassovanton_bassov Member Posts: 5,092

    Gee, and here I thought that what you were doing.

    Well, I have discovered the following algorithm for myself quite a while ago

    [begin code snippet]

    int HaveSomethingToSay=0,TheHangingJudge=0,Reps=0;

    CheckThread(&HaveSomethingToSay);

    while (HaveSomethingToSay) {

        CannotBeWrong (&TheHangingJudge);
    
        if(IsWrong (TheHangingJudge)) {
    
                if(Reps==MAX_REPS_ALLOWED)break;
                RecallTheListOwner(Ntdev);
                RecallTheScaryCapabilities(NewHostingPlatform);
                Reps++;        
                continue;
    
                }
    
        HaveSomethingToSay=0; Reps=0;
        MakePosts();
        if(STATUS_TIMEOUT==WaitForReplies(&HaveSomethingToSay)) break;
    

    }

    [end code snippet]

    I guess the above code is so self-explanatory that no comments are needed..

    Anton Bassov

  • MBond2MBond2 Member Posts: 19

    As an aside, learning about the decisions and their consequences of the past can be an invaluable source of knowledge. Generally judgement comes from experience, and generally speaking reading competent history is a much faster way of gaining experience than actually living all that time and doing all of those things

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
Writing WDF Drivers 21 Oct 2019 OSR Seminar Space & ONLINE
Internals & Software Drivers 18 Nov 2019 Dulles, VA
Kernel Debugging 30 Mar 2020 OSR Seminar Space
Developing Minifilters 27 Apr 2020 OSR Seminar Space & ONLINE