A question about Interrupt Service Routine

Hi all,
The DDK says that an ISR should return FALSE when it detects that its device was not the source of an interrupt, and return TRUE if it’s the source. My question is, since the driver does this detection by reading it’s status register, what will happen if the INTR bit in the register is set right between the ISR getting called and the status register being read? Thus the ISR will return TRUE, which is a wrong value not expected by the OS! What will things go?

gmm

> The DDK says that an ISR should return FALSE when it detects that its

device was not the source of an interrupt, and return TRUE if it’s the
source. My question is, since the driver does this detection by reading
it’s status register, what will happen if the INTR bit in the register is
set right between the ISR getting called and the status register being
read? Thus the ISR will return TRUE, which is a wrong value not expected
by the OS! What will things go?

The purpose of the TRUE/FALSE return value is to tell the OS if it should
continue to poll ISR’s in a shared interrupt environment (like PCI bus
devices). If the ISR returns FALSE, the next shared ISR get’s polled, or
else a spurious interrupt is noted if there are no more ISR’s.

The case you describe MUST be with shared interrupts. Assume device A and B
share level triggered PCI interrupts. Device B interrupts, but while the
processor is getting to the device B ISR, device A also raises his
interrupt. The device A ISR is polled forst and notices his status is
requesting service and returns TRUE, unwinding the interrupt. As it’s a
level triggered interrupt, the moment the level is enabled again, it fires
again, causing a poll of the device A ISR, which now returns FALSE, and
then a poll of the device B ISR, which returns TRUE. On unwinding the
interrupt again, both devices have been serviced, although NOT in the order
they raised their interrupt signal. Note this may be an over simplification
of what REALLY happens, as interrupt priorities may get rotated, and
multiple processors may get involved.

I believe the guarantee is only a single processor will be in a specific
ISR, but believe you should be prepared to have two different ISR’s, on a
shared interrupt level, to run concurrently on two processors. As ISR have
processor affinity, it seems possible you could have two devices, on a
shared interrupt, with non-overlapping processor affinity, and the OS would
need to cope.

  • Jan

>

I believe the guarantee is only a single processor will be in
a specific
ISR, but believe you should be prepared to have two different
ISR’s, on a
shared interrupt level, to run concurrently on two
processors. As ISR have
processor affinity, it seems possible you could have two
devices, on a
shared interrupt, with non-overlapping processor affinity,
and the OS would
need to cope.

Well not really. Each platform interrupt vector is represented by one
interrupt object and each interrupt object contains an interrupt spinlock
that is acquired BEFORE ANY of the driver interrupt service routines
attached to the interrupt object are called and released AFTER THE LAST isr
has been called. (On a pci bus this would be after one of the isr returns
TRUE.) This guarantees that for any interrupt object the concurrency of all
the associated isrs is at most one.

Note that this does not affect the concurrency of the associated DPC
routines, which is at most N where N is the number of processors on the
system.

Mark Roddy
xxxxx@hollistech.com
www.hollistech.com
WindowsNT Windows 2000 Consulting Services

At 09:15 AM 10/31/2000 -0500, Roddy, Mark wrote:

Each platform interrupt vector is represented by one
interrupt object and each interrupt object contains an interrupt spinlock
that is acquired BEFORE ANY of the driver interrupt service routines
attached to the interrupt object are called and released AFTER THE LAST isr
has been called. (On a pci bus this would be after one of the isr returns
TRUE.) This guarantees that for any interrupt object the concurrency of all
the associated isrs is at most one.

Thanks for correcting me. Do you know if this is just how it happens to be
implemented, or is the spec defined this way, so should never change in the
future (until Microsoft decides otherwise).

Offhand, not spreading ISR processing on a shared interrupt across multiple
processors seems like a bug. Seems like the lock should be on each
interrupt object, and the lock on the level should just be held for a
moment while fooling with level specific stuff. If I have 4 cards
generating a large numbers of interrupts, all on a shared level, I fully
would expect a 4 processor machine to spread the load. Seems like the OS
should round robin which ISR get’s polled first, so a lock in one ISR
doesn’t prevent another processor from polling a different ISR. If it
were smart, it would assure any locked ISR’s didn’t get polled first.

  • Jan

>register, what will happen if the INTR bit in the register is set right
between

the ISR getting called and the status register being read? Thus the ISR
will

The correct driver design is to loop in either ISR or DPC till the status
register bits still show that there are events to handle and quit the ISR or
DPC only if the is no events.
The device must not use “one event - one interrupt” approach. It must use
interrupts only in “something is pending” semantics, and the status register
must be used to determine what is pending.
In such a design, the second ISR call will do nothing bad - it will be just
a second redundant attempt to attract the host’s attention to the status
registers. It will do nothing bad.

Max

> moment while fooling with level specific stuff. If I have 4 cards

generating a large numbers of interrupts, all on a shared level, I fully
would expect a 4 processor machine to spread the load. Seems like the

Move the majority of your logic from ISRs to DPCs - DPCs can run
concurrently on different processors.

Max

You have to keep in mind that the architectural concept of interrupt-side
load balancing in NT is implemented through the relationship between
interrupt service routine and “deferred procedure call for interrupt service
routine”. The idea is that an isr is minimal, deferring essentially all
activity to the dpc. Load balancing is achieved by having full
multi-processor concurrency for dpc routines.

As to your other question: the concurrency of isrs is ‘cast in concrete’
until it isn’t :slight_smile: I think it is rather unlikely that this fundamental piece
of the os architecture is going to get re-factored, but then again I don’t
have even the slightest say in what goes on at redmond.

Mark Roddy
Windows 2000/NT Consultant
Hollis Technology Solutions
www.hollistech.com

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Jan Bottorff
Sent: Tuesday, October 31, 2000 4:32 PM
To: NT Developers Interest List
Subject: [ntdev] Re: A question about Interrupt Service Routine

At 09:15 AM 10/31/2000 -0500, Roddy, Mark wrote:
>Each platform interrupt vector is represented by one
>interrupt object and each interrupt object contains an interrupt spinlock
>that is acquired BEFORE ANY of the driver interrupt service routines
>attached to the interrupt object are called and released AFTER
THE LAST isr
>has been called. (On a pci bus this would be after one of the isr returns
>TRUE.) This guarantees that for any interrupt object the
concurrency of all
>the associated isrs is at most one.

Thanks for correcting me. Do you know if this is just how it
happens to be
implemented, or is the spec defined this way, so should never
change in the
future (until Microsoft decides otherwise).

Offhand, not spreading ISR processing on a shared interrupt
across multiple
processors seems like a bug. Seems like the lock should be on each
interrupt object, and the lock on the level should just be held for a
moment while fooling with level specific stuff. If I have 4 cards
generating a large numbers of interrupts, all on a shared level, I fully
would expect a 4 processor machine to spread the load. Seems like the OS
should round robin which ISR get’s polled first, so a lock in one ISR
doesn’t prevent another processor from polling a different ISR. If it
were smart, it would assure any locked ISR’s didn’t get polled first.

  • Jan

You are currently subscribed to ntdev as: xxxxx@wattanuck.mv.com
To unsubscribe send a blank email to $subst(‘Email.Unsub’)

At 02:22 AM 11/1/2000 +0300, Maxim S. Shatskih wrote:

> moment while fooling with level specific stuff. If I have 4 cards
> generating a large numbers of interrupts, all on a shared level, I fully
> would expect a 4 processor machine to spread the load. Seems like the

Move the majority of your logic from ISRs to DPCs - DPCs can run
concurrently on different processors.

I’m thinking in terms of how drivers co-exist with each other. Some drivers
do major work in the ISR (sometimes because the hardware is stupid and
critical timing loops have to happen in the ISR), and as a result, can
seriously degrade the interrupt response latency for ISR’s farther down the
shared level chain. It seems kind of bogus to think on some systems, there
is an idle processor, and my driver might have degraded performance because
some ISR up the chain takes it’s time.

I just happen to have built a new Win2000 machine, and noticed no less than
NINE devices are all assigned to interrupt level 9, not to mention there
are at least 4 free interrupt levels. Offhand, it looks like MANY devices
on an interrupt level is common. This is just a bland workstation machine,
on a very modern chipset (Intel D815EEA motherboard), a video card, and two
PCI cards (a dual serial port, and a PCCard controller). If we believe the
execution time for a typical ISR is like 15 uSec (PCI uncached device
accesses can take a significant fraction of a microsecond each), the last
device on this chain is looking at a latency of 120 uSec. If it happens to
be the serial port, running at 900k baud, the interrupt rate will be about
6250 interrupts/sec. For 9 devices to service 15 uSec ISR’s, 6250 times/sec
suggest the system will spend 84% if it’s time servicing ISR’s. If we think
the ISR can run in 5 uSec, that’s still over 28% of the CPU consumed if the
last device in the chain generated 6250 interrupts/sec. My rule of thumb is
6250 interrupts/sec is not excessive, although in this example it looks
pretty unpleasant. Is my calculator broken, or does this really suggest my
system may experience very serious performance degradation if I run a
serial port at a high speed. Network cards OFTEN generate 6250
interrupts/sec, so it doesn’t seem like an issue specific to serial ports.

The issue seems like long chains of shared interrupt ISR’s, with devices
down the chain generating more than trivial activity. Perhaps the ordering
of calling shared ISR’s should be adjusted dynamically to put the most
recent ISR to process an interrupt at the head of the poll chain (or maybe
this happens already?).

  • Jan

> I’m thinking in terms of how drivers co-exist with each other. Some
drivers

do major work in the ISR (sometimes because the hardware is stupid and
critical timing loops have to happen in the ISR), and as a result, can

Such stupid hardware is usually ISA - so, no shared interrupts.

Max

You are seeing the cluster-f*** on IRQ 9 because your machine has installed the
ACPI HAL, and this is what that HAL does. The ACPI HAL can cause a lot of
problems on some installations. You can replace this hal by using device mangler
and selecting computer, and update driver. However, this is not recommended by MS
– the only recommended way is to reinstall the OS. There is a hotkey that you
can press during NT setup to force setup not to check for ACPI, can’t recall it
right now (IIRC it’s F7). It would be interesting to see the effect of replacing
the HAL on interrupt latency…

Regards,

Paul Bunn, UltraBac.com, 425-644-6000
Microsoft MVP - WindowsNT/2000
http://www.ultrabac.com

-----Original Message-----
From: Jan Bottorff [mailto:xxxxx@pmatrix.com]
Sent: Tuesday, October 31, 2000 6:36 PM
To: NT Developers Interest List
Subject: [ntdev] Re: A question about Interrupt Service Routine

At 02:22 AM 11/1/2000 +0300, Maxim S. Shatskih wrote:

> moment while fooling with level specific stuff. If I have 4 cards
> generating a large numbers of interrupts, all on a shared level, I fully
> would expect a 4 processor machine to spread the load. Seems like the

Move the majority of your logic from ISRs to DPCs - DPCs can run
concurrently on different processors.

I’m thinking in terms of how drivers co-exist with each other. Some drivers
do major work in the ISR (sometimes because the hardware is stupid and
critical timing loops have to happen in the ISR), and as a result, can
seriously degrade the interrupt response latency for ISR’s farther down the
shared level chain. It seems kind of bogus to think on some systems, there
is an idle processor, and my driver might have degraded performance because
some ISR up the chain takes it’s time.

I just happen to have built a new Win2000 machine, and noticed no less than
NINE devices are all assigned to interrupt level 9, not to mention there
are at least 4 free interrupt levels. Offhand, it looks like MANY devices
on an interrupt level is common. This is just a bland workstation machine,
on a very modern chipset (Intel D815EEA motherboard), a video card, and two
PCI cards (a dual serial port, and a PCCard controller). If we believe the
execution time for a typical ISR is like 15 uSec (PCI uncached device
accesses can take a significant fraction of a microsecond each), the last
device on this chain is looking at a latency of 120 uSec. If it happens to
be the serial port, running at 900k baud, the interrupt rate will be about
6250 interrupts/sec. For 9 devices to service 15 uSec ISR’s, 6250 times/sec
suggest the system will spend 84% if it’s time servicing ISR’s. If we think
the ISR can run in 5 uSec, that’s still over 28% of the CPU consumed if the
last device in the chain generated 6250 interrupts/sec. My rule of thumb is
6250 interrupts/sec is not excessive, although in this example it looks
pretty unpleasant. Is my calculator broken, or does this really suggest my
system may experience very serious performance degradation if I run a
serial port at a high speed. Network cards OFTEN generate 6250
interrupts/sec, so it doesn’t seem like an issue specific to serial ports.

The issue seems like long chains of shared interrupt ISR’s, with devices
down the chain generating more than trivial activity. Perhaps the ordering
of calling shared ISR’s should be adjusted dynamically to put the most
recent ISR to process an interrupt at the head of the poll chain (or maybe
this happens already?).

> The ACPI HAL can cause a lot of

problems on some installations. You can replace this hal by using device
mangler
and selecting computer, and update driver.

Gee, I went through quite a bit of trouble to be sure the APCI HAL was
installed and functioning, as I really would like my machine to correctly
manage power. Putting 9 devices on one interrupt level seems really stupid
of some component. Hopefully it’s not some wires that are forever stupid.
Still, it seems like the lesson is: us driver developers had better try to
make our ISR execution time in the couple of microsecond range. I used to
think in terms of less than 10K interrupts/sec or 10-20 microseconds
execution being ok. Unless W2K dynamically orders the interrupt polling
sequence, based on activity (not a guaranteed cure), these old rules of
thumb will make some machine have horrible performance.

  • Jan

> manage power. Putting 9 devices on one interrupt level seems really stupid

of some component. Hopefully it’s not some wires that are forever stupid.

Win98’s PnP allowed to override the automatic interrupt allocation by
explicitly specifying yours in the UI. This specified configuration was
saved in the device registry key as “ForcedConfig”.
Is there such feature on ACPI-enabled w2k systems? Maybe it is a worth thing
to do to distribute IRQs manually using the PnP UI?

Max