RE: ACPI Machines and IRQ 9 [was: Communicating with the NT developers]

This is a stupid story. And I’m embarrassed to tell you the truth. But
here it is.

The early ACPI machines were mostly laptops. And the laptops of that
generation had most of their devices either embedded in the chipset or
on the ISA bus. The PCI or AGP buses were used only for video, and to
connect the north bridge with the south bridge. (In Intel’s chipset
terms, the North bridge has all the fast gates of the chipset, including
the memory controller, AGP and in that generation, the PCI bus
generation logic. The south bridge contains all the slow gates,
including the IDE controller, the ISA bridge, all the PC legacy stuff
and probably a USB controller. Today, the south bridge probably also
has audio and a few other random odds and ends.) Because the laptops of
that era had all of their devices on the ISA bus, interrupt sharing
worked poorly. If you bought a mid-'90s laptop from IBM or Toshiba, the
serial port and possibly IR would be disabled. There would be a utility
packaged with the machine that allowed you to turn on your serial or IR,
but at the cost of the bi-directional parallel port, or one of the
PCMCIA slots, since there just weren’t enough IRQs in the machine to
guarantee that all of the peripherals worked, especially if you filled
both PCMCIA slots with combo cards.

I once debugged a Toshiba 750CDT in a docking station that had two
PCMCIA cards plugged into the machine, two PCMCIA cards plugged into the
slots in dock, two ISA cards in the dock and an extra IDE device in the
dock, too. This meant that the total demand on the machine was 20 IRQs,
when only 16 were actually available.

(As an aside, I’ve been trying to convince Intel to put APIC interrupt
controllers, which would allow many more IRQs, in their laptop chipsets
since 1997. My predecessor had been trying since '94. They may
actually manage it soon.)

Along comes ACPI. When you turn on ACPI in a machine, it suddenly
switches all the power management logic in the machine from delivering
its interrupts as BIOS-visible, non-vectored System Management
Interrupts over to OS-visible, vectored interrupts. And that interrupt
is delivered level-triggered, active-low, which means that it can be
shared with a PCI interrupt.

Now consider that these early ACPI machines were already over-committed
in terms of interrupts. There was no way to make them work with PCI
devices spread out on lots of IRQs. So I just made the code collapse
all the sharable devices onto the ACPI interrupt, which was fixed in the
chipset by Intel at IRQ 9. By doing it this way, I could hide the fact
that ACPI had just created a demand for one more IRQ. (If you use a
non-Intel chipset that has ACPI coming in on some other IRQ, you’ll see
all the PCI devices in Win2K go to that IRQ, not 9.)

Further complicating this story was that I was trying to get ACPI
machines to work back in 1997, when the people working on Plug and Play
in Win2K hadn’t yet gotten their stuff going yet. At time, it wasn’t
possible to move a device from one set of resources to another after it
had been started. This meant that any IRQ solution that I came up with
had to work from the first try, so it had to be conservative.

The everything-on-IRQ-9 solution worked. It got the machines to run, as
long as none of the device drivers mis-handled their ISRs. (Later, this
turned out to be a huge debugging problem, since when you chain eight or
nine devices, you’ll get somebody who fails.) The solution wasn’t
optimal, but it did work. I meant to go back and change it later,
before we shipped Windows 2000.

A couple of years passed. I had been working on multi-processor
problems and on other aspects of ACPI. It got close to the time to ship
Windows 2000 and somebody brought up the old question of IRQ stacking.
I worked up a more-elegant solution, one that spread out interrupts on
most machines. By that time, Plug and Play had been mostly completed,
and that wasn’t a bottleneck any more. But the test team told me that
they wouldn’t let me put it into the product, since they didn’t have
time to re-test the thousands of machines that had already been tested
with the old algorithm.

At the time, I thought that this was somewhat ridiculous. I thought
that my code would work just fine. I thought that their fears were
un-justified. But I was overruled, and I just put the code into what
became Windows XP, letting Windows 2000 ship with the simple, safe, yet
frustrating stacking.

This is a good point in the story to explain that, in ACPI machines, the
IRQ steering is accomplished by interpreting BIOS-supplied P-code called
ASL. The IRQ routers are completely abstracted by the BIOS. The OS
doesn’t need to know about the actual hardware. The old IRQ steering
code in Win9x, which was dropped into the non-ACPI HAL in Win2K, had to
have code specific to each chipset, which meant that it didn’t work when
new chipsets were shipped. It was also written in a way that it assumed
that there were exactly four IRQs coming from PCI. ACPI machines
sometimes have many more. (This is the reason that you don’t see the
IRQ steering tab in ACPI machines. It just wasn’t flexible enough and
we didn’t have time to re-do it.)

What we discovered with Windows XP was that all of those ACPI machines
that had been tested with their IRQs stacked on IRQ 9 tended to fail
when you spread the IRQs out. A typical example of a failure would work
like this: WinXP doesn’t need the IRQ for the parallel port unless
you’re using one of the extended modes. So the parallel driver releases
its IRQ until it’s needed. The IRQ choosing logic (called an IRQ
“arbiter”) would move a PCI device onto the parallel IRQ. This action
depends on re-programming the chipset so that the parallel port isn’t
actually triggering the IRQ. This is supposed to happen by interpreting
even more BIOS P-code that manipulates the chipset, since there is no
standard for parallel port configuration.

If your chipset comes from Intel, this probably works, since the mere
act of setting a PCI device to an IRQ also disconnects that IRQ from the
ISA bus. But if your chipset comes from VIA or ALi, there is another
step involved. The problem is that nearly all of the BIOS P-code out
there is copied from old Intel example code. So they are almost all
missing the extra step necessary in VIA and ALi machines.

If the BIOS fails to stop the IRQ coming from the parallel port, the
machine hangs, since the parallel port, which sends its IRQs
active-high, edge-triggered, will ground the interrupt signal in the
passive state. And grounding an interrupt which is enabled active-low,
level-triggered will cause an endless stream of interrupts.

The parallel port is just an example. Pick any device that is in the
legacy SuperIO chip and the story repeats itself.

In Windows XP, I made a bunch of changes. In machines without cardbus
controllers, (which don’t have the IRQ problems created by PCMCIA,) it
will try to keep the PCI devices on the IRQs that the BIOS used during
boot. If the BIOS didn’t set the device up, then any IRQ may be chosen.
But if your machine has a VIA chipset, or if it has a BIOS that we know
to be broken, then we fall back to the Win2K-style stacking behavior.
The unfortunate truth is that you guys on this list mostly build your
own machines, rather than buying them from reputable manufacturers,
which means that you guys own the machines with broken BIOSes and VIA
chipsets. So even with WindowsXP, you’ll see the same old stacking
behavior.

One notable addendum is that any machine with an APIC interrupt
controller, and thus more than 16 IRQs, will spread interrupts out, even
in Win2K. In the past, this was mostly limited to SMP machines. But
any desktop machine shipping today that gets the Windows logo has to
have an APIC. (This was another reason that I hadn’t gone back to
re-write this code earlier. Intel had promised that all machines would
have APICs by 1998. If this had materialized, then none of you would
have had any complaints by now.) I’m actually currently working on
software for some future NT that will let an administrator configure the
machine in any way he or she desires.

  • Jake

-----Original Message-----

Subject: Re: Communicating with the NT developers
From: “Maxim S. Shatskih”
Date: Thu, 27 Dec 2001 18:22:20 +0300
X-Message-Number: 19

Hi Jake,

>expertise. (For those who are curious, I have spent the last five
years
>working on the NT HAL, with lots of forays into power management and
>ACPI.

Then maybe you’re exactly the person who will be able to answer to one
of the popular questions here:

- why ACPI HAL puts all PCI cards to IRQ9? What were the technical
backgrounds of such a decision? Why PCI IRQ steering is allowed
for non-ACPI HAL (and even has a UI property sheet) and is disabled on
ACPI?

thanks,
Max


You are currently subscribed to ntdev as: $subst(‘Recip.EmailAddr’)
To unsubscribe send a blank email to leave-ntdev-$subst(‘Recip.MemberIDChar’)@lists.osr.com