Geoff Chappell - Software Analyst
Processors that implement the x86 or x64 instruction sets are identified by a combination of vendor-specific family, model and stepping numbers (in order of decreasing significance). This classification has been firmly established ever since the cpuid instruction was added for Intel’s Pentium processor in 1993 and for some models of the 80486. For the 32-bit Windows kernel to execute at all, the processor evidently has 32-bit x86 instructions. If a standard test shows that this instruction set is not advanced enough to include cpuid or if a less standard test suggests the instruction is not reliably usable, then the kernel falls back to identifying the processor as an 80386 or 80486.
Of course, when Windows was new, the 80386 and 80486 were not so old. Both were realistic possibilities for what Windows might find itself running on. That said, the 80386 was fast being supplanted for the high-end computers that the new Windows NT aimed for (and, some would say, needed for acceptable performance). Support for the 80386 soon started being closed down. Early steppings are not acceptable to version 3.10. Version 3.50 rejects any 80386 in a multi-processor system. Version 4.0 declines to start even on a lone 80386. Though the 80486 is not formally rejected by any version, it has been unable to run new Windows versions since Windows XP made the cmpxchg8b instruction essential. Yet not until version 6.3 does the 32-bit kernel just assume that it’s running on a processor that has the cpuid instruction. What does it do for processor identification when it can’t simply ask cpuid?
Up to and including version 6.2, the 32-bit kernel regards the cpuid instruction as unimplemented if either:
A processor that has no cpuid instruction by this test is inferred to be either an 80386 or 80486. So too can be a processor that has the instruction but only with too little functionality. The main measure of functionality for the cpuid instruction is which function or leaf numbers the instruction accepts as input in the eax register. The maximum supported leaf is easily learnt as the output in eax from cpuid leaf 0. The family, model and stepping are produced as bit fields in eax from cpuid leaf 1. If the instruction does not have leaf 1, then starting with version 3.50 the kernel dismisses the instruction as unusable, such that again the processor must be an 80386 or 80486.
Though the family, model and stepping can’t be read from these early processors by executing the cpuid instruction, something very much like them had been introduced with the 80386 as a processor identification signature that is loaded into the edx register as its initial value immediately after the processor is reset. Many, if not all, computers with these processors have BIOS support through which this value can be retrieved on a running machine. For some, the BIOS explicitly saves the processor signature and makes it available through some API. For many more, something that may look a bit like magic is inherited from the 80286, for which the processor’s inability to return to real mode from protected mode is overcome with BIOS support. The processor is reset without losing memory, having configured the BIOS not to reinitialise as if from a reboot but instead to resume execution at an address that was saved for it at a known location before the reset. If the BIOS gets this far without changing edx, then the processor signature is retrievable from the reset. That’s all a bit much for the kernel, if not for anyone. When faced with a processor that does not have a usable cpuid instruction from which to learn the processor signature, the kernel doesn’t try to retrieve it but instead invents family, model and stepping numbers from the results of various tests.
For later processors which do have the cpuid instruction, Intel is clear that the processor signature returned in eax from cpuid leaf 1 and the processor signature in edx at reset are one and the same. For the 80386 and early 80486 that do not have the instruction, simulating the processor signature from edx at reset plausibly wasn’t what Microsoft aimed for. What does seem intended is to fit the processor into Intel’s descriptions of steppings as A0, B0, B1, etc. Even in later processors, this notation for steppings does not correlate directly with the model and stepping numbers in the processor signature. This seems to have been so for the 80386 too. See, for instance, that a datasheet for the Intel386™ DX (order number 231630-011, dated December 1995) has it that the B0 and B1 steppings both have 0x0303 for the processor signature and the D0 and D1 steppings have 0x0305 and 0x0308.
That Microsoft’s inferred family, model and stepping aim for the letter-and-number stepping, not the processor signature, is supported by their use for descriptive text in the registry:
For an 80386 or 80486, this registry value’s string data has the form 80x86-yz in which x is the family as a decimal digit, i.e., 3 or 4, y is the model but as a letter from the scheme A for 0, B for 1, etc., and z is the stepping, again as a decimal digit.
Of course, with the Pentium being effectively a minimum requirement in Windows XP and higher, what the Windows kernel identifies about a processor that does not support cpuid is now of interest only to historians (and perhaps to hobbyists who have enough time on their hands to try running a modern Windows on an 80486 for the dubious fun of seeing what happens). Yet it’s no small curiosity that the code for testing that the CPU is an 80386 or 80486, and then for identifying which type of 80386 or 80486, wasn’t discarded until version 6.3, having stayed unchanged, byte for byte, from version 3.10. There cannot in the whole history of computing be much other binary code that has been retained longer, through more revisions, with wider distribution.
When run on a processor without a cpuid instruction that implements at least leaf 1, the kernel looks to the AC bit (18) in the eflags register. If this can be changed, then the processor is deemed to be an 80486 (family 4). To identify the model and stepping, the kernel tests successively for what seem mostly to be defects. Any 80486 that has none of the defects is said to be model 3.
|4||0||0||80486-A0||ET bit (0x10) of cr0 can be cleared|
|4||1||0||80486-B0||reading dr4 causes Invalid Opcode exception|
|4||2||0||80486-C0||numeric coprocessor not present;
or pseudo-denormal not normalised for fractional fscale
According to Intel (see, for instance, the chapter on Architecture Compatibility in the Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 3A: System Programming Guide, Part 1), the ET bit of cr0 is hard-wired to 1 for Intel486 processors. Presumably then, it can be changed on some early 80486 processors only as a defect, which distinguishes what Microsoft regards as model 0.
Intel has long documented the dr4 register as reserved, but also as being aliased to dr6. Intel gives no indication of when this aliasing started, but documents that it does not apply to processors from families 5 and 6 when the DE bit is set in cr4. It also apparently does not apply to the 80486 in what Microsoft calls model 1.
Model 2 is distinguished by having no sufficiently advanced floating-point unit.
Detection of a numeric coprocessor is a standard test. The kernel clears the MP, EM, TS and ET bits of the cr0 register, initialises the floating-point unit (FPU) and reads the floating-point status word. An FPU is present if all flags in the low byte are clear. With the test done, the kernel sets the EM, TS and NE bits in cr0, and also the ET bit if the coprocessor was detected.
The specific defect that is tested for model 2 is in the fscale instruction’s handling of pseudo-denormals. These are 80-bit floating-point numbers that have zero as the biased exponent and 1 as the integer part. They ought never to be given as operands, but are tolerated for compatibility. They supposedly cannot be generated as the result of any floating-point operation. They, along with actual denormals, are meant to be normalised automatically if the Denormal Operand exception is masked. Scaling by a fraction leaves a normalised operand unchanged. Model 2 is apparently defective in that fractional scaling leaves a pseudo-denormal operand un-normalised. For testing the fscale instruction, the kernel clears the MP, EM, TS and ET bits of the cr0 register and masks all floating-point exceptions (by setting the low 8 bits of the floating-point control word). The 80-bit pseudo-denormal used for the test is zero except for having 1 as its integer part. If scaling this pseudo-denormal by 0.5 leaves the exponent as zero, then the processor is model 2.
Finer identification of 80386 processors has long been academic. Whatever the model or stepping, the 80386 processor is unsupported since version 4.0, and soon causes the bug check UNSUPPORTED_PROCESSOR (0x5D), though not without the kernel having worked its way through more tests for defects to identify models and steppings. For any 80386 processor that passes all tests, the model and stepping leap ahead to 3 and 1.
|3||0||0||80386-A0||32-bit mul not reliably correct|
|3||1||0||80386-B0||supports xbts instruction|
|3||1||1||80386-B1||set TF bit (8) in eflags causes Debug exception only at completion of two-cycle rep movsb|
The few versions that accept the 80386 at all reject any that doesn’t pass all three tests. For who knows what reason, the bug check in versions 3.50 and 3.51 is not specifically about the processor but is instead HAL_INITIALIZATION_FAILED (0x5C). Version 3.10 doesn’t have a bug check for this but instead displays the following message in text mode and hangs:
Your system may be using an early version of the Intel 386(tm) DX CPU which is not supported in this beta version of Windows NT. Please contact Intel at 1-800-228-4561, in Europe at 44-793-431144, or 1-503-629-7354 to determine if you need to acquire an Intel 386 CPU upgrade.
What resulted in practice from calling these numbers is not known.
The particular multiplication that distinguishes model 0 is of 0x81 by 0x0417A000. It is tried as many as 65,536 times to see if it ever produces an incorrect result. This same test (but with interrupts disabled) was used by Microsoft as long, long, long ago as September 1987 for Windows/386 version 2.01, to advise
WARNING: The Intel 80386 CPU in this computer does not reliably execute 32-bit multiply operations. Windows will USUALLY work correctly on computers with this problem but may occasionally fail. Contacting your hardware service representative and replacing your 80386 chip is strongly recommended. Press any key to continue...
Two of the several other phrasings of this warning from the Windows that ran on DOS are presented by Microsoft for the Knowledge Base article Windows and Early Intel 80386 CPU 32-Bit Operations (Q38029, apparently long gone from Microsoft’s website). By the time of Windows 95, the warning was a little reduced and softened (and was no longer particular to Intel):
WARNING: The 80386 processor in this computer may not reliably execute 32-bit multiplication. Windows may occasionally fail on this computer. You may want to replace your 80386 processor. Press any key to continue...
Some sources on the Internet associate this multiplication defect with the B1 stepping. This is perhaps indicative of an overall uncertainty in the historical record, which would in turn be an unsurprising side-effect of tightly restricted circulation of the original processor errata from Intel. This confusion seems to have affected Microsoft too. Although Microsoft places this defect firmly with the A0 stepping when developing Windows NT 3.1, a later Knowledge Base article Windows 95 Fails to Install on an 80386 Computer (Q119118) describes it just as definitely as affecting the B1 stepping:
Intel 386 microprocessors dated before April 1987 are known as B1 stepping chips. These chips are known to introduce random math errors when performing 32-bit operations, thus making them incompatible with Windows 95.
See that for both its operating systems of the time, Microsoft is explicit that its reason for warning about this stepping is exactly what it has tested. For the others, it’s not at all cear how what’s tested could matter enough to Windows (or even to any program or driver that’s ever written to run on Windows) to make the processor unsafe to use. More credible is that these tests are safe ways to identify processors that are separately known to have more serious faults.
The instruction whose support is tested for model 1 stepping 0 has the two-byte opcode 0x0F 0xA6 followed by a Mod R/M byte and by whatever more this byte indicates is needed. Intel’s Introduction to the 80386 Including the 80386 Data Sheet (order number 231746-001, dated April 1986) has this opcode as xbts in its table of instructions—and gives not just its encoding but its clock counts.
The instruction presumably does not survive even to the B1 stepping. Presuambly also the B0 stepping is not rejected just for having this instruction that Windows is not known ever to have used except for identifying the B0 stepping. Yet for something so short-lived in real-world implementation, it has left surprisingly much history. First, it survived outside the implementation. The opcode is disassembled as xbts by Microsoft’s C/C++ linker, typically through the DUMPBIN tool, even as recently as Visual Studio 2019 and has been since at least the mid-90s. (See Strange Things LINK Knows About 80x86 Processors.) Second, its two-byte opcode got a second life. The Opcode Table (Appendix A) in Intel’s i486™ Microprocessor Programmer’s Reference Manual (order number 240486-001, dated 1990) clearly shows 0x0F 0xA6 as assigned to what was then the new cmpxchg instruction. The next edition (order number 240486-002, dated 1992) fills the same space unusually specifically:
Confusion certainly followed. See, for instance, the following line from a file named DISASM.H in the Dr. Watson programming sample in Microsoft’s Win32 SDK, later named Platform SDK, at least until 1997:
dszCMPXCHG,O_bModrm_Reg, /* A6 XBTS */
How widespread was this confusion or what trouble it caused is not known, but it seems to have left a lasting mark: Intel’s opcode charts leave 0x0F 0xA6 unassigned even now.
The specific test performed by the Windows kernel is to execute xbts ecx,edx having loaded eax and edx with zero and ecx with 0xFF00. If this does not cause an Invalid Opcode exception and clears ecx to zero, then xbts is deemed to be supported and the processor is model 1 stepping 0. This test dates at least to 1987. Windows/386 version 2.01 is very terse when rejecting an 80386 that has the xbts instruction:
Error: Unsupported Intel 80386 CPU version.
By the time of Windows 3.10 Enhanced Mode, the rejection reads with less certainty:
Windows may not run correctly with the 80386 processor in this computer. Upgrade your 80386 processor or start Windows in standard mode by typing WIN /s at the MS-DOS prompt.
It changes again for Windows 95, which has no Standard Mode to offer as a fallback. Although the text does not name a stepping, the routine that does the test in these later versions of the Windows that ran on DOS returns 0xB0 if the processor supports the xbts instruction, else 0xB1.
When string instructions such as movsb are repeated because of a rep prefix, each iteration is ordinarily interruptible. As Intel says (for rep in the Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2B: Instruction Set Reference M-U), this “allows long string operations to proceed without affecting the interrupt response time of the system.” That repeated instructions are interruptible applies also to the Debug exception, such as raised by the processor at the end of executing an instruction for which the TF bit is set in the eflags when the instruction started. Programmers may have noticed this in assembly-language debugging: rep movsb may take many keystrokes to trace through.
Though the appearance of tracing through rep movsb without interruption might be welcome in practice when debugging, and Microsoft’s WDEB386 for the Windows that ran on DOS did give this effect by setting an int 3 breakpoint where the instruction is calculated to end, missing the Debug exception on even one iteration when actually tracing through rep movsb certainly is a defect. The kernel tests with 2 as the counter in ecx. The movsb should execute twice and ecx should count down to zero, having produced two Debug exceptions. The kernel has the first Debug exception escape from the rep. If ecx reaches zero, then the first of the expected Debug exceptions was missed and the kernel figures it is running on model 1 stepping 1.
The coding of all six of the preceding tests for early steppings of the 80386 and 80486 was settled for Windows NT by mid-1992, if not before. The oldest pre-release version that has yet been obtained for inspection is 3.10.297.1, built on 28th June 1992. Its kernel does none of its own processor identification but instead learns from NTLDR. This loader’s processor identification is done before it is yet known that the 32-bit instruction set is available. After eliminating the 8086 and 80286, there is just the AC bit to distinguish the 80386 and 80486. This loader knows nothing of the ID bit or the cpuid instruction. It does, however, know the same six tests for steppings. Except that the code is 16-bit and executes in real mode, the main difference is just that it doesn’t try to form model and stepping numbers as if for a processor signature. Instead, the routines for each test return hexadecimal representations of the letter-and-stepping notation, i.e., 0xA0 through to 0xD1. This loader accepts the defective 32-bit multiplication without warning, but it rejects the other two early 80386 steppings and is very precise for its explanation:
Windows NT has detected that your i386 CPU version is B0 or B1. Windows NT will not run on this CPU. Newer versions are available. Please contact your computer manufacturer for an upgrade.
By version 3.10.328.1, built on 12th October 1992, processor identification had moved to the kernel, which now includes the A0 stepping among the rejects. The code is all 32-bit, of course, and by then knew of a rudimentary cpuid instruction. For processors that don’t have this instruction, the only change in the identification algorithms on the way to the formal release (version 3.10.5098.1, built on 24th July 1993) was to reverse the order of distinguishing the families. In the pre-release code, inability to change the AC bit in the eflags implies an 80386 and then inability to change the ID bit implies an 80486. The pre-release code thus checks for the old while progressing to the new, but the released code starts by hoping for the new and falling back.