Geoff Chappell, Software Analyst
If the HAL’s initialisation of Hardware Performance Counters establishes that the boot processor is from Intel, has at least some support for Performance Monitoring Counters, and that this support is not masked from Windows by a Microsoft-compatible hypervisor, the HAL chooses the Emon profile interface. The name Emon appears to come from Intel’s manuals, apparently standing for Event Monitoring.
The cpuid leaf for learning about profiling is 0x0A. The low byte returned in eax is the “version ID of architectural performance monitoring” and is known already to be at least 1. The HAL saves this but is not known to make any use of it (nor any, yet, of the fixed-function performance counters that Intel documents as being supported if the version is greater than 1). The middle bytes in eax tell how many general-purpose performance monitoring counters are supported by each logical processor and how wide, in bits, are those counters. The high byte in eax tells how many bits are meaningful in ebx. Each bit that is both meaningful and clear confirms that a corresponding performance event is available. It is already known that at least the first bit is meaningful and clear. Depending on those bits in ebx, some performance events that the Emon profile interface might support instead become unsupported.
For the purpose of interaction with the kernel, and indeed all the way to user mode through such functions as NtCreateProfile and NtCreateProfileEx, these performance events are abstracted as profile sources, represented numerically by a KPROFILE_SOURCE enumeration. The Emon profile interface in the HAL from the original release of Windows 10 supports the following:
Microsoft’s names for values below 0x19 are known from the enumeration’s C-language definition in WDM.H from the Windows Driver Kit (WDK). Presumably, the values from 0x19 and higher are omitted from that definition because they are processor-specific and the definition is meant to be general. Names for the Emon-specific profile sources are inferred from descriptive strings in the HAL, which can be obtained even from user mode through ZwQuerySystemInformation when given the information class SystemPerformanceTraceInformation (0x1F) and the secondary information class EventTraceProfileSourceListInformation (0x0D) as the first dword in the information buffer. For the values that are named in KPROFILE_SOURCE, the name is this descriptive string but with Profile as a prefix. Extrapolation of this relationship to the extra values seems at least a reasonable guess.
For each profile source other than ProfileTime, which is handled by a separate mechanism, if the corresponding bit shown in the column headed EBX Bit is either not meaningful according to the high byte that cpuid leaf 0x0A returned in eax or is set in what the same cpuid leaf returned in ebx, then the profile source becomes regarded as unsupported.
There also corresponds to each profile source a value that must be loaded into a Performance Event Select Register to, well, select the corresponding performance event. Each Performance Event Select Register is a model-specific register beginning at 0x0186, one for each counter that cpuid leaf 0x0A declared. The counters themselves are the model-specific registers beginning at 0xC1. Initially, the Emon profile interface loads zero into each of the declared Performance Event Select Registers.
Note that the Emon-specific profile sources are the complete set. The generally defined profile sources, numbered below 0x19, that the Emon profile interface can support are just those that map to the Emon-specific profile sources. The mapping is not one-to-one: whatever use numbers 0x1B and 0x1C may be, they are available only to those in the know. LLC, by the way, stands for Last Level Cache.
For the sake of completeness, note that the Emon profile interface requires 8 bytes of memory per counter per processor. The number of counters per processor is known from cpuid, as explained above. The number of processors is not known at the time and anyway can change. The HAL allows for the maximum possible number of registered processors, the meaning of which is a small topic in itself. Failure to get the memory, which is almost unthinkable, causes all profile sources to be treated as unsupported.