Geoff Chappell, Software Analyst
This function creates an object for profiling a process’s execution within a specified range of addresses.
NTSTATUS NtCreateProfileEx ( HANDLE *ProfileHandle, HANDLE Process, PVOID ProfileBase, SIZE_T ProfileSize, ULONG BucketSize, ULONG *Buffer, ULONG BufferSize, KPROFILE_SOURCE ProfileSource, USHORT GroupCount, GROUP_AFFINITY *AffinityArray);
The ProfileHandle argument is the address of a variable that is to receive a handle to the created profile object. This handle can then be given to the NtStartProfile and NtStopProfile functions to start and stop the profiling that this function sets up.
The Process argument limits the profiling to a specified process. This argument can be NULL to profile globally.
The ProfileBase and ProfileSize arguments are respectively the address and size, in bytes, of a region of address space to profile. The 32-bit builds allow a special case in which the ProfileBase is instead a segment address: this applies if the BucketSize argument is zero.
The BucketSize argument selects a granularity for the profiling. Think of the profiled region as an array of buckets. Profiling produces a count of executions that are discovered within each bucket. The function supports buckets whose size in bytes is a power of two. As an argument, the BucketSize is not in bytes but is instead the logarithm base 2 of the size in bytes.
The Buffer and BufferSize arguments are respectively the address and size, in bytes, of a buffer that is to receive the ULONG execution counts for successive buckets while profiling is started but not stopped.
The ProfileSource argument limits the profiling to the specified source.
The GroupCount is the number of elements in the array whose start address is given by AffinityArray. Together they limit the profiling to the specified processors, which must be active. The GroupCount can be 0 to stand for all active processors. In this case, the AffinityArray is ignored.
The function returns STATUS_SUCCESS if successful, else a negative error code.
The NtCreateProfileEx function and its alias ZwCreateProfileEx are exported by name from NTDLL in version 6.1 and higher. In kernel mode, where ZwCreateProfileEx is a stub and NtCreateProfileEx is the implementation, neither is exported.
Neither NtCreateProfileEx nor its alias is documented. As ZwCreateProfileEx, it is declared in the ZWAPI.H file from an Enterprise edition of the Windows Driver Kit (WDK) for Windows 10.
Unusually for native API functions, no repackaging of NtCreateProfileEx, documented or not, is known in any higher-level user-mode module that is distributed as standard with Windows.
The following implementation notes are from inspection of the kernel from the original release of Windows 10 only. They may some day get revised to account for earlier versions. Meanwhile, where anything is added about earlier versions, take it not as an attempt at comprehensiveness but as a bonus from my being unable to resist a trip down memory lane or at least a quick look into the history.
The function has no purpose except to create a profile object that can be made to store execution counts in the given buffer. If BufferSize is zero, the function returns STATUS_INVALID_PARAMETER_7.
The 32-bit implementation allows that BucketSize can be zero, apparently to indicate that ProfileBase is not an address but a segment selector. Provided that the ProfileBase has its high word clear and that BufferSize is at least 4, the function computes a BucketSize to adopt by default. It perhaps suffices just to say that the computation aims for the smallest bucket that can span the given ProfileSize without needing more counters than are allowed for in the given BufferSize.
The smallest allowed bucket is 4 bytes, represented by 2 for the BucketSize. The largest is 2GB, represented by 0x1F for the BucketSize. Given a BucketSize outside this range, the function returns STATUS_INVALID_PARAMETER.
The buckets span the ProfileSize bytes that are to be profiled. If the BufferSize is too small to allow one ULONG counter for each such bucket, the function returns STATUS_BUFFER_TOO_SMALL. Beware, however, that the coding of this defence is defective in all known versions from 3.10 until Microsoft corrected it for the 1703 release of Windows 10 (or perhaps some slightly earlier update), having been notified during this article’s development in December 2016 and January 2017. Among the implications is that unpriviliged user-mode software can reliably induce the kernel to bug-check while handling some subsequent profile interrupt. Details are presented separately, with source code to demonstrate how this coding error can be abused to cause a Bug Check From User Mode By Profiling.
If the ProfileBase is too high for ProfileSize bytes to follow, the function returns STATUS_BUFFER_OVERFLOW.
Profiling is coordinated with the Hardware Abstraction Layer (HAL), which has the job of arranging for an interrupt at each occurrence of some event that acts as the profile source. Starting with version 6.2, the function checks that the given ProfileSource is one that the HAL supports. This is done via the HalQuerySystemInformation pointer in the kernel’s HAL_DISPATCH, specifically for the information class HalProfileSourceInformation (0x01). If the HAL does not support the given ProfileSource, the function returns STATUS_NOT_SUPPORTED.
If executing for a user-mode request, which looks to be necessary given that the function is not exported in kernel mode and is not called internally, the function has some general defensiveness about addresses that it is given for input or output. Failure at any of these defences, whose descriptions follow in the next paragraph, is failure for the function, typically showing as a return of STATUS_DATATYPE_MISALIGNMENT or STATUS_ACCESS_VIOLATION.
The variable at ProviderHandle must start in user-mode address space and be writable. The Buffer must be entirely in user-mode address space, have 4-byte alignment at its start and be writable both at the start and at every page boundary within. Unless GroupCount is zero, the AffinityArray must be entirely in user-mode address space and have 4-byte alignment at its start.
The GroupCount and AffinityArray must correctly describe only active processors, else the function returns STATUS_INVALID_PARAMETER. Specifically, the processor specification is rejected if any GROUP_AFFINITY in the array has an invalid Group, an empty Mask, a Mask that has a bit set for a processor that is not active in the group, or has any non-zero Reserved member.
If a handle is given as the Process argument, the function fails unless the handle references a process object (not some other type of object) and has the PROCESS_QUERY_INFORMATION permission.
The Process argument can be NULL to profile execution in the given range no matter by which process, but with two constraints.
The first applies only to the 32-bit implementation. If BucketSize was defaulted, then NULL for the Process implies that ProfileBase must be NULL too, else the function returns STATUS_INVALID_PARAMETER. The reasoning for this is not understood, though the effect is plainly that profiling execution in a segment is supported only if profiling a specific process.
Second, SeSystemProfilePrivilege is required of any user-mode request that would profile all processes but specifies a ProfileBase in user-mode address space. Without this privilege, the function returns STATUS_PRIVILEGE_NOT_HELD. The intention is presumably that an unprivileged caller should not be able to specify NULL for the Process as a way to learn even indirectly about the user-mode execution of processes that it would not be permitted to profile explicitly.
See that a user-mode caller does not need SeSystemProfilePrivilege to profile kernel-mode execution. This is a potentially huge leak of information about the distribution of kernel-mode software. There’s not much point to defences such as Address Space Layout Randomisation (ASLR) if an attacker who wants a kernel-mode location at which to exploit a vulnerability can get a good guess just by starting some well-chosen activity and observing where kernel-mode execution changes in response. Starting in version 6.3, this is closed off to restricted callers (meaning, essentially, those that have low integrity). Whatever the Process argument, if a user-mode request from a restricted caller asks to profile a region that reaches into kernel-mode address space, the function returns STATUS_ACCESS_DENIED.
Given that the parameters are not rejected, the function transfers them to an executive profile object. This is a formal object in the kernel’s namespace of objects, though instances are not named. Microsoft’s name for the object as a C-language structure is not known, though there would be no surprise if it turned out to be EPROFILE. If the object cannot be created, the function fails. If a handle cannot be created for the object, granting whatever permission is represented by 0x00000001, the function fails.
Ordinarily, however, the function returns the handle via the address given as ProfileHandle and returns STATUS_SUCCESS. The handle can then be used in pairs of calls to the NtStartProfile and NtStopProfile functions to start and stop the profiling as configured from this function’s parameters, any number of times before being closed, e.g., through CloseHandle.
Special mention must be made of an indirect effect. Though the NtCreateProfileEx function’s validated parameters are not acted on until some later call to NtStartProfile, callers of the former are exposed in theory to defects in the latter and even to defects that are merely activated by the latter. In versions 6.2 and higher, this exposure is not just hypothetical. Starting the profile sets up a recurring interrupt, which the kernel learns about as calls from the HAL to the kernel’s KeProfileInterruptWithSource function. Because of a coding error in this KeProfileInterruptWithSource function’s interpretation of the profile’s parameters, even a correctly validated combination of ProfileSize, BucketSize and BufferSize can cause the same bug check that versions before the 1703 release of Windows 10 make possible by incorrectly validating the parameters during NtCreateProfileEx.