Geoff Chappell - Software Analyst
While attempting to document the native API functions that query and set so-called system information, I keep letting myself get diverted (happily) by other research and (unhappily) by the business processes of potential clients. Both the research and the business processes will all come good in the end, I guess, and it surely is better to have potential clients than not—but there’s definitely a negative-feedback cycle when their overheads eat into research time, the funding of which is the reason for wanting clients.
That I’m trying to document these system-information functions is relevant to this month’s new research and writing because two cases of system information got me looking at what I eventually realised must be named the Kernel Shim Engine. At first, though, I knew only the names of the functions that are necessary for using the feature. Imagine my surprise that when I looked on Google, searching for what has yet been written about the KseRegisterShim function, to test if it would be worth my while looking further, the very first words Google showed me in the results were the name of my friend Travis Goodspeed. The tweet he had contributed to was from two years before, and was only a request for ideas, and it was someone else who suggested KseRegisterShim for the potential hack-ability of hooking the Import Address Table of someone else’s software in kernel mode without tripping PatchGuard. Still, the experience was spooky enough that I couldn’t resist. So, here is my preliminary write-up for three of the functions and the data they work with.
All pages listed below are new. Many are still in-progress to some extent, and will still be in September, if not beyond. Please remember, as you read any of these pages, that the analytical engine that produces them is available for consultation and job offers: you can have this experience, knowledge and skill working for you.
Had I searched instead for Kernel Shim Engine, I might have left the field to Alex Ionescu: Google’s results for that search tell prominently enough of his recent conference presentation Abusing the NT Kernel Shim Engine and I’d have surmised just from the link and his reputation that he very plausibly tells everyone who’ll ever be interested everything they might ever need to know. But surmise is all I could do, from Google, which apparently knows of no searchable text for these conference proceedings. Google’s link is just to a title and a precis. If you look around, there’s an MP4 of the talk, but no text or even slides (yet). Of course, books, letters, articles, proceedings, etc, that are published the old way, on paper, aren’t searchable (immediately) through Google either. We are not yet, and perhaps never should be, in a world where material is considered published only if Google finds it and indexes it. Still, it fascinates me that I would not know of Alex’s work on the subject except by word of mouth and from Alex himself. From Google nobody would know anything of it except by chancing on terms that other people might think to use not just somewhere in any useful text but specifically in titles.
Of course, Kernel Shim Engine is an obvious title once you get into the subject. But it’s far less obvious when you have just started and are thinking whether to go further. Even once you do get into the subject, the code itself yields very little evidence for any title. You can guess that Kernel Shim Engine is what Kse stands for in the prefixes to the names of functions. Certainly, shim and engine are suggested by many names from symbol files and from text that the kernel can write to log files, but only rarely together, as “kshim engine” for log-file text and, more persuasively, as KernelShimEngineProvider for the symbolic name of the GUID for the event provider that’s associated with the functionality. By contrast, you need barely start looking at the code to realise that the Kernel Shim Engine, or whatever you think it will turn out to be named, simply can’t be used without calling at least some of its exported functions. Their names are obvious things to search for in text if you want to find what is yet published on the subject, but they’re evidently not the terms to search for in titles. Hmm.
I’m not complaining, though, just musing. I’m even glad that I didn’t find substantial analyses already done by other hands, for I have come to realise that the topic is important for kernel-mode programming and is not without reverse-engineering interest. The integrity of any kernel-mode driver that you or I write now depends on what anyone else can work into the driver database file, DRVMAIN.SDB, in the AppPatch subdirectory of the Windows installation Yet the world’s understanding of this file is primitive, to be kind.
Thankfully, though the kernel lets one driver register a shim to apply to another, it doesn’t apply the shim unless the shim is suitably configured in the driver database file DRVMAIN.SDB. This file was introduced in Windows XP to identify drivers that are blocked from loading. Now it seems rather more important. The integrity of your driver depends on how easily others may abuse the functionality and get Windows to apply their shim to your driver. Though the kernel, and much other software that works with SDB files, gets the necessary code from statically linked libraries, the code is also built into a DLL that exports functions for working with SDB files. Here’s the usual survey, as groundwork for actual research.
The first steps of such research—nothing that yet counts as serious work—produce what appears to be the Internet’s first list of all the tags that have been defined for the SDB file format since Microsoft’s somewhat grudging documentation of them for Windows Vista. Or so I thought at first. Then I came across names.py, which accurately both lists and names the tags as defined at the time of its writing, i.e., for Windows 8.1 With Update. I want to give the page full marks, for Google really does find no other instance at all of TAG_KSHIM_REF (for example) and no others of KSHIM_REF, i.e., without the TAG prefix, except for automated, unannotated lists of strings from executables. But the page doesn’t explain where the names come from. This brings to mind an unsettling observation about the state of reverse engineering if it’s ever to be formalised as a field of academic study. When it comes to explaining its work, it still tends to be at one extreme or the other. Too much of it goes into every little detail, almost as if hoping to explain mathematics to readers who are treated as not knowing how to count. Yet as much, if not more, presents reverse-engineered information as if obtained by magic (or, as surely does happen, by the happy discovery of leaked source code). The field, if it’s to be worth the word, needs to do better than this.