mr4th-dynamic-linking/win32_before_main/DETAILS.txt

40 lines
2.0 KiB
Plaintext
Raw Normal View History

Being able to run code 'before main' isn't just a magic trick. In a situation
where there may be more than one layer with dynamic linking and more than one
.dll (plugin system for instance), the maintenance burden of setting up each
layer in a central DllMain or main is significant enough to be a burden.
It is possible to get this effect on Windows through the CL compiler, but it
would be a stretch to say that it is "supported". The way I show here works
by relying on the fact that a special section does exist that contains function
pointers that run before 'main' or 'DllMain'. We can use CL's compiler
extensions to add a function pointer to that section just by declaring it as a
global variable and marking it up with:
__declspec(allocate(".CRT$XCU"))
If you look up this method on the internet, you will find claims that under
certain types of whole-program optimization, this won't work. In particular
this happens if you use the option /GL in CL.
This happens because the global variable appears to be unused from the
perspective of the compiler & linker. Since it is never directly referenced,
there is no C-level semantical reason to think this global variable is doing
anything.
However, in this example I show how we can still make it work. We have to
make sure the linker won't eliminate the global variable that we are trying to
place into the ".CRT$XCU" section. I achieve this by marking it as an export
symbol. Export symbols can't be eliminated even if they aren't used locally.
From what I've seen in testing, this works as desired, even with the /GL option.
IMPORTANT RESTRICTION: Because this creates an export symbol, each time we use
this within a binary it must have a unique name. Generally I would recommend
naming before-main symbol by scoping it to the layer where it exists.
CLANG NOTE: Interestingly, clang can build this, but it can also use the
__attribute__((constructor)) extension on Windows, which is a lot closer to
"supporting" this feature. I suspect that When I am building with clang I will
prefer to go with this option most of the time.