Jump to content
Tuts 4 You

Loader race issue


birt

Recommended Posts

So, I have this packed service that I need to patch at runtime without actually running a loader. I took the easy way, I wrote a DLL and added it to AppInit_DLLs, returning 0 from its DLLMain if it's trying to attach to a different process. So far so good.

Next step was to patch the process. Since at the time the DLL is loaded, the exe image isn't even loaded yet, let alone unpacked, my solution was to start a thread as soon as the DLL is attached to the process and enter an infinite loop. Inside the loop I monitor a memory location (one of the patch locations actually) directly (not via ReadProcessMemory() since it's in the same address space) and when it has a certain value (the unpacked bytes), I apply the patches via WriteProcessMemory(). The infinite loop breaks and the thread exits after the patches are applied.

The problem: this only works about half the time, even though the patch always gets applied correctly.

My best guess is that due to the fact that I'm patching some code that gets executed early on when the service is starting (although it's pretty far from the OEP in terms of executed code), sometimes that particular code gets executed before it gets patched.

I've tried setting the thread priority to THREAD_PRIORITY_ABOVE_NORMAL. I've tried replacing Sleep(10) with Sleep(1) and even completely removing it, but I get the same result: sometimes it works, sometimes it doesn't, even when using the high thread priority and no Sleep() call (it just loads slower in this particular case).

Interesting stuff:

- when running it with a debugger that breaks at least once, it always works properly.

- since there's a CreateFileA() right before the code I'm patching, I also tried adding a CreateFileA() call of my own after patching to attempt some kind of monitoring by finding out which one gets called first. To this end, I ran the sysinternals process monitor and just like with the debugger, it always works when this one is monitoring it, whether I'm creating a file or not.

Has anyone encountered a similar problem? Any recommendations? At this point I've totally run out of ideas, so any suggestions that I could try would be welcome.

I'm also open to different patching solutions, but keep in mind that this is a service. Of course, I could dump & rebuild it, but 1. the patch I'm applying actually adds extra functionality that's inside the DLL and 2. I am eventually aiming for a generic solution that would work on future versions as well.

Link to comment

- since there's a CreateFileA() right before the code I'm patching, I also tried adding a CreateFileA() call of my own after patching to attempt some kind of monitoring by finding out which one gets called first.

simple solution then... hook CreateFileA

Link to comment

Since you are injected into the process, why are you using WriteProcessMemory to write your patches? You have direct access to the memory just write the changes directly.

Overall just sounds like the timing is off for the thread missing the patching. Since its running in two different threads you will always land up getting some sort of race condition that can effect the patching from properly happening.

If you can guarantee the assumption of where the return will be from CreateFileA, hooking that and checking the return will probably give you a better outcome. If the return is inside the address space of what needs to be patched, apply the patch since you can now assume its allocated and loaded, then just let CreateFileA return normally and unhook it.

Link to comment

Well, yes, I was thinking along the same lines: hook CreateFileA, patch when it's trying to open that particular file and my problem is solved. However, I planned to walk the IAT, replace GetProcAddress() with a proxy function of my own and then wait for it to get called with the kernel.dll handle and the "CreateFileA" string and return my own CreateFileA function. The problem would have been that at the time the DLL is loaded, the EXE image is not loaded yet so I have no IAT to do this on. The solution would have been to start a thread and wait for the image to get loaded, but I would have just created another race condition if I did that. Luckily I had a sudden detours flashback, implemented it and got everything working nicely. Thanks a bunch for spelling it out for me nonetheless :)

As for WriteProcessMemory(), I'm using that instead of patching directly because if I wrote a packer/protector, I'd VirtualProtect() the memory to PAGE_EXECUTE_READ after decompressing and I don't want to have to handle that in my code. I have no idea if this particular packer does it, but I don't even care as long as WriteProcessMemory() works fine. On top of that, to tell the truth, I copy/pasted some code from an older loader of mine and I didn't bother changing the writes, only the reads.

Link to comment
Luckily I had a sudden detours flashback, implemented it and got everything working nicely.

You can hook the API directly without touching the IAT...

MS Detours is a great free hook library: http://research.microsoft.com/en-us/projects/detours/

Thanks a lot, that's what I ended up using, I even posted that :)

I haven't had to hook an API since the win9x days so I've been out of touch with the tools. I ran into detours some years ago while helping a friend and now when I did a forum search it popped up and jogged my memory really good.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...