Directly calling ZwReadVirtualMemory for stealth and performance

Direct ZwReadVirtualMemory extra protection safe

I am writing a last handle hijacking user-mode bypass and I am going way over the top to avoid detection and get performance with all sorts of tricks (e.g. using spinlocks for synchronisation instead of semaphores or other synchronisation objects to have an stealthy OS-independent inter-process synchronisation system that leaves no tracks, no handle of any sort)
I will share that soon.
I know that since anti-cheats are in kernel they have the advantage over me in user-mode, but I want to push user-mode stealth to its limits before going kernel myself, simply because I enjoy coding/hacking.

One thing bugged me:
I was calling NtReadVirtualMemory and NtWriteVirtualMemory (NtRVM and NtWVM) or the classic RPM/WPM from the process having the handle that I abuse.
Small background in case you don’t know:
When you call RPM/WPM, some wrapping operations are done to return a BOOL instead of an NTSTATUS and other things, then RPM/WPM calls NtRVM/NtWVM that then calls the kernel function ZwReadVirtualMemory or ZwWriteVirtualMemory (ZwRVM and ZwWVM)
I made a small experiment and hooked these 2 functions to detect myself and I could detect when these functions were called and retrieve their parameters which could also be done by an anti-cheat.
I do not know if they use this user-mode hook based detection technique and I do not care, as I said I just enjoy coding/hacking for its own sake.
However, certain system processes having interesting process handles have the control flow guard (CF Guard) that apparently prevents from putting simple hooks (it did prevent me from setting my hooks, but there are certainly ways to go around that, there always are).

So how can I read another process’ memory without calling these 2 functions?
My solution: Calling directly the kernel function ZwRVM or ZwWVM.
To do this I would have to replicate the operations that NtRVM does before doing the syscall that leaves user-land to kernel-land.
I thought that there will be many operations before the syscall, to check stuff around and prepare the travel to kernel-land, but then I opened ntdll with IDA and saw this:

NtReadProcessMemory disassembled in IDA

 

Well, that’s pretty simple …
The only difference in NtWVM is the number moved into eax.

So I fired Visual Studio, new project, add new source file “ZwRWVM.asm”, wrote this code:

  1. .code
  2.  
  3. ZwRVM proc
  4. mov r10, rcx
  5. mov eax, 3Fh
  6. syscall
  7. ret
  8. ZwRVM endp
  9.  
  10. ZwWVM proc
  11. mov r10, rcx
  12. mov eax, 3Ah
  13. syscall
  14. ret
  15. ZwWVM endp
  16.  
  17. end

Then I did a quick program to test that:
(Click here for the GitHub repo)

  1. #include <Windows.h>
  2. #include <iostream>
  3.  
  4. using namespace std;
  5.  
  6. extern “C” NTSTATUS ZwRVM(HANDLE hProcess, void* lpBaseAddress, void* lpBuffer, SIZE_T nSize, SIZE_T* lpNumberOfBytesRead = NULL);
  7. extern “C” NTSTATUS ZwWVM(HANDLE hProcess, void* lpBaseAddress, void* lpBuffer, SIZE_T nSize, SIZE_T* lpNumberOfBytesRead = NULL);
  8.  
  9. int main() {
  10. DWORD pid;
  11. cout << “PID: “;
  12. cin >> dec >> pid;
  13. HANDLE hProcess = OpenProcess(PROCESS_ALL_ACCESS, FALSE, pid);
  14. if (!hProcess)
  15. return EXIT_FAILURE;
  16.  
  17. int intBuffer = 555555;
  18. void* lpBaseAddress = nullptr;
  19. SIZE_T lpNumberOfBytesRead = 0;
  20. cout << “Address to read: 0x”;
  21. cin >> hex >> lpBaseAddress;
  22.  
  23. NTSTATUS status = ZwRVM(hProcess, lpBaseAddress, &intBuffer, sizeof(int), &lpNumberOfBytesRead);
  24.  
  25. cout << “ZwRVM returned “ << dec << status << endl;
  26. cout << “intBuffer = “ << dec << intBuffer << endl;
  27. cout << “lpNumberOfBytesRead = “ << lpNumberOfBytesRead << endl;
  28. system(“pause”);
  29. return EXIT_SUCCESS;
  30. }

(When using a .asm assembly file don’t forget to right click your solution, go to Build dependencies/Build customizations, then tick “masm(.targets, .props)”)

I fired up my favourite guinee-pig program:

And successfully read its first integer:

Well, that was much simpler than I expected lol

With this solution I could obviously not detect myself anymore by hooking RPM, WPM, NtRVM, and NtWVM in the process of which I abuse the handle.
In addition this gives us the small speed boost that calling directly Nt functions gives over using RPM/WPM that I calculated in a previous experiment (about 5%, I will post this experiment on the blog soon).

Feel free to use this for extra protection.

Bonus meme!

User-mode API functions - Aint nobody got time for that

Leave a Reply

Your email address will not be published. Required fields are marked *