Tuesday, October 2, 2007

Low-overhead access to the memory space of a traced process, Part I

The content of a traced process's memory can be read using ptrace, but ptrace only read 4-bytes (more precisely, a long int) at a time. So, if you want to read a big chunk of memory you should call ptrace multiple times. However, ptrace is executed in kernel space, which means that every time you call it, a context switch is made. Context switches are expensive operations which can take thousands of cpu cycles to complete. Therefore, in order to lower the overhead of our technique, it is necessary to find a better way for reading from the memory of a traced process.

Recent Linux kernels allow reading a traced process's memory by opening /proc/PID/mem file, where PID is the pid of the the traced process. This is only allowed if the traced process is suspended. Besides, you should use the open syscall to open this pseudo file instead of fopen or other library functions.
A problem that I found is that the read fails if the address falls into the memory range of the stack. I don't know whether it is intentional or just a bug, but it doesn't work anyway.

Now, I am trying to find another low-overhead method to read from the memory of a traced process.

3 comments:

Anonymous said...

Hi, I need to access a memory space of a traced process, but I get the same error you reported: read fails if the address is located in the memory range of the stack. Could you solve this problem? Thanks in advance.

Babak said...

If performance does not matter, you can use ptrace with PTRACE_PEEKDATA and pass the PID of the traced process and desired memory address.

If performance is important, you should read my other posts about low-overhead access to the memory space of a traced process.
Take a look at these posts:

http://blog.babaks.com/2008/05/low-overhead-access-to-memory-space-of.html

http://blog.babaks.com/2009/02/low-overhead-access-to-memory-space-of.html

Unknown said...

Solved! You need to use lseek64() instead of lseek() because lseek() takes an off_t with is a *signed* 32-bit integer.