Babak Salamat's Research Blog: October 2007

One of the difficulties with multi-variant execution is file operation.
When variants try to open a file we can open the file in the platform and send the results to the variants. This solution fails if the variants try to mmap the file. They use a file descriptor that is sent to them by the platform and is not really open in their contexts. Therefore the mmap returns error. This situation happens frequently when variants map shared libraries.
Another solution is to open all files for all variants, but preventing them from writing to the files. We allow only one of the variants (or the platform) to write to a file. This solves the problem of the previous method and helps keep valid file descriptors in variants, but the locking mechanisms of the kernel don't allow us to open one file for writing in a few processes simultaneously.
An alternative solution is to open files that are opened as read-only in all variants and open the other files only in the platform. Obviously, this solution still has the problem of the first solution. Files that are not opened for all variants can not be mmapped, but it is not as severe as the former problem, because in most cases the files that are mmapped are opened as read-only and, therefore, are opened by all variants.
The only complication is that the files that are opened by the platform and those opened by the variants can have the same file descriptor. For example, a request to open file "a.txt" for reading and writing is received by the platform. The platform doesn't allow the variants to open the file and opens it itself. Let's assume it is assigned file descriptor #5. This file descriptor is sent to the variants and they use it when they request further operations on the file.
Now variants want to open "b.txt" as read-only. The platform let them run the syscall and open the file. This time they get the file descriptor directly from the kernel that can be 5 again! Now if they invoke "read" to read file descriptor #5, it will not be clear whether they want to read "a.txt" or "b.txt". To solve this problem, the platform adds a big number to all file descriptors that are opened by itself before sending them to the variants. In Linux 2.6, maximum number of files that can be opened by a process is 256 and therefore, the largest file descriptor is 256. If we add 256 to all file descriptors opened by the platform before sending them to the variants, we can distinguish between the files opened by the variants and those opened by the platform. Now all file operations requests sent by the variants that have a file descriptor larger than 256 should be performed on the files opened by the platform and those that have smaller file descriptors should be performed on the files opened directly by the variants.

The content of a traced process's memory can be read using ptrace, but ptrace only read 4-bytes (more precisely, a long int) at a time. So, if you want to read a big chunk of memory you should call ptrace multiple times. However, ptrace is executed in kernel space, which means that every time you call it, a context switch is made. Context switches are expensive operations which can take thousands of cpu cycles to complete. Therefore, in order to lower the overhead of our technique, it is necessary to find a better way for reading from the memory of a traced process.

Recent Linux kernels allow reading a traced process's memory by opening /proc/PID/mem file, where PID is the pid of the the traced process. This is only allowed if the traced process is suspended. Besides, you should use the open syscall to open this pseudo file instead of fopen or other library functions.
A problem that I found is that the read fails if the address falls into the memory range of the stack. I don't know whether it is intentional or just a bug, but it doesn't work anyway.

Now, I am trying to find another low-overhead method to read from the memory of a traced process.

Babak Salamat's Research Blog

About Me

Fellow Researchers

Blog Archive

Friday, October 19, 2007

Files

Tuesday, October 2, 2007

Low-overhead access to the memory space of a traced process, Part I