Well I had hoped to learn more about Address Space Layout Randomization (ASLR) and blog about it after reading “On the Effectiveness of Address-Space Layout Randomization.” I got about a quarter of the way through the paper and realized that I need to better understand address space layout in general. Where is libc in memory and how do programs get access to it?
I’ve been Googling and reading for the last 3.5 hours and am somewhat more informed but I’m still trying to figure it all out.
Every ELF executable on Linux invokes the dynamic linker ld-linux.so which is the bootstrap for finding and loading all other shared libraries (.so extensions). I think that the linker uses mmap() to memory map the shared libraries from the disk into virtual address space but I’m still somewhat confused on this part.
Mmap was another thing I wasn’t familiar with. Memory mapped files essentially allow a program to refer to a file as if it were in memory by paging in page sized chunks into memory. For writing files, it has the advantage that many small writes will happen to memory and then the virtual memory manager will write the page all at once to disk.
I’m still trying to figure out what the advantage of mmap’ing shared libraries is. For shared libraries, mmap is used to map the shared library into the processes address space as if the process contained library in it’s memory. According to what I read, mmap can be used to map “shared memory objects” as well as files, etc. Basically anything that has a file descriptor can be memory mapped.
I’m wondering if the operating system loads very common and essential libraries like glibc into physical memory and then mmap’s them to the processes virtual address space or if it memory maps the actual file on disk like I mentioned above (i.e. /usr/lib/libc.so)
So far the best material I’ve found related to these subjects was
and one that looks good but I haven’t had a chance to read yet this one is especially good: