Taking a Dive into Linux’s /proc/kcore

Feb 3, 2025    #linux   #programming   #c  

The Proc filesystem (procfs) on Linux is quite an interesting thing to take a deep dive into. This virtual filesystem is generated on-the-fly by the Linux Kernel and contains all kinds of juicy information about the system. Two very interesting files that I recently found myself exploring are /proc/kcore and /proc/iomem.

Whilst poking around on a Linux system recently, I came across the kccore file for the first time. Up until this point, I remained blissfully unaware of what it was, but on this day I had reason to dig into it. Through my research I discovered that it’s possible to access the entire memory of a Linux system via this file. Being that this is an interesting proposition, I went ahead and developed a proof-of-concept tool to dump the memory to a file on disk.

But first, let’s take a little closer look at these files.

A look at /proc/kcore

As I’ve already established, kcore is one of the files in the virtual /proc filesystem. It is created on-the-fly by fs/proc/kcore.c in the Kernel source code and it allows us read-only access (with root privileges) to the kernel’s memory space (it does not allow writing of any kind).

Taking an initial look at /proc/kcore you will notice that it reports a ridiculously large size, often more than all of the memory devices installed in the system combined. For example, on my laptop, it reports being 128TB:

1adam@Archie:/proc$ ls -lah | grep kcore
2-r--------.   1 root            root            128T Feb  3 19:12 kcore

Of course, being a file in proc, it isn’t actually taking up any disk space at all.

Format of /proc/kcore

Before we can even consider trying to develop a tool to dump the kcore file, we need to understand the internal format of the file. Well, internally, /proc/kcore uses the same format as a coredump from a crashed process. That is to say that it uses the ELF coredump format.

The nitty-gritty of the ELF format is a deep rabbit hole that one could go down but, luckily, for our purposes there’s only a couple of aspects of it that we care about:

  1. The ELF header (Elf64_Ehdr). This header is present at the start of every ELF file. We care about two pieces of information from this header: The location of the the program header table and the number of entries in that table.
  2. The program segment headers (Elf64_Phdr). ELF files contain an array of these program header structures of various subtypes. For our purposes of decoding kcore, we only care about the PT_LOAD headers. In a normal ELF, these headers describe a loadable segment that can be loaded into memory. In the case of /proc/kcore, however, it describes where in the kcore file each portion of the system memory is located.

A look at /proc/iomem

While the main point of this post is /proc/kcore, remember what our goal is; We want to write a tool that can dump the contents of RAM to a file on disk. This goal is all well and good, but we have a problem.

You see, the physical RAM of any given system is not necessarily going to be located at the start of the physical address space. To make matters worse, it may very well not be in a contiguous block. This means that we need some way to differentiate between the portions of the address space that are part of physical memory and which parts are something else (i.e. memory-mapped I/O).

This is where /proc/iomem comes into play.

As you could probably guess by the fact that it’s located in /proc, iomem is a virtual file that is generated on-the-fly. In particular, this file is generated by kernel/resource.c in the Linux Kernel. It lists out all of the I/O memory regions that are mapped into the address space, including the system’s RAM. For those interested, there’s a LWN article about it .

As an example, here’s a portion of /proc/iomem on my laptop:

 1───────┬─────────────────────────────────────────────────────────────────────────────────────────
 2       │ File: /proc/iomem
 3───────┼─────────────────────────────────────────────────────────────────────────────────────────
 4   1   │ 00000000-00000fff : Reserved
 5   2   │ 00001000-0009ffff : System RAM
 6   3   │ 000a0000-000fffff : Reserved
 7   4   │   000a0000-000fffff : PCI Bus 0000:00
 8   5   │     000c0000-000dffff : 0000:00:02.0
 9   6   │     000f0000-000fffff : System ROM
10   7   │ 00100000-659b1fff : System RAM
11   8   │ 659b2000-659c4fff : ACPI Tables
12   9   │ 659c5000-659c6fff : Reserved
13  10   │ 659c7000-659c7fff : System RAM
14  11   │ 659c8000-659cafff : Reserved
15  12   │ 659cb000-659ccfff : System RAM
16  13   │ 659cd000-659cdfff : Reserved
17  14   │ 659ce000-659cefff : System RAM
18  15   │ 659cf000-7fffffff : Reserved
19  16   │   659fb000-65a02fff : BOOT0000:00
20  17   │ 80000000-dfffffff : PCI Bus 0000:00
21  <snip>

Identifying the regions that are associated with the system RAM is as simple as finding the sections labelled as System RAM. Note: I’ve found that this doesn’t necessarily hold true on some embedded devices, but that’s another story.

Correlating the Segments from /proc/iomem to /proc/kcore

At this point, we’ve determined the basics of how the /proc/kcore file is formatted and how we can determine the memory address ranges for the system’s physical RAM by parsing /proc/iomem. This is a major step in figuring out how we can dump the physical RAM to disk, but we still have a hurdle to around: we need a way to figure out how to correlate the address ranges from /proc/iomem to the sections of /proc/kcore.

Prior to Kernel 4.8, this was extremely simple, but things get a bit more complicated post 4.8 due to KASLR .

Pre-4.8 Kernels

For our purposes, we don’t really care too much about how things worked on the older kernel versions, but I figured I’d be remiss if I didn’t touch on how it used to work.

You see, prior to 4.8, the kernel’s virtual mapping of the physical address space would always start at the constant address 0xffff880000100000. This meant that we could translate a physical address into a virtual kernel address with a simple addition.

For example, suppose you wanted ot translate the physical address 0x100000 into its corresponding virtual address. You would simple add 0xffff880000100000 + 0x100000, giving you a resulting address of 0xffff880000100000.

Somewhat unfortunately for us, things aren’t quite this simple on 4.8 and newer kernels (though it’s really not that much harder for our purposes).

4.8 and Newer Kernels

In kernel version 4.8, there were some changes made to the way KASLR functions. Namely, the offset is no longer at the constant 0xffff880000100000. Instead, the offset gets randomized. This means that we can’t simply add the physical RAM addresses from iomem to this constant offset to get our virtual address.

One could be forgiven for thinking that this sounds like a massive problem but, luckily for us, the format of /proc/kcore gives us a very easy way to work around this limitation.

Remember earlier when we talking about the ELF program header? Well, it turns out that this header contains an entry called p_paddr. If we look at the man page for ELF, it will tell us the following about this field: “On systems for which physical addressing is relevant, this member is reserved for the segment’s physical address”.

Most of the time on x86 and x86_64 systems, this would not apply. In /proc/kcore, however, it’s used exactly as described, meaning it contains the physical address for this segment!

In other words, all we have to do is fetch the physical address ranges from /proc/iomem and then scan the segments in /proc/kcore looking for that physical address. Simple!

My Proof of Concept Tool

Using all of this background information, I put together a simple proof-of-concept tool. Well, actually, I put together two tools but, regardless, you can find them on my GitHub .

The first tool is called dumpmemory and it does exactly what I described in this article: dumps the system’s memory to disk.

The second tool is called scanmemory . This tool utilizes the same concepts, but instead of dumping the memory to disk it scans it for a string pattern. My idea behind this was that it could allow you to scan for a string in memory on a device where you don’t have room to dump the entire memory to a file.

Disclaimer: I made both of these tools as a simple proof-of-concept and mostly to satisfy my own curiosity one evening. These have not been thoroughly tested. As such, I make not guarantees about it’s useful or accuracy.

With that disclaimer out of the way, I hope you enjoyed joining me on a dive into the world of /proc/kcore and /proc/iomem!