Kernel threads and process virtual address


If the kernel is spawned as threads and resides in the memory then how could the ps command identify them if those are not normal process and i give you more look here:

root         2     0  0 févr.04 ?     00:00:00 [kthreadd]
root         3     2  0 févr.04 ?     00:00:01 [ksoftirqd/0]
root         5     2  0 févr.04 ?     00:00:00 [kworker/0:0H]

those kernel threads as we see has the same information as linux process children id, parent id (which is 0) and user owner (which is root)

Please explain this.

So if those threads are executed with different manner how could the CPU tell the difference between a kernel thread and linux process elf executable or library in the memory i need to know this please.

Another question: when the compiler builds the executable it creates a vma (virtual memory address) which is used then by the CPU to allocate memory space; how could the compiler generate those addresses?

Thank you guys.

Best Answer

I can't definitively answer the "kernel threads" question for Linux. For Windows, I can tell you that the "kernel threads" are simply threads created from some other kernel mode routine, running procedures that never enter user mode. When the scheduler picks a thread for execution it resumes its previous state (user or kernel, whatever that was); the CPU doesn't need to "tell the difference". The thread executes in kernel mode because that's what it was doing the last time it was executing.

In Windows these typically are created with the so-called "System" process as their parent, but they can actually be created in any process. So, in Unix they can have a parent ID of zero? i.e. belonging to no process? This actually doesn't matter unless the thread tries to use process-level resources.

As for the addresses assigned by the compiler... There are a couple of possible ways to think about this. One part of it is that the compiler really doesn't pick addresses for much of anything; almost everything a compiler produces (in a modern environment) is in terms of offsets. A given local variable is at some offset from wherever the stack pointer will be when the routine is instantiated. (Note that stacks themselves are at dynamically assigned addresses, just like heap allocations are.) A routine entry point is at some offset from the start of the code section it's in. Etc.

The second part of the answer is that addresses, such as they are, are assigned by the linker, not the compiler. Which really just defers the question - how can it do this? By which I guess you mean, how does it know what addresses will be available at runtime? The answer is "practically all of them."

Remember that every process starts out as an almost completely blank slate, with a new instantiation of user mode address space. e.g. every process has its own instance of 0x10000. So aside from having to avoid a few things that are at well-known (to the linker, anyway) locations within each process on the platform, the linker is free to put things where it wants them within the process address space. It doesn't have to know or care where anything else already is.

The third part is that nearly everything (except those OS-defined things that are at well-known addresses) can be moved to different addresses at run time, due to Address Space Layout Randomization, which exists on both Windows and Linux (Linux released it first, in fact). So it doesn't actually matter where the linker put things.

Related Question