It would be so nice if programs could request a dedicated core for these kinds o...

MertsA · on March 18, 2014

Under Linux you can. You can boot the kernel with the maxcpus argument set to 1 or the number of cores you want the Linux kernel to use and then on a quad core machine you have 3 cores available all the time and setup to where Linux won't ever need to handle interrupts on that core. Then you just start your application and set the affinity to that unused core.

You can extend this further by doing things like mmaping in 4GB of memory through the kernels hugepage support and lock the physical to virtual address map so the kernel can't touch your physical block of ram you just allocated. Then you can do things like talk directly to a pci device like a network card and set up DMA directly from a NIC into a buffer in your applications memory.

All of this is done completely in userspace but you get all the performance benefits of implementing everything like it was running in Ring 0 and the kernel is not involved in anything apart from the initial setup and teardown. You can build an extremely high performance application basically running on bare metal but with the Linux kernel still running on a different core to handle anything that doesn't directly involve your application and there wouldn't need to be any syscalls between the two to service some request.

valarauca1 · on March 18, 2014

What I'm confused about. I'll likely just have to play around with this feature at some point. I knew about APIC, but not completely sandboxing cores.

Is if you have bare metal operations do you still have access to kernel functionality ala stdio and libc libraries? Normally when you hit bare metal your on your own. I'm just wondering because the idea of writing my own threading, and memory management libraries excites me to no end </sarcasm>.

Also if you can call these functions like your in userland then do they block until execution has completed on the other 'kernel' cores? Also if you creating P-Threads elsewhere but not managing their execution on the 'non-kernel' core what happens?

>userspace but you get all the performance benefits of implementing everything like it was running in Ring 0

Can you give any literature on this? these terms are contradictory.

MertsA · on March 22, 2014

Sorry for the delay, I'm probably not the best person to answer this and I know just enough on the subject to be dangerous so with that in mind I'll give it a shot.

>Is if you have bare metal operations do you still have access to kernel functionality ala stdio and libc libraries? Normally when you hit bare metal your on your own.

That's just it, your process is just another Linux process. The difference is that the scheduler will put it on lets say core 2 but everything else has an affinity for core 1 and interrupts will also be handled by core 1 meaning your application is never interrupted on core 2. You still get every feature that you normally get in Linux.

>Also if you can call these functions like your in userland then do they block until execution has completed on the other 'kernel' cores?

You are in userland, normal userland. Implementation details of syscalls are black magic as far as I'm concerned so take this with a grain of salt but apart from kthreads the kernel isn't running in some other thread waiting for a syscall to service, a syscall is just your program calling int 80 which jumps into the interrupt handler in kernel mode on the same core that was just running int 80, does it's work figuring out what syscall you're making and finishes handling the interrupt. So basically yes your thread "blocks" while the syscall is in progress on your special isolated core, not core 1 like everything else running on the system.

>Also if you creating P-Threads elsewhere but not managing their execution on the 'non-kernel' core what happens?

I'm not entirely sure what you mean by this, specifically "managing their execution on the 'non-kernel' core". It's just a thread like a normal Linux thread, but at first a new thread is going to have an affinity for only core 1 which you can change to core 2.

>Can you give any literature on this? these terms are contradictory.

What I meant was that generally if you want to do certain low level things like talk directly to hardware you need to be running in kernel mode. But really you don't need to be in kernel mode all of the time, just initially to allow normal user mode code to talk to the hardware instead of having to use the kernel like one big expensive proxy. As for why a user mode driver for a network card would be such a huge performance gain there are a number of reasons such as every syscall will be a context switch, whatever data you're sending or receiving to the network card will need to be needlessly copied to/from the buffer instead of reading and writing directly to it, you have to go through the entire Linux TCP/IP stack when there's tons of functionality in there that you might not need but have to have so it's just wasted cycles, and the list goes on.

I did manage to find an old Hacker News comment on the subject for further reading from someone much more well versed on the topic than I am. https://news.ycombinator.com/item?id=5703632

Also of interest might be Intel's DPDK which is basically what we're talking about, moving the data plane out of the kernel completely for extreme scalability.

valarauca1 · on March 17, 2014

Even that wouldn't work. Their are a lot of parts of the CPU and all of them work together. RAM access, Cache access, Interrupts, and Memory management is handled globally not on a pre-core basis. You'd run into the problem of needing multiple north bridges (do we even have those anymore or are those on chip now?) which you couldn't have.

You have to build an entire OS with real time I/O at heart. They do exist, some are secure (Blackberry's platform) but they aren't deployed to the test industry. The most likely version is licensing fees, nobody writings Data Acquisitions for them.