HOWTO: Build an RT-application

From RTwiki
Revision as of 19:28, 22 May 2007 by Remy (Talk | contribs)

Jump to: navigation, search




Remy Bohmer


Revision History
Revision 2 2007-05-22
Draft / Work in progress


This document describes the steps to writing hard realtime linux programs while using the Realtime Preemption Patch. It also describes the pitfalls that destroy the realtime responsiveness. It focusses on x86, as this is currently the most mature architecture.


Hardware causes of ISR latency

A good Realtime behaviour of a system depends a lot on low latency interrupt handling. Taking a look at the X86 platform, it shows that this platform is not really designed for RT usage. Several mechanisms cause ISR latencies that can run into the 10's of microseconds. Knowing them will enable you to make the best design choices on this platform to enable you to work around the negative impact.

  • DMA busmastering: Bus mastering events can cause long-latency CPU stalls of many microseconds. It can be generated by every device that uses DMA, such as SATA/PATA/SCSI devices and even network adapters. Also video cards that insert wait cycles on the bus in response to a CPU access can cause this kind of latency. Sometimes the behavior of such peripherals can be controlled from the driver, trading off throughput for lower latency. The negative impact of busmastering is independant from the chosen OS, so this is not a unique problem for Linux-RT, even other RTOS-es experience these type of latency!
  • On-demand CPU scaling: creates long-latency events when the CPU is put in a low-power-consumption state after a period of inactivity. Such problems are usually quite easy to detect. (e.g. On Fedora the 'cpuspeed' tool should be disabled, as this tool loads the ondemand scaling_governor driver)
  • VGA Console: When the system is fullfilling its RT requirements the VGA Text Console must be left untouched. Nothing may be written to that console, also printk's are not allowed. This VGA text console causes very large latencies, up to more than hundreds of microseconds. It is better to use a serial console and have no login shell on the VGA text console. Also SSH or Telnet sessions can be used. The 'quiet' option on the kernel command line could also work if no application outputs anything on the VGA console. Notice that using a graphical UI of X has no RT-impact, it is just the VGA text console that causes latencies.

Latencies caused by Page-faults

Whenever the RT process runs into a page-fault the kernel freezes the entire program (with all its threads in it), until the kernel has handled the pagefault. There are 2 types of pagefaults, major and minor pagefaults. Minor pagefaults are handled without IO accesses. Major pagefaults are pagefaults that are handled by means of IO activity. Pagefaults are therefor dangerous for RT applications and need to be prevented.

If there is no Swap space used, and no other applications stress the memory boundaries, then there is enough free RAM ready for the RT application to be used. In this case the RT-application will only run into minor pagefaults, which cause relatively small latencies. But, if the RT application is just one of the many applications on the system, and there is Swap space used, then special actions has to be taken to protect the memory of the RT-application. If memory has to be retrieved from disk or pushed towards the disk to handle a pagefault, the RT-application will experience very large latencies, sometimes up to more than a second! Notice that pagefaults of one application cannot interfere the RT-behaviour of another application.

During startup a RT-application will always experience a lot of pagefaults. These cannot be prevented. In fact, this startup period must be used to claim and lock enough memory for the RT-process in RAM. This must be done in such a way that when the application needs to expose its RT capabilities, pagefaults do not occur anymore.

This can be done by taking care of the following during the initial startup phase:

  • Call directly from the main() entry the mlockall() call.
  • Create all threads at startup time of the application, and touch each page of the entire stack of each thread. Never start threads dynamically during RT show time, this will ruin RT behavior.
  • Never use system calls that are known to generate pagefaults, such as fopen(). (Opening of files does the mmap() system call, which generates a page-fault).
  • Do not use 'compile time static arrays' without initializing them directly after startup, before RT show time.

Simple memory locking example

   #include <stdio.h>
   #include <sys/mman.h> // Needed for mlockall()
   #include <unistd.h> // needed for sysconf(int name);
   #include <malloc.h>
   #include <sys/time.h> // needed for getrusage
   #include <sys/resource.h> // needed for getrusage
#define SOMESIZE (100*1024) // 100kB
int main(int argc, char* argv[]) { // Allocate some memory int i, page_size; char* buffer; struct rusage usage;
// Now lock all current and future pages from preventing of being paged if (mlockall(MCL_CURRENT | MCL_FUTURE )) { perror("mlockall failed:"); }
page_size = sysconf(_SC_PAGESIZE); buffer = malloc(SOMESIZE);
// Touch each page in this piece of memory to get it mapped into RAM for (i=0; i < SOMESIZE; i+=page_size) { // Each write to this buffer will generate a pagefault. // Once the pagefault is handled a page will be locked in memory and never // given back to the system. buffer[i] = 0; // print the number of major and minor pagefaults this application has triggered getrusage(RUSAGE_SELF, &usage); printf("Major-pagefaults:%d, Minor Pagefaults:%d\n", usage.ru_majflt, usage.ru_minflt); } // buffer is never released, or swapped, so using it from now will never lead to any pagefault
//<do your RT-thing>
return 0; }

Notice that for this application you have to be 'root' to function properly. In fact: you only need the capability called 'CAP_IPC_LOCK' Notice also the difference between running this program with and without using the mlockall() call. Tip: Also run this application when there is no free RAM in the system, and see that the number of initial major pagefaults increases.

During runtime the getrusage() can be used to detect if the running RT application has been trapped by any new pagefaults.

How to deal with threads

While creating a new thread the kernel will allocate memory for a new stack and for the thread administration. These allocations will result in new pagefaults. Therefor all threads need to be created at startup time.

After a thread is created, all stack pages of that thread need to be touched to prevent pagefaults when it is used for the first time. Threads are created default with a stack size of 8MB. Touching all 8MB is for most applications overkill. If we leave the stack size default to 8MB, then we are probably out-of-memory in no-time. So, we need to figure out what the maximum size of stack space used by a certain thread, and then create that thread with the amount of stack space it requires. You may add a little bit more, but surely nothing less.

Touching the stack can be achieved by calling a function which has a sufficiently large automatic variable and which writes to the memory occupied by this large array in order to touch these stack pages. This way, enough pages will be mapped for the stack and can be locked into RAM. The dummy writes ensure that not even copy-on-write page faults can occur in the critical section.

TODO: insert example application.

File handling

File handling is known to generate disastrous pagefaults. So, if there is a need for file access from the context of the RT-application, then this can be done best by splitting the application in an RT part and a file-handling part. Both parts are allowed to communicate through sockets. I have never seen a pagefault caused by socket traffic. Note: While accessing files the low-level fopen() call will do a mmap() to allocate new memory to the process, resulting in a new pagefault.

How to use dynamic memory allocation

In the previous section is explained that all memory must be allocated and claimed, for the entire lifetime of the RT-application, at startup time, before the RT-application is going to fulfill its RT requirements. If memory is allocated later on, this normally will result in pagefaults, and thus ruin the RT behavior of the application.

Q: So, we cannot run C++ applications with dynamic memory allocation? A: Wrong! Dynamic memory allocation is possible, if:

  • allocated memory is never given back to the kernel.

How can this be achieved? All memory allocation routines are implemented inside Glibc. Glibc translates each memory allocation request to a call to:

  • mmap(): mmap maps in a certain amount of memory into the virtual memory space of the process. mmap() is usually faster than sbrk() for smaller memory allocations.
  • sbrk(): sbrk increases (or decreases) the memory block assigned to the process by a given size.

Glibc offers interfaces that can be used to configure its behavior related to these calls.

Glibc can be configured how much memory must be released before calling sbrk() to give memory back to the kernel. It can also be configured when the kernel starts using sbrk() instead of mmap() What we need to do is to get rid of the mmap calls, and to configure glibc to never give memory back to kernel, until the process terminates. (of course).

We use this call for it: int mallopt (int param, int value) (it is defined in malloc.h.) When calling mallopt, the param argument specifies the parameter to be set, and value the new value to be set. Possible choices for param, as defined in malloc.h, are:

  • M_TRIM_THRESHOLD: This is the minimum size (in bytes) of the top-most, releasable chunk that will cause sbrk to be called with a negative argument in order to return memory to the system.
  • M_TOP_PAD: This parameter determines the amount of extra memory to obtain from the system when a call to sbrk is required. It also specifies the number of bytes to retain when shrinking the heap by calling sbrk with a negative argument. This provides the necessary hysteresis in heap size such that excessive amounts of system calls can be avoided.
  • M_MMAP_THRESHOLD: All chunks larger than this value are allocated outside the normal heap, using the mmap system call. This way it is guaranteed that the memory for these chunks can be returned to the system on free.
  • M_MMAP_MAX: The maximum number of chunks to allocate with mmap. Setting this to zero disables all use of mmap.

More background information can be found at this paper:

The next example shows how we can create a pool of memory during startup, and lock it into memory. At startup a block of memory is allocated through the malloc() call. Prior to it Glibc will configured such that it uses the sbrk() call to fulfill this allocation. After locking it, we can free this block of memory, knowing that it is not released to the kernel and still assigned to our RT-process. We have now created a pool of memory that will be used by Glibc for dynamic memory allocation. We can new() and delete() as much as we want without noticing any page fault! Even if the system is fully stressed, and swapping is continuously active, the RT-application will never run into any page fault...

Personal tools