Table Of Contents:

Description

overview

Realtime Preemption is (as of this writing 12/21/2004) a patch which tries to improve realtime performance of the Linux kernel.

Recent patches from Ingo include a (large) number of technologies for improving preemption and debugging preemption issues with the Linux kernel.

An overview of the technologies is as follows:

voluntary preempt

Overview:

conversion of spinlocks to mutexes

According to Ingo Molnar, it's primary author, "the big change in this release is the addition of PREEMPT_REALTIME, which is a new implementation of a fully preemptible kernel model"

For a brief description of the overall technology, see: http://kerneltrap.org/node/3995?PHPSESSID=4bc02ae16e5a27308031f3cd664fd574

Briefly, the technology makes spinlocks and rwlocks preemptible by default.

Ingo mentioned at one time that this was about 20% of the locks in his kernel configuration, implying that there were about 450 spinlocks present in the kernel in his configuration.

Ingo said this about how well this works on Un-processor (UP) systems versus SMP systems.

{{{...and no matter how well UP works, to fix SMP one has to 'cover' all the necessary locks first before fixing it, which (drastic) increase in raw locks invalidates most of the UP efforts of getting rid of raw locks. That's why i decided to go for SMP primarily - didnt see much point in going for UP. }}} Normally, in UP the spinlocks are compiled away. When PREEMPT is turned on (without the new patch) these spinlocks are turned into markers for non-preemptible regions. When RT-PREEMPT is used, non-raw spinlocks are convereted into rw-semaphores.

people working on/interested in this stuff

people working on related stuff

miscellaneous comments

Comments regarding the scheduling of RT tasks

Ingo said (in this message):


note that my -RT patchset includes scheduler changes that implement "global RT scheduling" on SMP systems. Give it a go, it's at:

you have to enable CONFIG_PREEMPT_RT to active this feature. I've designed this code to not hurt non-RT scheduling, and i've optimized performance for the 'lightly loaded case' (which is the most common to occur on mainline-using systems).

A very short description of the design: there's a global 'RT overload counter' - which is zero and causes no overhead if there is at most 1 RT task in every runqueue. (i.e. at most 2 RT tasks on a 2-way system, at most 4 RT tasks on a 4-way system, etc.) If the system gets into 'RT overload' mode (e.g. the third RT task gets activated on a 2-way box), then the scheduler starts to balance the RT tasks agressively. Also, whenever an RT task is preempted on a CPU, or is woken up but cannot preempt a higher-prio RT task on a given CPU, then it's 'pushed' to other CPUs if possible. This design avoids global locking (it avoids a global runqueue), which simplifies things immensely. (I first tried a global runqueue for RT tasks but the complexity impact was much bigger.)

(note that these scheduler changes are resonably self-contained and do not depend on other parts of PREEMPT_RT, so in theory they could be added to mainline too, after some time - given lots of testing and broad agreement.)


comments regarding the hard parts of this work

Ingo says (at: http://groups-beta.google.com/group/linux.kernel/msg/cf036477d30ab736) {{{some of the harder stuff:

- the handling of per-CPU data structures (get_cpu_var())

- RCU and softirq data structures

- the handling of the IRQ flag }}}

comments about the number of raw spinlocks needed

Ingo says (at: http://groups-beta.google.com/group/linux.kernel/msg/e63b2860d2e993dd)

{{{Sven Dietrich <sdietr...@mvista.com> wrote:

> IMO the number of raw_spinlocks should be lower, I said teens before.

> Theoretically, it should only need to be around hardware registers and > some memory maps and cache code, plus interrupt controller and other > SMP-contended hardware.

yeah, fully agreed. Right now the 90 locks i have means roughly 20% of all locking still happens as raw spinlocks.

But, there is a 'correctness' _minimum_ set of spinlocks that _must_ be raw spinlocks - this i tried to map in the -T4 patch. The patch does run on SMP systems for example. (it was developed as an SMP kernel - in fact i never compiled it as UP :-|.) If code has per-CPU or preemption assumptions then there is no choice but to make it a raw spinlock, until those assumptions are fixed. }}}

Rationale

This feature is intended to provide much better realtime scheduling response for a Linux system.

Resources

Projects

Various parties are working on ports: TimeSys and Monta Vista, in particular, seem to have made ports to PPC and ARM platforms.

Specifications

None that I'm aware of.

Online resources

The original announcement for voluntary-preemption:

Here's some stuff by Jonathon Corbet:

There's a page of links about RT for audio at:

A brief introduction of RT patch (Sorry, in Japanese only):

Downloads

Patch

Utility programs

[other programs, user-space, test, etc. related to this technology]

How To Use

Configuration variables

The patch introduces (or modifies) the following configuration variables:

Variable

Purpose

ASM_SEMAPHORES

BLOCKER

CRITICAL_IRQSOFF_TIMING

CRITICAL_PREEMPT_TIMING

CRITICAL_TIMING

FRAME_POINTER

LATENCY_TIMING

LATENCY_TRACE

MCOUNT

PREEMPT

PREEMPT_BKL

PREEMPT_DESKTOP

PREEMPT_HARDIRQS

PREEMPT_NONE

PREEMPT_RT

PREEMPT_SOFTIRQS

PREEMPT_TRACE

PREEMPT_VOLUNTARY

RTC_HISTOGRAM

RT_DEADLOCK_DETECT

RWSEM_GENERIC_SPINLOCK

RWSEM_XCHGADD_ALGORITHM

SPINLOCK_BKL

USE_FRAME_POINTER

WAKEUP_TIMING

* retrieved from patch with command:

grep "[+-]config " realtime-preempt-2.6.10-mm1-V0.7.34-01 | sed "s/[+-]config //" | sort | uniq

How to validate

[put references to test plans, scripts, methods, etc. here]

Related projects

MontaVista released a similar technology, which had the following features:

See http://groups-beta.google.com/group/linux.kernel/msg/7eeef031d9ec1446 {{{These RT enhancements are an integration of features developed by others and some new MontaVista components:

}}}

Sample Results

[Examples of use with measurement of the effects.]

Case Study 1

Case Study 2

Trevor Woerner published some results in November 2005 regarding some latency measurements he have been recording on the 2.6.14 kernel with Ingo's patches.

See http://geek.vtnet.ca/embedded/LatencyTests/html/index.html

Case Study 3

Status

Future Work/Action Items

Here is a list of things that could be worked on for this feature:

people who expressed interest

Manas Saksena, Jon Masters, Takeharu Kato, Ralph Siemsen, Jyunji Kondo

RealtimePreemption (last edited 2008-05-07 18:22:04 by localhost)