/ARTICLES/2002MAR/2002MAR16_CT_EMS_ID_TA.PDF
|
Applying POSIX to real-time systems
to the OS. The original POSIX
standard defines interfaces to
core functions such as file op-
erations,processmanagement,
signals, and devices. Subse-
quent releases of POSIX have
also been defined to cover real-
time extensions and multi-
threading.
In a perfect world, because
of the previously cited advan-
tages, one would always choose
astandard.However,inthereal
world, a number of questions
must be answered before decid-
ing to use a standard. These in-
clude:
7 Does the standard provide
the functionality that my ap-
plication needs?
7 Is the performance of the
standard, or implementation
of the standard, suitable for
my application?
7 Do commercially available
implementations of the stan-
dard exist?
In this article, I will discuss
the usefulness of POSIX in real-
timesystemsbylookingatthree
factors: functionality, perfor-
mance, and availability. Be-
cause real-time systems typi-
cally have stringent perfor-
mance constraints, emphasis is
placed on the performance of
POSIX implementations.
POSIX real-time OSs
The POSIX family of standards
includes over 30 individual
In today's computing systems,
it is becoming increasingly im-
portant to design software with
an open system architecture
utilizing industry-adopted
standards. The need to develop
open systems is driven by three
major factors. First, gone are
the days when a single devel-
oper could implement the en-
tire system from scratch. Soft-
ware development programs
are growing in scale, requiring
teams of increasing size. Sec-
ond, software does not operate
in isolation; it must co-exist
with the vast amount of com-
mercially available software.
Last, the lifecycle of a software
application is typically long,
requiring numerous modifica-
tions and updates as new fea-
tures are added.
An open software architec-
tureaddressesthechallengesof
today's software development
process by defining standard
software interfaces, which pro-
mote interoperability and port-
ability. Openly published stan-
dard interfaces also reduce the
cost of adding functionality in
the future.
Standards are pervasive in
today's computer systems. New
standards are constantly being
defined to address the ever-
changingstateofsoftwaretech-
nology. A standard will not be
effective if it is not used, or if it
is gone tomorrow. To be effec-
tive, a standard must be based
on well-established technology
and accepted by a wide portion
of the industry.
The original Portable Oper-
atingSystemInterfaceforCom-
puting Environments (POSIX)
standard was first published in
1990. POSIX is based on UNIX,
a well-established technology
dating back to the early 1970s.
POSIX defines a standard way
for an application to interface
standards, ranging from speci-
ficationsforbasicOSservicesto
specifications for testing the
conformance of an OS to the
standard.Thisarticlefocuseson
those standards important to
the development of real-time
embedded systems. In this sec-
tion I discuss real-time systems
as well as give a brief review of
the relevant POSIX standards.
Real-time systems: A real-time
system is one where the timeli-
nessoftheresultofacalculation
is important. Examples include
military weapons systems, fac-
tory control systems, and video
and audio streaming. Real-time
systems are typically catego-
rized into two classes: hard and
soft. In a hard real-time system
the time deadlines must be met
or the result of a calculation is
invalid. For example, in a mis-
sile tracking system, if the mis-
sile is delayed, it may miss its
intended target. The timing
constraints in a soft real-time
systemarenotasstringent.The
resultofacalculationcanstillbe
usefulifitdoesnotmeetitstim-
ingdeadline.Audiostreamingis
CPU1 CPU2 CPU3 CPU4
Real-time threadsTime shared threads
and unbound interrupts
Figure 1: Solaris processor binding and control
StandardStandard NameName DescriptionDescription
1003.1a OS Definition Basic OS interfaces; includes support for: single process,
multi process, job control, signals, user groups, file system,
file attributes, file device management, file locking, device
I/O, device-specific control, system database, pipes,
FIFO and C language
1003.1b Real-time Functions needed for real-time systems; includes support
extensions for: real-time signals, priority scheduling, timers, asynchro-
nous I/O, prioritized I/O, synchronized I/O, file sync,
mapped files, memory locking, memory protection,
message passing, semaphores and shared memory
1003.1c Threads Functions to support multiple threads within a process;
includes support for: thread control, thread attributes,
priority scheduling, mutexes, mutex priority inheritance,
mutex priority ceiling and condition variables
1003.1d Additional Additional interfaces; includes support for: new process
real-time create semantics (spawn), sporadic server scheduling,
extensions execution time monitoring of processes and threads, I/O
advisory information, timeouts on blocking functions,
device control and interrupt control
1003.1j Advanced More real-time functions including support for: typed
real-time memory, nanosleep improvements, barrier synchronization,
extensions reader/writer locks, spin locks and persistent notification
for message queues
1003.21 Distributed Functions to support real-time distributed communication;
real-time includes support for: buffer management, send control
blocks, asynchronous and synchronous operations,
bounded blocking, message priorities, message labels and
implementation protocols
1003.2h High Services for Reliable, Available, and Serviceable (SRASS);
availability includes support for: logging, core dump control, shut-
down/reboot and reconfiguration
TABLE 1 POSIX standards
an example of a soft real-time
system. If a packet of data is late
or lost, the quality of the audio
isdegraded,butthestreammay
still be audible.
Toguaranteethatthetiming
requirements of a real-time sys-
tem are met, the behavior and
timing of the underlying com-
puting system must be predict-
able. The time required by all
operationsmustbeboundedfor
the timing of the system to be
called predictable.This implies
that the worst case timing of all
operations is known. Some-
times though, a system is called
predictableonlyifitsworstcase
timing is also very close to its
average case timing.
POSIX real-time related stan-
dards: Of the more than 30
POSIX standards, the seven
standards listed in Table 1 are
especially relevant to the devel-
opmentofreal-timeandembed-
ded systems. The first three
standards--1003.1a, 1003.1b,
and 1003.1c--are the most
widely supported. POSIX
1003.1adefinestheinterfaceto
basic OS functions, and was the
firsttobeadoptedin1990.Real-
time extensions are defined in
the standards 1003.1b,
1003.1d, 1003.1j, and 1003.21.
However, the original real-time
extensions,definedby1003.1b,
are the only ones commonly
implemented. Support for mul-
tiple threads in a process is pro-
vided in a separate standard,
POSIX 1003.1c. POSIX also in-
cludes support for high avail-
ability in the 1003.1h standard.
Commercial support for
POSIX varies widely. Because
POSIX 1003.1a is based on
UNIX, any UNIX-based OS will
naturally be very close to the
standard. To be conformant to
thePOSIXstandard,theOSand
hardware platform have to be
certified using a suite of tests.
Currently, test suites exist only
for POSIX 1003.1a. Because
POSIX is structured as a set of
optional features, OS vendors
can choose to implement only
portions of POSIX and still be
POSIX compliant. Compliance
only requires the vendor to
state which features of POSIX
are and are not implemented.
This is a source of confusion
because,formarketingreasons,
almost all vendors report that
they are POSIX compliant.
POSIX profiles. Embedded
systems typically have space
and resource limitations, and
an OS that includes all the fea-
tures of POSIX may not be ap-
propriate. The POSIX 1003.13
profile standard was defined to
address these types of systems.
POSIX 1003.13 does not con-
tain any additional features; in-
stead it groups the functions
from existing POSIX standards
into units of functionality. The
profilesarebasedonwhetheror
not an OS supports more than
one process and a file system.
The four current profiles are
summarized in Table 2.
POSIX real-time extensions.
POSIX 1003.1b, as well as
1003.1d and 1003.1j, define ex-
tensionsusefulfordevelopment
of real-time systems. Functions
definedintheoriginalreal-time
extension standard 1003.1b are
supported across a wider num-
ber of OSs than the other two
specifications. For this reason
this article focuses on POSIX
LISTINGLISTING 11 Creating and using a POSIX timer
#include
#include
void timer_create(int num_secs, int num_nsecs)
{
struct sigaction sa;
struct sigevent sig_spec;
sigset_t allsigs;
struct itimerspec tmr_setting;
timer_t timer_h;
/* setup signal to respond to timer */
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_SIGINFO;
sa.sa_sigaction = timer_intr;
if (sigaction(SIGRTMIN, &sa, NULL) < 0)
perror("sigaction");
sig_spec.sigev_notify = SIGEV_SIGNAL;
sig_spec.sigev_signo = SIGRTMIN;
/* create timer, which uses the REALTIME clock */
if (timer_create(CLOCK_REALTIME, &sig_spec, &timer_h) < 0)
perror("timer create");
/* set the initial expiration and frequency of timer */
tmr_setting.it_value.tv_sec = 1;
tmr_setting.it_value.tv_nsec = 0;
tmr_setting.it_interval.tv_sec = num_secs;
tmr_setting.it_interval.tv_sec = num_nsecs;
if ( timer_settime(timer_h, 0, &tmr_setting,NULL) < 0)
perror("settimer");
/* wait for signals */
sigemptyset(&allsigs);
while (1) {
sigsuspend(&allsigs);
}
}
/* routine that is called when timer expires */
void timer_intr(int sig, siginfo_t *extra, void *cruft)
{
/* perform periodic processing and then exit */
}
1003.1b. The following items
constitute the bulk of the fea-
turesdefinedinPOSIX1003.1b:
7 Timers: periodic timers, de-
livery is accomplished using
POSIX signals
7 Priority scheduling: fixed
priority preemptive schedul-
ingwithaminimumof32pri-
ority levels
7 Real-time signals: additional
signalswithmultiplelevelsof
priority
7 Semaphores: named and
memory counting sema-
phores
7 Memory queues: message
passing using named queues
7 Shared memory: named
memory regions shared be-
tween multiple processes
7 Memory locking: functions
to prevent virtual memory
swappingofphysicalmemory
pages
Listing 1 shows C code for
creating and using a POSIX
timer. Creating a timer consists
of two steps: specifying a signal
that is to be delivered at timer
expiration,andcreating/setting
the timer itself. In this example
we use the highest priority real-
time signal (SIGRTMIN) to
asynchronously call the timer
handler routine. Two values
must be specified for the timer:
the initial expiration time
(it_value) and the frequency
(tv_sec). The structure
(itimerspec)allowsnanosecond
timespecification,however,ac-
tual resolution is dependent on
the system. The POSIX call
clock_getres() can be used to
determinetheactualresolution,
typically 10ms or 1ms.
POSIX1003.1bprovidessup-
port for fixed priority preemp-
tivescheduling.Tobecompliant
with POSIX, an OS must imple-
ment at least 32 priorities.
POSIXdefinesthreescheduling
policiestohandleprocessesrun-
ning at the same priority. For
SCHED_FIFO, processes are
scheduled first in first out, and
run until completion. For
SCHED_RR,thescheduleruses
atimequantumtoschedulepro-
cesses in a round robin fashion.
The SCHED_OTHER policy is
also included to handle an
implement ation-def ined
scheduling policy. Because
SCHED_OTHER is implemen-
tation dependent, it is not por-
tableacrossdifferentplatforms,
and its use should be limited.
POSIX uses named objects
for several different mecha-
nisms including semaphores,
shared memory, and message
queues.Thesenamesareanalo-
gous, but independent, to
names in the file system. For
semaphores one process cre-
ates the semaphore and other
processes can attach to the
semaphore using its name.
Bothprocessescanperformsig-
nal (sem_post) or wait
(sem_wait) operations.
POSIX threads. In POSIX,
threads are implemented in an
independent specification,
which means that their specifi-
cation is independent of the
other real-time features. Be-
cause of this, a number of fea-
tures from the real-time specifi-
cation are carried over to the
thread specification. For ex-
ample, priority scheduling is
done on a per-thread basis, but
ishandledinamannersimilarto
schedulinginPOSIX1003.1b.A
thread'spriorityandscheduling
policyistypicallyspecifiedwhen
it is created.
The POSIX threadspecifica-
tion defines functionality and/
or makes modifications to
POSIX in the following areas:
7 Thread control: creation, de-
letion, and management of
individual threads
7 Priority scheduling: POSIX
real-time scheduling ex-
tended to include scheduling
on a per thread basis; the
scheduling scope is either
done globally across all
threads in all processes, or
performed locally within
each process
7 Mutexes: used to guard criti-
cal sections of code; mutexes
also include support for pri-
orityinheritanceandpriority
ceiling protocols to help pre-
vent priority inversions
7 Condition variables: used in
conjunction with mutexes,
condition variables can be
used to create a monitor syn-
chronization structure
7 Signals: ability to deliver sig-
nals to individual threads
POSIX coverage in OS imple-
mentations: Table 3 shows the
Thread C
Thread B
Thread A
Highest priority
Lowest priority
Thread C
Interrupt Thread C
Thread B
Interrupt Thread B
Thread A
Interrupt Thread A
Highest priority
Lowest priority
256 user priorities
are mapped to 512
user/interrupt priorities
Figure 2 Lynx priority tracking
Profiles Number of processes Threads File system
54 Multiple Yes Yes
53 Multiple Yes No
52 Single Yes Yes
51 Single Yes No
TABLE 2 POSIXa 1003.13 profiles
0 100 300
Actual
tick time
Desired
tick time
200
Timer
resolution
400
Timer
period
Time (ms)
Timer
jitter
Figure 3: Timer jitter benchmark.
level of compliance to POSIX
1003.1a and the 3.1 release is
compliant with all three stan-
dards.VxWorksonlysupportsa
subset of the POSIX standards
because in releases prior to and
including v. 5.4, VxWorks was
based on a single process model
that does not include task
memory protection. The cur-
rent release, VxWorks AE, does
support memory protection;
however,theprotectionscheme
isimplementeddifferentlythan
inthetraditionalPOSIXprocess
model. Linux provides good
support for the base POSIX
APIsandthreads,butismissing
featuressuchastimersandmes-
sage queues.
OS design
The design of an OS can have a
significant impact on its ability
tobeused ina real-time system.
This includes the internal de-
sign of the OS as well as the fea-
tures it provides to the applica-
tion programmer. This section
focuses on the design of two
OSs (Solaris and LynxOS), and
theirsuitabilityforuseinareal-
time system.
Desired features of a real-time
OS: Real-time systems are typi-
cally implemented with mul-
tiple asynchronous threads of
execution. This is dictated by
the need to react to external
events, and control asynchro-
nous devices. Because of this
characteristic, an RTOS must
support multithreading. Also,
because the criticality and rates
of events are different, the
RTOS must support a notion of
priority so that a time-critical
task is not delayed because of a
non-criticaltask.Furthermore,
tasks need to communicate.
Therefore,theOSmustprovide
synchronization and communi-
cation facilities.
An RTOS also needs to sup-
port timing features like high-
resolution timers and clocks.
Timers are used to support pe-
riodic processing and to detect
system timeout errors. Clocks
are needed to keep track of
time. Typical real-time applica-
S
W
Task 1
Latency of
system call
Test 1
Task 1
Round-trip
signaling
latency
Test 2
Test 3
S
W
Task 2
Low task
Low priority
task delayed
by medium
priority task
Delay
time
High priority
task delayed
until low
priority task
releases
resource
Medium task High task
P1
P2
A
A
R
W
S
Figure 4: Synchronization tests
OS POSIX 1003.1a POSIX 1003.1b POSIX 1003.1c
(Base POSIX) (Real-time extensions) (Threads)
Solaris Full support Full support Full support
LynxOS Conformant Full support 3.0.1 based on draft and
missing thread attributes;
3.1 based on final
standard
VxWorks Partial support; Partial support; support Support through a third
support for functions for functions that do not party product
that do not require a require a process model
process model
IRIX Conformant Full support Full support
Linux Full support Partial support; no Full support
support for timers or
message queues
QNX Neutrino Full support Close to full support; Full support
no for memory locking
TABLE 3 POSIX in commercial operating systemsTABLE 3 POSIX in commercial operating systems
Figure 5: Timer jitter results
1 10 100
200
180
160
140
120
100
80
60
40
20
0
Timer period (ms)
Jitter(5s)
1 10010
100,000,000
10,000,000
1,000,000
100,000
10,000
1,000
100
10
1
Timer period (ms)
Jitter(5s)
lynx
s8 (2 proc)
s8 (1 rt)
s8 (1 proc)
A No load B Heavy load
tions may need to be aware of
time at a granularity of micro-
or milliseconds.
With respect to perfor-
mance, the OS must be predict-
ableandaddminimaloverhead.
As discussed previously, a real-
time system must behave deter-
ministically. This implies that
the time required by all opera-
tions, including OS functions,
must be deterministic. To be
deterministic an OS must be
preemptable,whichmeansthat
iftheOSis processing a request
on behalf of a low priority task,
it must be able to stop what it is
doing and turn its attention to
a higher priority task. This pre-
vents a situation where a high
priority task is forever delayed
by the OS.
Solaris: Solaris is a general pur-
pose UNIX OS developed to
run on SPARC and Pentium-
class CPUs. Solaris has many of
the features required for a real-
time system. These features
are:
7 A multithreaded preemp-
table kernel
7 Global priority model:
threads are mapped to light-
weight processes, which are
allocated to priority classes
and then scheduled globally.
7 Configurable clock tick: the
frequency of the clock tick
can be changed, thereby in-
creasing or decreasing the
frequency with which the
scheduler runs.
7 High resolution POSIX tim-
ers: Solaris defines an addi-
tional POSIX timer
(CLOCK_HIGHRES) that,
based on the capability of the
hardware, can provide timers
with nanosecond and milli-
second resolution.
7 Priority I/O streams
7 Additional support for
POSIX real-time APIs:
Solaris 8 now supports all of
POSIX 1003.1b.
7 Symmetrical multiprocess-
ing support: Solaris sup-
ports multiprocessing that is
transparent to the user. This
also allows processors to
be reserved for real-time
processing, increasing the
determinism.
Solaris thread implementa-
tion. Solaris implements both
user-level and kernel-level
threads. User-level threads are
implemented as a library at the
user application level, whereas
kernel-level threads are the
unit of execution seen by the
kernel. Solaris uses the Light-
weight Processes (LWP)
mechanism to run kernel-level
threads on processors. The
mapping of user-level threads
to LWPs can be done in a num-
berofdifferentways.Ifmultiple
user-level threads are mapped
to a single kernel-level thread,
at most one of them can be ac-
tiveatatime.Totakeadvantage
of multiple processors, user-
level threads can be mapped
one-to-one to LWPs.
Figure 1 illustrates how
Solaris processor sets and pro-
cessor binding can be used to
dedicate processors for real-
timetasks.Thepsrsetcommand
is first used to create a pool of
one or more processors. Note
that all but one processor is eli-
gibleforinclusionintheproces-
sor set; one processor is needed
ClassClass PriorityPriority rangerange DescriptionDescription
ISRs N/A Asynchronous interrupt service routines; not scheduled
Interrupt threads 160-169 Interrupt processing not done in the ISR; scheduled
based on priority of ISR
Real-time 100-159 Time-critical tasks; fixed priority preemptive scheduling
System/kernel 60-99 System level functions
Time sharing/ 0-59 General-purpose applications; OS may dynamically
Interactive adjust priorities to achieve fairness
TABLE 4 Solaris priority classesTABLE 4 Solaris priority classes
No load Heavy load
1,000,000
100,000
10,000
1,000
100
10
1
Maximuminterval(5s)
lynx
s8 (2 proc)
s8 (1 rt)
s8 (1 proc)
Figure 6: Bintime results
Timer jitter Create a periodic Measures the response Timer period: (1, 10,
thread and measure time of the OS 100ms)
the deviation between
desired and actual
expiration
Response Execute a fixed pro- Determine if a thread Type of processing: (add,
cessing load and can respond in a copy, whetstone)
measure its execution deterministic fashion
time over a number of
runs
Bintime Call a time of day Measures the max- None
clock and measure imum kernel blocking
interval between calls time
Sync Measure the latency Measures the context Type of semaphore:
of thread to thread or switching time (POSIX name/unnamed
process to process between threads semaphore, pthread
synchronization and processes mutex, lynx semaphore);
process to process or
thread to thread
Message passing Measure the latency Measures the possible Data buffer size; process
of sending data from throughput of data to process or thread to
thread to thread or between processes thread
process to process and threads
RT signals Measure the latency Measures the latency None
of real-time signals of POSIX real-time
between two processes signals
TABLE 5 Real-time benchmarksTABLE 5 Real-time benchmarks
Benchmark Description Aspect tested Parameters
toprocesslightweightprocesses
outside the set. The psradm
command can then be used to
disable unbound interrupts on
the processors in the processor
set.Thepsrsetcommandisthen
used to run real-time processes
on the processors in the bound
processor set. All other non-
real-time processes and inter-
rupts run on processors outside
the real-time processor set. As
will be addressed later, this
mechanismhasadramaticeffect
on the timeliness of real-time
processing.
The Solaris scheduler. To
support different types of
scheduling policies, Solaris
runs each lightweight process
in one of four priority classes.
These classes are shown in
Table 4. Interrupt service rou-
tines are not part of the sched-
uling process, but they are in-
cluded in Table 4 because they
run at a higher priority than all
tasks, and thus can interfere
with normal LWP processing.
Application LWPs run in one of
threeclasses:real-time,system,
or timesharing. Interrupt
threads are reserved for inter-
rupt processing not done in the
interrupt service routine.
Scheduling consists of two
processes:decidingwhichLWP
to run and performing tick pro-
cessing. When the scheduler is
invoked it dispatches the LWP
withthehighestglobalpriority.
If the machine has multiple
CPUs, the scheduler can dis-
patch multiple LWPs.
The second aspect of sched-
uling is tick processing, the
processing that takes place at
every clock tick. The scheduler
will scan all the active LWPs
and update their state. For
timesharingthreads,thesched-
ulermayincreasethepriorityof
a LWP if it determines that
thread is not receiving a fair
share of the CPU. Solaris may
also promote a LWP to the sys-
tem class if the LWP is holding
a system resource. Because
real-time threads run with a
fixed priority scheduling
policy, very little tick process-
ing is done for them.
Lynx OS: LynxOS is a UNIX-
styleOSdevelopedforreal-time
embedded systems. The Lynx
kernel is preemptable, reen-
trant,andcanbescaleddownto
a footprint as low as 97KB.
Lynx scheduling. LynxOS
supports a single scheduling
policy, fixed priority preemp-
tive with 256 priority levels.
The clock tick frequency is
fixed at 100Hz, which limits
the resolution of timers to 10
milliseconds. The scheduler is
also invoked in response to
asynchronous events and
change in the system state.
Lynx priority tracking.
LynxOS uses a mechanism
called priority tracking to
handle interrupt processing
not done in the interrupt ser-
vice routine. This is in contrast
to the interrupt thread class
used by Solaris. The problem
with using an interrupt thread
class is that interrupt process-
ing on behalf of low priority
tasks will run at higher priority
than application processing of
a high priority task. This cre-
ates a priority inversion. The
way LynxOS solves this prob-
lem is to tie the priority of the
interrupt processing to the pri-
ority of the application thread.
The 256 task priorities are sub-
divided into 512 priorities and
application threads use the 256
even priorities and interrupt
threads use the 256 odd priori-
ties. This idea is illustrated in
Figure 2, where interrupt
threads run a half-step above
their corresponding applica-
tion thread.
Interrupt threads are writ-
ten as part of the device driver
for a particular device, and
therefore are not associated
with a particular application
lynx
(No load)
lynx
(Heavy load)
s8
(No load)
s8
(Heavy load)
lynx
(No load)
lynx
(Heavy load)
s8
(No load)
s8
(Heavy load)
35
30
25
20
15
10
5
0
Latency(5s)
10
9
8
7
6
5
4
3
2
0
1
Latency(5s)
Memory sem
Named sem
pmutexes
lynx sem
A Worst case B Average
Figure 7: Sync Test 1: Lynx and Solaris (1 rt)
PlatformPlatform HardwareHardware CPUCPU (speed)(speed) OperatingOperating systemsystem CPUCPU config.config.
Lynx Dell Pentium 2 (266MHz) Lynx OS 3.0.1 1 CPU
Solaris (2 proc) Sun Ultra 60 SPARC (360MHz) Solaris 8 2 CPUs
Solaris (1 proc) Sun Ultra 60 SPARC (360MHz) Solaris 8 1 CPU
Solaris (1 rt) Sun Ultra 60 SPARC (360MHz) Solaris 8 2 CPUs, 1 CPU
reserved to run
RTbenchmarks
TABLE 6 Experimental platforms
N a m e D e s c r i p t i o n L o a d d e g r e e
CPU Processing load generated with the 10ms every 100ms
Whetstone synthetic benchmark
Disk File write operations 10ms every 100ms
Interrupt External serial interrupt 1,000 interrupts/sec
Network TCP/IP socket transfers 4,000 packets/sec
System call Sequence of utility system calls 10ms every 100ms
Memory Dynamic memory allocation 10ms every 100ms
File search Search files in a directory and all sub-directories Continuous
TABLE 7 Non real-time (heavy) loadABLE 7 Non real-time (heavy) load
thread.Becauseofthis,LynxOS
providesa mechanismbywhich
thedevicedrivercandetermine
the priority of the thread on
behalf of which it is currently
running.Usingthisfeature,the
interrupt thread can adjust its
priority to the appropriate
level. If in the future a different
application thread needs the
same device, the interrupt
thread is notified and can
change its priority.
Testing the real-time
performance of OSs
The benchmarks used in this
study are divided into two cat-
egories: those that measure the
determinism of the OS and
those that measure the latency
of particular important opera-
tions. These benchmarks are
motivated by the real-time per-
formance requirements dis-
cussed previously. The bench-
marks test core OS capabilities
and are independent of any ac-
tual application. Also because
we are interested in determin-
ing the best possible real-time
performance, all real-time
threadsarerunatthemaximum
possible real-time priority, and
the virtual memory used by the
benchmarks is locked into
physical memory. Table 5 sum-
marizes the six benchmarks
used in this study.
Deterministic benchmarks: The
first three benchmarks shown
in Table 5, (timer jitter, re-
sponse, and bintime) are de-
signed to measure the deter-
minism of an OS. Because de-
terminism implies that the
time it takes to perform an op-
eration is known under all cir-
cumstances, we typically re-
port the worst case time for
these benchmarks.
The structure of the timer
jitter test is shown in Figure 3.
The test creates a timer, sets it
toexpireatagivenperiod,then
determines the actual expira-
tion time. The jitter is then de-
fined as the deviation between
the actual and desired expira-
tion times. Most current CPUs
include a stamp counter that is
updated on every CPU cycle.
ThePOSIXclock_gettimefunc-
tion in most OSs uses this
stamp counter, giving a high-
precision time of day clock.
The second deterministic
benchmark (response) mea-
sures the actual execution time
of a 10-millisecond fixed block
of processing. The actual ex-
ecution time over a number of
separate runs is calculated to
determine whether or not ap-
plication response time is de-
terministic. The fixed process-
ing is generated with a loop
consistingofoneofthreediffer-
ent types of operations: addi-
tions (add), memory copies
(copy), or the synthetic Whet-
stone benchmark (whet).
The last deterministic
benchmark (bintime) deter-
mines the maximum kernel
blocking time. The benchmark
uses a high priority real-time
thread to repeatedly call a time
of day clock and calculate the
time required by each call. The
time required by each call con-
sists of the time to perform the
system call and any time spent
blocked in the kernel.Sincethe
time to perform the system call
should be constant, the devia-
tion between the maximum
time reported by the bench-
mark and the average time
gives a good indication of the
maximum time spent blocked
in the kernel.
Latency benchmarks: The final
three benchmarks test the syn-
chronization, message passing,
and RT signaling capabilities of
an OS. For a real-time system it
is important to minimize syn-
chronization and communica-
tion latency. So the average la-
tency of operations should be
smalltominimizethetotalover-
head. Bounding the maximum
80
0
Latency(5s)
60
50
40
30
20
10
70
60
50
40
30
20
10
0
Latency(5s)
A Worst case B Average
lynx
(No load)
lynx
(Heavy load)
s8
(No load)
s8
(Heavy load)
lynx
(No load)
lynx
(Heavy load)
s8
(No load)
s8
(Heavy load)
Memory sem
Named sem
lynx sem
Figure 8: Sync Test 2: Lynx and Solaris (1 rt)
L y n x 9 . 9 9.9 10.0 10.1 10.1 10.2
Solaris (2 procs) 10.1 11236.5 10.7 12061.7 10.6 12162.8
Solaris (1 proc) 10.2 7310.7 10.2 4599.3 10.7 6328.2
Solaris (1 rt) 10.0 10.0 10.0 10.0 10.5 10.5
TABLE 8 Worst case response results (in ms)orst case response results (in ms)
Configuration
add copy whet
No load Heavy load No load Heavy load No load Heavy load
Lynx 42.2 20.1 47.2 24.2 40.5 20.1 53.2 24.0
Solaris (1 rt) 65.4 52.9 446.8 49.9 67.2 51.8 461.0 50.6
Solaris (1 proc) 198.9 53.0 459.1 50.3 160.8 53.2 23240 51.4
Solaris (2 proc) 247.5 48.1 119.6 41.4 7149.0 68.7 639191 82.2
TABLE 9 Context switching times
Process
Max Avg
Process
Max AvgConfiguration
No load Heavy load
Thread
Max Avg
Thread
Max Avg
latency is important as well--to
achieve determinism.
Four different synchroniza-
tion tests are shown in Figure 4.
In the first test, a single thread
signals(S)andthenwaits(W)on
asemaphore.Thistestmeasures
thelatencyofsemaphoresystem
calls.Thesecondtestusessema-
phores to signal between two
threads. The threads are either
in a single process or two differ-
ent processes. Measurements
from the first two tests can be
used to determine the context
switching time by subtracting
the system call overhead, ob-
tained in test one, from half of
the roundtrip signaling time,
obtained in test two.
The last test assesses an OSs'
ability to deal with priority in-
version. The test sets up a clas-
sic priority inversion using
semaphores. (Note: for clarity
thesemaphoresarenotshownin
the picture.) The priority inver-
sion occurs when a low priority
task acquires (A) a resource
needed later by a high priority
task. The high priority task
blocks waiting on the resource
and is delayed indefinitely be-
cause an independent medium
priority task is monopolizing
theCPU.Thisisapriorityinver-
sion because now the medium
priority task is favored over the
high priority task. A typical way
of solving this problem is to al-
low the low priority task to in-
heritthepriorityofthehighpri-
ority task so that it can run and
release the resource (R). In the
test,afixed-durationprocessing
loop is used for the medium pri-
oritytask.Ifapriorityinversion
occurs, the time between when
the low priority task acquires
the resource and when the high
prioritytaskreceivesitwillbeat
least the time in this fixed-dura-
tionofprocessing.IftheOSsyn-
chronization mechanism pre-
vents a priority inversion, this
time will be negligible.
The message passing bench-
mark uses POSIX message
queues to measure the latency
and throughput of data trans-
fers between two threads in the
same process or in different
processes. The last benchmark
measures the latency of POSIX
real-time signals.
Benchmark results
The benchmarks defined in the
previous section were run on
twodifferentOSs:LynxOS3.0.1
and Solaris 8. The details of the
two systems are shown in Table
6. Note that the CPU, among
other hardware characteristics,
differs between the two plat-
forms.Becauseourbenchmarks
were written to test the deter-
minism of the OSs, and we ob-
serve the worst case time, this
difference has little impact on
the results. However, the speed
differenceshouldbeconsidered
when comparing the results of
average timings.
Table 6 identifies three dif-
ferent Solaris configurations.
These different configurations
allow us to investigate the im-
pact of using multiple CPUs.
Thefirstconfigurationusesthe
twoprocessorUltra60asis.For
the second configuration, one
of the CPUs is disabled. In the
last configuration, one of the
CPUs is reserved and the real-
time benchmarks are run on it.
Also for this configuration the
reserved processor is sheltered
from all unbound interrupts.
Non real-time external load:
The benchmarks were run
stand-alone,thatis,withoutany
other user processes running,
thenincombinationwithanon-
real-time load. Typically a real-
time system will run a mixture
of applications, some with real-
time requirements and some
without. A graphical user inter-
face is an example of a non-real-
time application. Table 7 shows
the types of processing used to
generate the non-real-time
load. The load contains CPU-
intensiveapplicationsaswellas
applications that use interrupt-
ing I/O devices such as the file
and network subsystems.
Timer jitter: Figure 5 shows the
results of the timer jitter tests
for all four platforms. Without
a load, shown in Figure 5a, all
platformshaveacceptablejitter
under 200ms. The Solaris (1 rt)
configuration has the least
amount of jitter. The jitter for
the Lynx configuration is also
quite low. Under a heavy load,
showninFigure5b,thejitterfor
the Solaris configurations that
do not reserve a real-processor
is out of bounds. The worst case
jitter, for these configurations,
is as great as 10 seconds.
Application response: Table 8
shows the worst case response
results for all configurations.
Without a load, all configura-
tionshavearesponseresultvery
close to the calibrated value of
10 milliseconds. With a load
only the Lynx and Solaris (1 rt)
configurationcomeclosetothe
10-millisecondvalue.Theworst
case results for the standard
Solarisplatform(Solaris2proc)
is three orders of magnitude
worsethanthecalibratedvalue.
Bintime: Figure 6 shows the re-
sults for the deterministic
bintime benchmark for all con-
figurations. Without a load the
kernel imposes very little delay.
For the Solaris (1 rt) configura-
tion, the delay is below 10 milli-
seconds, and for all other con-
figurationsthedelayisatorless
than 100 milliseconds. Under a
heavy load, the Solaris configu-
rations without a reserved real-
time processor again are very
non-deterministic. The maxi-
mum delay for the single CPU
Solaris configuration is close to
one second.
Synchronization:Inthissection
wepresenttheresultsofthesyn-
chronization tests described
previously.
Test 1 (Signaling within a
thread). Figure 7 shows the re-
sults of the simple synchroniza-
Lynx 50.1 30.5 57.7 35.9 46.2 51.6 45.9 50.0
Solaris (1 rt) 98.7 90.5 118.9 102.7 62.4 77.8 61.5 76.5
Solaris (1 proc) 152.8 89.6 159.0 102.4 77.7 77.3 72.9 76.3
Solaris (2 proc) 148.7 82.8 146.8 77.5 41.3 66.6 58.2 65.5
ConfigurationConfiguration
Latency (Latency (55s)s)
T
hroughout (MBps)Throughout (MBps)
ProcessProcess
Worst AvgWorst Avg
ThreadThread
Worst AvgWorst Avg
ProcessProcess
Worst AvgWorst Avg
ThreadThread
Worst AvgWorst Avg
TABLE 10 POSIX message queues (no load)TABLE 10 POSIX message queues (no load)
No load Heavy load
10,000,000
1,000,000
100,000
10,000
1,000
100
10
1
Latency(5s)
Lynx
lynx (lsem)
s8 (1 rt)
s8 (1 proc)
s8 (2 proc)
Figure 9: Sync Test 3: Lynx and Solaris (1 rt)
tion test for the Lynx and
Solaris (1 rt) configurations.
Four different types of synchro-
nization mechanisms were
tested for Lynx, and three for
Solaris.AsFigure7ashows,the
worst case latency for the
Solaris platform is much better
than the latency for the Lynx
platform. For both platforms
the addition of a load has little
affectontheworstcasetimings.
Figure 7b shows the average
latencies for the same synchro-
nization mechanisms. For
Lynx, the lynx semaphores ex-
hibit the highest latency, most
likely because priority inherit-
ance is implemented for this
semaphore. For Solaris the la-
tency of the POSIX-named
semaphoreismuchhigherthan
the latency of the other mecha-
nisms. An explanation for this
is that the semaphore name is
kept in the file system.
Test 2 (Inter-thread signal-
ing). Figure 8 shows the results
of the inter-thread signaling
test for the Lynx and the Solaris
(1 rt) configurations. In all
cases the average and worst
case round-trip time is better
for Lynx than Solaris. This re-
sult is especially significant be-
cause the Solaris test was run
on a faster processor than the
Lynx test. Figure 8 also shows
that the latency of all types of
synchronizationmechanisms is
roughly equal.
Test 3 (Priority inversion).
The results for the priority in-
versiontestareshowninFigure
9, for all configurations. For all
cases, except the Lynx (lsem)
case, a pthread mutex is used to
guard the resource shared by
the low and high priority tasks.
Without a load, the first Lynx
configurationexhibitsalatency
correspondingtothedelaytime
of the medium priority task of
10 milliseconds. This is due to
the fact that in LynxOS 3.0.1,
priority inheritance is not
implemented for pthread
mutexes. This problem is not
seen with Lynx semaphores.
Priority inheritance is imple-
mented in Solaris, and the la-
tency for all Solaris configura-
tions, without a load, is low.
Under a heavy load, only the
Lynx (lsem) and Solaris (1 rt)
configurations exhibit an ac-
ceptable latency. The Solaris
1rt and 2 proc configurations
are affected by the heavy load,
andtheLynxconfigurationstill
has a high latency, because of
the lack of a priority inherit-
ance protocol.
Context switching time.
Table 9 shows context switch
timeforallplatformscomputed
from the results for memory
semaphores in the first two syn-
chronization tests. The context
switch time for Lynx isless than
halfthevalueofthebestSolaris
configuration. Also for Lynx,
the process-to-process context
switching time is only slightly
worse than the thread-to-
thread context switching time.
The context switching time
for Solaris threads is more de-
terministic than the context
switching time for processes.
For the Solaris (1 rt) configu-
ration, the maximum thread-
to-thread context switching
time is close to average. How-
ever, for the same configura-
tion, the process-to-process
context switching time is an
order of magnitude worse than
the average value. Another in-
teresting observation is that
for Solaris the context switch-
ing time between processes is
slightly better than between
threads. In both cases there is
a context switch between
LWPs, which seems to imply
that the bulk of the overhead is
in the scheduler.
Communication:Real-time sig-
nals. Figure 10 shows the re-
sults of the real-time signal
benchmark for all configura-
tions. The Lynx configuration
has a lower signal latency than
any of the Solaris configura-
tions. Also the Solaris 1 proc
and 2 proc configurations are
severely affected by the addi-
tion of a non-real-time load.
Message queues. The la-
tency and throughput of POSIX
message queues for all configu-
rations is shown in Table 10.
The latency for the Lynx plat-
form is better than the Solaris
platform, but the Solaris plat-
form has better throughput.
This better throughput is most
Figure 10: Real-time signal latency
No load
(max)
Heavy load
(max)
Heavy load
(avg)
No load
(avg)
100,000
10,000
1,000
100
10
1
Latency(5s)
Lynx
s8 (2 proc)
s8 (1 rt)
s8 (1 proc)
likely due to faster hardware on
the Solaris platform.
Suitability
In this article we have assessed
theuseofPOSIXinthedevelop-
ment of software for real-time
and embedded systems. We dis-
cussed the features of POSIX
and how well these features
match those required for real-
timesoftwaredevelopment.We
also empirically evaluated the
real-time performance charac-
teristics of two implementa-
tions of POSIX: LynxOS 3.0.1
and Solaris 8.
The empirical evaluation
showed that both LynxOS and
Solaris are suitable for use in
real-time systems. LynxOS ex-
hibited a low overhead for all
operations and was determinis-
tic even under heavy loading
conditions.
Solaris 8 contains a number
offeaturesthatareimportantin
real-time development, includ-
inghigh-resolutiontimers,pro-
cessor partitioning, and SMP
support.Theselasttwofeatures
are key in Solaris's use as a
real-time OS. A dramatic dif-
ference is apparent between
the determinism of the stan-
dard Solaris configuration and
one in which all real-time tasks
are run on a dedicated proces-
sor. The standard configura-
tion is unsuitable for real-time,
whereas the second configura-
tion is very deterministic.
Although this study did not
perform an exhaustive com-
parison of the POSIX APIs be-
tween Solaris and LynxOS, our
conclusion is that the two
implementations of POSIX
have a great deal in common.
The biggest differences are in
the areas of clock resolution
and number of real-time priori-
ties. Clock resolution could
pose a portability problem if a
resolution of greater than 10
milliseconds is needed. Other
differences that we encoun-
tered, like discrepancies in the
LynxOS threads implementa-
tion, have been rectified in v.
3.1 of the OS.
[Embedded Systems Programming]
Kevin Obenland
Faculty Member
George Mason University
E-mail:kevin.m.obenland@saic.com