dd81eca83c
Update RCU documentation based on discussions and review of RCU-based tree patches. Add an introductory whatisRCU.txt file. Signed-off-by: <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
446 lines
15 KiB
Text
446 lines
15 KiB
Text
Read the F-ing Papers!
|
|
|
|
|
|
This document describes RCU-related publications, and is followed by
|
|
the corresponding bibtex entries. A number of the publications may
|
|
be found at http://www.rdrop.com/users/paulmck/RCU/.
|
|
|
|
The first thing resembling RCU was published in 1980, when Kung and Lehman
|
|
[Kung80] recommended use of a garbage collector to defer destruction
|
|
of nodes in a parallel binary search tree in order to simplify its
|
|
implementation. This works well in environments that have garbage
|
|
collectors, but current production garbage collectors incur significant
|
|
read-side overhead.
|
|
|
|
In 1982, Manber and Ladner [Manber82,Manber84] recommended deferring
|
|
destruction until all threads running at that time have terminated, again
|
|
for a parallel binary search tree. This approach works well in systems
|
|
with short-lived threads, such as the K42 research operating system.
|
|
However, Linux has long-lived tasks, so more is needed.
|
|
|
|
In 1986, Hennessy, Osisek, and Seigh [Hennessy89] introduced passive
|
|
serialization, which is an RCU-like mechanism that relies on the presence
|
|
of "quiescent states" in the VM/XA hypervisor that are guaranteed not
|
|
to be referencing the data structure. However, this mechanism was not
|
|
optimized for modern computer systems, which is not surprising given
|
|
that these overheads were not so expensive in the mid-80s. Nonetheless,
|
|
passive serialization appears to be the first deferred-destruction
|
|
mechanism to be used in production. Furthermore, the relevant patent has
|
|
lapsed, so this approach may be used in non-GPL software, if desired.
|
|
(In contrast, use of RCU is permitted only in software licensed under
|
|
GPL. Sorry!!!)
|
|
|
|
In 1990, Pugh [Pugh90] noted that explicitly tracking which threads
|
|
were reading a given data structure permitted deferred free to operate
|
|
in the presence of non-terminating threads. However, this explicit
|
|
tracking imposes significant read-side overhead, which is undesirable
|
|
in read-mostly situations. This algorithm does take pains to avoid
|
|
write-side contention and parallelize the other write-side overheads by
|
|
providing a fine-grained locking design, however, it would be interesting
|
|
to see how much of the performance advantage reported in 1990 remains
|
|
in 2004.
|
|
|
|
At about this same time, Adams [Adams91] described ``chaotic relaxation'',
|
|
where the normal barriers between successive iterations of convergent
|
|
numerical algorithms are relaxed, so that iteration $n$ might use
|
|
data from iteration $n-1$ or even $n-2$. This introduces error,
|
|
which typically slows convergence and thus increases the number of
|
|
iterations required. However, this increase is sometimes more than made
|
|
up for by a reduction in the number of expensive barrier operations,
|
|
which are otherwise required to synchronize the threads at the end
|
|
of each iteration. Unfortunately, chaotic relaxation requires highly
|
|
structured data, such as the matrices used in scientific programs, and
|
|
is thus inapplicable to most data structures in operating-system kernels.
|
|
|
|
In 1993, Jacobson [Jacobson93] verbally described what is perhaps the
|
|
simplest deferred-free technique: simply waiting a fixed amount of time
|
|
before freeing blocks awaiting deferred free. Jacobson did not describe
|
|
any write-side changes he might have made in this work using SGI's Irix
|
|
kernel. Aju John published a similar technique in 1995 [AjuJohn95].
|
|
This works well if there is a well-defined upper bound on the length of
|
|
time that reading threads can hold references, as there might well be in
|
|
hard real-time systems. However, if this time is exceeded, perhaps due
|
|
to preemption, excessive interrupts, or larger-than-anticipated load,
|
|
memory corruption can ensue, with no reasonable means of diagnosis.
|
|
Jacobson's technique is therefore inappropriate for use in production
|
|
operating-system kernels, except when such kernels can provide hard
|
|
real-time response guarantees for all operations.
|
|
|
|
Also in 1995, Pu et al. [Pu95a] applied a technique similar to that of Pugh's
|
|
read-side-tracking to permit replugging of algorithms within a commercial
|
|
Unix operating system. However, this replugging permitted only a single
|
|
reader at a time. The following year, this same group of researchers
|
|
extended their technique to allow for multiple readers [Cowan96a].
|
|
Their approach requires memory barriers (and thus pipeline stalls),
|
|
but reduces memory latency, contention, and locking overheads.
|
|
|
|
1995 also saw the first publication of DYNIX/ptx's RCU mechanism
|
|
[Slingwine95], which was optimized for modern CPU architectures,
|
|
and was successfully applied to a number of situations within the
|
|
DYNIX/ptx kernel. The corresponding conference paper appeared in 1998
|
|
[McKenney98].
|
|
|
|
In 1999, the Tornado and K42 groups described their "generations"
|
|
mechanism, which quite similar to RCU [Gamsa99]. These operating systems
|
|
made pervasive use of RCU in place of "existence locks", which greatly
|
|
simplifies locking hierarchies.
|
|
|
|
2001 saw the first RCU presentation involving Linux [McKenney01a]
|
|
at OLS. The resulting abundance of RCU patches was presented the
|
|
following year [McKenney02a], and use of RCU in dcache was first
|
|
described that same year [Linder02a].
|
|
|
|
Also in 2002, Michael [Michael02b,Michael02a] presented techniques
|
|
that defer the destruction of data structures to simplify non-blocking
|
|
synchronization (wait-free synchronization, lock-free synchronization,
|
|
and obstruction-free synchronization are all examples of non-blocking
|
|
synchronization). In particular, this technique eliminates locking,
|
|
reduces contention, reduces memory latency for readers, and parallelizes
|
|
pipeline stalls and memory latency for writers. However, these
|
|
techniques still impose significant read-side overhead in the form of
|
|
memory barriers. Researchers at Sun worked along similar lines in the
|
|
same timeframe [HerlihyLM02,HerlihyLMS03].
|
|
|
|
In 2003, the K42 group described how RCU could be used to create
|
|
hot-pluggable implementations of operating-system functions. Later that
|
|
year saw a paper describing an RCU implementation of System V IPC
|
|
[Arcangeli03], and an introduction to RCU in Linux Journal [McKenney03a].
|
|
|
|
2004 has seen a Linux-Journal article on use of RCU in dcache
|
|
[McKenney04a], a performance comparison of locking to RCU on several
|
|
different CPUs [McKenney04b], a dissertation describing use of RCU in a
|
|
number of operating-system kernels [PaulEdwardMcKenneyPhD], a paper
|
|
describing how to make RCU safe for soft-realtime applications [Sarma04c],
|
|
and a paper describing SELinux performance with RCU [JamesMorris04b].
|
|
|
|
|
|
2005 has seen further adaptation of RCU to realtime use, permitting
|
|
preemption of RCU realtime critical sections [PaulMcKenney05a,
|
|
PaulMcKenney05b].
|
|
|
|
Bibtex Entries
|
|
|
|
@article{Kung80
|
|
,author="H. T. Kung and Q. Lehman"
|
|
,title="Concurrent Maintenance of Binary Search Trees"
|
|
,Year="1980"
|
|
,Month="September"
|
|
,journal="ACM Transactions on Database Systems"
|
|
,volume="5"
|
|
,number="3"
|
|
,pages="354-382"
|
|
}
|
|
|
|
@techreport{Manber82
|
|
,author="Udi Manber and Richard E. Ladner"
|
|
,title="Concurrency Control in a Dynamic Search Structure"
|
|
,institution="Department of Computer Science, University of Washington"
|
|
,address="Seattle, Washington"
|
|
,year="1982"
|
|
,number="82-01-01"
|
|
,month="January"
|
|
,pages="28"
|
|
}
|
|
|
|
@article{Manber84
|
|
,author="Udi Manber and Richard E. Ladner"
|
|
,title="Concurrency Control in a Dynamic Search Structure"
|
|
,Year="1984"
|
|
,Month="September"
|
|
,journal="ACM Transactions on Database Systems"
|
|
,volume="9"
|
|
,number="3"
|
|
,pages="439-455"
|
|
}
|
|
|
|
@techreport{Hennessy89
|
|
,author="James P. Hennessy and Damian L. Osisek and Joseph W. {Seigh II}"
|
|
,title="Passive Serialization in a Multitasking Environment"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="1989"
|
|
,number="US Patent 4,809,168 (lapsed)"
|
|
,month="February"
|
|
,pages="11"
|
|
}
|
|
|
|
@techreport{Pugh90
|
|
,author="William Pugh"
|
|
,title="Concurrent Maintenance of Skip Lists"
|
|
,institution="Institute of Advanced Computer Science Studies, Department of Computer Science, University of Maryland"
|
|
,address="College Park, Maryland"
|
|
,year="1990"
|
|
,number="CS-TR-2222.1"
|
|
,month="June"
|
|
}
|
|
|
|
@Book{Adams91
|
|
,Author="Gregory R. Adams"
|
|
,title="Concurrent Programming, Principles, and Practices"
|
|
,Publisher="Benjamin Cummins"
|
|
,Year="1991"
|
|
}
|
|
|
|
@unpublished{Jacobson93
|
|
,author="Van Jacobson"
|
|
,title="Avoid Read-Side Locking Via Delayed Free"
|
|
,year="1993"
|
|
,month="September"
|
|
,note="Verbal discussion"
|
|
}
|
|
|
|
@Conference{AjuJohn95
|
|
,Author="Aju John"
|
|
,Title="Dynamic vnodes -- Design and Implementation"
|
|
,Booktitle="{USENIX Winter 1995}"
|
|
,Publisher="USENIX Association"
|
|
,Month="January"
|
|
,Year="1995"
|
|
,pages="11-23"
|
|
,Address="New Orleans, LA"
|
|
}
|
|
|
|
@techreport{Slingwine95
|
|
,author="John D. Slingwine and Paul E. McKenney"
|
|
,title="Apparatus and Method for Achieving Reduced Overhead Mutual
|
|
Exclusion and Maintaining Coherency in a Multiprocessor System
|
|
Utilizing Execution History and Thread Monitoring"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="1995"
|
|
,number="US Patent 5,442,758 (contributed under GPL)"
|
|
,month="August"
|
|
}
|
|
|
|
@techreport{Slingwine97
|
|
,author="John D. Slingwine and Paul E. McKenney"
|
|
,title="Method for maintaining data coherency using thread
|
|
activity summaries in a multicomputer system"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="1997"
|
|
,number="US Patent 5,608,893 (contributed under GPL)"
|
|
,month="March"
|
|
}
|
|
|
|
@techreport{Slingwine98
|
|
,author="John D. Slingwine and Paul E. McKenney"
|
|
,title="Apparatus and method for achieving reduced overhead
|
|
mutual exclusion and maintaining coherency in a multiprocessor
|
|
system utilizing execution history and thread monitoring"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="1998"
|
|
,number="US Patent 5,727,209 (contributed under GPL)"
|
|
,month="March"
|
|
}
|
|
|
|
@Conference{McKenney98
|
|
,Author="Paul E. McKenney and John D. Slingwine"
|
|
,Title="Read-Copy Update: Using Execution History to Solve Concurrency
|
|
Problems"
|
|
,Booktitle="{Parallel and Distributed Computing and Systems}"
|
|
,Month="October"
|
|
,Year="1998"
|
|
,pages="509-518"
|
|
,Address="Las Vegas, NV"
|
|
}
|
|
|
|
@Conference{Gamsa99
|
|
,Author="Ben Gamsa and Orran Krieger and Jonathan Appavoo and Michael Stumm"
|
|
,Title="Tornado: Maximizing Locality and Concurrency in a Shared Memory
|
|
Multiprocessor Operating System"
|
|
,Booktitle="{Proceedings of the 3\textsuperscript{rd} Symposium on
|
|
Operating System Design and Implementation}"
|
|
,Month="February"
|
|
,Year="1999"
|
|
,pages="87-100"
|
|
,Address="New Orleans, LA"
|
|
}
|
|
|
|
@techreport{Slingwine01
|
|
,author="John D. Slingwine and Paul E. McKenney"
|
|
,title="Apparatus and method for achieving reduced overhead
|
|
mutual exclusion and maintaining coherency in a multiprocessor
|
|
system utilizing execution history and thread monitoring"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="2001"
|
|
,number="US Patent 5,219,690 (contributed under GPL)"
|
|
,month="April"
|
|
}
|
|
|
|
@Conference{McKenney01a
|
|
,Author="Paul E. McKenney and Jonathan Appavoo and Andi Kleen and
|
|
Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni"
|
|
,Title="Read-Copy Update"
|
|
,Booktitle="{Ottawa Linux Symposium}"
|
|
,Month="July"
|
|
,Year="2001"
|
|
,note="Available:
|
|
\url{http://www.linuxsymposium.org/2001/abstracts/readcopy.php}
|
|
\url{http://www.rdrop.com/users/paulmck/rclock/rclock_OLS.2001.05.01c.pdf}
|
|
[Viewed June 23, 2004]"
|
|
annotation="
|
|
Described RCU, and presented some patches implementing and using it in
|
|
the Linux kernel.
|
|
"
|
|
}
|
|
|
|
@Conference{Linder02a
|
|
,Author="Hanna Linder and Dipankar Sarma and Maneesh Soni"
|
|
,Title="Scalability of the Directory Entry Cache"
|
|
,Booktitle="{Ottawa Linux Symposium}"
|
|
,Month="June"
|
|
,Year="2002"
|
|
,pages="289-300"
|
|
}
|
|
|
|
@Conference{McKenney02a
|
|
,Author="Paul E. McKenney and Dipankar Sarma and
|
|
Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell"
|
|
,Title="Read-Copy Update"
|
|
,Booktitle="{Ottawa Linux Symposium}"
|
|
,Month="June"
|
|
,Year="2002"
|
|
,pages="338-367"
|
|
,note="Available:
|
|
\url{http://www.linux.org.uk/~ajh/ols2002_proceedings.pdf.gz}
|
|
[Viewed June 23, 2004]"
|
|
}
|
|
|
|
@article{Appavoo03a
|
|
,author="J. Appavoo and K. Hui and C. A. N. Soules and R. W. Wisniewski and
|
|
D. M. {Da Silva} and O. Krieger and M. A. Auslander and D. J. Edelsohn and
|
|
B. Gamsa and G. R. Ganger and P. McKenney and M. Ostrowski and
|
|
B. Rosenburg and M. Stumm and J. Xenidis"
|
|
,title="Enabling Autonomic Behavior in Systems Software With Hot Swapping"
|
|
,Year="2003"
|
|
,Month="January"
|
|
,journal="IBM Systems Journal"
|
|
,volume="42"
|
|
,number="1"
|
|
,pages="60-76"
|
|
}
|
|
|
|
@Conference{Arcangeli03
|
|
,Author="Andrea Arcangeli and Mingming Cao and Paul E. McKenney and
|
|
Dipankar Sarma"
|
|
,Title="Using Read-Copy Update Techniques for {System V IPC} in the
|
|
{Linux} 2.5 Kernel"
|
|
,Booktitle="Proceedings of the 2003 USENIX Annual Technical Conference
|
|
(FREENIX Track)"
|
|
,Publisher="USENIX Association"
|
|
,year="2003"
|
|
,month="June"
|
|
,pages="297-310"
|
|
}
|
|
|
|
@article{McKenney03a
|
|
,author="Paul E. McKenney"
|
|
,title="Using {RCU} in the {Linux} 2.5 Kernel"
|
|
,Year="2003"
|
|
,Month="October"
|
|
,journal="Linux Journal"
|
|
,volume="1"
|
|
,number="114"
|
|
,pages="18-26"
|
|
}
|
|
|
|
@techreport{Friedberg03a
|
|
,author="Stuart A. Friedberg"
|
|
,title="Lock-Free Wild Card Search Data Structure and Method"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="2003"
|
|
,number="US Patent 6,662,184 (contributed under GPL)"
|
|
,month="December"
|
|
,pages="112"
|
|
}
|
|
|
|
@article{McKenney04a
|
|
,author="Paul E. McKenney and Dipankar Sarma and Maneesh Soni"
|
|
,title="Scaling dcache with {RCU}"
|
|
,Year="2004"
|
|
,Month="January"
|
|
,journal="Linux Journal"
|
|
,volume="1"
|
|
,number="118"
|
|
,pages="38-46"
|
|
}
|
|
|
|
@Conference{McKenney04b
|
|
,Author="Paul E. McKenney"
|
|
,Title="{RCU} vs. Locking Performance on Different {CPUs}"
|
|
,Booktitle="{linux.conf.au}"
|
|
,Month="January"
|
|
,Year="2004"
|
|
,Address="Adelaide, Australia"
|
|
,note="Available:
|
|
\url{http://www.linux.org.au/conf/2004/abstracts.html#90}
|
|
\url{http://www.rdrop.com/users/paulmck/rclock/lockperf.2004.01.17a.pdf}
|
|
[Viewed June 23, 2004]"
|
|
}
|
|
|
|
@phdthesis{PaulEdwardMcKenneyPhD
|
|
,author="Paul E. McKenney"
|
|
,title="Exploiting Deferred Destruction:
|
|
An Analysis of Read-Copy-Update Techniques
|
|
in Operating System Kernels"
|
|
,school="OGI School of Science and Engineering at
|
|
Oregon Health and Sciences University"
|
|
,year="2004"
|
|
,note="Available:
|
|
\url{http://www.rdrop.com/users/paulmck/RCU/RCUdissertation.2004.07.14e1.pdf}
|
|
[Viewed October 15, 2004]"
|
|
}
|
|
|
|
@Conference{Sarma04c
|
|
,Author="Dipankar Sarma and Paul E. McKenney"
|
|
,Title="Making RCU Safe for Deep Sub-Millisecond Response Realtime Applications"
|
|
,Booktitle="Proceedings of the 2004 USENIX Annual Technical Conference
|
|
(FREENIX Track)"
|
|
,Publisher="USENIX Association"
|
|
,year="2004"
|
|
,month="June"
|
|
,pages="182-191"
|
|
}
|
|
|
|
@unpublished{JamesMorris04b
|
|
,Author="James Morris"
|
|
,Title="Recent Developments in {SELinux} Kernel Performance"
|
|
,month="December"
|
|
,year="2004"
|
|
,note="Available:
|
|
\url{http://www.livejournal.com/users/james_morris/2153.html}
|
|
[Viewed December 10, 2004]"
|
|
}
|
|
|
|
@unpublished{PaulMcKenney05a
|
|
,Author="Paul E. McKenney"
|
|
,Title="{[RFC]} {RCU} and {CONFIG\_PREEMPT\_RT} progress"
|
|
,month="May"
|
|
,year="2005"
|
|
,note="Available:
|
|
\url{http://lkml.org/lkml/2005/5/9/185}
|
|
[Viewed May 13, 2005]"
|
|
,annotation="
|
|
First publication of working lock-based deferred free patches
|
|
for the CONFIG_PREEMPT_RT environment.
|
|
"
|
|
}
|
|
|
|
@conference{PaulMcKenney05b
|
|
,Author="Paul E. McKenney and Dipankar Sarma"
|
|
,Title="Towards Hard Realtime Response from the Linux Kernel on SMP Hardware"
|
|
,Booktitle="linux.conf.au 2005"
|
|
,month="April"
|
|
,year="2005"
|
|
,address="Canberra, Australia"
|
|
,note="Available:
|
|
\url{http://www.rdrop.com/users/paulmck/RCU/realtimeRCU.2005.04.23a.pdf}
|
|
[Viewed May 13, 2005]"
|
|
,annotation="
|
|
Realtime turns into making RCU yet more realtime friendly.
|
|
"
|
|
}
|