The attached patch documents the Linux kernel's memory barriers.
I've updated it from the comments I've been given.
The per-arch notes sections are gone because it's clear that there are so many
exceptions, that it's not worth having them.
I've added a list of references to other documents.
I've tried to get rid of the concept of memory accesses appearing on the bus;
what matters is apparent behaviour with respect to other observers in the
system.
Interrupts barrier effects are now considered to be non-existent. They may be
there, but you may not rely on them.
I've added a couple of definition sections at the top of the document: one to
specify the minimum execution model that may be assumed, the other to specify
what this document refers to by the term "memory".
I've made greater mention of the use of mmiowb().
I've adjusted the way in which caches are described, and described the fun
that can be had with cache coherence maintenance being unordered and data
dependency not being necessarily implicit.
I've described (smp_)read_barrier_depends().
I've rearranged the order of the sections, so that memory barriers are
discussed in abstract first, and then described the memory barrier facilities
available on Linux, before going on to more real-world discussions and examples.
I've added information about the lack of memory barriering effects with atomic
ops and bitops.
I've added information about control dependencies.
I've added more diagrams to illustrate caching interactions between CPUs.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As announced, lookup_hash() can now become static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The LED class/subsystem takes John Lenz's work and extends and alters it to
give what I think should be a fairly universal LED implementation.
The series consists of several logical units:
* LED Core + Class implementation
* LED Trigger Core implementation
* LED timer trigger (example of a complex trigger)
* LED device drivers for corgi, spitz and tosa Zaurus models
* LED device driver for locomo LEDs
* LED device driver for ARM ixp4xx LEDs
* Zaurus charging LED trigger
* IDE disk activity LED trigger
* NAND MTD activity LED trigger
Why?
====
LEDs are really simple devices usually amounting to a GPIO that can be turned
on and off so why do we need all this code? On handheld or embedded devices
they're an important part of an often limited user interface. Both users and
developers want to be able to control and configure what the LED does and the
number of different things they'd potentially want the LED to show is large.
A subsystem is needed to try and provide all this different functionality in
an architecture independent, simple but complete, generic and scalable manner.
The alternative is for everyone to implement just what they need hidden away
in different corners of the kernel source tree and to provide an inconsistent
interface to userspace.
Other Implementations
=====================
I'm aware of the existing arm led implementation. Currently the new subsystem
and the arm code can coexist quite happily. Its up to the arm community to
decide whether this new interface is acceptable to them. As far as I can see,
the new interface can do everything the existing arm implementation can with
the advantage that the new code is architecture independent and much more
generic, configurable and scalable.
I'm prepared to make the conversion to the LED subsystem (or assist with it)
if appropriate.
Implementation Details
======================
I've stripped a lot of code out of John's original LED class. Colours were
removed as LED colour is now part of the device name. Multiple colours are to
be handled as multiple led devices. This means you get full control over each
colour. I also removed the LED hardware timer code as the generic timer isn't
going to add much overhead and is just as useful. I also decided to have the
LED core track the current LED status (to ease suspend/resume handling)
removing the need for brightness_get implementations in the LED drivers.
An underlying design philosophy is simplicity. The aim is to keep a small
amount of code giving as much functionality as possible.
The major new idea is the led "trigger". A trigger is a source of led events.
Triggers can either be simple or complex. A simple trigger isn't
configurable and is designed to slot into existing subsystems with minimal
additional code. Examples are the ide-disk, nand-disk and zaurus-charging
triggers. With leds disabled, the code optimises away. Examples are
nand-disk and ide-disk.
Complex triggers whilst available to all LEDs have LED specific parameters and
work on a per LED basis. The timer trigger is an example.
You can change triggers in a similar manner to the way an IO scheduler is
chosen (via /sys/class/leds/somedevice/trigger).
So far there are only a handful of examples but it should easy to add further
LED triggers without too much interference into other subsystems.
Known Issues
============
The LED Trigger core cannot be a module as the simple trigger functions would
cause nightmare dependency issues. I see this as a minor issue compared to
the benefits the simple trigger functionality brings. The rest of the LED
subsystem can be modular.
Some leds can be programmed to flash in hardware. As this isn't a generic LED
device property, I think this should be exported as a device specific sysfs
attribute rather than part of the class if this functionality is required (eg.
to keep the led flashing whilst the device is suspended).
Future Development
==================
At the moment, a trigger can't be created specifically for a single LED.
There are a number of cases where a trigger might only be mappable to a
particular LED. The addition of triggers provided by the LED driver should
cover this option and be possible to add without breaking the current
interface.
A CPU activity trigger similar to that found in the arm led implementation
should be trivial to add.
This patch:
Add some brief documentation of the design decisions behind the LED class and
how it appears to users.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
find_trylock_page() is an odd interface in that it doesn't take a reference
like the others. Now that XFS no longer uses it, and its last remaining
caller actually wants an elevated refcount, opencode that callsite and
schedule find_trylock_page() for removal.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[PATCH] sata_mv: three bug fixes
[PATCH] libata: ata_dev_init_params() fixes
[PATCH] libata: Fix interesting use of "extern" and also some bracketing
[PATCH] libata: Simplex and other mode filtering logic
[PATCH] libata - ATA is both ATA and CFA
[PATCH] libata: Add ->set_mode hook for odd drivers
[PATCH] libata: BMDMA handling updates
[PATCH] libata: kill trailing whitespace
[PATCH] libata: add FIXME above ata_dev_xfermask()
[PATCH] libata: cosmetic changes in ata_bus_softreset()
[PATCH] libata: kill E.D.D.
Add a field to the host_set called 'flags' (was host_set_flags changed
to suit Jeff)
Add a simplex_claimed field so we can remember who owns the DMA channel
Add a ->mode_filter() hook to allow drivers to filter modes
Add docs for mode_filter and set_mode
Filter according to simplex state
Filter cable in core
This provides the needed framework to support all the mode rules found
in the PATA world. The simplex filter deals with 'to spec' simplex DMA
systems found in older chips. The cable filter avoids duplicating the
same rules in each chip driver with PATA. Finally the mode filter is
neccessary because drive/chip combinations have errata that forbid
certain modes with some drives or types of ATA object.
Drive speed setup remains per channel for now and the filters now use
the framework Tejun put into place which cleans them up a lot from the
older libata-pata patches.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Sorely out of date. Add the linux-net wiki web site to
the NETWORKING maintainers entry, on which we maintain
the current networking TODO list.
Noticed by Randy Dunlap.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a lot of typos. Eyeballed by jmc@ in OpenBSD.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix a few trivial mistakes in Documentation/cputopology.txt
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace all occurences of 0xff.. in calls to function pci_set_dma_mask()
and pci_set_consistant_dma_mask() with the corresponding DMA_xBIT_MASK from
linux/dma-mapping.h.
Signed-off-by: Matthias Gehre <M.Gehre@gmx.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nowadays, even Debian stable ships a microcode_ctl utility recent enough to no
longer use this ioctl.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Tigran Aivazian <tigran_aivazian@symantec.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace for_each_cpu with for_each_possible_cpu.
Modifies occurences in documentaion.
for_each_cpu in whatisRCU.txt should be for_each_online_cpu ???
(I'm not sure..)
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This removes statically assigned platform numbers and reworks the
powerpc platform probe code to use a better mechanism. With this,
board support files can simply declare a new machine type with a
macro, and implement a probe() function that uses the flattened
device-tree to detect if they apply for a given machine.
We now have a machine_is() macro that replaces the comparisons of
_machine with the various PLATFORM_* constants. This commit also
changes various drivers to use the new macro instead of looking at
_machine.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block:
[PATCH] Don't make debugfs depend on DEBUG_KERNEL
[PATCH] Fix blktrace compile with sysfs not defined
[PATCH] unused label in drivers/block/cciss.
[BLOCK] increase size of disk stat counters
[PATCH] blk_execute_rq_nowait-speedup
[PATCH] ide-cd: quiet down GPCMD_READ_CDVD_CAPACITY failure
[BLOCK] ll_rw_blk: kmalloc -> kzalloc conversion
[PATCH] kzalloc() conversion in drivers/block
[PATCH] update max_sectors documentation
Remove the assumption that pnp_register_driver() returns the number of devices
claimed. Returning the count is unreliable because devices may be hot-plugged
in the future.
This changes the convention to "zero for success, or a negative error value,"
which matches pci_register_driver(), acpi_bus_register_driver(), and
platform_driver_register().
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Adam Belay <ambx1@neo.rr.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- fix: initialize the robust list(s) to NULL in copy_process.
- doc update
- cleanup: rename _inuser to _inatomic
- __user cleanups and other small cleanups
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Erik Mouw
The LART website moved to http://www.lartmaker.nl/. This patch
updates the URL in ARM specific files.
Signed-off-by: Erik Mouw <erik@bitwizard.nl>
Acked-by: Jan-Derk Bakker <jdb@lartmaker.nl>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The max_sectors has been split into max_hw_sectors and max_sectors for some
time. A patch to have blk_queue_max_sectors enforce this was sent by
me and it broke IDE. This patch updates the documentation.
Signed-off-by: Jens Axboe <axboe@suse.de>
Fix spelling errors in EDAC documentation.
Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We have a problem in a lot of emulated storage in that it takes a page from
get_user_pages() and does something like
kmap_atomic(page)
modify page
kunmap_atomic(page)
However, nothing has flushed the kernel cache view of the page before the
kunmap. We need a lightweight API to do this, so this new API would
specifically be for flushing the kernel cache view of a user page which the
kernel has modified. The driver would need to add
flush_kernel_dcache_page(page) before the final kunmap.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently, get_user_pages() returns fully coherent pages to the kernel for
anything other than anonymous pages. This is a problem for things like
fuse and the SCSI generic ioctl SG_IO which can potentially wish to do DMA
to anonymous pages passed in by users.
The fix is to add a new memory management API: flush_anon_page() which
is used in get_user_pages() to make anonymous pages coherent.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/aoe-2.6:
[PATCH] aoe [3/3]: update version to 22
[PATCH] aoe [2/3]: don't request ATA device ID on ATA error
[PATCH] aoe [1/3]: support multiple AoE listeners
[PATCH] aoe: do not stop retransmit timer when device goes down
[PATCH] aoe [8/8]: update driver version number
[PATCH] aoe [7/8]: update driver compatibility string
[PATCH] aoe [6/8]: update device information on last close
[PATCH] aoe [5/8]: allow network interface migration on packet retransmit
[PATCH] aoe [4/8]: use less confusing driver name
[PATCH] aoe [3/8]: increase allowed outstanding packets
[PATCH] aoe [2/8]: support dynamic resizing of AoE devices
[PATCH] aoe [1/8]: zero packet data after skb allocation
* master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild: (46 commits)
kbuild: remove obsoleted scripts/reference_* files
kbuild: fix make help & make *pkg
kconfig: fix time ordering of writes to .kconfig.d and include/linux/autoconf.h
Kconfig: remove the CONFIG_CC_ALIGN_* options
kbuild: add -fverbose-asm to i386 Makefile
kbuild: clean-up genksyms
kbuild: Lindent genksyms.c
kbuild: fix genksyms build error
kbuild: in makefile.txt note that Makefile is preferred name for kbuild files
kbuild: replace PHONY with FORCE
kbuild: Fix bug in crc symbol generating of kernel and modules
kbuild: change kbuild to not rely on incorrect GNU make behavior
kbuild: when warning symbols exported twice now tell user this is the problem
kbuild: fix make dir/file.xx when asm symlink is missing
kbuild: in the section mismatch check try harder to find symbols
kbuild: fix section mismatch check for unwind on IA64
kbuild: kill false positives from section mismatch warnings for powerpc
kbuild: kill trailing whitespace in modpost & friends
kbuild: small update of allnoconfig description
kbuild: make namespace.pl CROSS_COMPILE happy
...
Trivial conflict in arch/ppc/boot/Makefile manually fixed up
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (21 commits)
BUG_ON() Conversion in drivers/video/
BUG_ON() Conversion in drivers/parisc/
BUG_ON() Conversion in drivers/block/
BUG_ON() Conversion in sound/sparc/cs4231.c
BUG_ON() Conversion in drivers/s390/block/dasd.c
BUG_ON() Conversion in lib/swiotlb.c
BUG_ON() Conversion in kernel/cpu.c
BUG_ON() Conversion in ipc/msg.c
BUG_ON() Conversion in block/elevator.c
BUG_ON() Conversion in fs/coda/
BUG_ON() Conversion in fs/binfmt_elf_fdpic.c
BUG_ON() Conversion in input/serio/hil_mlc.c
BUG_ON() Conversion in md/dm-hw-handler.c
BUG_ON() Conversion in md/bitmap.c
The comment describing how MS_ASYNC works in msync.c is confusing
rcu: undeclared variable used in documentation
fix typos "wich" -> "which"
typo patch for fs/ufs/super.c
Fix simple typos
tabify drivers/char/Makefile
...
Fix Documentation/firmware_class/ examples so that they will build.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add info on flow control for serial consoles. Refer to netconsole option
also.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As Pekka Enberg pointed out, with the if still following the else, you can
still get a null uid written to the disk if you specify a default uid= without
uid=forget. In other words, if the desktop user is uid=1000 and the mount
option uid=1000 is given ( which is done on ubuntu automatically and probably
other distributions that use hal ), then if any other user besides uid 1000
owns a file then a 0 will be written to the media as the owning uid instead.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Flesh out the description of the address_space operations.
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Avishay Traeger <atraeger@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix documentation to match current implementation.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There were a number of conflicting naming schemes used in the v9fs project.
The directory was fs/9p, but MAINTAINERS and Documentation referred to
v9fs. The module name itself was 9p2000, and the file system type was 9P.
This patch attempts to clean that up, changing all references to 9p in
order to match the directory name. We'll also start using 9p instead of
v9fs as our patch prefix.
There is also a minor consistency cleanup in the options changing the name
option to uname in order to more closely match the Plan 9 options.
Signed-off-by: Eric Van Hensbergevan <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
MODULE_PARM was actually breaking: recent gcc version optimize them out as
unused. It's time to replace the last users, which are generally in the
most unloved drivers anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
According to the specification the timevals must be validated and an
errorcode -EINVAL returned in case the timevals are not in canonical form.
This check was never done in Linux.
The pre 2.6.16 code converted invalid timevals silently. Negative timeouts
were converted by the timeval_to_jiffies conversion to the maximum timeout.
hrtimers and the ktime_t operations expect timevals in canonical form.
Otherwise random results might happen on 32 bits machines due to the
optimized ktime_add/sub operations. Negative timeouts are treated as
already expired. This might break applications which work on pre 2.6.16.
To prevent random behaviour and API breakage the timevals are checked and
invalid timevals sanitized in a simliar way as the pre 2.6.16 code did.
Invalid timevals are reported with a per boot limited number of kernel
messages so applications which use this misfeature can be corrected.
After a grace period of one year the sanitizing should be replaced by a
correct validation check. This is also documented in
Documentation/feature-removal-schedule.txt
The validation and sanitizing is done inside do_setitimer so all callers
(sys_setitimer, compat_sys_setitimer, osf_setitimer) are catched.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The RCU documentation uses an fp variable which is not declared in the code
snippets. Use the new_fp variable instead.
Signed-Off-By: Baruch Even <baruch@ev-en.org>
Acked-by: <paulmck@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Drivers have no business looking at the task list and thus using this lock.
The only possibly modular users left are:
arch/ia64/kernel/mca.c
drivers/edac/edac_mc.c
fs/binfmt_elf.c
which I'll send out fixes for soon.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>