I was trying to use HDIO_DRIVE_TASK for something today,
and discovered that the libata implementation does not copy
over the upper four LBA bits from args[6].
This is serious, as any tools using this ioctl would have their
commands applied to the wrong sectors on the drive, possibly resulting
in disk corruption.
Ideally, newer apps should use SG_IO/ATA_16 directly,
avoiding this bug. But with libata poised to displace drivers/ide,
better compatibility here is a must.
This patch fixes libata to use the upper four LBA bits passed
in from the ioctl.
The original drivers/ide implementation copies over all bits
except for the master/slave select bit. With this patch,
libata will copy only the four high-order LBA bits,
just in case there are assumptions elsewhere in libata (?).
Signed-Off-By: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The current EH speed down code is more of a proof that the EH
framework is capable of adjusting transfer speed in response to error.
This patch puts some intelligence into EH speed down sequence. The
rules are..
* If there have been more than three timeout, HSM violation or
unclassified DEV errors for known supported commands during last 10
mins, NCQ is turned off.
* If there have been more than three timeout or HSM violation for known
supported command, transfer mode is slowed down. If DMA is active,
it is first slowered by one grade (e.g. UDMA133->100). If that
doesn't help, it's slowered to 40c limit (UDMA33). If PIO is
active, it's slowered by one grade first. If that doesn't help,
PIO0 is forced. Note that this rule does not change transfer mode.
DMA is never degraded into PIO by this rule.
* If there have been more than ten ATA bus, timeout, HSM violation or
unclassified device errors for known supported commands && speeding
down DMA mode didn't help, the device is forced into PIO mode. Note
that this rule is considered only for PATA devices and is pretty
difficult to trigger.
One error can only trigger one rule at a time. After a rule is
triggered, error history is cleared such that the next speed down
happens only after some number of errors are accumulated. This makes
sense because now speed down is done in bigger stride.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* Move forcing device to PIO0 on device disable into
ata_dev_disable(). This makes both old and new EHs act the same
way.
* Speed down only PIO mode on probe failure. All commands used during
probing are PIO commands. There's no point in speeding down DMA.
* Retry at least once after -ENODEV. Some devices report garbled
IDENTIFY data after certain events. This shouldn't cause device
detach and re-attach.
* Rearrange EH failure path for simplicity.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Make ata_down_xfermask_limit() accept @sel instead of @force_pio0.
@sel selects how the xfermask limit will be adjusted. The following
selectors are defined.
* ATA_DNXFER_PIO : only speed down PIO
* ATA_DNXFER_DMA : only speed down DMA, don't cause transfer mode change
* ATA_DNXFER_40C : apply 40c cable limit
* ATA_DNXFER_FORCE_PIO : force PIO
* ATA_DNXFER_FORCE_PIO0 : force PIO0 (same as original with @force_pio0 == 1)
* ATA_DNXFER_ANY : same as original with @force_pio0 == 0
Currently, only ANY and FORCE_PIO0 are used to maintain the original
behavior. Other selectors will be used later to improve EH speed down
sequence.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This is the patch for PATA controller of Celleb.
This driver uses the managed iomap (devres).
Because this driver needs special taskfile accesses, there is
a copy of ata_std_softreset(). ata_dev_try_classify() is exported
so that it can be used in this function.
Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Akira Iguchi <akira2.iguchi@toshiba.co.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
WARNING: drivers/video/i810/i810fb.o - Section mismatch: reference
to .init.data: from .text between 'i810_check_params' (at offset
0x1123) and 'encode_fix'
yres cannot be declared __devinitdata as it is used in
i810_check_params(), which isn't __devinit.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Driver for the Silicon Motion SM501 multifunction device framebuffer
subsystem.
This driver supports both the CRT and LCD panel heads, with some simple
acceleration for the cursor plotting and support for screen panning. There
is no current support for bitblt/drawing engines, which should be added at
a later date.
This has been tested on a number of configurations, including PCI and
generic-bus, on PPC, ARM and SH4
[akpm@linux-foundation.org: fix warnings]
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Vincent Sanders <vince@arm.linux.org.u.>
Acked-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Based on the discussion last december (http://lkml.org/lkml/2006/12/20/241),
this patch
- adds gpio_direction_input/output functions to
generic.c instead of making them inline,
- fixes comment and includes and uses inline functions
instead of macros in gpio.h
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
this one adds an #include <asm/arch/regs-gpio.h>.
Tested by Roman Moravcik on s3c2440.
Based on the discussion last december
(http://lkml.org/lkml/2006/12/20/243), this patch
- fixes comment and includes in gpio.h
- adds the gpio_to_irq definition for S3C2400
- includes asm/arch/regs-gpio.h for pin direction
definitions
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add transfer modes 2 and 3 to the S3C24XX gpio SPI driver
Signed-off-by: Harald Welte <laforge@openmoko.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The signature of the per-device cleanup() routine changed to remove its
const-ness. Three new SPI controller drivers now need that change, to
eliminate build warnings.
This also fixes a build bug with atmel_spi on AT91 systems.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WARNING: drivers/parport/parport_pc.o - Section mismatch: reference
to .init.text: from .text between 'parport_pc_probe_port' (at offset
0x14f7) and 'parport_pc_unregister_port'
parport_dma_probe() cannot be declared __devinit as it is called
from parport_pc_probe_port() which isn't.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
LD drivers/isdn/gigaset/built-in.o
drivers/isdn/gigaset/ser_gigaset.o: In function `gigaset_m10x_send_skb':
(.text+0xe50): multiple definition of `gigaset_m10x_send_skb'
drivers/isdn/gigaset/usb_gigaset.o:(.text+0x0): first defined here
drivers/isdn/gigaset/ser_gigaset.o: In function `gigaset_m10x_input':
(.text+0x1121): multiple definition of `gigaset_m10x_input'
drivers/isdn/gigaset/usb_gigaset.o:(.text+0x2d1): first defined here
make[4]: *** [drivers/isdn/gigaset/built-in.o] Error 1
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch stops "modpost" from issuing erroneous modpost warnings on ARM
builds, which it's been doing since since maybe last summer. A canonical
example would be driver method table entries:
WARNING: <path> - Section mismatch: reference to .exit.text:<name>_remove
from .data after '$d' (at offset 0x4)
That "$d" symbol is generated by tools conformant with ARM ABI specs; in
this case it's a symbol **in the middle of** a "<name>_driver" struct.
The erroneous warnings appear to be issued because "modpost" whitelists
references from "<name>_driver" data into init and exit sections ... but
doesn't know should also include those "$d" mapping symbols, which are not
otherwise associated with "<name>_driver" symbols.
This patch prevents the modpost symbol lookup code from ever returning
those mapping symbols, so it will return a whitelisted symbol instead.
Then things work as expected.
Now to revert various code-bloating "fixes" that got merged because of this
modpost bug....
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Based on the discussion last december (http://lkml.org/lkml/2006/12/20/242),
this patch:
- moves the PXA_LAST_GPIO check into pxa_gpio_mode
- fixes comment and includes in gpio.h
- replaces the gpio_set/get_value macros with inline
functions and adds a non-inline version to avoid
code explosion when gpio is not a constant.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Nicolas Pitre <nico@cam.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Various bug fixes to the at91rm9200 RTC:
- alarm: setalarm() should pay attention to the "enabled" flag
- init: cleaner handling of the wakeup flags, which cpu init should
really have set up. Doing it here is just a workaround.
- linkage: since the at91_rtc driver probe() routine is in the init
section, it should use platform_driver_probe() instead of leaving
that pointer around in the driver struct after init section removal.
- linkage: likewise, remove() belongs in the exit section.
Among other things, the init and alarm changes ensure that this driver
handles the new sysfs "wakealarm" attribute properly.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some rtc-sa1100 bugfixes:
- The read_alarm() method reports the rtc_wkalrm.enabled field properly.
This patch is already in the handhelds.org tree.
- And the set_alarm() method now handles that flag correctly, rather than
making mismatched {en,dis}able_irq_wake() calls, which trigger runtime
warning messages. (Those calls are best made in suspend/resume methods.)
Note that while this SA1100/PXA RTC is fully capable of waking those ARM
processors from sleep states, that mechanism isn't properly supported on
either processor family, or in this driver. Some boards have board-specific
PM glue providing partial workarounds for the weak generic PM support.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pointers to user data should be marked with a __user hint. This one is
missing.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lib/genalloc.c: In function 'gen_pool_alloc':
lib/genalloc.c:151: warning: passing argument 2 of '__set_bit' from incompatible pointer type
lib/genalloc.c: In function 'gen_pool_free':
lib/genalloc.c:190: warning: passing argument 2 of '__clear_bit' from incompatible pointer type
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
affs wants to truncate the inode when the last user goes away, currently it
does that through a potentially racy i_count check in ->put_inode. But we
already have a method that's called just after the we dropped the last
reference, ->drop_inode. This patch implements affs_drop_inode to take
advantage of this.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This problem was identified and fixed some time ago by Jeff Moyer but it fell
through the cracks somehow.
It is possible that a user space application could remove and re-create a
directory during a request. To avoid returning a failure from lookup
incorrectly when our current dentry is unhashed we need to check if another
positive, hashed dentry matching this one exists and if so return it instead
of a fail.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Moyer has identified a race between mount and expire.
What happens is that during an expire the situation can arise that a directory
is removed and another lookup is done before the expire issues a completion
status to the kernel module. In this case, since the the lookup gets a new
dentry, it doesn't know that there is an expire in progress and when it posts
its mount request, matches the existing expire request and waits for its
completion. ENOENT is then returned to user space from lookup (as the dentry
passed in is now unhashed) without having performed the mount request.
The solution used here is to keep track of dentrys in this unhashed state and
reuse them, if possible, in order to preserve the flags. Additionally, this
infrastructure will provide the framework for the reintroduction of caching of
mount fails removed earlier in development.
Signed-off-by: Ian Kent <raven@themaw.net>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current header file definitions for autofs version 5 have caused a couple
of problems for application builds downstream.
This fixes the problem by separating the definitions.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nobh_prepare_write leaks data similarly to how simple_prepare_write did. Fix
by not marking the page uptodate until nobh_commit_write time. Again, this
could break weird use-cases, but none appear to exist in the tree.
We can safely remove the set_page_dirty, because as the comment says,
nobh_commit_write does set_page_dirty. If a filesystem wants to allocate
backing store for a page dirtied via mmap, page_mkwrite is the suggested
approach.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
simple_prepare_write leaks uninitialised kernel data. This happens because
the it leaves an uninitialised "hole" over the part of the page that the
write is expected to go to. This is fine, but it then marks the page
uptodate, which means a concurrent read can come in and copy the
uninitialised memory into userspace before it written to.
Fix it by simply marking it uptodate in simple_commit_write instead, after
the hole has been filled in. This could theoretically break an fs that
uses simple_prepare_write and not simple_commit_write, and that relies on
the incorrect simple_prepare_write behaviour. Luckily, none of those
exists in the tree.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This option is useful for all of the X86 subarchs afaik (and especially
X86_GENERICARCH).
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch from Mohan Kumar M to add the ppc64 portions of the kdump
documentation.
http://thread.gmane.org/gmane.linux.kernel/481689/focus=3375
Cc: Mohan Kumar M <mohan@in.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The patch below updates MAINTAIER address
Individuals (Only Andrew :): osdl.org -> linux-foundation.org
Lists: osdl.org -> lists.osdl.org
I assume the latter will change at some stage, but at least
with this change the osdl/linux-foundation lists are consistent.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix 23 of these sparse warnings on x86_64 allmodconfig:
include/linux/cdrom.h:942:19: error: dubious bitfield without explicit
`signed' or `unsigned'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix sparse warning in tty_io:
drivers/char/tty_io.c:1536:34: warning: Using plain integer as NULL pointer
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Output of a function or struct in html mode needs to include the short
description from the function/struct name line in the output title line.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This driver provides the core functionality of the SM501, which is a
multi-function chip including two framebuffers, video acceleration, USB,
and many other peripheral blocks.
The driver exports a number of entries for the peripheral drivers to use.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Vincent Sanders <vince@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In __lock_acquire check_chain_key can turn off debug_locks, so check is
needed to assure proper return code.
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The problem comes when ks0108/cfag12864b are built-in and no parallel port is
present. ks0108_init() is called first, as it should be, but fails to load
(as there is no parallel port to use).
After that, cfag12864b_init() gets called, without knowing anything about
ks0108 failed, and calls ks0108_writecontrol(), which dereferences an
uninitialized pointer.
Init order is OK, I think. The problem is how to stop cfag12864b_init() being
called if ks0108 failed to load. modprobe does it for us, but, how when
built-in?
Signed-off-by: Miguel Ojeda Sandonis <maxextreme@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Flags from spin_lock_irqsave() are saved into global variable and restored
from it. My gut feeling this is very racy.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes http://bugzilla.kernel.org/show_bug.cgi?id=7810 - a silly
copy-paste bug introduced by the latest change.
Signed-off-by: Gerhard Dirschl <gd@spherenet.de>
Cc: Peter Osterlund <petero2@telia.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is no prompt for CONFIG_STACKTRACE, so FAULT_INJECTION cannot be
selected without LOCKDEP enabled. (found by Paolo 'Blaisorblade'
Giarrusso)
In order to fix such broken Kconfig dependency, this patch splits up the
stacktrace filter support for fault injection by new Kconfig option, which
enables to use fault injection on the architecture which doesn't have
general stacktrace support.
Cc: "Paolo 'Blaisorblade' Giarrusso" <blaisorblade@yahoo.it>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the DIO write on FAT is expanding the size, it will be fail by -EINVAL,
because FAT can't handle it now.
This patch fallback it to the normal buffered-write and would return
success.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch lists all active probes in the system by scanning through
kprobe_table[]. It takes care of aggregate handlers and prints the type of
the probe. Letter "k" for kprobes, "j" for jprobes, "r" for kretprobes.
It also lists address of the instruction,its symbolic name(function name +
offset) and the module name. One can access this file through
/sys/kernel/debug/kprobes/list.
Output looks like this
=====================
llm40:~/a # cat /sys/kernel/debug/kprobes/list
c0169ae3 r sys_read+0x0
c0169ae3 k sys_read+0x0
c01694c8 k vfs_write+0x0
c0167d20 r sys_open+0x0
f8e658a6 k reiserfs_delete_inode+0x0 reiserfs
c0120f4a k do_fork+0x0
c0120f4a j do_fork+0x0
c0169b4a r sys_write+0x0
c0169b4a k sys_write+0x0
c0169622 r vfs_read+0x0
=================================
[akpm@linux-foundation.org: cleanup]
[ananth@in.ibm.com: sparc build fix]
Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew noticed that unlocking the page before submitting all buffers for
writeout could cause problems if the IO completes before we've finished
messing around with the page buffers, and they subsequently get freed.
Even if there were no bug, it is a good idea to bring the error case
into line with the common case here.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current ipc shared memory code runs into several problems because it
does not quite use files like the rest of the kernel. With the option of
backing ipc shared memory with either hugetlbfs or ordinary shared memory
the problems got worse. With the added support for ipc namespaces things
behaved so unexpected that we now have several bad namespace reference
counting bugs when using what appears at first glance to be a reasonable
idiom.
So to attack these problems and hopefully make the code more maintainable
this patch simply uses the files provided by other parts of the kernel and
builds it's own files out of them. The shm files are allocated in do_shmat
and freed when their reference count drops to zero with their last unmap.
The file and vm operations that we don't want to implement or we don't
implement completely we just delegate to the operations of our backing
file.
This means that we now get an accurate shm_nattch count for we have a
hugetlbfs inode for backing store, and the shm accounting of last attach
and last detach time work as well.
This means that getting a reference to the ipc namespace when we create the
file and dropping the referenece in the release method is now safe and
correct.
This means we no longer need a special case for clearing VM_MAYWRITE
as our file descriptor now only has write permissions when we have
requested write access when calling shmat. Although VM_SHARED is now
cleared as well which I believe is harmless and is mostly likely a
minor bug fix.
By using the same set of operations for both the hugetlb case and regular
shared memory case shmdt is not simplified and made slightly more correct
as now the test "vma->vm_ops == &shm_vm_ops" is 100% accurate in spotting
all shared memory regions generated from sysvipc shared memory.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The alien cache is a per cpu per node array allocated for every slab on the
system. Currently we size this array for all nodes that the kernel does
support. For IA64 this is 1024 nodes. So we allocate an array with 1024
objects even if we only boot a system with 4 nodes.
This patch uses "nr_node_ids" to determine the number of possible nodes
supported by a hardware configuration and only allocates an alien cache
sized for possible nodes.
The initialization of nr_node_ids occurred too late relative to the bootstrap
of the slab allocator and so I moved the setup_nr_node_ids() into
free_area_init_nodes().
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We frequently need the maximum number of possible processors in order to
allocate arrays for all processors. So far this was done using
highest_possible_processor_id(). However, we do need the number of
processors not the highest id. Moreover the number was so far dynamically
calculated on each invokation. The number of possible processors does not
change when the system is running. We can therefore calculate that number
once.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
highest_possible_node_id() is currently used to calculate the last possible
node idso that the network subsystem can figure out how to size per node
arrays.
I think having the ability to determine the maximum amount of nodes in a
system at runtime is useful but then we should name this entry
correspondingly, it should return the number of node_ids, and the the value
needs to be setup only once on bootup. The node_possible_map does not
change after bootup.
This patch introduces nr_node_ids and replaces the use of
highest_possible_node_id(). nr_node_ids is calculated on bootup when the
page allocators pagesets are initialized.
[deweerdt@free.fr: fix oops]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>