After this patch none of the netlink callback support anything
except the initial network namespace but the rtnetlink infrastructure
now handles multiple network namespaces.
Changes from v2:
- IPv6 addrlabel processing
Changes from v1:
- no need for special rtnl_unlock handling
- fixed IPv6 ndisc
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds IEEE80211_MAX_FRAME_LEN which is useful for drivers trying
to determine how much to allocate for their RX buffers.
It also updates the comment on IEEE80211_MAX_DATA_LEN based on revisions
in 802.11e.
IEEE80211_MAX_FRAG_THRESHOLD and IEEE80211_MAX_RTS_THRESHOLD are also
revised due to the new maximum frame size.
Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rx_flags variable is redundant. Turning rx on/off is done
via setting the rx_np pointer.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The local_mac is managed by the network device, no need to keep a
spare copy and all the management problems that could cause.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the missing Kbuild entries and the missing Kbuild file
in include/linux/can for the CAN subsystem.
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the use of plain integers instead of __u32 in a struct
that is visible from kernel space and user space.
Thanks to Sam Ravnborg for pointing out the wrong plain int usage.
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the CAN broadcast manager (bcm) protocol.
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the CAN raw protocol.
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the CAN core functionality but no protocols or drivers.
No protocol implementations are included here. They come as separate
patches. Protocol numbers are already in include/linux/can.h.
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a protocol/address family number, ARP hardware type,
ethernet packet type, and a line discipline number for the SocketCAN
implementation.
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Key points of this patch are:
- In case new SACK information is advance only type, no skb
processing below previously discovered highest point is done
- Optimize cases below highest point too since there's no need
to always go up to highest point (which is very likely still
present in that SACK), this is not entirely true though
because I'm dropping the fastpath_skb_hint which could
previously optimize those cases even better. Whether that's
significant, I'm not too sure.
Currently it will provide skipping by walking. Combined with
RB-tree, all skipping would become fast too regardless of window
size (can be done incrementally later).
Previously a number of cases in TCP SACK processing fails to
take advantage of costly stored information in sack_recv_cache,
most importantly, expected events such as cumulative ACK and new
hole ACKs. Processing on such ACKs result in rather long walks
building up latencies (which easily gets nasty when window is
huge). Those latencies are often completely unnecessary
compared with the amount of _new_ information received, usually
for cumulative ACK there's no new information at all, yet TCP
walks whole queue unnecessary potentially taking a number of
costly cache misses on the way, etc.!
Since the inclusion of highest_sack, there's a lot information
that is very likely redundant (SACK fastpath hint stuff,
fackets_out, highest_sack), though there's no ultimate guarantee
that they'll remain the same whole the time (in all unearthly
scenarios). Take advantage of this knowledge here and drop
fastpath hint and use direct access to highest SACKed skb as
a replacement.
Effectively "special cased" fastpath is dropped. This change
adds some complexity to introduce better coveraged "fastpath",
though the added complexity should make TCP behave more cache
friendly.
The current ACK's SACK blocks are compared against each cached
block individially and only ranges that are new are then scanned
by the high constant walk. For other parts of write queue, even
when in previously known part of the SACK blocks, a faster skip
function is used (if necessary at all). In addition, whenever
possible, TCP fast-forwards to highest_sack skb that was made
available by an earlier patch. In typical case, no other things
but this fast-forward and mandatory markings after that occur
making the access pattern quite similar to the former fastpath
"special case".
DSACKs are special case that must always be walked.
The local to recv_sack_cache copying could be more intelligent
w.r.t DSACKs which are likely to be there only once but that
is left to a separate patch.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is going to replace the sack fastpath hint quite soon... :-)
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Policy table is implemented as an RCU linear list since we do not expect
large list nor frequent updates.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IPv4 and IPv6 hook values are identical, yet some code tries to figure
out the "correct" value by looking at the address family. Introduce NF_INET_*
values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__
section for userspace compatibility.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow caller to pass in a release function, there might be
other resources that need releasing as well. Needed for
network receive.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Today we have the following annotations for functions/data
referencing __init/__exit functions / data:
__init_refok => for init functions
__initdata_refok => for init data
__exit_refok => for exit functions
There is really no difference between the __init and __exit
versions and simplify it and to introduce a shorter annotation
the following new annotations are introduced:
__ref => for functions (code) that
references __*init / __*exit
__refdata => for variables
__refconst => for const variables
Whit this annotation is it more obvious what the annotation
is for and there is no longer the arbitary division
between __init and __exit code.
The mechanishm is the same as before - a special section
is created which is made part of the usual sections
in the linker script.
We will start to see annotations like this:
-static struct pci_serial_quirk pci_serial_quirks[] = {
+static const struct pci_serial_quirk pci_serial_quirks[] __refconst = {
-----------------
-static struct notifier_block __cpuinitdata cpuid_class_cpu_notifier =
+static struct notifier_block cpuid_class_cpu_notifier __refdata =
----------------
-static int threshold_cpu_callback(struct notifier_block *nfb,
+static int __ref threshold_cpu_callback(struct notifier_block *nfb,
[The above is just random samples].
Note: No modifications were needed in modpost
to support the new sections due to the newly introduced
blacklisting.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Remove the deprecated __attribute_used__.
[Introduce __section in a few places to silence checkpatch /sam]
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Introducing separate sections for __dev* (HOTPLUG),
__cpu* (HOTPLUG_CPU) and __mem* (MEMORY_HOTPLUG)
allows us to do a much more reliable Section mismatch
check in modpost. We are no longer dependent on the actual
configuration of for example HOTPLUG.
This has the effect that all users see much more
Section mismatch warnings than before because they
were almost all hidden when HOTPLUG was enabled.
The advantage of this is that when building a piece
of code then it is much more likely that the Section
mismatch errors are spotted and the warnings will be
felt less random of nature.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Adrian Bunk <bunk@kernel.org>
auxvec.h, i2c-dev.h and vt.h *should* be unifdef'ed i2o-dev.h does not need
unifdef'ing
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (24 commits)
HID: ADS/Tech Radio si470x needs blacklist entry
HID: Logitech Extreme 3D needs NOGET quirk
HID: Refactor MS Presenter 8K key mapping
HID: MS Presenter mapping for PID 0x0701
HID: Support Samsung IR remote
HID: fix compilation of hidbp drivers without usbhid
HID: Blacklist the Gretag-Macbeth Huey display colorimeter
HID: the `bit' in hidinput_mapping_quirks() is an out parameter
HID: remove redundant WARN_ON()s in order not to scare users
HID: force hiddev creation for SONY PS3 controller
HID: Use hid blacklist in usbmouse/usbkbd
HID: proper handling of MS 4k and 6k devices
HID: remove unused variable in quirk event handler
HID: hid-input quirk for BTC 8193
HID: separate hid-input event quirks from generic code
HID: refactor mapping to input subsystem for quirky devices
HID: Microsoft Wireless Optical Desktop 3.0 quirk
HID: Add support for Logitech Elite keyboards
HID: add full support for Genius KB-29E
HID: fix a potential bug in pointer casting
...
* 'sg' of git://git.kernel.dk/linux-2.6-block:
SG: work with the SCSI fixed maximum allocations.
SG: Convert SCSI to use scatterlist helpers for sg chaining
SG: Move functions to lib/scatterlist.c and add sg chaining allocator helpers
Samsung USB remotes (0419:0001) are rejected by kernel 2.6.23, because the
report descriptor from the remote contains a 48 bit HID report field. HID 1.11
states: Fields may span at most 4 bytes.
This patch, based on 2.6.23, fixes this by modifying the internal report
descriptor in hid-quirks.c. Additional user space support (e.g. LIRC) is
required to fetch the information from the hiddev interface.
The burden to reconstruct the data is moved into userspace (lirc through hiddev).
There is no need to set HID_QUIRK_HIDDEV quirk, as the device has also output
applications, which trigger the creation of hiddev device automatically.
Signed-off-by: Robert Schedel <r.schedel@yahoo.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Fix a panic, by changing
hidinput_mapping_quirks(,, unsigned long *bit,)
to
hidinput_mapping_quirks(,, unsigned long **bit,)
The `bit' in this function is an out parameter.
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This removes ugly macros IS_* to distinguish devices that
need special handling in hid-input, and establish proper
quirks for them.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
BTC 8193 keyboard handles its scrollwheel in very non-standard way.
It produces two non-standard usages for scrolling up and down, in
both cases with postive value equaling to 1. We handle this by temporary
mapping, which we then catch in quirk event handler, and remap to
negative HWHEEL even in order to introduce correct behavior.
Also the button requires special mapping, as it triggers standard-violating
usage code.
Reported in kernel.org bugzilla #9385
Reported-by: Kir Kolyshkin <kir@sacred.ru>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch separates also the hid-input quirks that have to be
applied at the time the event occurs, so that the generic code
handling HUT-compliant devices is not messed up by them too much.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Currently, the handling of mapping between hid and input for devices
that don't conform to HUT 1.12 specification is very messy -- no per-device
handling, no blacklists, conditions on idVendor and idProduct placed
all over the code.
This patch moves all the device-specific input mapping to a separate
file, and introduces a blacklist-style handling for non-standard
device-specific mappings.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Genius KB-29E has broken report descriptor, which causes some of the
Consumer usages to appear incorrectly as Button usages. We fix it by
fixing the report descriptor before it is being parsed.
Also a few of the keys violate the HUT standard, so they need a special
handling. They currently fall into "Reserved" range as per HUT 1.12.
Reported-by: Szekeres Istvan <szekeres@iii.hu>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This mouse distinguishes horizontal wheel from vertical by a special "pseudo
event" GenericDesktop.00b8, with values of 0 for vertical and 8 for horizontal
wheel. Because this event is supplied by the parser too late, we need to delay
a wheel event, wait for this one and send either REL_WHEEL or REL_HWHEEL to
input depending on the event value.
Signed-off-by: Pavel Troller <patrol@sinus.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Preserve identifiers exposed in build and run time configuration though in
order not to break existing configurations.
This is in preparation for adding support for Apple aluminum USB keyboards.
Signed-off-by: Michel Daenzer <michel@tungstengraphics.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* orion: (26 commits)
[ARM] Orion: implement power-off method for QNAP TS-109/209
[ARM] Orion: add support for QNAP TS-109/TS-209
[ARM] Orion: I2C support
[I2C] i2c-mv64xxx: Don't set i2c_adapter.retries
[I2C] Split mv643xx I2C platform support
[ARM] Orion: enable CONFIG_RTC_DRV_M41T80 for D-Link DNS-323
[ARM] Orion defconfig
[ARM] Orion: add support for Orion/MV88F5181 based D-Link DNS-323
[ARM] Orion: MV88F5181 support bits
[ARM] Orion: Buffalo/Revogear Kurobox Pro support
[ARM] OrionNAS RD board support
[ARM] Orion: support for Marvell Orion-2 (88F5281) Development Board
[ARM] Orion: common platform setup for Gigabit Ethernet port
[ARM] Orion: platform device registration for UART, USB and NAND
[ARM] Orion: system timer support
[ARM] Orion edge GPIO IRQ support
[ARM] Orion: IRQ support
[ARM] Orion: provide GPIO method for enabling hardware assisted blinking
[ARM] Orion: GPIO support
[ARM] Orion: programable address map support
...
Conflicts:
arch/arm/Kconfig
arch/arm/Makefile
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
SCSI sg table allocation has a maximum size (of SCSI_MAX_SG_SEGMENTS,
currently 128) and this will cause a BUG_ON() in SCSI if something
tries an allocation over it. This patch adds a size limit to the
chaining allocator to allow the specification of the maximum
allocation size for chaining, so we always chain in units of the
maximum SCSI allocation size.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
These DMA drain buffer implementations in drivers are pretty horrible
to do in terms of manipulating the scatterlist. Plus they're being
done at least in drivers/ide and drivers/ata, so we now have code
duplication.
The one use case for this, as I understand it is AHCI controllers doing
PIO mode to mmc devices but translating this to DMA at the controller
level.
So, what about adding a callback to the block layer that permits the
adding of the drain buffer for the problem devices. The idea is that
you'd do this in slave_configure after you find one of these devices.
The beauty of doing it in the block layer is that it quietly adds the
drain buffer to the end of the sg list, so it automatically gets mapped
(and unmapped) without anything unusual having to be done to the
scatterlist in driver/scsi or drivers/ata and without any alteration to
the transfer length.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
syslets (or other threads/processes that want io context sharing) can
set this to enforce sharing of io context.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The io context sharing introduced a per-ioc spinlock, that would protect
the cfq io context lookup. That is a regression from the original, since
we never needed any locking there because the ioc/cic were process private.
The cic lookup is changed from an rbtree construct to a radix tree, which
we can then use RCU to make the reader side lockless. That is the performance
critical path, modifying the radix tree is only done on process creation
(when that process first does IO, actually) and on process exit (if that
process has done IO).
As it so happens, radix trees are also much faster for this type of
lookup where the key is a pointer. It's a very sparse tree.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch converts 'uptodate' arguments of no longer exported
interfaces, end_that_request_first/last, to 'error', and removes
internal conversions for it in blk_end_request interfaces.
Also, this patch removes no longer needed end_io_error().
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch removes the following functions:
o end_that_request_first()
o end_that_request_chunk()
and stops exporting the functions below:
o end_that_request_last()
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch adds a variant of the interface, blk_end_bidi_request(),
which completes a bidi request.
Bidi request must be completed as a whole, both rq and rq->next_rq
at once. So the interface has 2 arguments for completion size.
As for ->end_io, only rq->end_io is called (rq->next_rq->end_io is not
called). So if special completion handling is needed, the handler
must be set to rq->end_io.
And the handler must take care of freeing next_rq too, since
the interface doesn't care of it if rq->end_io is not NULL.
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch adds a variant of the interface, blk_end_request_callback(),
which has driver callback feature.
Drivers may need to do special works between end_that_request_first()
and end_that_request_last().
For such drivers, blk_end_request_callback() allows it to pass
a callback function which is called between end_that_request_first()
and end_that_request_last().
This interface is only for fallback of other blk_end_request interfaces.
Drivers should avoid their tricky behaviors and use other interfaces
as much as possible.
Currently, only one driver, ide-cd, needs this interface.
So this interface should/will be removed, after the driver removes
such tricky behaviors.
o ide-cd (cdrom_newpc_intr())
In PIO mode, cdrom_newpc_intr() needs to defer end_that_request_last()
until the device clears DRQ_STAT and raises an interrupt after
end_that_request_first().
So end_that_request_first() and end_that_request_last() are called
separately in cdrom_newpc_intr().
This means blk_end_request_callback() has to return without
completing request even if no leftover in the request.
To satisfy the requirement, callback function has return value
so that drivers can tell blk_end_request_callback() to return
without completing request.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch adds/exports functions to get the size of request in bytes.
They are useful because blk_end_request interfaces take bytes
as a completed I/O size instead of sectors.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch adds 2 new interfaces for request completion:
o blk_end_request() : called without queue lock
o __blk_end_request() : called with queue lock held
blk_end_request takes 'error' as an argument instead of 'uptodate',
which current end_that_request_* take.
The meanings of values are below and the value is used when bio is
completed.
0 : success
< 0 : error
Some device drivers call some generic functions below between
end_that_request_{first/chunk} and end_that_request_last().
o add_disk_randomness()
o blk_queue_end_tag()
o blkdev_dequeue_request()
These are called in the blk_end_request interfaces as a part of
generic request completion.
So all device drivers become to call above functions.
To decide whether to call blkdev_dequeue_request(), blk_end_request
uses list_empty(&rq->queuelist) (blk_queued_rq() macro is added for it).
So drivers must re-initialize it using list_init() or so before calling
blk_end_request if drivers use it for its specific purpose.
(Currently, there is no driver which completes request without
re-initializing the queuelist after used it. So rq->queuelist
can be used for the purpose above.)
"Normal" drivers can be converted to use blk_end_request()
in a standard way shown below.
a) end_that_request_{chunk/first}
spin_lock_irqsave()
(add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request())
end_that_request_last()
spin_unlock_irqrestore()
=> blk_end_request()
b) spin_lock_irqsave()
end_that_request_{chunk/first}
(add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request())
end_that_request_last()
spin_unlock_irqrestore()
=> spin_lock_irqsave()
__blk_end_request()
spin_unlock_irqsave()
c) spin_lock_irqsave()
(add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request())
end_that_request_last()
spin_unlock_irqrestore()
=> blk_end_request() or spin_lock_irqsave()
__blk_end_request()
spin_unlock_irqrestore()
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Manually doing chained sg lists is not trivial, so add some helpers
to make sure that drivers get it right.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Let queue_dma_alignment return 0 if it was specifically set to 0.
This permits devices with no particular alignment restrictions to
use arbitrary user space buffers without copying.
Signed-off-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Since the SCSI layer uses the request queues from the block layer, blktrace can
also be used to trace the requests to all SCSI devices (like SCSI tape drives),
not only disks. The only missing part is the ioctl interface to start and stop
tracing.
This patch adds the SETUP, START, STOP and TEARDOWN ioctls from blktrace to the
sg device files. With this change, blktrace can be used for SCSI devices like
for disks, e.g.: blktrace -d /dev/sg1 -o - | blkparse -i -
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This adds a i2c_new_dummy() primitive to help work with devices
that consume multiple addresses, which include many I2C eeproms
and at least one RTC.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
The i2c_adapter.clients list of i2c_client nodes duplicates driver
model state. This patch starts removing that list, letting us remove
most existing users of those i2c-core lists.
* The core I2C code now iterates over the driver model's list instead
of the i2c-internal one in some places where it's safe:
- Passing a command/ioctl to each client, a mechanims
used almost exclusively by DVB adapters;
- Device address checking, in both i2c-core and i2c-dev.
* Provide i2c_verify_client() to use with driver model iterators.
* Flag the relevant i2c_adapter and i2c_client fields as deprecated,
to help prevent new users from appearing.
For the moment the list needs to stick around, since some issues show
up when deleting devices created by legacy I2C drivers. (They don't
follow standard driver model rules. Removing those devices can cause
self-deadlocks.)
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Discard all I2C driver IDs that aren't used anywhere. That's not just a
couple of them, but more like 49 or one quarter of all defined IDs! And
this is just a first pass, next will come all IDs that are set but
never used, or used but never set.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Move the tps65010 header file from the OMAP arch directory to the
more generic <linux/i2c/...> directory, and remove the spurious
dependency of this driver on OMAP.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
i2c_driver.list is superfluous, this list duplicates the one
maintained by the driver core. Drop it.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
i2c_adapter.list is superfluous, this list duplicates the one
maintained by the driver core. Drop it.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Use more standard prototypes for i2c_use_client() and
i2c_release_client(). The former now returns a pointer to the client,
and the latter no longer returns anything. This matches what all other
subsystems do.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: David Brownell <david-b@pacbell.net>
Don't implement our own reference counting mechanism for i2c clients
when the driver model already has one.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: David Brownell <david-b@pacbell.net>
This patch allows much of the I2C client address data to move from initdata
into text.
Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
This patch contains the overdue removal of three I2C drivers.
[JD: In fact only i2c-ixp4xx can be removed at the moment, the other two
platforms don't implement the generic GPIO layer yet.]
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
This patch contains the scheduled removal of legacy I2C RTC drivers with
replacement drivers.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Based on the earlier work by Tejun Heo.
All users are gone so we can finally remove it.
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Based on the earlier work by Tejun Heo.
Switch set_xfer_rate() to use REQ_TYPE_ATA_TASKFILE requests
and make ide_wait_cmd() static.
There should be no functionality changes caused by this patch.
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Use wait_drive_not_busy() in drive_cmd_intr().
v2:
* Fix wait_drive_not_busy() comment (noticed by Sergei).
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
task_end_request() modified to always call ide_end_drive_cmd()
for taskfile requests. Previously, ide_end_drive_cmd() was
called only when IDE_TFLAG_FLAGGED was set. Also,
ide_dma_intr() is modified to use task_end_request().
Enables TASKFILE ioctls to get valid register outputs on
successful completion.
Bart:
- ported it over recent IDE changes
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add IDE_TFLAG_{HOB,TF,DEVICE} defines.
* Set IDE_TFLAG_IN_* flags in {do_rw,ide_no_data,ide_raw}_taskfile() users.
* Remove no longer needed ->tf_flags setup from ide_end_drive_cmd().
There should be no functionality changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
In ide_taskfile_ioctl(), there was a race condition involving
drive->io_32bit. It was cleared and restored during ioctl
requests but there was no synchronization with other requests.
So, other requests could execute with the altered ->io_32bit
setting or updated drive->io_32bit could be overwritten by
ide_taskfile_ioctl().
This patch adds IDE_TFLAG_IO_16BIT flag to indicate to
ide_pio_datablock() that 16-bit I/O is needed regardless of
drive->io_32bit settting.
Bart:
- ported it over recent IDE changes
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove broken disk byte-swapping support:
- it can cause a data corruption on SMP (or if using PREEMPT on UP)
- all data coming from disk are byte-swapped by taskfile_*_data() which
results in incorrect identify data being reported by /proc/ide/ and IOCTLs
- "hdx=bswap/byteswap" kernel parameter has been broken on m68k host drivers
(including Atari/Q40 ones) since 2.5.x days (because of 'hwif' zero-ing)
- byte-swapping is limited to PIO transfers (for working with TiVo disks on
x86 machines using user-space solutions or dm-byteswap should result in
much better performance because DMA can be used)
For previous discussions please see:
http://www.ussg.iu.edu/hypermail/linux/kernel/0201.0/0768.htmlhttp://lkml.org/lkml/2004/2/28/111
[ I have dm-byteswap device mapper target if somebody is interested
(patch is for 2.6.4 though but I'll dust it off if needed). ]
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Make remaining built-in only IDE host drivers modular, add ide-scan-pci.c
file for probing PCI host drivers registered with IDE core (special case
for built-in IDE and CONFIG_IDEPCI_PCIBUS_ORDER=y) and then take care of
the ordering in which all IDE host drivers are probed when IDE is built-in
during link time.
* Move probing of gayle, falconide, macide, q40ide and buddha (m68k arch
specific) host drivers, before PCI ones (no PCI on m68k), ide-cris (cris
arch specific), cmd640 (x86 arch specific) and pmac (ppc arch specific).
* Move probing of ide-cris (cris arch specific) host driver before cmd640
(x86 arch specific).
* Move probing of mpc8xx (ppc specific) host driver before ide-pnp (depends
on ISA and none of ppc platform that use mpc8xx supports ISA) and ide-h8300
(h8300 arch specific).
* Add "probe_vlb" kernel parameter to cmd640 host driver and update
Documentation/ide.txt accordingly.
* Make IDE_ARM config option visible so it can also be disabled if needed.
* Remove bogus comment from ide.c while at it.
v2:
* Fix two issues spotted by Sergei:
- replace ENOMEM error value by ENOENT in ide-h8300 host driver
- fix MODULE_PARM_DESC() in cmd640 host driver
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Rename init_hwif_data() to ide_init_port_data() and export it.
* For all users of ide_register_hw() with 'initializing' argument set
hwif->present and hwif->hold are always zero so convert these host
drivers to use ide_find_port()+ide_init_port_data()+ide_init_port_hw()
instead (also no need for init_hwif_default() call since the setup
done by it gets over-ridden by ide_init_port_hw() call).
* Drop 'initializing' argument from ide_register_hw().
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add ide_init_port_hw() helper.
* rapide.c: convert rapide_locate_hwif() to rapide_setup_ports()
and use ide_init_port_hw().
* ide_platform.c: convert plat_ide_locate_hwif() to plat_ide_setup_ports()
and use ide_init_port_hw().
* sgiioc4.c: use ide_init_port_hw().
* pmac.c: add 'hw_regs_t *hw' argument to pmac_ide_setup_device(),
setup 'hw' in pmac_ide_{macio,pci}_attach() and use ide_init_port_hw()
in pmac_ide_setup_device().
This patch is a preparation for the future changes in the IDE probing code.
There should be no functionality changes caused by this patch.
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: Jeremy Higdon <jeremy@sgi.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Fix build break of powerpc holly_defconfig:
In file included from arch/powerpc/platforms/embedded6xx/holly.c:24:
include/linux/ide.h:1206: error: 'CONFIG_IDE_MAX_HWIFS' undeclared here (not in a function)
There's no need to have a sized array in the prototype, might as well
turn it into a pointer.
It could probably be argued that large parts of the include file can be
covered under #ifdef CONFIG_IDE, but that's a larger undertaking.
Signed-off-by: Olof Johansson <olof@lixom.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Rename ide_device_add() to ide_device_add_all() and make it accept
'u8 idx[MAX_HWIFS]' instead of 'u8 idx[4]' as an argument.
* Add ide_device_add() wrapper for ide_device_add_all().
* Convert ide_generic_init() to use ide_device_add_all().
* Remove no longer needed ideprobe_init().
There should be no functionality changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Assign drive->quirk_list in ->quirkproc implementations:
- hpt366.c::hpt3xx_quirkproc()
- pdc202xx_new.c::pdcnew_quirkproc()
- pdc202xx_old.c::pdc202xx_quirkproc()
* Make ->quirkproc void.
* Move calling ->quirkproc from do_identify() to probe_hwif().
* Convert it821x_fixups() to it821x_quirkproc() in it821x.c.
* Convert siimage_fixup() to sil_quirkproc() in siimage.c, also remove
no longer needed drive->present check from is_dev_seagate_sata().
* Convert ide_undecoded_slave() to accept 'drive' instead of 'hwif'
as an argument. Then convert ide_register_hw() to accept 'quirkproc'
argument instead of 'fixup' one.
* Remove no longer needed ->fixup method.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Merge ->dma_host_{on,off} methods into ->dma_host_set method
which takes 'int on' argument.
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Make ide_dma_off_quietly() and __ide_dma_on() always available.
* Drop "__" prefix from __ide_dma_on().
* Check for presence of ->dma_host_on instead of ->ide_dma_on.
* Convert all users of ->ide_dma_on and ->dma_off_quietly methods
to use ide_dma_on() and ide_dma_off_quietly() instead.
* Remove no longer needed ->ide_dma_on and ->dma_off_quietly methods
from ide_hwif_t.
* Make ide_dma_on() void.
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Fix SWDMA/MWDMA masks in cy82c693_chipset.
* Add IDE_HFLAG_CY82C693 host flag and use it in ide_tune_dma() to
check whether the DMA should be enabled even if ide_max_dma_mode()
fails.
* Convert cy82c693_dma_enable() to become cy82c693_set_dma_mode()
and remove no longer needed cy82c693_ide_dma_on(). Then set
IDE_HFLAG_CY82C693 instead of IDE_HFLAG_TRUST_BIOS_FOR_DMA in
cy82c693_chipset.
* Bump driver version.
As a result of this patch cy82c693 driver will configure and use DMA on
all SWDMA0-2 and MWDMA0-2 capable ATA devices instead of relying on BIOS.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
I2C adapter drivers are supposed to handle retries on nack by themselves
if they do, so there's no point in setting .retries if they don't.
As this retry mechanism is going away (at least in its current form),
clean this up now so that we don't get build failures later.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Mark A. Greer <mgreer@mvista.com>
The motivation for this change is to allow other chips, like the
Marvell Orion ARM SoC family, to use the existing i2c-mv64xxx driver.
Signed-off-by: Tzachi Perelstein <tzachi@marvell.com>
Acked-by: Nicolas Pitre <nico@marvell.com>
Acked-by: Dale Farnsworth <dale@farnsworth.org>
Acked-by: Mark A. Greer <mgreer@mvista.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 commits)
[SCSI] usbstorage: use last_sector_bug flag universally
[SCSI] libsas: abstract STP task status into a function
[SCSI] ultrastor: clean up inline asm warnings
[SCSI] aic7xxx: fix firmware build
[SCSI] aacraid: fib context lock for management ioctls
[SCSI] ch: remove forward declarations
[SCSI] ch: fix device minor number management bug
[SCSI] ch: handle class_device_create failure properly
[SCSI] NCR5380: fix section mismatch
[SCSI] sg: fix /proc/scsi/sg/devices when no SCSI devices
[SCSI] IB/iSER: add logical unit reset support
[SCSI] don't use __GFP_DMA for sense buffers if not required
[SCSI] use dynamically allocated sense buffer
[SCSI] scsi.h: add macro for enclosure bit of inquiry data
[SCSI] sd: add fix for devices with last sector access problems
[SCSI] fix pcmcia compile problem
[SCSI] aacraid: add Voodoo Lite class of cards.
[SCSI] aacraid: add new driver features flags
[SCSI] qla2xxx: Update version number to 8.02.00-k7.
[SCSI] qla2xxx: Issue correct MBC_INITIALIZE_FIRMWARE command.
...
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (67 commits)
fix drivers/ata/sata_fsl.c double-decl
[libata] Prefer SCSI_SENSE_BUFFERSIZE to sizeof()
pata_legacy: Merge winbond support
ata_generic: Cenatek support
pata_winbond: error return
pata_serverworks: Fix cable types and cosmetics
pata_mpc52xx: remove un-needed assignment
libata: fix off-by-one in error categorization
ahci: factor out AHCI enabling and enable AHCI before reading CAP
ata_piix: implement SIDPR SCR access
ata_piix: convert to prepare - activate initialization
libata: factor out ata_pci_activate_sff_host() from ata_pci_one()
[libata] Prefer SCSI_SENSE_BUFFERSIZE to sizeof()
pata_legacy: resychronize with upstream changes and resubmit
[libata] pata_legacy: typo fix
[libata] pata_winbond: update for new ->data_xfer hook
pata_pcmcia: convert to new data_xfer prototype
libata annotations and fixes
libata: use dev_driver_string() instead of "libata" in libata-sff.c
ata_piix: kill unused constants and flags
...
This allows others to use the DLM constants without being tied to the
function API of fs/dlm.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (509 commits)
V4L/DVB (7078): radio: fix sf16fmi section mismatch
V4L/DVB (7077): bt878: remove handcrafted PCI subsystem ID check
V4L/DVB (7075): Make a local function static
V4L/DVB (7074): DiB7000P: correct tuning problem for 7MHz channel
V4L/DVB (7073): DiB7070: Reception quality improved
V4L/DVB (7072): sets the MT2060 IF1 frequency according to EEPROM
V4L/DVB (7071): DiB0700: Start streaming the right way
V4L/DVB (7070): Fix some tuning problems
V4L/DVB (7069): Support for myTV.t
V4L/DVB (7068): Add support for WinTV Nova-T-CE driver
V4L/DVB (7067): fix autoserach in the Hauppauge NOVA-T 500
V4L/DVB (7066): ASUS My Cinema U3000 Mini DVBT Tuner
V4L/DVB (7065): Artec T14BR patches
V4L/DVB (7063): xc5000: Fix OOPS caused by missing firmware
V4L/DVB (7062): radio-si570x: Some fixes and new USB ID addition
V4L/DVB (7061): radio-si470x: Some cleanups
V4L/DVB (7060): em28xx: remove has_tuner
V4L/DVB (7059): cx88: Ensure the tuner is reset correctly
V4L/DVB (7058): IR corrections for the Pinnacle 800i
V4L/DVB (7056): tuner: suppress obsolete tuner i2c address warning for XC5000 tuners
...
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (67 commits)
ide: remove redundant DMA blacklist check from __ide_dma_on()
ide: cleanup ide_set_dma()
ide: remove redundant ->ide_dma_on call from set_using_dma()
sc1200: move DMA timings to timing tables
ide: add IDE_HFLAG_ABUSE_SET_DMA_MODE host flag
sis5513: factor out UDMA programming code
pdc202xx_new: move PIO programming code to pdcnew_set_pio_mode()
ide: make 'extra' field in struct ide_port_info u8
ide: kill duplicate code in ide_dump_{ata,atapi}_status()
ide-disk: use ide_get_lba_addr()
ide: printk fix
ide: add ide_tf_read() helper
ide: fix registers loading order in ide_dump_ata_status()
ide-disk: use do_rw_taskfile() (take 2)
ide-disk: add ide_tf_set_cmd() helper
ide-disk: extend timeout for PIO-in commands
ide: remove 'handler' field from ide_task_t (take 2)
ide: use ->data_phase to set ->handler in do_rw_taskfile()
ide: convert do_rw_taskfile() to use ->data_phase
ide: merge flagged_taskfile() into do_rw_taskfile()
...
* Add IDE_HFLAG_ABUSE_SET_DMA_MODE host flag and use it to decide
what to do with transfer modes < XFER_PIO_0 in ide_set_xfer_rate().
* Set IDE_HFLAG_ABUSE_SET_DMA_MODE in host drivers that need it
(aec62xx, amd74xx, cs5520, cs5535, hpt34x, hpt366, pdc202xx_old,
serverworks, tc86c001 and via82cxxx) and cleanup ->set_dma_mode
methods in host drivers that don't (IDE core code guarantees that
->set_dma_mode will be called only for modes which are present
in SWDMA/MWDMA/UDMA masks).
While at it:
* Add IDE_HFLAGS_HPT34X/HPT3XX/PDC202XX/SVWKS define in
hpt34x/hpt366/pdc202xx_old/serverworks host driver.
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
The maximum value used currently for 'extra' field in struct ide_port_info
is 240.
Make 'extra' u8 so it packs nicely together with enablebits[] and 'chipset'
fields (ide_pci_enablebit_t is 3 bytes and hwif_chipset_t is 1 byte).
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Export ide_get_lba_addr().
* Convert idedisk_{read_native,set}_max_address() to use ide_get_lba_addr().
* Remove incorrect comment from idedisk_read_native_max_address()
(noticed by Sergei).
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Factor out code reading taskfile registers from ide_end_drive_cmd()
to the new ide_tf_read() helper.
* Add IDE_TFLAG_IN_* taskfile flags to indicate the need to load
particular IDE taskfile register in ide_tf_read().
* Update ide_end_drive_cmd() to set respective IDE_TFLAG_IN_* taksfile flags.
* Add ide_get_lba_addr() for getting LBA sector address from taskfile struct.
* Factor out code getting sector address from ide_dump_ata_status()
to the new ide_dump_sector() function.
* Convert ide_dump_sector() to use ide_tf_read() and ide_get_lba_addr().
* Remove no longer needed ide_read_24().
The only change in functionality caused by this patch is that
ide_dump_ata_status() no longer prints "high"/"low" parts of LBA48
sector address (of course LBA48 sector address is still printed).
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add IDE_TFLAG_DMA_PIO_FALLBACK taskfile flag to indicate the need
to skip loading taskfile registers in do_rw_taskfile().
* Export do_rw_taskfile().
* Convert __ide_do_rw_disk() to use do_rw_taskfile().
* Unexport ide_tf_load().
* Unexport {pre_task_out,task_in}_intr() and make it static.
* Remove incorrect comment about do_rw_taskfile() from <linux/ide.h>.
There should be no functionality changes caused by this patch.
v2:
* Add missing blk_fs_request() check to task_dma_ok() (for VDMA).
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add IDE_TFLAG_CUSTOM_HANDLER taskfile flag and use it for internal requests
which require custom handlers. Check the flag in do_rw_taskfile() and set
handler accordingly.
* Cleanup ide_init_{specify,restore,setmult}_cmd() and rename it to
ide_tf_set_{specify,restore,setmult}_cmd().
* Make {set_geometry,recal,set_multmode}_intr() static.
* Remove no longer needed 'handler' field from ide_task_t.
v2:
* 'handler' in do_rw_taskfile() must be set to NULL initially.
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Use ->data_phase to set ->handler in do_rw_taskfile() instead of
setting ->handler in callers of ide_raw_taskfile()/do_rw_taskfile().
* Unexport task_no_data_intr() and make it static.
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Use task->data_phase in do_rw_taskfile() to decide what to do.
* task->prehandler is only used by TASKFILE[_MULTI]_OUT so just
use pre_task_out_intr() directly and remove no longer needed
'prehandler' field from ide_task_t.
* Remove no longer needed ide_pre_handler_t type.
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Based on the earlier work by Tejun Heo.
task->data_phase == TASKFILE_MULTI_{IN,OUT} vs drive->mult_count == 0
check is needed also for ide_taskfile_ioctl() requests that don't have
IDE_TFLAG_FLAGGED taskfile flag set.
Cc: Tejun Heo <htejun@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add IDE_TFLAG_IN_DATA taskfile flag to indicate the need of reading
IDE_DATA_REG in ide_end_drive_cmd().
Set the new flag in ide_taskfile_ioctl() if ->in_flags.b.data is set.
* Add IDE_TFLAG_FLAGGED_SET_IN_FLAGS taskfile flag to indicate the
need of modifying ->in_flags in ide_taskfile_ioctl().
Set the new flag in flagged_taskfile() and move the code modifying
->tf_in_flags to ide_taskfile_ioctl().
While at it remove the bogus comment: ->tf_in_flags (except .b.data)
have no effect on selection of registers to read.
* Remove no longer needed 'tf_in_flags' field from ide_task_t.
As the result we finally have the internals of HDIO_DRIVE_TASKFILE ioctl
separated from the core IDE code.
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'data_buf' and 'nsect' variables in ide_taskfile_ioctl()
to cache data buffer pointer and number of sectors to transfer
(this allows us to have only one ide_diag_taskfile() call).
* Add IDE_TFLAG_WRITE taskfile flag and use it to check whether
the REQ_RW request flag should be set.
* Move ->command_type handling from ide_diag_taskfile() to
ide_taskfile_ioctl() and use ->req_cmd instead of ->command_type.
* Add 'nsect' parameter to ide_raw_taskfile().
* Merge ide_diag_taskfile() into ide_raw_taskfile().
* Initialize ->data_phase explicitly in idedisk_prepare_flush(),
ide_start_power_step() and ide_disk_special().
* Remove no longer needed 'command_type' field from ide_task_t.
* Add #ifndef/#endif __KERNEL__ to <linux/hdreg.h> around no
longer used by kernel IDE_DRIVE_TASK_* and TASKFILE_* defines.
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Given that:
* hpt366.c::hpt3xx_intrproc() is the only user of hwif->intrproc
* hpt366.c::hpt3xx_quirkproc() sets drive->quirk_list to 1 for quirky drives
which is a value unique to hpt366 host driver
we can remove hwif->intproc and just check for drive->quirk_list == 1
in ide_do_request().
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ide_pktcmd_tf_load() helper and convert ATAPI device drivers to use it.
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove atapi_ireason_t.
While at it:
* replace 'HWIF(drive)' by 'drive->hwif' (or just 'hwif' where possible)
v2:
* v1 had CD and IO bits reversed in many places.
* Use CD and IO defines from <linux/hdreg.h>.
v3:
* Fix incorrect "(ireason & IO) == test_bit()". (Noticed by Sergei)
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove ata_nsector_t, ata_data_t (unused) and atapi_bcount_t.
While at it:
* replace 'HWIF(drive)' by 'hwif'
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove atapi_feature_t.
While at it:
* replace 'HWIF(drive)' by 'hwif'
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove atapi_error_t.
While at it:
* replace 'HWIF(drive)' by 'drive->hwif'
v2:
* Add {ILI,EOM,LFS}_ERR defines to <linux/hdreg.h>.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove ata_status_t (unused) and atapi_status_t.
While at it:
* replace 'HWIF(drive)' by 'drive->hwif' (or just 'hwif' where possible)
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
special_t is used only internally by the IDE subsystem (it isn't
related to hardware registers and isn't exported to the user-space).
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Based on the earlier work by Tejun Heo.
All users are gone so we can finally remove it.
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add IDE_TFLAG_OUT_DEVICE taskfile flag to indicate the need of writing
the Device register and handle it in ide_tf_load().
Update ide_tf_load() and {do_rw,flagged}_taskfile() users accordingly.
* Use struct ide_taskfile and ide_tf_load() in execute_drive_cmd().
* Make the debugging code dump all taskfile registers for both
REQ_ATA_TYPE_{CMD,TASK} requests and move it to ide_tf_load()
so it also covers REQ_ATA_TYPE_TASKFILE requests.
There should be no functionality changes caused by this patch
(unless DEBUG is defined).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove stale ide.h "configuration options":
* INITIAL_MULT_COUNT - always defined to 0
* SUPPORT_SLOW_DATA_PORTS - unused
* OK_TO_RESET_CONTROLLER - always defined to 1
* DISABLE_IRQ_NOSYNC - always defined to 0
Leave SUPPORT_VLB_SYNC (defined to 0 for CRIS and FRV, otherwise to 1)
for now but disallow overriding it by <asm/ide.h>.
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Based on the earlier work by Tejun Heo.
* Move setting IDE_TFLAG_LBA48 taskfile flag from do_rw_taskfile()
function to the callers.
* Add IDE_TFLAG_FLAGGED taskfile flag for flagged taskfiles coming
from ide_taskfile_ioctl(). Check it instead of ->tf_out_flags.all.
* Add IDE_TFLAG_OUT_DATA taskfile flag to indicate the need to load
IDE data register in ide_tf_load().
* Add IDE_TFLAG_OUT_* taskfile flags to indicate the need to load
particular IDE taskfile registers in ide_tf_load().
* Update do_rw_taskfile() and ide_tf_load() users to set respective
IDE_TFLAG_OUT_* taksfile flags.
* Add task_dma_ok() helper.
* Use IDE_TFLAG_FLAGGED taskfile flag to select HIHI mask in ide_tf_load().
* Use do_rw_taskfile() in flagged_taskfile().
* Remove no longer needed 'tf_out_flags' field from ide_task_t.
There should be no functionality changes caused by this patch.
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add ide_no_data_taskfile() helper and convert ide_raw_taskfile() w/ NO DATA
protocol users to use it instead.
* Set ->data_phase explicitly in ide_no_data_taskfile()
(TASKFILE_NO_DATA is defined as 0x0000).
* Unexport task_no_data_intr().
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Based on the earlier work by Tejun Heo.
* Add 'tf_flags' field (for taskfile flags) to ide_task_t.
* Add IDE_TFLAG_LBA48 taskfile flag for LBA48 taskfiles.
* Add IDE_TFLAG_NO_SELECT_MASK taskfile flag for __ide_do_rw_disk()
which doesn't use SELECT_MASK() (looks like a bug but it requires
some more investigation).
* Split off ide_tf_load() helper from do_rw_taskfile().
* Convert __ide_do_rw_disk() to use ide_tf_load().
There should be no functionality changes caused by this patch.
Cc: Tejun Heo <htejun@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Don't set write-only ide_task_t.hobRegister[6] and ide_task_t.hobRegister[7]
in idedisk_set_max_address_ext().
* Add struct ide_taskfile and use it in ide_task_t instead of tfRegister[]
and hobRegister[].
* Remove no longer needed IDE_CONTROL_OFFSET_HOB define.
* Add #ifndef/#endif __KERNEL__ around definitions of {task,hob}_struct_t.
While at it:
* Use ATA_LBA define for LBA bit (0x40) as suggested by Tejun Heo.
v2:
* Add missing newlines. (Noticed by Sergei)
* Use ~ATA_LBA instead of 0xBF. (Noticed by Sergei)
* Use unnamed unions for error/feature and status/command.
(Suggested by Sergei).
There should be no functionality changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove task_ioreg_t typedef from the kernel code (but leave it
in <linux/hdreg.h> for #ifndef/#endif __KERNEL__ case).
While at it also move sata_ioreg_t typedef under #ifndef/#endif __KERNEL__.
v2:
Remove name of the second parameter from ide_execute_command() declaration.
(Noticed by Sergei).
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Convert cmd64x, hpt366 and pdc202xx_old host drivers to use
pci_resource_start(hwif->pci_dev, 4) instead of hwif->dma_master.
* Remove no longer needed ->dma_master field from ide_hwif_t.
v2:
* Use the more readable 'hwif->dma_base - (hwif->channel * 8)' instead of
pci_resource_start(hwif->pci_dev, 4).
v3:
* Use hwif->extra_base in hpt366/pdc20xx_old + some cosmetic fixups over v2
(suggested by Sergei).
v4:
* Correct offsets in hpt3xxn_set_clock().
v5:
* Use hwif->extra_base in hpt366 for _real_ this time. (Noticed by Sergei)
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Right now, the linux kernel (with scheduler statistics enabled) keeps track
of the maximum time a process is waiting to be scheduled. While the maximum
is a very useful metric, tracking average and total is equally useful
(at least for latencytop) to figure out the accumulated effect of scheduler
delays. The accumulated effect is important to judge the performance impact
of scheduler tuning/behavior.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
LatencyTOP kernel infrastructure; it measures latencies in the
scheduler and tracks it system wide and per process.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For some crazy reason (trying to work around hw problem in i810) I wanted
to use HZ around 4000.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently all highres=off timers are run from softirq context, but
HRTIMER_CB_IRQSAFE_NO_SOFTIRQ timers expect to run from irq context.
Fix this up by splitting it similar to the highres=on case.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We need to teach no_hz about the rt throttling because its tick driven.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Extend group scheduling to also cover the realtime classes. It uses the time
limiting introduced by the previous patch to allow multiple realtime groups.
The hard time limit is required to keep behaviour deterministic.
The algorithms used make the realtime scheduler O(tg), linear scaling wrt the
number of task groups. This is the worst case behaviour I can't seem to get out
of, the avg. case of the algorithms can be improved, I focused on correctness
and worst case.
[ akpm@linux-foundation.org: move side-effects out of BUG_ON(). ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Very simple time limit on the realtime scheduling classes.
Allow the rq's realtime class to consume sched_rt_ratio of every
sched_rt_period slice. If the class exceeds this quota the fair class
will preempt the realtime class.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use HR-timers (when available) to deliver an accurate preemption tick.
The regular scheduler tick that runs at 1/HZ can be too coarse when nice
level are used. The fairness system will still keep the cpu utilisation 'fair'
by then delaying the task that got an excessive amount of CPU time but try to
minimize this by delivering preemption points spot-on.
The average frequency of this extra interrupt is sched_latency / nr_latency.
Which need not be higher than 1/HZ, its just that the distribution within the
sched_latency period is important.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Why do we even have cond_resched when real preemption
is on? It seems to be a waste of space and time.
remove cond_resched with CONFIG_PREEMPT on.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Introduce a new rlimit that allows the user to set a runtime timeout on
real-time tasks their slice. Once this limit is exceeded the task will receive
SIGXCPU.
So it measures runtime since the last sleep.
Input and ideas by Thomas Gleixner and Lennart Poettering.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Lennart Poettering <mzxreary@0pointer.de>
CC: Michael Kerrisk <mtk.manpages@googlemail.com>
CC: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move the task_struct members specific to rt scheduling together.
A future optimization could be to put sched_entity and sched_rt_entity
into a union.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch implements a new version of RCU which allows its read-side
critical sections to be preempted. It uses a set of counter pairs
to keep track of the read-side critical sections and flips them
when all tasks exit read-side critical section. The details
of this implementation can be found in this paper -
http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf
and the article-
http://lwn.net/Articles/253651/
This patch was developed as a part of the -rt kernel development and
meant to provide better latencies when read-side critical sections of
RCU don't disable preemption. As a consequence of keeping track of RCU
readers, the readers have a slight overhead (optimizations in the paper).
This implementation co-exists with the "classic" RCU implementations
and can be switched to at compiler.
Also includes RCU tracing summarized in debugfs.
[ akpm@linux-foundation.org: build fixes on non-preempt architectures ]
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Reviewed-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch re-organizes the RCU code to enable multiple implementations
of RCU. Users of RCU continues to include rcupdate.h and the
RCU interfaces remain the same. This is in preparation for
subsequently merging the preemptible RCU implementation.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch makes RCU use softirq instead of tasklets.
It also adds a memory barrier after raising the softirq
inorder to ensure that the cpu sees the most recently updated
value of rcu->cur while processing callbacks.
The discussion of the related theoretical race pointed out
by James Huang can be found here --> http://lkml.org/lkml/2007/11/20/603
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Reviewed-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dmitry Adamushko found that the current implementation of the RT
balancing code left out changes to the sched_setscheduler and
rt_mutex_setprio.
This patch addresses this issue by adding methods to the schedule classes
to handle being switched out of (switched_from) and being switched into
(switched_to) a sched_class. Also a method for changing of priorities
is also added (prio_changed).
This patch also removes some duplicate logic between rt_mutex_setprio and
sched_setscheduler.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
To make the main sched.c code more agnostic to the schedule classes.
Instead of having specific hooks in the schedule code for the RT class
balancing. They are replaced with a pre_schedule, post_schedule
and task_wake_up methods. These methods may be used by any of the classes
but currently, only the sched_rt class implements them.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We add the notion of a root-domain which will be used later to rescope
global variables to per-domain variables. Each exclusive cpuset
essentially defines an island domain by fully partitioning the member cpus
from any other cpuset. However, we currently still maintain some
policy/state as global variables which transcend all cpusets. Consider,
for instance, rt-overload state.
Whenever a new exclusive cpuset is created, we also create a new
root-domain object and move each cpu member to the root-domain's span.
By default the system creates a single root-domain with all cpus as
members (mimicking the global state we have today).
We add some plumbing for storing class specific data in our root-domain.
Whenever a RQ is switching root-domains (because of repartitioning) we
give each sched_class the opportunity to remove any state from its old
domain and add state to the new one. This logic doesn't have any clients
yet but it will later in the series.
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
CC: Christoph Lameter <clameter@sgi.com>
CC: Paul Jackson <pj@sgi.com>
CC: Simon Derr <simon.derr@bull.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The current wake-up code path tries to determine if it can optimize the
wake-up to "this_cpu" by computing load calculations. The problem is that
these calculations are only relevant to SCHED_OTHER tasks where load is king.
For RT tasks, priority is king. So the load calculation is completely wasted
bandwidth.
Therefore, we create a new sched_class interface to help with
pre-wakeup routing decisions and move the load calculation as a function
of CFS task's class.
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some RT tasks (particularly kthreads) are bound to one specific CPU.
It is fairly common for two or more bound tasks to get queued up at the
same time. Consider, for instance, softirq_timer and softirq_sched. A
timer goes off in an ISR which schedules softirq_thread to run at RT50.
Then the timer handler determines that it's time to smp-rebalance the
system so it schedules softirq_sched to run. So we are in a situation
where we have two RT50 tasks queued, and the system will go into
rt-overload condition to request other CPUs for help.
This causes two problems in the current code:
1) If a high-priority bound task and a low-priority unbounded task queue
up behind the running task, we will fail to ever relocate the unbounded
task because we terminate the search on the first unmovable task.
2) We spend precious futile cycles in the fast-path trying to pull
overloaded tasks over. It is therefore optimial to strive to avoid the
overhead all together if we can cheaply detect the condition before
overload even occurs.
This patch tries to achieve this optimization by utilizing the hamming
weight of the task->cpus_allowed mask. A weight of 1 indicates that
the task cannot be migrated. We will then utilize this information to
skip non-migratable tasks and to eliminate uncessary rebalance attempts.
We introduce a per-rq variable to count the number of migratable tasks
that are currently running. We only go into overload if we have more
than one rt task, AND at least one of them is migratable.
In addition, we introduce a per-task variable to cache the cpus_allowed
weight, since the hamming calculation is probably relatively expensive.
We only update the cached value when the mask is updated which should be
relatively infrequent, especially compared to scheduling frequency
in the fast path.
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
this patch extends the soft-lockup detector to automatically
detect hung TASK_UNINTERRUPTIBLE tasks. Such hung tasks are
printed the following way:
------------------>
INFO: task prctl:3042 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message
prctl D fd5e3793 0 3042 2997
f6050f38 00000046 00000001 fd5e3793 00000009 c06d8264 c06dae80 00000286
f6050f40 f6050f00 f7d34d90 f7d34fc8 c1e1be80 00000001 f6050000 00000000
f7e92d00 00000286 f6050f18 c0489d1a f6050f40 00006605 00000000 c0133a5b
Call Trace:
[<c04883a5>] schedule_timeout+0x6d/0x8b
[<c04883d8>] schedule_timeout_uninterruptible+0x15/0x17
[<c0133a76>] msleep+0x10/0x16
[<c0138974>] sys_prctl+0x30/0x1e2
[<c0104c52>] sysenter_past_esp+0x5f/0xa5
=======================
2 locks held by prctl/3042:
#0: (&sb->s_type->i_mutex_key#5){--..}, at: [<c0197d11>] do_fsync+0x38/0x7a
#1: (jbd_handle){--..}, at: [<c01ca3d2>] journal_start+0xc7/0xe9
<------------------
the current default timeout is 120 seconds. Such messages are printed
up to 10 times per bootup. If the system has crashed already then the
messages are not printed.
if lockdep is enabled then all held locks are printed as well.
this feature is a natural extension to the softlockup-detector (kernel
locked up without scheduling) and to the NMI watchdog (kernel locked up
with IRQs disabled).
[ Gautham R Shenoy <ego@in.ibm.com>: CPU hotplug fixes. ]
[ Andrew Morton <akpm@linux-foundation.org>: build warning fix. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
This patch converts the known per-subsystem mutexes to get_online_cpus
put_online_cpus. It also eliminates the CPU_LOCK_ACQUIRE and
CPU_LOCK_RELEASE hotplug notification events.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Replace all lock_cpu_hotplug/unlock_cpu_hotplug from the kernel and use
get_online_cpus and put_online_cpus instead as it highlights the
refcount semantics in these operations.
The new API guarantees protection against the cpu-hotplug operation, but
it doesn't guarantee serialized access to any of the local data
structures. Hence the changes needs to be reviewed.
In case of pseries_add_processor/pseries_remove_processor, use
cpu_maps_update_begin()/cpu_maps_update_done() as we're modifying the
cpu_present_map there.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch implements a Refcount + Waitqueue based model for
cpu-hotplug.
Now, a thread which wants to prevent cpu-hotplug, will bump up a global
refcount and the thread which wants to perform a cpu-hotplug operation
will block till the global refcount goes to zero.
The readers, if any, during an ongoing cpu-hotplug operation are blocked
until the cpu-hotplug operation is over.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Paul Jackson <pj@sgi.com> [For !CONFIG_HOTPLUG_CPU ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The current load balancing scheme isn't good enough for precise
group fairness.
For example: on a 8-cpu system, I created 3 groups as under:
a = 8 tasks (cpu.shares = 1024)
b = 4 tasks (cpu.shares = 1024)
c = 3 tasks (cpu.shares = 1024)
a, b and c are task groups that have equal weight. We would expect each
of the groups to receive 33.33% of cpu bandwidth under a fair scheduler.
This is what I get with the latest scheduler git tree:
Signed-off-by: Ingo Molnar <mingo@elte.hu>
--------------------------------------------------------------------------------
Col1 | Col2 | Col3 | Col4
------|---------|-------|-------------------------------------------------------
a | 277.676 | 57.8% | 54.1% 54.1% 54.1% 54.2% 56.7% 62.2% 62.8% 64.5%
b | 116.108 | 24.2% | 47.4% 48.1% 48.7% 49.3%
c | 86.326 | 18.0% | 47.5% 47.9% 48.5%
--------------------------------------------------------------------------------
Explanation of o/p:
Col1 -> Group name
Col2 -> Cumulative execution time (in seconds) received by all tasks of that
group in a 60sec window across 8 cpus
Col3 -> CPU bandwidth received by the group in the 60sec window, expressed in
percentage. Col3 data is derived as:
Col3 = 100 * Col2 / (NR_CPUS * 60)
Col4 -> CPU bandwidth received by each individual task of the group.
Col4 = 100 * cpu_time_recd_by_task / 60
[I can share the test case that produces a similar o/p if reqd]
The deviation from desired group fairness is as below:
a = +24.47%
b = -9.13%
c = -15.33%
which is quite high.
After the patch below is applied, here are the results:
--------------------------------------------------------------------------------
Col1 | Col2 | Col3 | Col4
------|---------|-------|-------------------------------------------------------
a | 163.112 | 34.0% | 33.2% 33.4% 33.5% 33.5% 33.7% 34.4% 34.8% 35.3%
b | 156.220 | 32.5% | 63.3% 64.5% 66.1% 66.5%
c | 160.653 | 33.5% | 85.8% 90.6% 91.4%
--------------------------------------------------------------------------------
Deviation from desired group fairness is as below:
a = +0.67%
b = -0.83%
c = +0.17%
which is far better IMO. Most of other runs have yielded a deviation within
+-2% at the most, which is good.
Why do we see bad (group) fairness with current scheuler?
=========================================================
Currently cpu's weight is just the summation of individual task weights.
This can yield incorrect results. For ex: consider three groups as below
on a 2-cpu system:
CPU0 CPU1
---------------------------
A (10) B(5)
C(5)
---------------------------
Group A has 10 tasks, all on CPU0, Group B and C have 5 tasks each all
of which are on CPU1. Each task has the same weight (NICE_0_LOAD =
1024).
The current scheme would yield a cpu weight of 10240 (10*1024) for each cpu and
the load balancer will think both CPUs are perfectly balanced and won't
move around any tasks. This, however, would yield this bandwidth:
A = 50%
B = 25%
C = 25%
which is not the desired result.
What's changing in the patch?
=============================
- How cpu weights are calculated when CONFIF_FAIR_GROUP_SCHED is
defined (see below)
- API Change
- Two tunables introduced in sysfs (under SCHED_DEBUG) to
control the frequency at which the load balance monitor
thread runs.
The basic change made in this patch is how cpu weight (rq->load.weight) is
calculated. Its now calculated as the summation of group weights on a cpu,
rather than summation of task weights. Weight exerted by a group on a
cpu is dependent on the shares allocated to it and also the number of
tasks the group has on that cpu compared to the total number of
(runnable) tasks the group has in the system.
Let,
W(K,i) = Weight of group K on cpu i
T(K,i) = Task load present in group K's cfs_rq on cpu i
T(K) = Total task load of group K across various cpus
S(K) = Shares allocated to group K
NRCPUS = Number of online cpus in the scheduler domain to
which group K is assigned.
Then,
W(K,i) = S(K) * NRCPUS * T(K,i) / T(K)
A load balance monitor thread is created at bootup, which periodically
runs and adjusts group's weight on each cpu. To avoid its overhead, two
min/max tunables are introduced (under SCHED_DEBUG) to control the rate
at which it runs.
Fixes from: Peter Zijlstra <a.p.zijlstra@chello.nl>
- don't start the load_balance_monitor when there is only a single cpu.
- rename the kthread because its currently longer than TASK_COMM_LEN
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
selinux: make mls_compute_sid always polyinstantiate
security/selinux: constify function pointer tables and fields
security: add a secctx_to_secid() hook
security: call security_file_permission from rw_verify_area
security: remove security_sb_post_mountroot hook
Security: remove security.h include from mm.h
Security: remove security_file_mmap hook sparse-warnings (NULL as 0).
Security: add get, set, and cloning of superblock security information
security/selinux: Add missing "space"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (125 commits)
[CRYPTO] twofish: Merge common glue code
[CRYPTO] hifn_795x: Fixup container_of() usage
[CRYPTO] cast6: inline bloat--
[CRYPTO] api: Set default CRYPTO_MINALIGN to unsigned long long
[CRYPTO] tcrypt: Make xcbc available as a standalone test
[CRYPTO] xcbc: Remove bogus hash/cipher test
[CRYPTO] xcbc: Fix algorithm leak when block size check fails
[CRYPTO] tcrypt: Zero axbuf in the right function
[CRYPTO] padlock: Only reset the key once for each CBC and ECB operation
[CRYPTO] api: Include sched.h for cond_resched in scatterwalk.h
[CRYPTO] salsa20-asm: Remove unnecessary dependency on CRYPTO_SALSA20
[CRYPTO] tcrypt: Add select of AEAD
[CRYPTO] salsa20: Add x86-64 assembly version
[CRYPTO] salsa20_i586: Salsa20 stream cipher algorithm (i586 version)
[CRYPTO] gcm: Introduce rfc4106
[CRYPTO] api: Show async type
[CRYPTO] chainiv: Avoid lock spinning where possible
[CRYPTO] seqiv: Add select AEAD in Kconfig
[CRYPTO] scatterwalk: Handle zero nbytes in scatterwalk_map_and_copy
[CRYPTO] null: Allow setkey on digest_null
...
Add the following class iteration functions for driver use:
class_for_each_device
class_find_device
class_for_each_child
class_find_child
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This name is just passed to platform_device_alloc which has its parameter
declared const.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
All kobjects require a dynamically allocated name now. We no longer
need to keep track if the name is statically assigned, we can just
unconditionally free() all kobject names on cleanup.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There are no in-kernel users of kobject_unregister() so it should be
removed.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We save the current state in the object itself, so we can do proper
cleanup when the last reference is dropped.
If the initial reference is dropped, the object will be removed from
sysfs if needed, if an "add" event was sent, "remove" will be send, and
the allocated resources are released.
This allows us to clean up some driver core usage as well as allowing us
to do other such changes to the rest of the kernel.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
No one is calling this anymore, so just remove it and hard-code the one
internal-use of it.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The function is no longer used by anyone in the kernel, and it prevents
the proper sending of the kobject uevent after the needed files are set
up by the caller. kobject_init_and_add() can be used in its place.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Now that the old kobject_init() function is gone, rename
kobject_init_ng() to kobject_init() to clean up the namespace.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The old kobject_init() function is on longer in use, so let us remove it
from the public scope (kset mess in the kobject.c file still uses it,
but that can be cleaned up later very simply.)
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Now that the old kobject_add() function is gone, rename kobject_add_ng()
to kobject_add() to clean up the namespace.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The old kobject_add() function is on longer in use, so let us remove it
from the public scope (kset mess in the kobject.c file still uses it,
but that can be cleaned up later very simply.)
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This moves the block devices to /sys/class/block. It will create a
flat list of all block devices, with the disks and partitions in one
directory. For compatibility /sys/block is created and contains symlinks
to the disks.
/sys/class/block
|-- sda -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda
|-- sda1 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda1
|-- sda10 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda10
|-- sda5 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda5
|-- sda6 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda6
|-- sda7 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda7
|-- sda8 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda8
|-- sda9 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda9
`-- sr0 -> ../../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0
/sys/block/
|-- sda -> ../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda
`-- sr0 -> ../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch removes the kobject, and a few other driver-core-only fields
out of struct driver and into the driver core only. Now drivers can be
safely create on the stack or statically (like they currently are.)
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The module driver specific code should belong in the driver core, not in
the kernel/ directory. So move this code. This is done in preparation
for some struct device_driver rework that should be confined to the
driver core code only.
This also lets us keep from exporting these functions, as no external
code should ever be calling it.
Thanks to Andrew Morton for the !CONFIG_MODULES fix.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The iseries driver wants to hang kobjects off of its driver, so, to
preserve backwards compatibility, we need to add a call to the driver
core to allow future changes to work properly.
Hopefully no one uses this function in the future and the iseries_veth
driver authors come to their senses so I can remove this hack...
Cc: Dave Larson <larson1@us.ibm.com>
Cc: Santiago Leon <santil@us.ibm.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is lot like default attributes for devices (and indeed,
a lot of the code is lifted from there).
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
struct bus_type is static everywhere in the kernel. This moves the
kobject in the structure out of it, and a bunch of other private only to
the driver core fields are now moved to a private structure. This lets
us dynamically create the backing kobject properly and gives us the
chance to be able to document to users exactly how to use the struct
bus_type as there are no fields they can improperly access.
Thanks to Kay for the build fixes on this patch.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This allows an easier way to get to the device klist associated with a
struct bus_type (you have three to choose from...) This will make it
easier to move these fields to be dynamic in a future patch.
The only user of this is the PCI core which horribly abuses this
interface to rearrange the order of the pci devices. This should be
done using the existing bus device walking functions, but that's left
for future patches.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This allows an easier way to get to the kset associated with a struct
bus_type (you have three to choose from...) This will make it easier to
move these fields to be dynamic in a future patch.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This isn't used by anything in the driver core, and by no one in the 204
different usages of it in the kernel tree. Remove this field so no one
gets any idea that it is needed to be used.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The uio kobject code is "wierd". This patch should hopefully fix it up
to be sane and not leak memory anymore.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benedikt Spranger <b.spranger@linutronix.de>
Signed-off-by: Hans J. Koch <hjk@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
/sys/power should not be a kset, that's overkill. This patch renames it
to power_kset and fixes up all usages of it in the tree.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
These functions are no longer used and are the last remants of the old
subsystem crap. So delete them for good.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
kernel_kset does not need to be a kset, but a much simpler kobject now
that we have kobj_attributes.
We also rename kernel_kset to kernel_kobj to catch all users of this
symbol with a build error instead of an easy-to-ignore build warning.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This macro is no longer used. ksets should be created dynamically with
a call to kset_create_and_add() not declared statically.
Yes, there are 5 remaining static struct kset usages in the kernel tree,
but they will be fixed up soon.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is no firmware "subsystem" it's just a directory in /sys that
other portions of the kernel want to hook into. So make it a kobject
not a kset to help alivate anyone who tries to do some odd kset-like
things with this.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
These functions are no longer called or needed, so we can remove them.
As I rewrote the whole firmware.c file, add my copyright.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove the no longer needed subsys_attributes, they are all converted to
the more sensical kobj_attributes.
There is no longer a magic fallback in sysfs attribute operations, all
kobjects which create simple attributes need explicitely a ktype
assigned, which tells the core what was intended here.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Needed for future firmware subsystem cleanups.
In the end, the firmware_register/unregister functions will be deleted
entirely, but we need this symbol so that subsystems can migrate over.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Clean up the use of ksets and kobjects. Kobjects are instances of
objects (like struct user_info), ksets are collections of objects of a
similar type (like the uids directory containing the user_info directories).
So, use kobjects for the user_info directories, and a kset for the "uids"
directory.
On object cleanup, the final kobject_put() was missing.
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add kobj_sysfs_ops to replace subsys_sysfs_ops. There is no
need for special kset operations, we want to be able to use
simple attribute operations at any kobject, not only ksets.
The whole concept of any default sysfs attribute operations
will go away with the upcoming removal of subsys_sysfs_ops.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dynamically create the kset instead of declaring it statically.
Having 3 static kobjects in one structure is not only foolish, but ripe
for nasty race conditions if handled improperly. We also rename the
field to catch any potential users of it (not that there should be
outside of the driver core...)
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dynamically create the kset instead of declaring it statically.
Having 3 static kobjects in one structure is not only foolish, but ripe
for nasty race conditions if handled improperly. We also rename the
field to catch any potential users of it (not that there should be
outside of the driver core...)
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dynamically create the kset instead of declaring it statically. We also
rename power_subsys to power_kset to catch all users of the variable and
we properly export it so that people don't have to guess that it really
is present in the system.
The pseries code is wierd, why is it createing /sys/power if CONFIG_PM
is disabled? Oh well, stupid big boxes ignoring config options...
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dynamically create the kset instead of declaring it statically. We also
rename module_subsys to module_kset to catch all users of the variable.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We don't need a kset here, a simple kobject will do just fine, so
dynamically create the kobject and use it.
We also rename hypervisor_subsys to hypervisor_kset to catch all users
of the variable.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dynamically create the kset instead of declaring it statically. We also
rename kernel_subsys to kernel_kset to catch all users of this symbol
with a build error instead of an easy-to-ignore build warning.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The last user of this macro (pci hotplug core) is now switched over to
using a dynamic kset, so this macro is no longer needed at all.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This also renames pci_hotplug_slots_subsys to pcis_hotplug_slots_kset
catch all current users with a build error instead of a build warning
which can easily be missed.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This also renames fs_subsys to fs_kobj to catch all current users with a
build error instead of a build warning which can easily be missed.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
kobject_kset_add_dir is only called in one place so remove it and use
kobject_create() instead.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
kobject_create_and_add is the same as kobject_add_dir, so drop
kobject_add_dir.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Now ksets can be dynamically created on the fly, no static definitions
are required. Thanks to Miklos for hints on how to make this work
better for the callers.
And thanks to Kay for finding some stupid bugs in my original version
and pointing out that we need to handle the fact that kobject's can have
a kset as a parent and to handle that properly in kobject_add().
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We don't need a "default" ktype for a kset. We should set this
explicitly every time for each kset. This change is needed so that we
can make ksets dynamic, and cleans up one of the odd, undocumented
assumption that the kset/kobject/ktype model has.
This patch is based on a lot of help from Kay Sievers.
Nasty bug in the block code was found by Dave Young
<hidave.darkstar@gmail.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Also add a kobject_init_and_add function which bundles up what a lot of
the current callers want to do all at once, and it properly handles the
memory usages, unlike kobject_register();
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is what the kobject_add function is going to become.
Add this to the kernel and then we can convert the tree over to use it.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is what the kobject_init function is going to become.
Add this to the kernel and then we can convert the tree over to use it.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
struct class_device is going away, this converts the code to use struct
device instead.
Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Shannon Nelson <shannon.nelson@intel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds kref_set() to the kref api for future use by people who really
know what they are doing with krefs...
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch reorganizes the way suspend and resume notifications are
sent to drivers. The major changes are that now the PM core acquires
every device semaphore before calling the methods, and calls to
device_add() during suspends will fail, while calls to device_del()
during suspends will block.
It also provides a way to safely remove a suspended device with the
help of the PM core, by using the device_pm_schedule_removal() callback
introduced specifically for this purpose, and updates two drivers (msr
and cpuid) that need to use it.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add a secctx_to_secid() LSM hook to go along with the existing
secid_to_secctx() LSM hook. This patch also includes the SELinux
implementation for this hook.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
The security_sb_post_mountroot() hook is long-since obsolete, and is
fundamentally broken: it is never invoked if someone uses initramfs.
This is particularly damaging, because the existence of this hook has
been used as motivation for not using initramfs.
Stephen Smalley confirmed on 2007-07-19 that this hook was originally
used by SELinux but can now be safely removed:
http://marc.info/?l=linux-kernel&m=118485683612916&w=2
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: James Morris <jmorris@namei.org>
Remove security.h include from mm.h, as it is only needed for a single
extern declaration, and pulls in all kinds of crud.
Fine-by-me: David Chinner <dgc@sgi.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Adds security_get_sb_mnt_opts, security_set_sb_mnt_opts, and
security_clont_sb_mnt_opts to the LSM and to SELinux. This will allow
filesystems to directly own and control all of their mount options if they
so choose. This interface deals only with option identifiers and strings so
it should generic enough for any LSM which may come in the future.
Filesystems which pass text mount data around in the kernel (almost all of
them) need not currently make use of this interface when dealing with
SELinux since it will still parse those strings as it always has. I assume
future LSM's would do the same. NFS is the primary FS which does not use
text mount data and thus must make use of this interface.
An LSM would need to implement these functions only if they had mount time
options, such as selinux has context= or fscontext=. If the LSM has no
mount time options they could simply not implement and let the dummy ops
take care of things.
An LSM other than SELinux would need to define new option numbers in
security.h and any FS which decides to own there own security options would
need to be patched to use this new interface for every possible LSM. This
is because it was stated to me very clearly that LSM's should not attempt to
understand FS mount data and the burdon to understand security should be in
the FS which owns the options.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
If BIOS invokes _OSI(Linux), the kernel response
depends on what the ACPI DMI list knows about the system,
and that is reflectd in dmesg:
1) System unknown to DMI:
ACPI: BIOS _OSI(Linux) query ignored
ACPI: DMI System Vendor: LENOVO
ACPI: DMI Product Name: 7661W1P
ACPI: DMI Product Version: ThinkPad T61
ACPI: DMI Board Name: 7661W1P
ACPI: DMI BIOS Vendor: LENOVO
ACPI: DMI BIOS Date: 10/18/2007
ACPI: Please send DMI info above to linux-acpi@vger.kernel.org
ACPI: If "acpi_osi=Linux" works better, please notify linux-acpi@vger.kernel.org
2) System known to DMI, but effect of OSI(Linux) unknown:
ACPI: DMI detected: Lenovo ThinkPad T61
...
ACPI: BIOS _OSI(Linux) query ignored via DMI
ACPI: If "acpi_osi=Linux" works better, please notify linux-acpi@vger.kernel.org
3) System known to DMI, which disables _OSI(Linux):
ACPI: DMI detected: Lenovo ThinkPad T61
...
ACPI: BIOS _OSI(Linux) query ignored via DMI
4) System known to DMI, which enable _OSI(Linux):
ACPI: DMI detected: Lenovo ThinkPad T61
ACPI: Added _OSI(Linux)
...
ACPI: BIOS _OSI(Linux) query honored via DMI
cmdline overrides take precidence over the built-in
default and the DMI prescribed default.
cmdline "acpi_osi=Linux" results in:
ACPI: BIOS _OSI(Linux) query honored via cmdline
Signed-off-by: Len Brown <len.brown@intel.com>
This simply allows other sub-systems (such as ACPI)
to access and print out slots in static dmi_ident[].
Signed-off-by: Len Brown <len.brown@intel.com>
This patch allows the various users of attribute_groups to selectively
allow the appearance of group attributes. The primary consumer of
this will be the transport classes in which we currently have
elaborate attribute selection algorithms to do this same thing.
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch is the beginning of moving the attribute_containers to use
attribute groups exclusively. The attr element is now deprecated and
will eventually be removed (along with all the hand rolled code for
doing exactly what attribute groups do) when all the consumers are
converted to attribute groups.
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Factor out ata_pci_activate_sff_host() from ata_pci_one(). This does
about the same thing as ata_host_activate() but needs to be separate
because SFF controllers use different and multiple IRQs in legacy
mode.
This will be used to make SFF LLD initialization more flexible.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ata_port_queue_task() served a single user: ata_pio_task()
Rename to ata_pio_queue_task() and un-export it, as nobody outside of
libata-core.c uses it.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
qc->nbytes didn't use to include extra buffers setup by libata core
layer and my be odd. This patch makes qc->nbytes include any extra
buffers setup by libata core layer and guaranteed to be aligned on 4
byte boundary.
This value is to be used to program the host controller. As this
represents the actual length of buffer available to the controller and
the controller must be able to deal with short transfers for ATAPI
commands which can transfer variable length, this shouldn't break any
controllers while making problems like rounding-down and controllers
choking up on odd transfer bytes much less likely.
The unmodified value is stored in new field qc->raw_nbytes.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata used private sg iterator to handle padding sg. Now that sg can
be chained, padding can be handled using standard sg ops. Convert to
chained sg.
* s/qc->__sg/qc->sg/
* s/qc->pad_sgent/qc->extra_sg[]/. Because chaining consumes one sg
entry. There need to be two extra sg entries. The renaming is also
for future addition of other extra sg entries.
* Padding setup is moved into ata_sg_setup_extra() which is organized
in a way that future addition of other extra sg entries is easy.
* qc->orig_n_elem is unused and removed.
* qc->n_elem now contains the number of sg entries that LLDs should
map. qc->mapped_n_elem is added to carry the original number of
mapped sgs for unmapping.
* The last sg of the original sg list is used to chain to extra sg
list. The original last sg is pointed to by qc->last_sg and the
content is stored in qc->saved_last_sg. It's restored during
ata_sg_clean().
* All sg walking code has been updated. Unnecessary assertions and
checks for conditions the core layer already guarantees are removed.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
With atapi_request_sense() converted to use sg, there's no user of
non-sg interface. Kill non-sg interface.
* ATA_QCFLAG_SINGLE and ATA_QCFLAG_SG are removed. ATA_QCFLAG_DMAMAP
is used instead. (this way no LLD change is necessary)
* qc->buf_virt is removed.
* ata_sg_init_one() and ata_sg_setup_one() are removed.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Rusty Russel <rusty@rustcorp.com.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Depending on how many bytes are transferred as a unit, PIO data
transfer may consume more bytes than requested. Knowing how much
data is consumed is necessary to determine how much is left for
draining. This patch update ->data_xfer such that it returns the
number of consumed bytes.
While at it, it also makes the following changes.
* s/adev/dev/
* use READ/WRITE constants for rw indication
* misc clean ups
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add ATAPI command types - ATAPI_READ, WRITE, RW_BUF, READ_CD and MISC,
and implement atapi_cmd_type() which takes SCSI opcode and returns to
which class the opcode belongs. This will be used later to improve
ATAPI handling.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ATA_PROT_ATAPI_* are ugly and naming schemes between ATA_PROT_* and
ATA_PROT_ATAPI_* are inconsistent causing confusion. Rename them to
ATAPI_PROT_* and make them consistent with ATA counterpart.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add GPCMD_* constants for READ_BUFFER, WRITE_12 and WRITE_BUFFER for
completeness. These will be used libata.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Reimplement ata_acpi_cbl_80wire() using ata_acpi_gtm_xfermask() and
while at it relocate the function below ata_acpi_gtm_xfermask().
New ata_acpi_cbl_80wire() implementation takes @gtm, in both pata_via
and pata_amd, use the initial GTM value. Both are trying to peek
initial BIOS configuration, so using initial caching value makes
sense. This fixes ACPI part of cable detection in pata_amd which
previously always returned 0 because configuring PIO0 during reset
clears DMA configuration.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata-acpi is using separate timing tables for transfer modes
although libata-core has the complete ata_timing table. Implement
ata_timing_cycle2mode() to look for matching mode given transfer type
and cycle duration and use it in libata-acpi and pata_acpi to replace
private timing tables.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Finding out matching transfer mode from ACPI GTM values is useful for
other purposes too. Separate out the function and timing tables from
pata_acpi::pacpi_discover_modes().
Other than checking shared-configuration bit after doing
ata_acpi_gtm() in pacpi_discover_modes() which should be safe, this
patch doesn't introduce any behavior change.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ATA_CBL_PATA_UNK indicates that the cable type can't be determined
from the host side and might be either 80c or 40c. libata applies
drive or other generic limit in this case. However, there are
controllers where both host and drive side detections are
misimplemented and the driver has to rely solely on private method -
peeking BIOS or ACPI configuration or using some other private
mechanism.
This patch adds ATA_CBL_PATA_IGN which tells libata to ignore the
cable type completely and just let the LLD determine the transfer mode
via host transfer mode masks and ->mode_filter().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jeff says xfer_mask is unsigned long not unsigned int. Convert all
xfermask fields and handling functions to deal with unsigned longs.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ata_id_to_dma_mode() isn't quite generic. The function is basically
privately implemented ata_id_xfermask() combined with hardcoded mode
printing and configuration which are specific to ata_generic.
Kill the function and open code it in generic_set_mode() using generic
xfermode handling functions.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* s/ATA_BITS_(PIO|MWDMA|UDMA)/ATA_NR_\1_MODES/g
* Consistently use 0xff to indicate invalid transfer mode (0x00 is
valid for PIO_SLOW).
* Make ata_xfer_mode2mask() return proper mode mask instead of just
the highest bit.
* Sort ata_timing table in increasing xfermode order and update
ata_timing_find_mode() accordingly.
This patch doesn't introduce any behavior change.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Export the following xfermode related functions.
* ata_pack_xfermask()
* ata_unpack_xfermask()
* ata_xfer_mask2mode()
* ata_xfer_mode2mask()
* ata_xfer_mode2shift()
* ata_mode_string()
* ata_id_xfermask()
* ata_timing_find_mode()
These functions will be used later by LLD updates. While at it,
change unsigned short @speed to u8 @xfer_mode in
ata_timing_find_mode() for consistency.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ATA_DFLAG_DUBIOUS_XFER is set whenever data transfer speed or method
changes and gets cleared when data transfer command succeeds in the
newly configured transfer mode.
This will be used to improve speed down logic.
Signed-off-by: Tejun Heo <htejun@gmail.com<
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Clean up EH speed down implementation.
* is_io boolean variable is replaced eflags. is_io is ATA_EFLAG_IS_IO.
* Error categories now have names.
* Better comments.
* Reorder 5min and 10min rules in ata_eh_speed_down_verdict()
* Use local variable @link to cache @dev->link in ata_eh_speed_down()
These changes are to improve readability and ease further changes.
This patch doesn't introduce any behavior change.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement protocol tests - ata_is_atapi(), ata_is_nodata(),
ata_is_pio(), ata_is_dma(), ata_is_ncq() and ata_is_data() and use
them to replace is_atapi_taskfile() and hard coded protocol tests.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Area for DFLAGs which are cleared on INIT is full. Extend it by 8
bits.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Historically word 48 in the identify data was used to mean 32bit I/O
was supported for VLB IDE etc. ATA8 reassigns this word to the Trusted
Computing Group, where it is used for TCG features. This means that
an ATA8 TCG drive is going to trigger 32bit I/O on some systems which
will be funny.
Anyway we need to sort this out ready for ATA8 so:
- Reorder the ata.h header a bit so the ata_version function occurs early
in it
- Make dword_io check the ATA version
- Add an ATA8 version checking TCG presence test
While we are at it the current drafts have a flaw where it may not be
possible to disable TCG features at boot (and opt out of the trusted
model) as TCG intends because it relies on presence of a different
optional feature (DCS). Handle this in software by refusing the TCG
commands if libata.allow_tpm is not set. (We must make it possible
as some environments such as proprietary VDR devices will doubtless
want to use it to lock up content)
Finally as with CPRM print a warning so that the user knows they may
not be able to full access and use the device.
Signed-off-by: Alan Cox <alan@redhat.com>
Dave Young reported warnings from lockdep that the workqueue API
can sometimes try to register lockdep classes with the same key
but different names. This is not permitted in lockdep.
Unfortunately, I was unaware of that restriction when I wrote
the code to debug workqueue problems with lockdep and used the
workqueue name as the lockdep class name. This can obviously
lead to the problem if the workqueue name is dynamic.
This patch solves the problem by always using a constant name
for the workqueue's lockdep class, namely either the constant
name that was passed in or a string consisting of the variable
name.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Devices that misreport the validity bit for word 93 look like SATA. If
they are on the blacklist then we must not test for SATA but assume 40 wire
in the 40 wire case (The TSSCorp reports 80 wire on SATA it seems!)
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This reverts commit 2e6883bdf4, as
requested by Fengguang Wu. It's not quite fully baked yet, and while
there are patches around to fix the problems it caused, they should get
more testing. Says Fengguang: "I'll resend them both for -mm later on,
in a more complete patchset".
See
http://bugzilla.kernel.org/show_bug.cgi?id=9738
for some of this discussion.
Requested-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
pnpacpi: print resource shortage message only once
PM: ACPI and APM must not be enabled at the same time
ACPI: apply quirk_ich6_lpc_acpi to more ICH8 and ICH9
ACPICA: fix acpi_serialize hang regression
ACPI : Not register gsi for PCI IDE controller in legacy mode
ACPI: Reintroduce run time configurable max_cstate for !CPU_IDLE case
ACPI: Make sysfs interface in ACPI power optional.
ACPI: EC: Enable boot EC before bus_scan
increase PNP_MAX_PORT to 40 from 24