asgardius/android_kernel_motorola_sm6225

Author	SHA1	Message	Date
Al Viro	a5cb013da7	[PATCH] auditing ptrace Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2007-05-11 05:38:25 -04:00
Patrick McHardy	3c2ad469c3	[NETFILTER]: Clean up table initialization - move arp_tables initial table structure definitions to arp_tables.h similar to ip_tables and ip6_tables - use C99 initializers - use initializer macros where possible Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-10 23:47:43 -07:00
Herbert Xu	572a103ded	[NET] link_watch: Move link watch list into net_device These days the link watch mechanism is an integral part of the network subsystem as it manages the carrier status. So it now makes sense to allocate some memory for it in net_device rather than allocating it on demand. In fact, this is necessary because we can't tolerate a memory allocation failure since that means we'd have to potentially throw a link up event away. It also simplifies the code greatly. In doing so I discovered a subtle race condition in the use of singleevent. This race condition still exists (and is somewhat magnified) without singleevent but it's now plugged thanks to an smp_mb__before_clear_bit. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-10 23:45:07 -07:00
Linus Torvalds	62933d36ac	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (24 commits) [POWERPC] Fix compile error with kexec and CONFIG_SMP=n [POWERPC] Split initrd logic out of early_init_dt_scan_chosen() to fix warning [POWERPC] Fix warning in hpte_decode(), and generalize it [POWERPC] Minor pSeries IOMMU debug cleanup [POWERPC] PS3: Fix sys manager build error [POWERPC] Assorted janitorial EEH cleanups [POWERPC] We don't define CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID [POWERPC] pmu_sys_suspended is only defined for PPC32 [POWERPC] Fix incorrect calculation of I/O window addresses [POWERPC] celleb: Update celleb_defconfig [POWERPC] celleb: Fix parsing of machine type hack command line option [POWERPC] celleb: Fix PCI config space accesses to subordinate buses [POWERPC] celleb: Fix support for multiple PCI domains [POWERPC] Wire up sys_utimensat [POWERPC] CPM_UART: Removed __init from cpm_uart_init_portdesc to fix warning [POWERPC] User rheap from arch/powerpc/lib [POWERPC] 83xx: Fix the PCI ranges in the MPC834x_MDS device tree. [POWERPC] 83xx: Fix the PCI ranges in the MPC832x_MDS device tree. [POWERPC] CPM_UART: cpm_uart_set_termios should take ktermios, not termios [POWERPC] Change rheap functions to use ulongs instead of pointers ...	2007-05-10 13:32:24 -07:00
Linus Torvalds	b526ca438b	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: acpi,msi-laptop: Fall back to EC polling mode for MSI laptop specific EC commands sony-laptop: rename SONY_LAPTOP_OLD to a more meaningful SONYPI_COMPAT asus-laptop: version bump and lindent asus-laptop: fix light sens init asus-laptop: add GPS support asus-laptop: notify ALL events ACPICA: Lindent ACPI: created a dedicated workqueue for notify() execution Revert "ACPICA: fix AML mutex re-entrancy" Revert "Execute AML Notify() requests on stack." Revert "ACPICA: revert "acpi_serialize" changes" ACPI: delete un-reliable concept of cooling mode ACPI: thermal trip points are read-only	2007-05-10 13:30:34 -07:00
Linus Torvalds	9b6a51746f	Merge branch 'juju' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'juju' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: (138 commits) firewire: Convert OHCI driver to use standard goto unwinding for error handling. firewire: Always use parens with sizeof. firewire: Drop single buffer request support. firewire: Add a comment to describe why we split the sg list. firewire: Return SCSI_MLQUEUE_HOST_BUSY for out of memory cases in queuecommand. firewire: Handle the last few DMA mapping error cases. firewire: Allocate scsi_host up front and allocate the sbp2_device as hostdata. firewire: Provide module aliase for backwards compatibility. firewire: Add to fw-core-y instead of assigning fw-core-objs in Makefile. firewire: Break out shared IEEE1394 constant to separate header file. firewire: Use linux/.h instead of asm/.h header files. firewire: Uppercase most macro names. firewire: Coding style cleanup: no spaces after function names. firewire: Convert card_rwsem to a regular mutex. firewire: Clean up comment style. firewire: Use lib/ implementation of CRC ITU-T. CRC ITU-T V.41 firewire: Rename fw-device-cdev.c to fw-cdev.c and move header to include/linux. firewire: Future proof the iso ioctls by adding a handle for the iso context. firewire: Add read/write and size annotations to IOC numbers. ... Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-10 13:30:08 -07:00
Andrew Morton	218e180e7e	add upper-32-bits macro We keep on getting "right shift count >= width of type" warnings when doing things like sector_t s; x = s >> 56; because with CONFIG_LBD=n, s is only 32-bit. Similar problems can occur with dma_addr_t's. So add a simple wrapper function which code can use to avoid this warning. The above example would become x = upper_32_bits(s) >> 24; The first user is in fact AFS. Cc: James Bottomley <James.Bottomley@SteelEye.com> Cc: "Cameron, Steve" <Steve.Cameron@hp.com> Cc: "Miller, Mike (OS Dev)" <Mike.Miller@hp.com> Cc: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-10 09:26:52 -07:00
Christoph Lameter	894b8788d7	slub: support concurrent local and remote frees and allocs on a slab Avoid atomic overhead in slab_alloc and slab_free SLUB needs to use the slab_lock for the per cpu slabs to synchronize with potential kfree operations. This patch avoids that need by moving all free objects onto a lockless_freelist. The regular freelist continues to exist and will be used to free objects. So while we consume the lockless_freelist the regular freelist may build up objects. If we are out of objects on the lockless_freelist then we may check the regular freelist. If it has objects then we move those over to the lockless_freelist and do this again. There is a significant savings in terms of atomic operations that have to be performed. We can even free directly to the lockless_freelist if we know that we are running on the same processor. So this speeds up short lived objects. They may be allocated and freed without taking the slab_lock. This is particular good for netperf. In order to maximize the effect of the new faster hotpath we extract the hottest performance pieces into inlined functions. These are then inlined into kmem_cache_alloc and kmem_cache_free. So hotpath allocation and freeing no longer requires a subroutine call within SLUB. [I am not sure that it is worth doing this because it changes the easy to read structure of slub just to reduce atomic ops. However, there is someone out there with a benchmark on 4 way and 8 way processor systems that seems to show a 5% regression vs. Slab. Seems that the regression is due to increased atomic operations use vs. SLAB in SLUB). I wonder if this is applicable or discernable at all in a real workload?] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-10 09:26:52 -07:00
Kristian Høgsberg	4c5a443e80	firewire: Break out shared IEEE1394 constant to separate header file. Signed-off-by: Kristian Hoegsberg <krh@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2007-05-10 18:24:13 +02:00
Kristian Høgsberg	04dfb8dbd2	firewire: Use linux/.h instead of asm/.h header files. Signed-off-by: Kristian Hoegsberg <krh@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2007-05-10 18:24:13 +02:00
Ivo van Doorn	3e7cbae7c6	CRC ITU-T V.41 This will add the CRC calculation according to the CRC ITU-T V.41 to the kernel lib/ folder. This code has been derived from the rt2x00 driver, currently found only in the wireless-dev tree, but this library is generic and could be used by more drivers who currently use their own implementation. Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Also useful for the new firewire stack. Signed-off-by: Kristian Hoegsberg <krh@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2007-05-10 18:24:13 +02:00
Stephen Rothwell	49d687b636	[POWERPC] pmu_sys_suspended is only defined for PPC32 thus we get a link error on ppc64 with CONFIG_PM=y. This fixes it. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-10 21:28:13 +10:00
Lennart Poettering	00eb43a189	acpi,msi-laptop: Fall back to EC polling mode for MSI laptop specific EC commands The ACPI EC that is used in MSI laptops knows some non-standard commands for changing the screen brighntess and a few other things, which are used by the msi-laptop.c driver. Unfortunately for these commands no GPE events for IBF and OBF are triggered. Since nowadays the EC code uses the ec_intr=1 mode by default, this causes these operations to timeout, although they don't fail. In result, all operations that you can do with the msi-laptop.c driver take more or less 1s to complete, which is awfully slow. In one of the more recent kernels (2.6.20?) the EC subsystem has been revamped. With that change the EC timeout has been increased. before that increase the MSI EC accesses were slow -- but not that slow, hence I took notice of this limitation of the MSI EC hardware only very recently. The standard EC operations on the MSI EC as defined in the ACPI spec support GPE events properly. The following patch adds a new argument "force_poll" to the ec_transaction() function (and friends). If set to 1, the function will poll for IBF/OBF even if ec_intr=1 is enabled. If set to 0 the current behaviour is used. The msi-laptop driver is modified to make use of this new flag, so that OBF/IBF is polled for the special MSI EC transactions -- but only for them. Signed-off-by: Lennart Poettering <mzxreary@0pointer.de> Acked-by: Alexey Starikovskiy <aystarik@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2007-05-10 03:52:22 -04:00
Linus Torvalds	de5603748a	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters IB: Put rlimit accounting struct in struct ib_umem IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules	2007-05-09 19:40:09 -07:00
Linus Torvalds	44ce6294d0	Revert "md: improve partition detection in md array" This reverts commit `5b479c91da`. Quoth Neil Brown: "It causes an oops when auto-detecting raid arrays, and it doesn't seem easy to fix. The array may not be 'open' when do_md_run is called, so bdev->bd_disk might be NULL, so bd_set_size can oops. This whole approach of opening an md device before it has been assembled just seems to get more and more painful. I think I'm going to have to come up with something clever to provide both backward comparability with usage expectation, and sane integration into the rest of the kernel." Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 18:51:36 -07:00
Linus Torvalds	3cb7396b7b	Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6: ide: fix PIO setup on resume for ATAPI devices ide: legacy PCI bus order probing fixes ide: add ide_proc_register_port() ide: add "initializing" argument to ide_register_hw() ide: cable detection fixes (take 2) ide: move IDE settings handling to ide-proc.c ide: split off ioctl handling from IDE settings (v2) ide: make /proc/ide/ optional ide: add ide_tune_dma() helper ide: rework the code for selecting the best DMA transfer mode (v3) ide: fix UDMA/MWDMA/SWDMA masks (v3)	2007-05-09 15:41:31 -07:00
Bartlomiej Zolnierkiewicz	6d208b39c4	ide: legacy PCI bus order probing fixes IDE PCI host drivers should register themselves with IDE core only when IDE driver is built-in, otherwise (IDE driver is modular and thus IDE PCI host drivers are also modular) the code has no effect and just complicates the probing. Fix it by adding new config option CONFIG_IDEPCI_PCIBUS (defined only when needed and invisible to the user) and covering by #ifdef/#endif the code in question. It turned out that "ide=reverse" was silently accepted but did nothing in case when IDE driver was modular, this is fixed now. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-05-10 00:01:11 +02:00
Bartlomiej Zolnierkiewicz	5cbf79cdb3	ide: add ide_proc_register_port() * create_proc_ide_interfaces() tries to add /proc entries for every probed and initialized IDE port, replace it by ide_proc_register_port() which does it only for the given port (also rename destroy_proc_ide_interface() to ide_proc_unregister_port() for consistency) * convert {create,destroy}_proc_ide_interface[s]() users to use new functions * pmac driver depended on proc_ide_create() to add /proc port entries, fix it * au1xxx-ide, swarm and cs5520 drivers depended indirectly on ide-generic driver (CONFIG_IDE_GENERIC=y) to add port /proc entries, fix them * there is now no need to add /proc entries for IDE ports in proc_ide_create() so don't do it * proc_ide_create() needs now to be called before drivers are probed - fix it, while at it make proc_ide_create() create /proc "ide" directory Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-05-10 00:01:11 +02:00
Bartlomiej Zolnierkiewicz	869c56ee9d	ide: add "initializing" argument to ide_register_hw() Add "initializing" argument to ide_register_hw() and use it instead of ide.c wide variable of the same name. Update all users of ide_register_hw() accordingly. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-05-10 00:01:10 +02:00
Bartlomiej Zolnierkiewicz	7f8f48af08	ide: cable detection fixes (take 2) Tejun's recent eighty_ninty_three() fix has inspired me to do more thorough review of the cable detection code... * print user-friendly warning about limiting the maximum transfer speed to UDMA33 (and the reason behind it) when 80-wire cable is not detected, also while at it cleanup eighty_ninty_three() a bit * use eighty_ninty_three() in ide_ata66_check(), this actually fixes 3 bugs: - bit 14 (word 93 validity check) == 1 && bit 13 (80-wire cable test) == 1 were used as 80-wire cable present test for CONFIG_IDEDMA_IVB=n case (please see FIXME comment in eighty_ninty_three() for more details) - CONFIG_IDEDMA_IVB=y/n cases were interchanged - check for SATA devices was missing * remove private cable warnings from pdc_202xx{old,new} drivers now that core code provides this functionality (plus, in pdc202xx_new case the test could give false warnings for ATAPI devices because pdc202xx_new driver doesn't even support ATAPI DMA) Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-05-10 00:01:10 +02:00
Bartlomiej Zolnierkiewicz	7662d046df	ide: move IDE settings handling to ide-proc.c * move __ide_add_setting() ide_add_setting() __ide_remove_setting() auto_remove_settings() ide_find_setting_by_name() ide_read_setting() ide_write_setting() set_xfer_rate() ide_add_generic_settings() ide_register_subdriver() ide_unregister_subdriver() from ide.c to ide-proc.c * set_{io_32bit,pio_mode,using_dma}() cannot be marked static now, fix it * rename ide_[un]register_subdriver() to ide_proc_[un]register_driver(), update device drivers to use new names * add CONFIG_IDE_PROC_FS=n versions of ide_proc_[un]register_driver() and ide_add_generic_settings() * make ide_find_setting_by_name(), ide_{read,write}_setting() and ide_{add,remove}_proc_entries() static * cover IDE settings code in device drivers with CONFIG_IDE_PROC_FS #ifdef, also while at it cover with CONFIG_IDE_PROC_FS #ifdef ide_driver_t.proc * remove bogus comment from ide.h * cover with CONFIG_IDE_PROC_FS #ifdef .proc and .settings in ide_drive_t Besides saner code this patch results in the IDE core smaller by ~2 kB (on x86-32) and IDE disk driver by ~1 kB (ditto) when CONFIG_IDE_PROC_FS=n. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-05-10 00:01:10 +02:00
Bartlomiej Zolnierkiewicz	1497943ee6	ide: split off ioctl handling from IDE settings (v2) * do write permission and min/max checks in ide_procset_t functions * ide-disk.c: drive->id is always available so cleanup "multcount" setting accordingly * ide-disk.c: "address" setting was incorrectly defined as type TYPE_INTA, fix it by using type TYPE_BYTE and updating ide_drive_t->adressing field, the bug didn't trigger because this IDE setting uses custom ->set function * ide.c: add set_ksettings() for handling HDIO_SET_KEEPSETTINGS ioctl * ide.c: add set_unmaskirq() for handling HDIO_SET_UNMASKINTR ioctl * handle ioctls directly in generic_ide_ioclt() and idedisk_ioctl() instead of using IDE settings to deal with them * remove no longer needed ide_find_setting_by_ioctl() and {read,write}_ioctl fields from ide_settings_t, also remove now unused TYPE_INTA handling v2: * add missing EXPORT_SYMBOL_GPL(ide_setting_sem) needed now for ide-disk Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-05-10 00:01:10 +02:00
Bartlomiej Zolnierkiewicz	ecfd80e4a5	ide: make /proc/ide/ optional All important information/features should be already available through sysfs and ioctl interfaces. Add CONFIG_IDE_PROC_FS (CONFIG_SCSI_PROC_FS rip-off) config option, disabling it makes IDE driver ~5 kB smaller (on x86-32). While at it add CONFIG_PROC_FS=n versions of proc_ide_{create,destroy}() and remove no longer needed #ifdefs. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-05-10 00:01:09 +02:00
Bartlomiej Zolnierkiewicz	29e744d088	ide: add ide_tune_dma() helper After reworking the code responsible for selecting the best DMA transfer mode it is now possible to add generic ide_tune_dma() helper. Convert some IDE PCI host drivers to use it (the ones left need more work). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-05-10 00:01:09 +02:00
Bartlomiej Zolnierkiewicz	2d5eaa6dd7	ide: rework the code for selecting the best DMA transfer mode (v3) Depends on the "ide: fix UDMA/MWDMA/SWDMA masks" patch. * add ide_hwif_t.udma_filter hook for filtering UDMA mask (use it in alim15x3, hpt366, siimage and serverworks drivers) * add ide_max_dma_mode() for finding best DMA mode for the device (loosely based on some older libata-core.c code) * convert ide_dma_speed() users to use ide_max_dma_mode() * make ide_rate_filter() take "ide_drive_t drive" as an argument instead of "u8 mode" and teach it to how to use UDMA mask to do filtering use ide_rate_filter() in hpt366 driver * remove no longer needed ide_dma_speed() and _ratemask() unexport eighty_ninty_three() v2: * rename ->filter_udma_mask to ->udma_filter [ Suggested by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ] v3: * updated for scc_pata driver (fixes XFER_UDMA_6 filtering for user-space originated transfer mode change requests when 100MHz clock is used) Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-05-10 00:01:08 +02:00
Bartlomiej Zolnierkiewicz	1813720723	ide: fix UDMA/MWDMA/SWDMA masks (v3) * use 0x00 instead of 0x80 to disable ->{ultra,mwdma,swdma}_mask * add udma_mask field to ide_pci_device_t and use it to initialize ->ultra_mask in aec62xx, cmd64x, pdc202xx_{new,old} and piix drivers * fix UDMA masks to match with chipset specific _ratemask() (alim15x3, hpt366, serverworks and siimage drivers need UDMA mask filtering method - done in the next patch) v2: piix: fix cable detection for 82801AA_1 and 82372FB_1 [ Noticed by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ] * cmd64x: use hwif->cds->udma_mask [ Suggested by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ] * aec62xx: fix newly introduced bug - check DMA status not command register [ Noticed by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ] v3: * piix: use hwif->cds->udma_mask [ Suggested by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-05-10 00:01:07 +02:00
Linus Torvalds	ba7cc09c9c	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: (21 commits) [MTD] [CHIPS] Remove MTD_OBSOLETE_CHIPS (jedec, amd_flash, sharp) [MTD] Delete allegedly obsolete "bank_size" field of mtd_info. [MTD] Remove unnecessary user space check from mtd.h. [MTD] [MAPS] Remove flash maps for no longer supported 405LP boards [MTD] [MAPS] Fix missing printk() parameter in physmap_of.c MTD driver [MTD] [NAND] platform NAND driver: add driver [MTD] [NAND] platform NAND driver: update header [JFFS2] Simplify and clean up jffs2_add_tn_to_tree() some more. [JFFS2] Remove another bogus optimisation in jffs2_add_tn_to_tree() [JFFS2] Remove broken insert_point optimisation in jffs2_add_tn_to_tree() [JFFS2] Remember to calculate overlap on nodes which replace older nodes [JFFS2] Don't advance c->wbuf_ofs to next eraseblock after wbuf flush [MTD] [NAND] at91_nand.c: CMDLINE_PARTS support [MTD] [NAND] Tidy up handling of page number in nand_block_bad() [MTD] block2mtd_paramline[] mustn't be __initdata [MTD] [NAND] Support multiple chips in CAFÉ driver [MTD] [NAND] Rename cafe.c to cafe_nand.c and remove the multi-obj magic [MTD] [NAND] Use rslib for CAFÉ ECC [RSLIB] Support non-canonical GF representations [JFFS2] Remove dead file histo_mips.h ...	2007-05-09 13:10:11 -07:00
Linus Torvalds	aabded9c3a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Further fixes for the removal of 4level-fixup hack from ppc32 [POWERPC] EEH: log all PCI-X and PCI-E AER registers [POWERPC] EEH: capture and log pci state on error [POWERPC] EEH: Split up long error msg [POWERPC] EEH: log error only after driver notification. [POWERPC] fsl_soc: Make mac_addr const in fs_enet_of_init(). [POWERPC] Don't use SLAB/SLUB for PTE pages [POWERPC] Spufs support for 64K LS mappings on 4K kernels [POWERPC] Add ability to 4K kernel to hash in 64K pages [POWERPC] Introduce address space "slices" [POWERPC] Small fixes & cleanups in segment page size demotion [POWERPC] iSeries: Make HVC_ISERIES the default [POWERPC] iSeries: suppress build warning in lparmap.c [POWERPC] Mark pages that don't exist as nosave [POWERPC] swsusp: Introduce register_nosave_region_late	2007-05-09 12:56:01 -07:00
Linus Torvalds	9a9136e270	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits) sound: convert "sound" subdirectory to UTF-8 MAINTAINERS: Add cxacru website/mailing list include files: convert "include" subdirectory to UTF-8 general: convert "kernel" subdirectory to UTF-8 documentation: convert the Documentation directory to UTF-8 Convert the toplevel files CREDITS and MAINTAINERS to UTF-8. remove broken URLs from net drivers' output Magic number prefix consistency change to Documentation/magic-number.txt trivial: s/i_sem /i_mutex/ fix file specification in comments drivers/base/platform.c: fix small typo in doc misc doc and kconfig typos Remove obsolete fat_cvf help text Fix occurrences of "the the " Fix minor typoes in kernel/module.c Kconfig: Remove reference to external mqueue library Kconfig: A couple of grammatical fixes in arch/i386/Kconfig Correct comments in genrtc.c to refer to correct /proc file. Fix more "deprecated" spellos. Fix "deprecated" typoes. ... Fix trivial comment conflict in kernel/relay.c.	2007-05-09 12:54:17 -07:00
NeilBrown	5b479c91da	md: improve partition detection in md array md currently uses ->media_changed to make sure rescan_partitions is call on md array after they are assembled. However that doesn't happen until the array is opened, which is later than some people would like. So use blkdev_ioctl to do the rescan immediately that the array has been assembled. This means we can remove all the ->change infrastructure as it was only used to trigger a partition rescan. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:57 -07:00
Haavard Skinnemoen	880169dd2e	fbdev: add support for AVR32 Provide framebuffer page protection flags and definitions of fb_readl/fb_writel for AVR32. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:57 -07:00
Antonino A. Daplas	5a87ede945	svgalib: move fb_get_caps to svgalib Move fb_get_caps() method to svgalib.c as svga_get_caps() so it can be used by s3fb, arkfb and vt8623fb. Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:57 -07:00
David Rientjes	0d7ebbbc6e	compiler: introduce __used and __maybe_unused __used is defined to be __attribute__((unused)) for all pre-3.3 gcc compilers to suppress warnings for unused functions because perhaps they are referenced only in inline assembly. It is defined to be __attribute__((used)) for gcc 3.3 and later so that the code is still emitted for such functions. __maybe_unused is defined to be __attribute__((unused)) for both function and variable use if it could possibly be unreferenced due to the evaluation of preprocessor macros. Function prototypes shall be marked with __maybe_unused if the actual definition of the function is dependant on preprocessor macros. No update to compiler-intel.h is necessary because ICC supports both __attribute__((used)) and __attribute__((unused)) as specified by the gcc manual. __attribute_used__ is deprecated and will be removed once all current code is converted to using __used. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
Roman Zippel	f7e4217b00	rename thread_info to stack This finally renames the thread_info field in task structure to stack, so that the assumptions about this field are gone and archs have more freedom about placing the thread_info structure. Nonbroken archs which have a proper thread pointer can do the access to both current thread and task structure via a single pointer. It'll allow for a few more cleanups of the fork code, from which e.g. ia64 could benefit. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> [akpm@linux-foundation.org: build fix] Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Andi Kleen <ak@muc.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
Roman Zippel	e61a1c1c4f	Allow arch to initialize arch field of the module structure This will later allow an arch to add module specific information via linker generated tables instead of poking directly in the module object structure. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
Thomas Gleixner	b52f52a093	clocksource: fix resume logic We need to make sure that the clocksources are resumed, when timekeeping is resumed. The current resume logic does not guarantee this. Add a resume function pointer to the clocksource struct, so clocksource drivers which need to reinitialize the clocksource can provide a resume function. Add a resume function, which calls the maybe available clocksource resume functions and resets the watchdog function, so a stable TSC can be used accross suspend/resume. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Andi Kleen <ak@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
Christoph Lameter	4037d45220	Move remote node draining out of slab allocators Currently the slab allocators contain callbacks into the page allocator to perform the draining of pagesets on remote nodes. This requires SLUB to have a whole subsystem in order to be compatible with SLAB. Moving node draining out of the slab allocators avoids a section of code in SLUB. Move the node draining so that is is done when the vm statistics are updated. At that point we are already touching all the cachelines with the pagesets of a processor. Add a expire counter there. If we have to update per zone or global vm statistics then assume that the pageset will require subsequent draining. The expire counter will be decremented on each vm stats update pass until it reaches zero. Then we will drain one batch from the pageset. The draining will cause vm counter updates which will then cause another expiration until the pcp is empty. So we will drain a batch every 3 seconds. Note that remote node draining is a somewhat esoteric feature that is required on large NUMA systems because otherwise significant portions of system memory can become trapped in pcp queues. The number of pcp is determined by the number of processors and nodes in a system. A system with 4 processors and 2 nodes has 8 pcps which is okay. But a system with 1024 processors and 512 nodes has 512k pcps with a high potential for large amount of memory being caught in them. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
Christoph Lameter	d1187ed210	vmstat: use our own timer events vmstat is currently using the cache reaper to periodically bring the statistics up to date. The cache reaper does only exists in SLUB as a way to provide compatibility with SLAB. This patch removes the vmstat calls from the slab allocators and provides its own handling. The advantage is also that we can use a different frequency for the updates. Refreshing vm stats is a pretty fast job so we can run this every second and stagger this by only one tick. This will lead to some overlap in large systems. F.e a system running at 250 HZ with 1024 processors will have 4 vm updates occurring at once. However, the vm stats update only accesses per node information. It is only necessary to stagger the vm statistics updates per processor in each node. Vm counter updates occurring on distant nodes will not cause cacheline contention. We could implement an alternate approach that runs the first processor on each node at the second and then each of the other processor on a node on a subsequent tick. That may be useful to keep a large amount of the second free of timer activity. Maybe the timer folks will have some feedback on this one? [jirislaby@gmail.com: add missing break] Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
Rafael J. Wysocki	8bb7844286	Add suspend-related notifications for CPU hotplug Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [oleg@tv-sign.ru: cleanups] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
Nate Diller	f37bc2712b	fs: deprecate memclear_highpage_flush Now that all the in-tree users are converted over to zero_user_page(), deprecate the old memclear_highpage_flush() call. Signed-off-by: Nate Diller <nate.diller@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
Nate Diller	01f2705daf	fs: convert core functions to zero_user_page It's very common for file systems to need to zero part or all of a page, the simplist way is just to use kmap_atomic() and memset(). There's actually a library function in include/linux/highmem.h that does exactly that, but it's confusingly named memclear_highpage_flush(), which is descriptive of how it does the work rather than what the purpose is. So this patchset renames the function to zero_user_page(), and calls it from the various places that currently open code it. This first patch introduces the new function call, and converts all the core kernel callsites, both the open-coded ones and the old memclear_highpage_flush() ones. Following this patch is a series of conversions for each file system individually, per AKPM, and finally a patch deprecating the old call. The diffstat below shows the entire patchset. [akpm@linux-foundation.org: fix a few things] Signed-off-by: Nate Diller <nate.diller@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:55 -07:00
Eric Dumazet	34f01cc1f5	FUTEX: new PRIVATE futexes Analysis of current linux futex code : -------------------------------------- A central hash table futex_queues[] holds all contexts (futex_q) of waiting threads. Each futex_wait()/futex_wait() has to obtain a spinlock on a hash slot to perform lookups or insert/deletion of a futex_q. When a futex_wait() is done, calling thread has to : 1) - Obtain a read lock on mmap_sem to be able to validate the user pointer (calling find_vma()). This validation tells us if the futex uses an inode based store (mapped file), or mm based store (anonymous mem) 2) - compute a hash key 3) - Atomic increment of reference counter on an inode or a mm_struct 4) - lock part of futex_queues[] hash table 5) - perform the test on value of futex. (rollback is value != expected_value, returns EWOULDBLOCK) (various loops if test triggers mm faults) 6) queue the context into hash table, release the lock got in 4) 7) - release the read_lock on mmap_sem <block> 8) Eventually unqueue the context (but rarely, as this part may be done by the futex_wake()) Futexes were designed to improve scalability but current implementation has various problems : - Central hashtable : This means scalability problems if many processes/threads want to use futexes at the same time. This means NUMA unbalance because this hashtable is located on one node. - Using mmap_sem on every futex() syscall : Even if mmap_sem is a rw_semaphore, up_read()/down_read() are doing atomic ops on mmap_sem, dirtying cache line : - lot of cache line ping pongs on SMP configurations. mmap_sem is also extensively used by mm code (page faults, mmap()/munmap()) Highly threaded processes might suffer from mmap_sem contention. mmap_sem is also used by oprofile code. Enabling oprofile hurts threaded programs because of contention on the mmap_sem cache line. - Using an atomic_inc()/atomic_dec() on inode ref counter or mm ref counter: It's also a cache line ping pong on SMP. It also increases mmap_sem hold time because of cache misses. Most of these scalability problems come from the fact that futexes are in one global namespace. As we use a central hash table, we must make sure they are all using the same reference (given by the mm subsystem). We chose to force all futexes be 'shared'. This has a cost. But fact is POSIX defined PRIVATE and SHARED, allowing clear separation, and optimal performance if carefuly implemented. Time has come for linux to have better threading performance. The goal is to permit new futex commands to avoid : - Taking the mmap_sem semaphore, conflicting with other subsystems. - Modifying a ref_count on mm or an inode, still conflicting with mm or fs. This is possible because, for one process using PTHREAD_PROCESS_PRIVATE futexes, we only need to distinguish futexes by their virtual address, no matter the underlying mm storage is. If glibc wants to exploit this new infrastructure, it should use new _PRIVATE futex subcommands for PTHREAD_PROCESS_PRIVATE futexes. And be prepared to fallback on old subcommands for old kernels. Using one global variable with the FUTEX_PRIVATE_FLAG or 0 value should be OK. PTHREAD_PROCESS_SHARED futexes should still use the old subcommands. Compatibility with old applications is preserved, they still hit the scalability problems, but new applications can fly :) Note : the same SHARED futex (mapped on a file) can be used by old binaries and new binaries, because both binaries will use the old subcommands. Note : Vast majority of futexes should be using PROCESS_PRIVATE semantic, as this is the default semantic. Almost all applications should benefit of this changes (new kernel and updated libc) Some bench results on a Pentium M 1.6 GHz (SMP kernel on a UP machine) /* calling futex_wait(addr, value) with value != addr / 433 cycles per futex(FUTEX_WAIT) call (mixing 2 futexes) 424 cycles per futex(FUTEX_WAIT) call (using one futex) 334 cycles per futex(FUTEX_WAIT_PRIVATE) call (mixing 2 futexes) 334 cycles per futex(FUTEX_WAIT_PRIVATE) call (using one futex) For reference : 187 cycles per getppid() call 188 cycles per umask() call 181 cycles per ni_syscall() call Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Pierre Peiffer <pierre.peiffer@bull.net> Cc: "Ulrich Drepper" <drepper@gmail.com> Cc: "Nick Piggin" <nickpiggin@yahoo.com.au> Cc: "Ingo Molnar" <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:55 -07:00
Pierre Peiffer	d0aa7a70bf	futex_requeue_pi optimization This patch provides the futex_requeue_pi functionality, which allows some threads waiting on a normal futex to be requeued on the wait-queue of a PI-futex. This provides an optimization, already used for (normal) futexes, to be used with the PI-futexes. This optimization is currently used by the glibc in pthread_broadcast, when using "normal" mutexes. With futex_requeue_pi, it can be used with PRIO_INHERIT mutexes too. Signed-off-by: Pierre Peiffer <pierre.peiffer@bull.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:55 -07:00
Pierre Peiffer	c19384b5b2	Make futex_wait() use an hrtimer for timeout This patch modifies futex_wait() to use an hrtimer + schedule() in place of schedule_timeout(). schedule_timeout() is tick based, therefore the timeout granularity is the tick (1 ms, 4 ms or 10 ms depending on HZ). By using a high resolution timer for timeout wakeup, we can attain a much finer timeout granularity (in the microsecond range). This parallels what is already done for futex_lock_pi(). The timeout passed to the syscall is no longer converted to jiffies and is therefore passed to do_futex() and futex_wait() as an absolute ktime_t therefore keeping nanosecond resolution. Also this removes the need to pass the nanoseconds timeout part to futex_lock_pi() in val2. In futex_wait(), if there is no timeout then a regular schedule() is performed. Otherwise, an hrtimer is fired before schedule() is called. [akpm@linux-foundation.org: fix `make headers_check'] Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net> Signed-off-by: Pierre Peiffer <pierre.peiffer@bull.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:55 -07:00
Andrew Morton	f34c506b03	declare struct ktime Some smarty went and inflicted ktime_t as a typedef upon us, so we cannot forward declare it. Create a new `union ktime', map ktime_t onto that. Now we need to kill off this ktime_t thing. Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:54 -07:00
Andrew Morton	b8522ead35	aio is unlikely Stick an unlikely() around is_aio(): I assert that most IO is synchronous. Cc: Suparna Bhattacharya <suparna@in.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Zach Brown <zach.brown@oracle.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:54 -07:00
Jeff Layton	cd123012d9	RPC: add wrapper for svc_reserve to account for checksum When the kernel calls svc_reserve to downsize the expected size of an RPC reply, it fails to account for the possibility of a checksum at the end of the packet. If a client mounts a NFSv2/3 with sec=krb5i/p, and does I/O then you'll generally see messages similar to this in the server's ring buffer: RPC request reserved 164 but used 208 While I was never able to verify it, I suspect that this problem is also the root cause of some oopses I've seen under these conditions: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=227726 This is probably also a problem for other sec= types and for NFSv4. The large reserved size for NFSv4 compound packets seems to generally paper over the problem, however. This patch adds a wrapper for svc_reserve that accounts for the possibility of a checksum. It also fixes up the appropriate callers of svc_reserve to call the wrapper. For now, it just uses a hardcoded value that I determined via testing. That value may need to be revised upward as things change, or we may want to eventually add a new auth_op that attempts to calculate this somehow. Unfortunately, there doesn't seem to be a good way to reliably determine the expected checksum length prior to actually calculating it, particularly with schemes like spkm3. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:54 -07:00
NeilBrown	7ac1bea550	knfsd: rename sk_defer_lock to sk_lock Now that sk_defer_lock protects two different things, make the name more generic. Also don't bother with disabling _bh as the lock is only ever taken from process context. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:54 -07:00
Adrian Bunk	8842c9655b	remove nfs4_acl_add_ace() nfs4_acl_add_ace() can now be removed. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Neil Brown <neilb@cse.unsw.edu.au> Acked-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:54 -07:00
Oleg Nesterov	10ab825bde	change kernel threads to ignore signals instead of blocking them Currently kernel threads use sigprocmask(SIG_BLOCK) to protect against signals. This doesn't prevent the signal delivery, this only blocks signal_wake_up(). Every "killall -33 kthreadd" means a "struct siginfo" leak. Change kthreadd_setup() to set all handlers to SIG_IGN instead of blocking them (make a new helper ignore_signals() for that). If the kernel thread needs some signal, it should use allow_signal() anyway, and in that case it should not use CLONE_SIGHAND. Note that we can't change daemonize() (should die!) in the same way, because it can be used along with CLONE_SIGHAND. This means that allow_signal() still should unblock the signal to work correctly with daemonize()ed threads. However, disallow_signal() doesn't block the signal any longer but ignores it. NOTE: with or without this patch the kernel threads are not protected from handle_stop_signal(), this seems harmless, but not good. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:53 -07:00
Eric W. Biederman	73c279927f	kthread: don't depend on work queues Currently there is a circular reference between work queue initialization and kthread initialization. This prevents the kthread infrastructure from initializing until after work queues have been initialized. We want the properties of tasks created with kthread_create to be as close as possible to the init_task and to not be contaminated by user processes. The later we start our kthreadd that creates these tasks the harder it is to avoid contamination from user processes and the more of a mess we have to clean up because the defaults have changed on us. So this patch modifies the kthread support to not use work queues but to instead use a simple list of structures, and to have kthreadd start from init_task immediately after our kernel thread that execs /sbin/init. By being a true child of init_task we only have to change those process settings that we want to have different from init_task, such as our process name, the cpus that are allowed, blocking all signals and setting SIGCHLD to SIG_IGN so that all of our children are reaped automatically. By being a true child of init_task we also naturally get our ppid set to 0 and do not wind up as a child of PID == 1. Ensuring that tasks generated by kthread_create will not slow down the functioning of the wait family of functions. [akpm@linux-foundation.org: use interruptible sleeps] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:53 -07:00
Oleg Nesterov	28e53bddf8	unify flush_work/flush_work_keventd and rename it to cancel_work_sync flush_work(wq, work) doesn't need the first parameter, we can use cwq->wq (this was possible from the very beginnig, I missed this). So we can unify flush_work_keventd and flush_work. Also, rename flush_work() to cancel_work_sync() and fix all callers. Perhaps this is not the best name, but "flush_work" is really bad. (akpm: this is why the earlier patches bypassed maintainers) Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Jeff Garzik <jeff@garzik.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Auke Kok <auke-jan.h.kok@intel.com>, Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:53 -07:00
Oleg Nesterov	23b2e5991a	workqueue: kill NOAUTOREL works We don't have any users, and it is not so trivial to use NOAUTOREL works correctly. It is better to simplify API. Delete NOAUTOREL support and rename work_release to work_clear_pending to avoid a confusion. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:52 -07:00
Oleg Nesterov	1634c48f8b	make cancel_rearming_delayed_work() work on any workqueue, not just keventd_wq cancel_rearming_delayed_workqueue(wq, dwork) doesn't need the first parameter. We don't hang on un-queued dwork any longer, and work->data doesn't change its type. This means we can always figure out "wq" from dwork when it is needed. Remove this parameter, and rename the function to cancel_rearming_delayed_work(). Re-create an inline "obsolete" cancel_rearming_delayed_workqueue(wq) which just calls cancel_rearming_delayed_work(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:52 -07:00
Oleg Nesterov	7097a87afe	workqueue: kill run_scheduled_work() Because it has no callers. Actually, I think the whole idea of run_scheduled_work() was not right, not good to mix "unqueue this work and execute its ->func()" in one function. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:52 -07:00
Gautham R Shenoy	baaca49f41	Define and use new events,CPU_LOCK_ACQUIRE and CPU_LOCK_RELEASE This is an attempt to provide an alternate mechanism for postponing a hotplug event instead of using a global mechanism like lock_cpu_hotplug. The proposal is to add two new events namely CPU_LOCK_ACQUIRE and CPU_LOCK_RELEASE. The notification for these two events would be sent out before and after a cpu_hotplug event respectively. During the CPU_LOCK_ACQUIRE event, a cpu-hotplug-aware subsystem is supposed to acquire any per-subsystem hotcpu mutex ( Eg. workqueue_mutex in kernel/workqueue.c ). During the CPU_LOCK_RELEASE release event the cpu-hotplug-aware subsystem is supposed to release the per-subsystem hotcpu mutex. The reasons for defining new events as opposed to reusing the existing events like CPU_UP_PREPARE/CPU_UP_FAILED/CPU_ONLINE for locking/unlocking of per-subsystem hotcpu mutexes are as follow: - CPU_LOCK_ACQUIRE: All hotcpu mutexes are taken before subsystems start handling pre-hotplug events like CPU_UP_PREPARE/CPU_DOWN_PREPARE etc, thus ensuring a clean handling of these events. - CPU_LOCK_RELEASE: The hotcpu mutexes will be released only after all subsystems have handled post-hotplug events like CPU_DOWN_FAILED, CPU_DEAD,CPU_ONLINE etc thereby ensuring that there are no subsequent clashes amongst the interdependent subsystems after a cpu hotplugs. This patch also uses __raw_notifier_call chain in _cpu_up to take care of the dependency between the two consequetive calls to raw_notifier_call_chain. [akpm@linux-foundation.org: fix a bug] Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:51 -07:00
Gautham R Shenoy	6f7cc11aa6	Extend notifier_call_chain to count nr_calls made Since 2.6.18-something, the community has been bugged by the problem to provide a clean and a stable mechanism to postpone a cpu-hotplug event as lock_cpu_hotplug was badly broken. This is another proposal towards solving that problem. This one is along the lines of the solution provided in kernel/workqueue.c Instead of having a global mechanism like lock_cpu_hotplug, we allow the subsytems to define their own per-subsystem hot cpu mutexes. These would be taken(released) where ever we are currently calling lock_cpu_hotplug(unlock_cpu_hotplug). Also, in the per-subsystem hotcpu callback function,we take this mutex before we handle any pre-cpu-hotplug events and release it once we finish handling the post-cpu-hotplug events. A standard means for doing this has been provided in [PATCH 2/4] and demonstrated in [PATCH 3/4]. The ordering of these per-subsystem mutexes might still prove to be a problem, but hopefully lockdep should help us get out of that muddle. The patch set to be applied against linux-2.6.19-rc5 is as follows: [PATCH 1/4] : Extend notifier_call_chain with an option to specify the number of notifications to be sent and also count the number of notifications actually sent. [PATCH 2/4] : Define events CPU_LOCK_ACQUIRE and CPU_LOCK_RELEASE and send out notifications for these in _cpu_up and _cpu_down. This would help us standardise the acquire and release of the subsystem locks in the hotcpu callback functions of these subsystems. [PATCH 3/4] : Eliminate lock_cpu_hotplug from kernel/sched.c. [PATCH 4/4] : In workqueue_cpu_callback function, acquire(release) the workqueue_mutex while handling CPU_LOCK_ACQUIRE(CPU_LOCK_RELEASE). If the per-subsystem-locking approach survives the test of time, we can expect a slow phasing out of lock_cpu_hotplug, which has not yet been eliminated in these patches :) This patch: Provide notifier_call_chain with an option to call only a specified number of notifiers and also record the number of call to notifiers made. The need for this enhancement was identified in the post entitled "Slab - Eliminate lock_cpu_hotplug from slab" (http://lkml.org/lkml/2006/10/28/92) by Ravikiran G Thirumalai and Andrew Morton. This patch adds two additional parameters to notifier_call_chain API namely - int nr_to_calls : Number of notifier_functions to be called. The don't care value is -1. - unsigned int *nr_calls : Records the total number of notifier_funtions called by notifier_call_chain. The don't care value is NULL. [michal.k.k.piotrowski@gmail.com: build fix] Credit: Andrew Morton <akpm@osdl.org> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:51 -07:00
Tom Zanussi	7c9cb38302	relay: use plain timer instead of delayed work relay doesn't need to use schedule_delayed_work() for waking readers when a simple timer will do. Signed-off-by: Tom Zanussi <zanussi@comcast.net> Cc: Satyam Sharma <satyam.sharma@gmail.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:51 -07:00
Andrew Morton	19a75d83ff	kblockd: use flush_work Switch the kblockd flushing from a global flush to a more specific flush_work(). (akpm: bypassed maintainers, sorry. There are other patches which depend on this) Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <axboe@suse.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:51 -07:00
Oleg Nesterov	b89deed32c	implement flush_work() A basic problem with flush_scheduled_work() is that it blocks behind _all_ presently-queued works, rather than just the work whcih the caller wants to flush. If the caller holds some lock, and if one of the queued work happens to want that lock as well then accidental deadlocks can occur. One example of this is the phy layer: it wants to flush work while holding rtnl_lock(). But if a linkwatch event happens to be queued, the phy code will deadlock because the linkwatch callback function takes rtnl_lock. So we implement a new function which will flush a single work - just the one which the caller wants to free up. Thus we avoid the accidental deadlocks which can arise from unrelated subsystems' callbacks taking shared locks. flush_work() non-blockingly dequeues the work_struct which we want to kill, then it waits for its handler to complete on all CPUs. Add ->current_work to the "struct cpu_workqueue_struct", it points to currently running "struct work_struct". When flush_work(work) detects ->current_work == work, it inserts a barrier at the _head_ of ->worklist (and thus right _after_ that work) and waits for completition. This means that the next work fired on that CPU will be this barrier, or another barrier queued by concurrent flush_work(), so the caller of flush_work() will be woken before any "regular" work has a chance to run. When wait_on_work() unlocks workqueue_mutex (or whatever we choose to protect against CPU hotplug), CPU may go away. But in that case take_over_work() will move a barrier we queued to another CPU, it will be fired sometime, and wait_on_work() will be woken. Actually, we are doing cleanup_workqueue_thread()->kthread_stop() before take_over_work(), so cwq->thread should complete its ->worklist (and thus the barrier), because currently we don't check kthread_should_stop() in run_workqueue(). But even if we did, everything should be ok. [akpm@osdl.org: cleanup] [akpm@osdl.org: add flush_work_keventd() wrapper] Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:50 -07:00
Andrew Morton	18d8362d51	mutex_lock_interruptible(): add __must_check It's not sane to use mutex_lock_interruptible() and to then ignore the result. Ditto down_interruptible(), but I'm lazy. Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:49 -07:00
Roland McGrath	55c0d1f83e	Move sig_kernel_* et al macros to linux/signal.h This patch moves the sig_kernel_* and related macros from kernel/signal.c to linux/signal.h, and cleans them up slightly. I need the sig_kernel_* macros for default signal behavior in the utrace code, and want to avoid duplication or overhead to share the knowledge. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:49 -07:00
James Bottomley	8813d1c00c	mca: add integrated device bus matching The MCA bus has a few "integrated" functions, which are effectively virtual slots on the bus. The problem is that these special functions don't have dedicated pos IDs, so we have to manufacture ids for them outside the pos space ... and these ids can't be matched by the standard matching function, so add a special registration that requests a list of pos ids or a particular integrated function. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:49 -07:00
Fernando Luis Vazquez Cao	2f4dfe206a	Remove hardcoding of hard_smp_processor_id on UP systems With the advent of kdump, the assumption that the boot CPU when booting an UP kernel is always the CPU with a particular hardware ID (often 0) (usually referred to as BSP on some architectures) is not valid anymore. The reason being that the dump capture kernel boots on the crashed CPU (the CPU that invoked crash_kexec), which may be or may not be that particular CPU. Move definition of hard_smp_processor_id for the UP case to architecture-specific code ("asm/smp.h") where it belongs, so that each architecture can provide its own implementation. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Acked-by: Andi Kleen <ak@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:48 -07:00
Dave Gilbert	dd2a345f8f	Display all possible partitions when the root filesystem failed to mount Display all possible partitions when the root filesystem is not mounted. This helps to track spell'o's and missing drivers. Updated to work with newer kernels. Example output: VFS: Cannot open root device "foobar" or unknown-block(0,0) Please append a correct "root=" boot option; here are the available partitions: 0800 8388608 sda driver: sd 0801 192748 sda1 0802 8193150 sda2 0810 4194304 sdb driver: sd Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) [akpm@linux-foundation.org: cleanups, fix printk warnings] Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Cc: Dave Gilbert <linux@treblig.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:48 -07:00
Rafael J. Wysocki	a3d25c275d	PM: Separate hibernation code from suspend code [ With Johannes Berg <johannes@sipsolutions.net> ] Separate the hibernation (aka suspend to disk code) from the other suspend code. In particular: * Remove the definitions related to hibernation from include/linux/pm.h * Introduce struct hibernation_ops and a new hibernate() function to hibernate the system, defined in include/linux/suspend.h * Separate suspend code in kernel/power/main.c from hibernation-related code in kernel/power/disk.c and kernel/power/user.c (with the help of hibernation_ops) * Switch ACPI (the only user of pm_ops.pm_disk_mode) to hibernation_ops Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Greg KH <greg@kroah.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:48 -07:00
Stephen Rothwell	97416ce82e	Declare {compat_}sys_utimensat This is needed before Powerpc can wire up the syscall. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:44 -07:00
Robert P. J. Day	42f209d3c9	[MTD] Delete allegedly obsolete "bank_size" field of mtd_info. Delete the allegedly obsolete "bank_size" member of struct mtd_info. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-05-09 13:26:52 +01:00
Robert P. J. Day	36200b7600	[MTD] Remove unnecessary user space check from mtd.h. Since the header file include/linux/mtd/mtd.h is not exported to user space, remove the user space check and error. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-05-09 13:24:37 +01:00
John Anthony Kazos Jr	121e70b69a	include files: convert "include" subdirectory to UTF-8 Convert the "include" subdirectory to UTF-8. Signed-off-by: John Anthony Kazos Jr. <jakj@j-a-k-j.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2007-05-09 08:58:21 +02:00
Uwe Kleine-König	5886269962	fix file specification in comments Many files include the filename at the beginning, serveral used a wrong one. Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2007-05-09 08:58:16 +02:00
Michael Opdenacker	59c51591a0	Fix occurrences of "the the " Signed-off-by: Michael Opdenacker <michael@free-electrons.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2007-05-09 08:57:56 +02:00
Johannes Berg	940d67f6b9	[POWERPC] swsusp: Introduce register_nosave_region_late This patch introduces a new register_nosave_region_late function that can be called from initcalls when register_nosave_region can no longer be used because it uses bootmem. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:34:56 +10:00
Robert P. J. Day	beb7dd86a1	Fix misspellings collected by members of KJ list. Fix the misspellings of "propogate", "writting" and (oh, the shame :-) "kenrel" in the source tree. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2007-05-09 07:14:03 +02:00
Roland Dreier	225c7b1fee	IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters Add an InfiniBand driver for Mellanox ConnectX adapters. Because these adapters can also be used as ethernet NICs and Fibre Channel HBAs, the driver is split into two modules: mlx4_core: Handles low-level things like device initialization and processing firmware commands. Also controls resource allocation so that the InfiniBand, ethernet and FC functions can share a device without stepping on each other. mlx4_ib: Handles InfiniBand-specific things; plugs into the InfiniBand midlayer. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-08 18:00:38 -07:00
Jiri Kosina	66da876962	USB HID: report descriptor of Cypress USB barcode readers needs fixup Certain versions of Cypress USB barcode readers (this problem is known to happen at least with PIDs 0xde61 and 0xde64) have report descriptor which has swapped usage min and usage max tag. This results in HID parser failing for report descriptor of these devices, as it (wrongly) requires allocating more usages than HID_MAX_USAGES. Solve this by walking through the report descriptor for such devices, and swap the usage min and usage max items (and their values) to be in proper order. Reported-by: Bret Towe <magnade@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2007-05-09 02:52:51 +02:00
Alex Dubov	055b822414	disable socket power in adapter driver instead of media one Socket power must be fully controlled by adapter driver. This also prevents unnecessary power-off of the socket when media driver is unloaded, yet media remains in the socket. Signed-off-by: Alex Dubov <oakad@yahoo.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2007-05-08 22:41:47 +02:00
Linus Torvalds	4750def52c	Merge branch 'reset-seq' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'reset-seq' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [libata reset-seq] build and merge fixes libata: reimplement reset sequencing libata: improve ata_std_prereset() libata: improve 0xff status handling libata: add deadline support to prereset and reset methods	2007-05-08 11:58:20 -07:00
Linus Torvalds	393bfca19e	Merge master.kernel.org:/pub/scm/linux/kernel/git/dtor/input * master.kernel.org:/pub/scm/linux/kernel/git/dtor/input: Input: move USB miscellaneous devices under drivers/input/misc Input: move USB mice under drivers/input/mouse Input: move USB gamepads under drivers/input/joystick Input: move USB touchscreens under drivers/input/touchscreen Input: move USB tablets under drivers/input/tablet Input: i8042 - fix AUX port detection with some chips Input: aaed2000_kbd - convert to use polldev library Input: drivers/usb/input - usb_buffer_free() cleanup Input: synaptics - don't complain about failed resets Input: pull input.h into uinpit.h Input: drivers/usb/input - fix sparse warnings (signedness) Input: evdev - fix some sparse warnings (signedness, shadowing) Input: drivers/joystick - fix various sparse warnings Input: force feedback - make sure effect is present before playing	2007-05-08 11:51:43 -07:00
Linus Torvalds	df6d3916f3	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (77 commits) [POWERPC] Abolish powerpc_flash_init() [POWERPC] Early serial debug support for PPC44x [POWERPC] Support for the Ebony 440GP reference board in arch/powerpc [POWERPC] Add device tree for Ebony [POWERPC] Add powerpc/platforms/44x, disable platforms/4xx for now [POWERPC] MPIC U3/U4 MSI backend [POWERPC] MPIC MSI allocator [POWERPC] Enable MSI mappings for MPIC [POWERPC] Tell Phyp we support MSI [POWERPC] RTAS MSI implementation [POWERPC] PowerPC MSI infrastructure [POWERPC] Rip out the existing powerpc msi stubs [POWERPC] Remove use of 4level-fixup.h for ppc32 [POWERPC] Add powerpc PCI-E reset API implementation [POWERPC] Holly bootwrapper [POWERPC] Holly DTS [POWERPC] Holly defconfig [POWERPC] Add support for 750CL Holly board [POWERPC] Generalize tsi108 PCI setup [POWERPC] Generalize tsi108 PHY types ... Fixed conflict in include/asm-powerpc/kdebug.h manually Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:50:19 -07:00
Ondrej Zajicek	34ed25f50b	s3fb: updates Move s3fb_get_tilemax to svgalib.c as svga_get_tilemax, because it reports limitation of other code from svgalib (svga_settile, svga_tilecopy, ...) Limit font width to 8 pixels in 4 bpp mode. Signed-off-by: Ondrej Zajicek <santiago@crfreenet.org> Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:33 -07:00
Matthias Kaehlcke	c831c338f0	use mutex instead of semaphore in virtual console driver The virtual console driver uses a semaphore as mutex. Use the mutex API instead of the (binary) semaphore. Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:33 -07:00
Antonino A. Daplas	38a3dc5185	fbdev: fbcon: check if mode can handle new screen Check if the mode can properly display the screen. This will be needed by drivers where the capability is not constant with each mode. The function fb_set_var() will query fbcon the requirement, then it will query the driver (via a new hook fb_get_caps()) its capability. If the driver's capability cannot handle fbcon's requirement, then fb_set_var() will fail. For example, if a particular driver supports 2 modes where: mode1 = can only display 8x16 bitmaps mode2 = can display any bitmap then if current mode = mode2 and current font = 12x22 fbset <mode1> /* mode1 cannot handle 12x22 */ fbset will fail Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:32 -07:00
Antonino A. Daplas	11f11d522f	fbdev: add tile operation to get the maximum length of the map Add a tile method, fb_get_tilemax(), that returns the maximum length of the tile map (or font map). This is needed by s3fb which can only handle 256 characters. Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:31 -07:00
Antonino A. Daplas	2d2699d984	fbcon: font setting should check limitation of driver fbcon_set_font() will now check if the new font dimensions can be drawn by the driver (by checking pixmap.blit_x and blit_y). Similarly, add 2 new parameters to get_default_font(), font_w and font_h, to further aid in the font selection process. Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:31 -07:00
Antonino A. Daplas	bf26ad72a6	fbdev: advertise limitation of drawing engine A few drivers are not capable of blitting rectangles of any dimension. vga16fb can only blit 8-pixel wide rectangles, while s3fb (in tileblitting mode) can only blit 8x16 rectangles. For example, loading a 12x22 font in vga16fb will result in a corrupt display. Advertise this limitation/capability in info->pixmap.blit_x and blit_y. These fields are 32-bit arrays (font max is 32x32 only), ie, if bit 7 is set, then width/height of 7+1 is supported. Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:30 -07:00
Antonino A. Daplas	09aaf268eb	fbdev: add fb_read/fb_write functions for framebuffers in system RAM The functions fb_read() and fb_write in fbmem.c assume that the framebuffer is in IO memory. However, we have 3 drivers (hecubafb, arcfb, and vfb) where the framebuffer is allocated from system RAM (via vmalloc). Using __raw_read/__raw_write (fb_readl/fb_writel) for these drivers is illegal, especially in other platforms. Create file read and write methods for these types of drivers. These are named fb_sys_read() and fb_sys_write(). Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:30 -07:00
Antonino A. Daplas	3f9b0880e4	fbdev: pass struct fb_info to fb_read and fb_write It is unnecessary to pass struct file to fb_read() and fb_write() in struct fb_ops. For consistency with the other methods, pass struct fb_info instead. Signed-off-by: Antonino Daplas <adaplas@gmail.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:30 -07:00
Antonino A. Daplas	68648ed1f5	fbdev: add drawing functions for framebuffers in system RAM The generic drawing functions (cfbimgblt, cfbcopyarea, cfbfillrect) assume that the framebuffer is in IO memory. However, we have 3 drivers (hecubafb, arcfb, and vfb) where the framebuffer is allocated from system RAM (via vmalloc). Using _raw_read/write and family for these drivers (as used in the cfb* functions) is illegal, especially in other platforms. Create 3 new drawing functions, based almost entirely from the original except that the framebuffer memory is assumed to be in system RAM. These are named as sysimgblt, syscopyarea, and sysfillrect. Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:30 -07:00
Jan Engelhardt	fa6ce9ab5f	vt: add color support to the "underline" and "italic" attributes Add color support to the "underline" and "italic" attributes as in OpenBSD/NetBSD-style (vt220) and xterm. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Acked-by: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:27 -07:00
Paul Mundt	5e841b88d2	fb: fsync() method for deferred I/O flush. There are cases when we do not want to wait on the delay for automatically updating the "real" framebuffer, this implements a simple ->fsync() hook for explicitly flushing the deferred I/O work. The ->page_mkwrite() handler will rearm the work queue normally. (akpm: nuke unneeded ifdefs, forward-delcare struct dentry) Signed-off-by: Paul Mundt <lethal@linux-sh.org> Cc: Jaya Kumar <jayakumar.lkml@gmail.com> Acked-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:27 -07:00
Jaya Kumar	60b59beafb	fbdev: mm: Deferred IO support This implements deferred IO support in fbdev. Deferred IO is a way to delay and repurpose IO. This implementation is done using mm's page_mkwrite and page_mkclean hooks in order to detect, delay and then rewrite IO. This functionality is used by hecubafb. [adaplas] This is useful for graphics hardware with no directly addressable/mappable framebuffer. Implementing this will allow the "framebuffer" to be accesible from user space via mmap(). Signed-off-by: Jaya Kumar <jayakumar.lkml@gmail.com> Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:26 -07:00
James Simmons	2ee121631b	fbdev: display class Add the new display class. This is meant to unite the various solutions to display units ie acpi output device, auxdisplay and the defunct lcd class in the backlight directory. Signed-off-by: James Simmons <jsimmons@infradead.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:26 -07:00
Jiri Slaby	dd025c0c7a	Char: cyclades, dynamic ports and save thus approx. 160k of .bss Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:25 -07:00
Jiri Slaby	6a0aa67b17	Char: cyclades, remove unused timestamps Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:25 -07:00
Jiri Slaby	2c7fea9921	Char: cyclades, remove sleep_on convert to wait_* and completion Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:25 -07:00
Jiri Slaby	875b206b5f	Char: cyclades, make info->card a pointer Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:25 -07:00
Jiri Slaby	46039f8a64	Char: cyclades, remove useless fileds from cyclades_card pde, ctl_phys and base_phys are useless -- they are never used. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:24 -07:00
Jiri Slaby	b81cc310f1	Char: cyclades, unexport struct cyclades_card Do not export internal card data to userspace. cytune doesn't use this anyway. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:24 -07:00
Bjorn Helgaas	8f81dd1498	PNP: notice whether we have PNP devices (PNPBIOS or PNPACPI) This series converts i386 and x86_64 legacy serial ports to be platform devices and prevents probing for them if we have PNP. This prevents double discovery, where a device was found both by the legacy probe and by 8250_pnp. This also prevents the serial driver from claiming IRDA devices (unless they have a UART PNP ID). The serial legacy probe sometimes assumed the wrong IRQ, so the user had to use "setserial" to fix it. Removing the need for setserial to make IRDA devices work seems good, but it does break some things. In particular, you may need to keep setserial from poking legacy UART stuff back in by doing something like "dpkg-reconfigure setserial" with the "kernel" option. Otherwise, the setserial-discovered "UART" will claim resources and prevent the IRDA driver from loading. This patch: If we can discover devices using PNP, we can skip some legacy probes. This flag ("pnp_platform_devices") indicates that PNPBIOS or PNPACPI is enabled and should tell us about builtin devices. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Keith Owens <kaos@ocs.com.au> Cc: Len Brown <lenb@kernel.org> Cc: Adam Belay <ambx1@neo.rr.com> Cc: Matthieu CASTET <castet.matthieu@free.fr> Cc: Jean Tourrilhes <jt@hpl.hp.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Ville Syrjala <syrjala@sci.fi> Cc: Russell King <rmk+serial@arm.linux.org.uk> Cc: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:23 -07:00
Jiri Slaby	db05c3b1dd	Char: cyclades, cy_readX/writeX cleanup cyclades, cy_readX/writeX cleanup - cy_readX are placeholders for readX, remove it - move cy_writeX macros into do {} while(0) to be safe Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:22 -07:00
Bernhard Walle	d85a60d85e	Add IRQF_IRQPOLL flag (common code) irqpoll is broken on some architectures that don't use the IRQ 0 for the timer interrupt like IA64. This patch adds a IRQF_IRQPOLL flag. Each architecture is handled in a separate pach. As I left the irq == 0 as condition, this should not break existing architectures that use timer_irq == 0 and that I did't address with that patch (because I don't know). This patch: This patch adds a IRQF_IRQPOLL flag that the interrupt registration code could use for the interrupt it wants to use for IRQ polling. Because this must not be the timer interrupt, an additional flag was added instead of re-using the IRQF_TIMER constant. Until all architectures will have an IRQF_IRQPOLL interrupt, irq == 0 will stay as alternative as it should not break anything. Also, note_interrupt() is called on CPU-specific interrupts to be used as interrupt source for IRQ polling. Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <willy@debian.org> Cc: Grant Grundler <grundler@google.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:22 -07:00
Peter Zijlstra	277866a0e3	nfs: fix congestion control: use atomic_longs Change the atomic_t in struct nfs_server to atomic_long_t in anticipation of machines that can handle 8+TB of (4K) pages under writeback. However I suspect other things in NFS will start going bang by then. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:21 -07:00
Ananth N Mavinakayanahalli	bf8f6e5b3e	Kprobes: The ON/OFF knob thru debugfs This patch provides a debugfs knob to turn kprobes on/off o A new file /debug/kprobes/enabled indicates if kprobes is enabled or not (default enabled) o Echoing 0 to this file will disarm all installed probes o Any new probe registration when disabled will register the probe but not arm it. A message will be printed out in such a case. o When a value 1 is echoed to the file, all probes (including ones registered in the intervening period) will be enabled o Unregistration will happen irrespective of whether probes are globally enabled or not. o Update Documentation/kprobes.txt to reflect these changes. While there also update the doc to make it current. We are also looking at providing sysrq key support to tie to the disabling feature provided by this patch. [akpm@linux-foundation.org: Use bool like a bool!] [akpm@linux-foundation.org: add printk facility levels] [cornelia.huck@de.ibm.com: Add the missing arch_trampoline_kprobe() for s390] Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:19 -07:00
Christoph Hellwig	4c4308cb93	kprobes: kretprobes simplifications - consolidate duplicate code in all arch_prepare_kretprobe instances into common code - replace various odd helpers that use hlist_for_each_entry to get the first elemenet of a list with either a hlist_for_each_entry_save or an opencoded access to the first element in the caller - inline add_rp_inst into it's only remaining caller - use kretprobe_inst_table_head instead of opencoding it Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:19 -07:00
Andrew Morton	416ce32e70	revert "rtc: Add rtc_merge_alarm()" David says "884b4aaaa242a2db8c8252796f0118164a680ab5 should be reverted. It added an rtc_merge_alarm() call to the 2.6.20 kernel, which hasn't yet been used by any in-tree driver; this patch obviates the need for that call, and uses a more robust approach." Cc: Scott Wood <scottwood@freescale.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:18 -07:00
David Brownell	87ac84f42a	rtc-cmos wakeup interface I finally got around to testing the updated wakeup event hooks for rtc-cmos, and they follow in two patches: - Interface update ... when a simple enable_irq_wake() doesn't suffice, the platform data can hold suspend/resume callback hooks. - ACPI implementation ... provides callback hooks to do ACPI magic, and eliminate the legacy /proc/acpi/alarm file. The interface update could go into 2.6.21, but that's not essential; they will be NOPs on most PCs, without the ACPI stuff. I suspect the ACPI folk may have opinions about how to merge that second patch, and how to obsolete that legacy procfs file. I'd like to see that merge into 2.6.22 if possible... As for how to kick it in ... two ways: - The appended "rtcwake" program; updated since the last time it was posted, it deals much better with timezones and DST. - Write the /sys/class/rtc/.../wakealarm file, then go to sleep. For some reason RTC wake from "swsusp" stopped working on a system where it previously worked; the alarm setting appears to get clobbered. But on the bright side, RTC wake from "standby" worked on a system that had never been able to resume from that state before ... IDEACPI is my guess as to why it finally started to work. It's the old "two steps forward, one step back" dance, I guess. - Dave /* gcc -Wall -Os -o rtcwake rtcwake.c / #include <stdio.h> #include <getopt.h> #include <fcntl.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <errno.h> #include <time.h> #include <sys/ioctl.h> #include <sys/time.h> #include <sys/types.h> #include <linux/rtc.h> / constants from legacy PC/AT hardware / #define RTC_PF 0x40 #define RTC_AF 0x20 #define RTC_UF 0x10 / * rtcwake -- enter a system sleep state until specified wakeup time. * * This uses cross-platform Linux interfaces to enter a system sleep state, * and leave it no later than a specified time. It uses any RTC framework * driver that supports standard driver model wakeup flags. * * This is normally used like the old "apmsleep" utility, to wake from a * suspend state like ACPI S1 (standby) or S3 (suspend-to-RAM). Most * platforms can implement those without analogues of BIOS, APM, or ACPI. * * On some systems, this can also be used like "nvram-wakeup", waking * from states like ACPI S4 (suspend to disk). Not all systems have * persistent media that are appropriate for such suspend modes. * * The best way to set the system's RTC is so that it holds the current * time in UTC. Use the "-l" flag to tell this program that the system * RTC uses a local timezone instead (maybe you dual-boot MS-Windows). / static char progname; #ifdef DEBUG #define VERSION "1.0 dev (" __DATE__ " " __TIME__ ")" #else #define VERSION "0.9" #endif static unsigned verbose; static int rtc_is_utc = -1; static int may_wakeup(const char devname) { char buf[128], s; FILE f; snprintf(buf, sizeof buf, "/sys/class/rtc/%s/device/power/wakeup", devname); f = fopen(buf, "r"); if (!f) { perror(buf); return 0; } fgets(buf, sizeof buf, f); fclose(f); s = strchr(buf, '\n'); if (!s) return 0; s = 0; /* wakeup events could be disabled or not supported / return strcmp(buf, "enabled") == 0; } / all times should be in UTC / static time_t sys_time; static time_t rtc_time; static int get_basetimes(int fd) { struct tm tm; struct rtc_time rtc; / this process works in RTC time, except when working * with the system clock (which always uses UTC). / if (rtc_is_utc) setenv("TZ", "UTC", 1); tzset(); / read rtc and system clocks "at the same time", or as * precisely (+/- a second) as we can read them. / if (ioctl(fd, RTC_RD_TIME, &rtc) < 0) { perror("read rtc time"); return 0; } sys_time = time(0); if (sys_time == (time_t)-1) { perror("read system time"); return 0; } / convert rtc_time to normal arithmetic-friendly form, * updating tm.tm_wday as used by asctime(). / memset(&tm, 0, sizeof tm); tm.tm_sec = rtc.tm_sec; tm.tm_min = rtc.tm_min; tm.tm_hour = rtc.tm_hour; tm.tm_mday = rtc.tm_mday; tm.tm_mon = rtc.tm_mon; tm.tm_year = rtc.tm_year; tm.tm_isdst = rtc.tm_isdst; / stays unspecified? / rtc_time = mktime(&tm); if (rtc_time == (time_t)-1) { perror("convert rtc time"); return 0; } if (verbose) { if (!rtc_is_utc) { printf("\ttzone = %ld\n", timezone); printf("\ttzname = %s\n", tzname[daylight]); gmtime_r(&rtc_time, &tm); } printf("\tsystime = %ld, (UTC) %s", (long) sys_time, asctime(gmtime(&sys_time))); printf("\trtctime = %ld, (UTC) %s", (long) rtc_time, asctime(&tm)); } return 1; } static int setup_alarm(int fd, time_t wakeup) { struct tm tm; struct rtc_wkalrm wake; tm = gmtime(wakeup); wake.time.tm_sec = tm->tm_sec; wake.time.tm_min = tm->tm_min; wake.time.tm_hour = tm->tm_hour; wake.time.tm_mday = tm->tm_mday; wake.time.tm_mon = tm->tm_mon; wake.time.tm_year = tm->tm_year; wake.time.tm_wday = tm->tm_wday; wake.time.tm_yday = tm->tm_yday; wake.time.tm_isdst = tm->tm_isdst; / many rtc alarms only support up to 24 hours from 'now' ... / if ((rtc_time + (24 60 * 60)) > wakeup) { if (ioctl(fd, RTC_ALM_SET, &wake.time) < 0) { perror("set rtc alarm"); return 0; } if (ioctl(fd, RTC_AIE_ON, 0) < 0) { perror("enable rtc alarm"); return 0; } / ... so use the "more than 24 hours" request only if we must / } else { / avoid an extra AIE_ON call / wake.enabled = 1; if (ioctl(fd, RTC_WKALM_SET, &wake) < 0) { perror("set rtc wake alarm"); return 0; } } return 1; } static void suspend_system(const char suspend) { FILE f = fopen("/sys/power/state", "w"); if (!f) { perror("/sys/power/state"); return; } fprintf(f, "%s\n", suspend); fflush(f); / this executes after wake from suspend / fclose(f); } int main(int argc, char argv) { static char devname = "rtc0"; static unsigned seconds = 0; static char suspend = "standby"; int t; int fd; time_t alarm = 0; progname = strrchr(argv[0], '/'); if (progname) progname++; else progname = argv[0]; if (chdir("/dev/") < 0) { perror("chdir /dev"); return 1; } while ((t = getopt(argc, argv, "d:lm:s:t:uVv")) != EOF) { switch (t) { case 'd': devname = optarg; break; case 'l': rtc_is_utc = 0; break; / what system power mode to use? for now handle only * standardized mode names; eventually when systems define * their own state names, parse /sys/power/state. * * "on" is used just to test the RTC alarm mechanism, * bypassing all the wakeup-from-sleep infrastructure. / case 'm': if (strcmp(optarg, "standby") == 0 \|\| strcmp(optarg, "mem") == 0 \|\| strcmp(optarg, "disk") == 0 \|\| strcmp(optarg, "on") == 0 ) { suspend = optarg; break; } printf("%s: unrecognized suspend state '%s'\n", progname, optarg); goto usage; / alarm time, seconds-to-sleep (relative) / case 's': t = atoi(optarg); if (t < 0) { printf("%s: illegal interval %s seconds\n", progname, optarg); goto usage; } seconds = t; break; / alarm time, time_t (absolute, seconds since 1/1 1970 UTC) / case 't': t = atoi(optarg); if (t < 0) { printf("%s: illegal time_t value %s\n", progname, optarg); goto usage; } alarm = t; break; case 'u': rtc_is_utc = 1; break; case 'v': verbose++; break; case 'V': printf("%s: version %s\n", progname, VERSION); break; default: usage: printf("usage: %s [options]" "\n\t" "-d rtc0\|rtc1\|...\t(select rtc)" "\n\t" "-l\t\t\t(RTC uses local timezone)" "\n\t" "-m standby\|mem\|...\t(sleep mode)" "\n\t" "-s seconds\t\t(seconds to sleep)" "\n\t" "-t time_t\t\t(time to wake)" "\n\t" "-u\t\t\t(RTC uses UTC)" "\n\t" "-v\t\t\t(verbose messages)" "\n\t" "-V\t\t\t(show version)" "\n", progname); return 1; } } if (!alarm && !seconds) { printf("%s: must provide wake time\n", progname); goto usage; } / REVISIT: if /etc/adjtime exists, read it to see what * the util-linux version of hwclock assumes. / if (rtc_is_utc == -1) { printf("%s: assuming RTC uses UTC ...\n", progname); rtc_is_utc = 1; } / this RTC must exist and (if we'll sleep) be wakeup-enabled / fd = open(devname, O_RDONLY); if (fd < 0) { perror(devname); return 1; } if (strcmp(suspend, "on") != 0 && !may_wakeup(devname)) { printf("%s: %s not enabled for wakeup events\n", progname, devname); return 1; } / relative or absolute alarm time, normalized to time_t / if (!get_basetimes(fd)) return 1; if (verbose) printf("alarm %ld, sys_time %ld, rtc_time %ld, seconds %u\n", alarm, sys_time, rtc_time, seconds); if (alarm) { if (alarm < sys_time) { printf("%s: time doesn't go backward to %s", progname, ctime(&alarm)); return 1; } alarm += sys_time - rtc_time; } else alarm = rtc_time + seconds + 1; if (setup_alarm(fd, &alarm) < 0) return 1; sync(); printf("%s: wakeup from \"%s\" using %s at %s", progname, suspend, devname, ctime(&alarm)); fflush(stdout); usleep(10 1000); if (strcmp(suspend, "on") != 0) suspend_system(suspend); else { unsigned long data; do { t = read(fd, &data, sizeof data); if (t < 0) { perror("rtc read"); break; } if (verbose) printf("... %s: %03lx\n", devname, data); } while (!(data & RTC_AF)); } if (ioctl(fd, RTC_AIE_OFF, 0) < 0) perror("disable rtc alarm interrupt"); close(fd); return 0; } This patch: Make rtc-cmos do the relevant magic so this RTC can wake the system from a sleep state. That magic comes in two basic flavors: - Straightforward: enable_irq_wake(), the way it'd work on most SOC chips; or generally with system sleep states which don't disable core IRQ logic. - Roundabout, using non-IRQ platform hooks. This is needed with ACPI and one almost-clone chip which uses a special wakeup-only alarm. (That's the RTC used on Footbridge boards, FWIW, which don't do PM in Linux.) A separate patch implements those hooks for ACPI platforms, so that rtc_cmos can issue system wakeup events (and its sysfs "wakealarm" attribute works on at least some systems). Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:18 -07:00
David Brownell	cd9662094e	rtc: remove rest of class_device Finish converting the RTC framework so it no longer uses class_device. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-By: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:18 -07:00
David Brownell	ab6a2d70d1	rtc: rtc interfaces don't use class_device This patch removes class_device from the programming interface that the RTC framework exposes to the rest of the kernel. Now an rtc_device is passed, which is more type-safe and streamlines all the relevant code. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-By: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:18 -07:00
David Brownell	5726fb2012	rtc: remove /sys/class/rtc-dev/* This simplifies the /dev support by removing a superfluous class_device (the /sys/class/rtc-dev stuff) and the class_interface that hooks it into the rtc core. Accordingly, if it's configured then /dev support is now part of the RTC core, and is never a separate module. It's another step towards being able to remove "struct class_device". [bunk@stusta.de: drivers/rtc/rtc-dev.c should #include "rtc-core.h"] Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-By: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:18 -07:00
Ulrich Drepper	1c710c896e	utimensat implementation Implement utimensat(2) which is an extension to futimesat(2) in that it a) supports nano-second resolution for the timestamps b) allows to selectively ignore the atime/mtime value c) allows to selectively use the current time for either atime or mtime d) supports changing the atime/mtime of a symlink itself along the lines of the BSD lutimes(3) functions For this change the internally used do_utimes() functions was changed to accept a timespec time value and an additional flags parameter. Additionally the sys_utime function was changed to match compat_sys_utime which already use do_utimes instead of duplicating the work. Also, the completely missing futimensat() functionality is added. We have such a function in glibc but we have to resort to using /proc/self/fd/* which not everybody likes (chroot etc). Test application (the syscall number will need per-arch editing): #include <errno.h> #include <fcntl.h> #include <time.h> #include <sys/time.h> #include <stddef.h> #include <syscall.h> #define __NR_utimensat 280 #define UTIME_NOW ((1l << 30) - 1l) #define UTIME_OMIT ((1l << 30) - 2l) int main(void) { int status = 0; int fd = open("ttt", O_RDWR\|O_CREAT\|O_EXCL, 0666); if (fd == -1) error (1, errno, "failed to create test file \"ttt\""); struct stat64 st1; if (fstat64 (fd, &st1) != 0) error (1, errno, "fstat failed"); struct timespec t[2]; t[0].tv_sec = 0; t[0].tv_nsec = 0; t[1].tv_sec = 0; t[1].tv_nsec = 0; if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0) error (1, errno, "utimensat failed"); struct stat64 st2; if (fstat64 (fd, &st2) != 0) error (1, errno, "fstat failed"); if (st2.st_atim.tv_sec != 0 \|\| st2.st_atim.tv_nsec != 0) { puts ("atim not reset to zero"); status = 1; } if (st2.st_mtim.tv_sec != 0 \|\| st2.st_mtim.tv_nsec != 0) { puts ("mtim not reset to zero"); status = 1; } if (status != 0) goto out; t[0] = st1.st_atim; t[1].tv_sec = 0; t[1].tv_nsec = UTIME_OMIT; if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0) error (1, errno, "utimensat failed"); if (fstat64 (fd, &st2) != 0) error (1, errno, "fstat failed"); if (st2.st_atim.tv_sec != st1.st_atim.tv_sec \|\| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec) { puts ("atim not set"); status = 1; } if (st2.st_mtim.tv_sec != 0 \|\| st2.st_mtim.tv_nsec != 0) { puts ("mtim changed from zero"); status = 1; } if (status != 0) goto out; t[0].tv_sec = 0; t[0].tv_nsec = UTIME_OMIT; t[1] = st1.st_mtim; if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0) error (1, errno, "utimensat failed"); if (fstat64 (fd, &st2) != 0) error (1, errno, "fstat failed"); if (st2.st_atim.tv_sec != st1.st_atim.tv_sec \|\| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec) { puts ("mtim changed from original time"); status = 1; } if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec \|\| st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec) { puts ("mtim not set"); status = 1; } if (status != 0) goto out; sleep (2); t[0].tv_sec = 0; t[0].tv_nsec = UTIME_NOW; t[1].tv_sec = 0; t[1].tv_nsec = UTIME_NOW; if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0) error (1, errno, "utimensat failed"); if (fstat64 (fd, &st2) != 0) error (1, errno, "fstat failed"); struct timeval tv; gettimeofday(&tv,NULL); if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec \|\| st2.st_atim.tv_sec > tv.tv_sec) { puts ("atim not set to NOW"); status = 1; } if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec \|\| st2.st_mtim.tv_sec > tv.tv_sec) { puts ("mtim not set to NOW"); status = 1; } if (symlink ("ttt", "tttsym") != 0) error (1, errno, "cannot create symlink"); t[0].tv_sec = 0; t[0].tv_nsec = 0; t[1].tv_sec = 0; t[1].tv_nsec = 0; if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0) error (1, errno, "utimensat failed"); if (lstat64 ("tttsym", &st2) != 0) error (1, errno, "lstat failed"); if (st2.st_atim.tv_sec != 0 \|\| st2.st_atim.tv_nsec != 0) { puts ("symlink atim not reset to zero"); status = 1; } if (st2.st_mtim.tv_sec != 0 \|\| st2.st_mtim.tv_nsec != 0) { puts ("symlink mtim not reset to zero"); status = 1; } if (status != 0) goto out; t[0].tv_sec = 1; t[0].tv_nsec = 0; t[1].tv_sec = 1; t[1].tv_nsec = 0; if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0) error (1, errno, "utimensat failed"); if (fstat64 (fd, &st2) != 0) error (1, errno, "fstat failed"); if (st2.st_atim.tv_sec != 1 \|\| st2.st_atim.tv_nsec != 0) { puts ("atim not reset to one"); status = 1; } if (st2.st_mtim.tv_sec != 1 \|\| st2.st_mtim.tv_nsec != 0) { puts ("mtim not reset to one"); status = 1; } if (status == 0) puts ("all OK"); out: close (fd); unlink ("ttt"); unlink ("tttsym"); return status; } [akpm@linux-foundation.org: add missing i386 syscall table entry] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: Alexey Dobriyan <adobriyan@openvz.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:18 -07:00
Eric Dumazet	5517d86bea	Speed up divides by cpu_power in scheduler I noticed expensive divides done in try_to_wakeup() and find_busiest_group() on a bi dual core Opteron machine (total of 4 cores), moderatly loaded (15.000 context switch per second) oprofile numbers : CPU: AMD64 processors, speed 2600.05 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 50000 samples % symbol name ... 613914 1.0498 try_to_wake_up 834 0.0013 :ffffffff80227ae1: div %rcx 77513 0.1191 :ffffffff80227ae4: mov %rax,%r11 608893 1.0413 find_busiest_group 1841 0.0031 :ffffffff802260bf: div %rdi 140109 0.2394 :ffffffff802260c2: test %sil,%sil Some of these divides can use the reciprocal divides we introduced some time ago (currently used in slab AFAIK) We can assume a load will fit in a 32bits number, because with a SCHED_LOAD_SCALE=128 value, its still a theorical limit of 33554432 When/if we reach this limit one day, probably cpus will have a fast hardware divide and we can zap the reciprocal divide trick. Ingo suggested to rename cpu_power to __cpu_power to make clear it should not be modified without changing its reciprocal value too. I did not convert the divide in cpu_avg_load_per_task(), because tracking nr_running changes may be not worth it ? We could use a static table of 32 reciprocal values but it would add a conditional branch and table lookup. [akpm@linux-foundation.org: !SMP build fix] Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:17 -07:00
Siddha, Suresh B	46cb4b7c88	sched: dynticks idle load balancing Fix the process idle load balancing in the presence of dynticks. cpus for which ticks are stopped will sleep till the next event wakes it up. Potentially these sleeps can be for large durations and during which today, there is no periodic idle load balancing being done. This patch nominates an owner among the idle cpus, which does the idle load balancing on behalf of the other idle cpus. And once all the cpus are completely idle, then we can stop this idle load balancing too. Checks added in fast path are minimized. Whenever there are busy cpus in the system, there will be an owner(idle cpu) doing the system wide idle load balancing. Open items: 1. Intelligent owner selection (like an idle core in a busy package). 2. Merge with rcu's nohz_cpu_mask? Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:17 -07:00
Mike Frysinger	a7e27d5dd3	sanitize linux/isdn_divertif.h for userspace the isdn_divertif contains kernel-only references so I've wrapped them in __KERNEL__ and add proper #include statements. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Karsten Keil <kkeil@suse.de> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:16 -07:00
Adrian Bunk	3a3a51d1f2	make drivers/isdn/capi/capiutil.c:cdebbuf_alloc() static Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:16 -07:00
David Brownell	33e34dc6ee	SPI kerneldoc Various documentation updates for the SPI infrastructure, to clarify things that may not have been clear, to cope with lack of editing, and fix omissions. Also, plug SPI into the kernel-api DocBook template, and fix all the resulting glitches in document generation. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:16 -07:00
Andrea Paterniani	814a8d50eb	/dev/spidevB.C interface Add a filesystem API for <linux/spi/spi.h> stack. The initial version of this interface is purely synchronous. dbrownell@users.sourceforge.net: Cleaned up, bugfixed; much simplified; added preliminary documentation. Works with mdev given CONFIG_SYSFS_DEPRECATED; and presumably udev. Updated SPI_IOC_MESSAGE ioctl to full spi_message semantics, supporting groups of one or more transfers (each of which may be full duplex if desired). This is marked as EXPERIMENTAL with an explicit disclaimer that the API (notably the ioctls) is subject to change. Signed-off-by: Andrea Paterniani <a.paterniani@swapp-eng.it> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:15 -07:00
Sergei Shtylyov	ce0be1273d	clockchips.h: kernel-doc fix Fix misnamed fields of 'struct clock_event_device' in the kernel-doc comment. Convert the acronyms to uppercase, while at it... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:15 -07:00
Mike Frysinger	acd64b7375	hide spinlock in linux/quota.h behind __KERNEL__ Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Jan Kara <jack@ucw.cz> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:15 -07:00
David Woodhouse	6d4d8c0aa2	Add taskstats.h to kbuild Add taskstats.h to include/linux/Kbuild, make headers_install would then pickup taskstats.h. This needs to be done as taskstats.h is a user interface header. Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Cc: Don Zickus <dzickus@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:15 -07:00
Jiri Slaby	cef2cf0727	Misc: add sensable phantom driver Add sensable phantom driver Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:14 -07:00
akpm@linux-foundation.org	f19b121e21	Driver for the Maxim DS1WM, a 1-wire bus master ASIC core Cc: Matt Reimer <mreimer@vpop.net> [akpm@linux-foundation.org: kconfig update] Signed-off-by: Matt Reimer <mreimer@vpop.net> Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:14 -07:00
Randy Dunlap	6df95fd7ad	consolidate asm/const.h to linux/const.h Make a global linux/const.h header file instead of having multiple, per-arch files, and convert current users of asm/const.h to use linux/const.h. Built on x86_64 and sparc64. [akpm@linux-foundation.org: fix include/asm-x86_64/Kbuild] Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:13 -07:00
OGAWA Hirofumi	28ec039c21	fat: don't use free_clusters for fat32 It seems that the recent Windows changed specification, and it's undocumented. Windows doesn't update ->free_clusters correctly. This patch doesn't use ->free_clusters by default. (instead, add "usefree" for forcing to use it) Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Juergen Beisert <juergen127@kreuzholzen.de> Cc: Andreas Schwab <schwab@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:13 -07:00
Jan Kara	28be5abb40	ext3: copy i_flags to inode flags on write A patch that stores inode flags such as S_IMMUTABLE, S_APPEND, etc. from i_flags to EXT3_I(inode)->i_flags when inode is written to disk. The same thing is done on GETFLAGS ioctl. Quota code changes these flags on quota files (to make it harder for sysadmin to screw himself) and these changes were not correctly propagated into the filesystem (especially, lsattr did not show them and users were wondering...). Propagate flags such as S_APPEND, S_IMMUTABLE, etc. from i_flags into ext3-specific i_flags. Hence, when someone sets these flags via a different interface than ioctl, they are stored correctly. Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:12 -07:00
Michael Ellerman	d1ab824be4	Document SPIN_LOCK_UNLOCKED/RW_LOCK_UNLOCKED deprecation Apparently it's not cool anymore to use SPIN/RW_LOCK_UNLOCKED. There's some mention of this in Documentation/spinlocks.txt, but that only talks about dynamic initialisation. A comment in the code mentioning the preferred usage would be good IMHO. [akpm@linux-foundation.org: add reason for deprecation] Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:11 -07:00
Pavel Emelianov	b5e618181a	Introduce a handy list_first_entry macro There are many places in the kernel where the construction like foo = list_entry(head->next, struct foo_struct, list); are used. The code might look more descriptive and neat if using the macro list_first_entry(head, type, member) \ list_entry((head)->next, type, member) Here is the macro itself and the examples of its usage in the generic code. If it will turn out to be useful, I can prepare the set of patches to inject in into arch-specific code, drivers, networking, etc. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Zach Brown <zach.brown@oracle.com> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:11 -07:00
Milind Arun Choudhary	b32e41bb97	SPIN_LOCK_UNLOCKED cleanup in init_task.h SPIN_LOCK_UNLOCKED cleanup,use __SPIN_LOCK_UNLOCKED instead Signed-off-by: Milind Arun Choudhary <milindchoudhary@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:10 -07:00
Bjorn Helgaas	873ec74615	EFI: warn only for pre-1.00 system tables We used to warn unless the EFI system table major revision was exactly 1. But EFI 2.00 firmware is starting to appear, and the 2.00 changes don't affect anything in Linux. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:10 -07:00
David Brownell	49a4ec188f	fix hotplug for legacy platform drivers We've had various reports of some legacy "probe the hardware" style platform drivers having nasty problems with hotplug support. The core issue is that those legacy drivers don't fully conform to the driver model. They assume a role that should be the responsibility of infrastructure code: creating device nodes. The "modprobe" step in hotplugging relies on drivers to have split those roles into different modules. The lack of this split causes the problems. When a driver creates nodes for devices that don't exist (sending a hotplug event), then exits (aborting one modprobe) before the "modprobe $MODALIAS" step completes (by failing, since it's in the middle of a modprobe), the result can be an endless loop of modprobe invocations ... badness. This fix uses the newish per-device flag controlling issuance of "add" events. (A previous version of this patch used a per-device "driver can hotplug" flag, which only scrubbed $MODALIAS from the environment rather than suppressing the entire hotplug event.) It also shrinks that flag to one bit, saving a word in "struct device". So the net of this patch is removing some nasty failures with legacy drivers, while retaining hotplug capability for the majority of platform drivers. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Greg KH <gregkh@suse.de> Cc: Andres Salomon <dilinger@debian.org> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:10 -07:00
Christoph Hellwig	6272e26679	cleanup compat ioctl handling Merge all compat ioctl handling into compat_ioctl.c instead of splitting it over compat.c and compat_ioctl.c. This also allows to get rid of ioctl32.h Signed-off-by: Christoph Hellwig <hch@lst.de> Looks-good-to: Andi Kleen <ak@suse.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:09 -07:00
Ravikiran G Thirumalai	e729aa16b1	Pad irq_desc to internode cacheline size We noticed a drop in n/w performance due to the irq_desc being cacheline aligned rather than internode aligned. We see 50% of expected performance when two e1000 nics local to two different nodes have consecutive irq descriptors allocated, due to false sharing. Note that this patch does away with cacheline padding for the UP case, as it does not seem useful for UP configurations. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:09 -07:00
Christoph Hellwig	644fd4f5de	merge compat_ioctl.h into compat_ioctl.c Now that there is no arch-specific compat ioctl handling left there is not point in having a separate copat_ioctl.h, so merge it into compat_ioctl.c Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:09 -07:00
Thomas Gleixner	0e8638e2ac	Deprecate SA_xxx interrupt flags -V2 The deprecation of the SA_xxx interrupt flags did not emit deprecated warnings. Andrew said about the removal of the deprecated flag defines: > This is going to break a lot of external stuff. We should have found > a way to make usage of SA_* emit deprecated warnings (or _some_ > warning) to warn people of impending doom. But I can't immediately > find a way of doing that. if we _can_ find a way of doing this, I > suspect we'll need to do it, and give people another six months. It's > going to get ugly out there. We shall see... Define the deprecated flags as a call to a __deprecated inline function so a warning is emitted on compile time. Extend the reprieve of out of tree drivers to 9/2007. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:08 -07:00
Alexey Dobriyan	a5c43dae7a	Fix race between cat /proc/slab_allocators and rmmod Same story as with cat /proc/*/wchan race vs rmmod race, only /proc/slab_allocators want more info than just symbol name. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:08 -07:00
Alexey Dobriyan	9d65cb4a17	Fix race between cat /proc//wchan and rmmod et al kallsyms_lookup() can go iterating over modules list unprotected which is OK for emergency situations (oops), but not OK for regular stuff like /proc//wchan. Introduce lookup_symbol_name()/lookup_module_symbol_name() which copy symbol name into caller-supplied buffer or return -ERANGE. All copying is done with module_mutex held, so... Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:08 -07:00
Alexey Dobriyan	ea07890a68	Fix race between rmmod and cat /proc/kallsyms module_get_kallsym() leaks "struct module *" outside of module_mutex which is no-no, because module can dissapear right after mutex unlock. Copy all needed information from inside module_mutex into caller-supplied space. [bunk@stusta.de: is_exported() can now become static] Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:08 -07:00
Alexey Dobriyan	ae84e32470	Simplify module_get_kallsym() by dropping length arg module_get_kallsym() could in theory truncate module symbol name to fit in buffer, but nobody does this. Always use KSYM_NAME_LEN + 1 bytes for name. Suggested by lg^WRusty. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:08 -07:00
Ananth N Mavinakayanahalli	0f95b7fc83	Kprobes: print details of kretprobe on assertion failure In certain cases like when the real return address can't be found or when the number of tracked calls to a kretprobed function is less than the number of returns, we may not be able to find the correct return address after processing a kretprobe. Currently we just do a BUG_ON, but no information is provided about the actual failing kretprobe. Print out details of the kretprobe before calling BUG(). Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:08 -07:00
Simon Horman	6672f76a5a	kdump/kexec: calculate note size at compile time Currently the size of the per-cpu region reserved to save crash notes is set by the per-architecture value MAX_NOTE_BYTES. Which in turn is currently set to 1024 on all supported architectures. While testing ia64 I recently discovered that this value is in fact too small. The particular setup I was using actually needs 1172 bytes. This lead to very tedious failure mode where the tail of one elf note would overwrite the head of another if they ended up being alocated sequentially by kmalloc, which was often the case. It seems to me that a far better approach is to caclculate the size that the area needs to be. This patch does just that. If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X) is needed then this should be as easy as making MAX_NOTE_BYTES larger in arch/asm-ia64/kexec.h. Perhaps 2048 would be a good choice. However, I think that the approach in this patch is a much more robust idea. Acked-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:07 -07:00
Ken Chen	7328508274	remove artificial software max_loop limit Remove artificial maximum 256 loop device that can be created due to a legacy device number limit. Searching through lkml archive, there are several instances where users complained about the artificial limit that the loop driver impose. There is no reason to have such limit. This patch rid the limit entirely and make loop device and associated block queue instantiation on demand. With on-demand instantiation, it also gives the benefit of not wasting memory if these devices are not in use (compare to current implementation that always create 8 loop devices), a net improvement in both areas. This version is both tested with creation of large number of loop devices and is compatible with existing losetup/mount user land tools. There are a number of people who worked on this and provided valuable suggestions, in no particular order, by: Jens Axboe Jan Engelhardt Christoph Hellwig Thomas M Signed-off-by: Ken Chen <kenchen@google.com> Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:07 -07:00
Jeremy Fitzhardinge	04c9167f91	add touch_all_softlockup_watchdogs() Add touch_all_softlockup_watchdogs() to allow the softlockup watchdog timers on all cpus to be updated. This is used to prevent sysrq-t from generating a spurious watchdog message when generating lots of output. Softlockup watchdogs use sched_clock() as its timebase, which is inherently per-cpu (at least, when it is measuring unstolen time). Because of this, it isn't possible for one CPU to directly update the other CPU's timers, but it is possible to tell the other CPUs to do update themselves appropriately. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Chris Lalancette <clalance@redhat.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Rick Lindsley <ricklind@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:06 -07:00
john stultz	8524070b79	Move timekeeping code to timekeeping.c Move the timekeeping code out of kernel/timer.c and into kernel/time/timekeeping.c. I made no cleanups or other changes in transit. [akpm@linux-foundation.org: build fix] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:06 -07:00
Eric Dumazet	329c8d84ca	time: SMP friendly alignment of struct clocksource struct clocksource is a critical data structure. Most of its fields are read only, some of them are heavily modified at each timer interrupt. It makes sense to separate those fields and make sure they all share one cache line, or at least the minimum for machines with small cache lines. [akpm@linux-foundation.org: build fix] Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:06 -07:00
Ralf Baechle	3367b994fe	<linux/sysdev.h> needs to include <linux/module.h> sysdev.h uses THIS_MODULE so should include <linux/module.h>. [akpm@linux-foundation.org: couple of fixes] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Andi Kleen <ak@suse.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:05 -07:00
Venki Pallipadi	28287033e1	Add a new deferrable delayed work init Add a new deferrable delayed work init. This can be used to schedule work that are 'unimportant' when CPU is idle and can be called later, when CPU eventually comes out of idle. Use this init in cpufreq ondemand governor. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:05 -07:00
Venki Pallipadi	6e453a6751	Add support for deferrable timers Introduce a new flag for timers - deferrable: Timers that work normally when system is busy. But, will not cause CPU to come out of idle (just to service this timer), when CPU is idle. Instead, this timer will be serviced when CPU eventually wakes up with a subsequent non-deferrable timer. The main advantage of this is to avoid unnecessary timer interrupts when CPU is idle. If the routine currently called by a timer can wait until next event without any issues, this new timer can be used to setup timer event for that routine. This, with dynticks, allows CPUs to be lazy, allowing them to stay in idle for extended period of time by reducing unnecesary wakeup and thereby reducing the power consumption. This patch: Builds this new timer on top of existing timer infrastructure. It uses last bit in 'base' pointer of timer_list structure to store this deferrable timer flag. __next_timer_interrupt() function skips over these deferrable timers when CPU looks for next timer event for which it has to wake up. This is exported by a new interface init_timer_deferrable() that can be called in place of regular init_timer(). [akpm@linux-foundation.org: Privatise a #define] Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:05 -07:00
David Brownell	c15a3837d2	parport->dev driver model support Currently a parport_driver can't get a handle on the device node for the underlying parport (PNPACPI, PCI, etc). That prevents correct placement of sysfs child nodes, which can affect things like power management. This patch adds a field to "struct parport" pointing to that device node, and updates non-legacy port drivers to initialize that device pointer. That field replaces the analagous PCI-only support in parport_pc. [akpm@linux-foundation.org: fix powerpc build] Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:05 -07:00
Robert P. J. Day	c467a388ae	Delete unused header file linux/awe_voice.h Delete the unused header file include/linux/awe_voice.h, as well as its corresponding Kbuild entry. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:05 -07:00
Adrian Bunk	e5f00f42f3	make remove_inode_dquot_ref() static remove_inode_dquot_ref() can now become static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:05 -07:00
Mark Fasheh	ef51c97623	Remove do_sync_file_range() Remove do_sync_file_range() and convert callers to just use do_sync_mapping_range(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:04 -07:00
Christoph Hellwig	1eeb66a1bb	move die notifier handling to common code This patch moves the die notifier handling to common code. Previous various architectures had exactly the same code for it. Note that the new code is compiled unconditionally, this should be understood as an appel to the other architecture maintainer to implement support for it aswell (aka sprinkling a notify_die or two in the proper place) arm had a notifiy_die that did something totally different, I renamed it to arm_notify_die as part of the patch and made it static to the file it's declared and used at. avr32 used to pass slightly less information through this interface and I brought it into line with the other architectures. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix vmalloc_sync_all bustage] [bryan.wu@analog.com: fix vmalloc_sync_all in nommu] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: <linux-arch@vger.kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:04 -07:00
Eric W. Biederman	98a27ba485	tty: introduce no_tty and use it in selinux While researching the tty layer pid leaks I found a weird case in selinux when we drop a controlling tty because of inadequate permissions we don't do the normal hangup processing. Which is a problem if it happens the session leader has exec'd something that can no longer access the tty. We already have code in the kernel to handle this case in the form of the TIOCNOTTY ioctl. So this patch factors out a helper function that is the essence of that ioctl and calls it from the selinux code. This removes the inconsistency in handling dropping of a controlling tty and who knows it might even make some part of user space happy because it received a SIGHUP it was expecting. In addition since this removes the last user of proc_set_tty outside of tty_io.c proc_set_tty is made static and removed from tty.h Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:04 -07:00
Andrew Morton	6ae9200f2c	enlarge console.name console.name[] is eight chars, but so is "earlyvga". So when we try to print console->name when using earlyvga it runs off the end of the string. Make it bigger. Diagnosed-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:04 -07:00
Rusty Russell	9adef58b1d	futex: get_futex_key, get_key_refs and drop_key_refs lguest uses the convenient futex infrastructure for inter-domain I/O, so expose get_futex_key, get_key_refs (renamed get_futex_key_refs) and drop_key_refs (renamed drop_futex_key_refs). Also means we need to expose the union that these use. No code changes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:03 -07:00
Klaus Kudielka	1a86b5e34e	cyclades: remove custom types Switch from private uclong, etc over to standard types. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:03 -07:00
Klaus Kudielka	7c4e95bf48	fix cyclades.h for x86_64 (and probably others) At least on x86_64 the present cyclades.h is broken due to the wrong size of uclong. This affects, of course, both the kernel and the user-level utilities. The symptom is that cyzload refuses to load the firmware. I also managed to freeze the machine when unloading the module. The patch below fixes this in an architecture-independent way. I have tested it with 2.6.19 and the driver works fine again with a Cyclades-Z on an Athlon 64 X2. [akpm@linux-foundation.org: fix warnings] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:03 -07:00
Ananth N Mavinakayanahalli	9b3af29bf3	Kprobes: Make kprobe.symbol_name const Kprobes doesn't scribble the kprobe.symbol_name field. Its only set by the module when registering the probe. Modules that exercise good hygiene using the "const" qualifier will see warnings... warning: assignment discards qualifiers from pointer target type Make struct kprobe.symbol_name const char * Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:03 -07:00
Eric Dumazet	c23fbb6bcb	VFS: delay the dentry name generation on sockets and pipes 1) Introduces a new method in 'struct dentry_operations'. This method called d_dname() might be called from d_path() to build a pathname for special filesystems. It is called without locks. Future patches (if we succeed in having one common dentry for all pipes/sockets) may need to change prototype of this method, but we now use : char d_dname(struct dentry dentry, char buffer, int buflen); 2) Adds a dynamic_dname() helper function that eases d_dname() implementations 3) Defines d_dname method for sockets : No more sprintf() at socket creation. This is delayed up to the moment someone does an access to /proc/pid/fd/... 4) Defines d_dname method for pipes : No more sprintf() at pipe creation. This is delayed up to the moment someone does an access to /proc/pid/fd/... A benchmark consisting of 1.000.000 calls to pipe()/close()/close() gives a nice* speedup on my Pentium(M) 1.6 Ghz : 3.090 s instead of 3.450 s Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Christoph Hellwig <hch@infradead.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:03 -07:00
Alexey Dobriyan	7695650a92	Fix race between proc_get_inode() and remove_proc_entry() proc_lookup remove_proc_entry =========== ================= lock_kernel(); spin_lock(&proc_subdir_lock); [find PDE with refcount 0] spin_unlock(&proc_subdir_lock); spin_lock(&proc_subdir_lock); [find PDE with refcount 0] [check refcount and free PDE] spin_unlock(&proc_subdir_lock); proc_get_inode: de_get(de); /* boom */ Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:01 -07:00
Miklos Szeredi	79c0b2df79	add filesystem subtype support There's a slight problem with filesystem type representation in fuse based filesystems. From the kernel's view, there are just two filesystem types: fuse and fuseblk. From the user's view there are lots of different filesystem types. The user is not even much concerned if the filesystem is fuse based or not. So there's a conflict of interest in how this should be represented in fstab, mtab and /proc/mounts. The current scheme is to encode the real filesystem type in the mount source. So an sshfs mount looks like this: sshfs#user@server:/ /mnt/server fuse rw,nosuid,nodev,... This url-ish syntax works OK for sshfs and similar filesystems. However for block device based filesystems (ntfs-3g, zfs) it doesn't work, since the kernel expects the mount source to be a real device name. A possibly better scheme would be to encode the real type in the type field as "type.subtype". So fuse mounts would look like this: /dev/hda1 /mnt/windows fuseblk.ntfs-3g rw,... user@server:/ /mnt/server fuse.sshfs rw,nosuid,nodev,... This patch adds the necessary code to the kernel so that this can be correctly displayed in /proc/mounts. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:01 -07:00
David Brownell	2e17c5508f	init dma masks in pnp_dev PNP now initializes device dma masks, which prevents oopses when generic dma calls are made using pnp device nodes. This assumes PNP only uses ISA DMA, with 24 bit addresses; and that it's safe to init those masks for all devices (rather than finding out which devices have been assigned DMA channels, and handling only those). Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Adam Belay <abelay@novell.com> Cc: Jaroslav Kysela <perex@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:00 -07:00
Badari Pulavarty	e3222c4ecc	Merge sys_clone()/sys_unshare() nsproxy and namespace handling sys_clone() and sys_unshare() both makes copies of nsproxy and its associated namespaces. But they have different code paths. This patch merges all the nsproxy and its associated namespace copy/clone handling (as much as possible). Posted on container list earlier for feedback. - Create a new nsproxy and its associated namespaces and pass it back to caller to attach it to right process. - Changed all copy__ns() routines to return a new copy of namespace instead of attaching it to task->nsproxy. - Moved the CAP_SYS_ADMIN checks out of copy__ns() routines. - Removed unnessary !ns checks from copy__ns() and added BUG_ON() just incase. - Get rid of all individual unshare__ns() routines and make use of copy_*_ns() instead. [akpm@osdl.org: cleanups, warning fix] [clg@fr.ibm.com: remove dup_namespaces() declaration] [serue@us.ibm.com: fix CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) retval] [akpm@linux-foundation.org: fix build with CONFIG_SYSVIPC=n] Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Serge Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <containers@lists.osdl.org> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:00 -07:00
Monakhov Dmitriy	616883df78	IRQ: add __must_check to request_irq This could help to find buggy drivers where request_irq return value wasn't checked. There's just no reason to ignore errors which can and do occur. Anyone who got warning during compilation have to realise what it is't realy safe code. Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:00 -07:00
Alexey Dobriyan	fe08a9d498	reiserfs: shrink superblock if no xattrs This makes in-core superblock fit into one cacheline here. Before: struct dentry * xattr_root; /* 124 4 / / --- cacheline 1 boundary (128 bytes) --- / struct rw_semaphore xattr_dir_sem; / 128 12 / int j_errno; / 140 4 / }; / size: 144, cachelines: 2 / / sum members: 142, holes: 1, sum holes: 2 / / last cacheline: 16 bytes / After: int j_errno; / 124 4 / / --- cacheline 1 boundary (128 bytes) --- / }; / size: 128, cachelines: 1 / / sum members: 126, holes: 1, sum holes: 2 */ Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:00 -07:00
Michal Schmidt	ee7b9e3706	Fix compilation of drivers with -O0 It is sometimes useful to compile individual drivers with optimization disabled for easier debugging. Currently drivers which use htonl() and similar functions don't compile with -O0. This patch fixes it. It also removes obsolete and misleading comments. This header is not for userspace, so we don't have to care about strange programs these comments mention. (akpm: -O0 probably isn't a good idea, but this code looks pretty crufty and unuseful) Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:00 -07:00
Adrian Bunk	46595390e9	init/do_mounts.c: proper prepare_namespace() prototype Add a proper protype for prepare_namespace() in include/linux/init.h. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:00 -07:00
Chris Snook	1ae7075bcd	use use SEEK_MAX to validate user lseek arguments Add SEEK_MAX and use it to validate lseek arguments from userspace. Signed-off-by: Chris Snook <csnook@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:14:59 -07:00
Trent Piepho	8e2c20023f	Fix constant folding and poor optimization in byte swapping code Constant folding does not work for the swabXX() byte swapping functions, and the C versions optimize poorly. Attempting to initialize a global variable to swab16(0x1234) or put something like "case swab32(42):" in a switch statement will not compile. It can work, swab.h just isn't doing it correctly. This patch fixes that. Contrary to the comment in asm-i386/byteorder.h, gcc does not recognize the "C" version of swab16 and turn it into efficient code. gcc can do this, just not with the current code. The simple function: u16 foo(u16 x) { return swab16(x); } Would compile to: movzwl %ax, %eax movl %eax, %edx shrl $8, %eax sall $8, %edx orl %eax, %edx With this patch, it will compile to: rolw $8, %ax I also attempted to document the maze different macros/inline functions that are used to create the final product. Signed-off-by: Trent Piepho <xyzzy@speakeasy.org> Cc: Francois-Rene Rideau <fare@tunes.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:14:59 -07:00
William Cohen	97dc32cdb1	reduce size of task_struct on 64-bit machines This past week I was playing around with that pahole tool (http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the size of various struct in the kernel. I was surprised by the size of the task_struct on x86_64, approaching 4K. I looked through the fields in task_struct and found that a number of them were declared as "unsigned long" rather than "unsigned int" despite them appearing okay as 32-bit sized fields. On x86_64 "unsigned long" ends up being 8 bytes in size and forces 8 byte alignment. Is there a reason there a reason they are "unsigned long"? The patch below drops the size of the struct from 3808 bytes (60 64-byte cachelines) to 3760 bytes (59 64-byte cachelines). A couple other fields in the task struct take a signficant amount of space: struct thread_struct thread; 688 struct held_lock held_locks[30]; 1680 CONFIG_LOCKDEP is turned on in the .config [akpm@linux-foundation.org: fix printk warnings] Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:14:58 -07:00
Christoph Hellwig	ab1b6f03a1	simplify the stacktrace code Simplify the stacktrace code: - remove the unused task argument to save_stack_trace, it's always current - remove the all_contexts flag, it's alwasy 0 Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <ak@suse.de> Cc: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:14:58 -07:00
Guillaume Chazarain	3e9f45bd18	Factor outstanding I/O error handling Cleanup: setting an outstanding error on a mapping was open coded too many times. Factor it out in mapping_set_error(). Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:14:57 -07:00
Dmitriy Monakhov	0ceb331433	mm: move common segment checks to separate helper function [akpm@linux-foundation.org: cleanup] Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org> Cc: Christoph Hellwig <hch@lst.de> Acked-by: Anton Altaparmakov <aia21@cam.ac.uk> Acked-by: David Chinner <dgc@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:14:57 -07:00
David Woodhouse	b46b8f19c9	Increase slab redzone to 64bits There are two problems with the existing redzone implementation. Firstly, it's causing misalignment of structures which contain a 64-bit integer, such as netfilter's 'struct ipt_entry' -- causing netfilter modules to fail to load because of the misalignment. (In particular, the first check in net/ipv4/netfilter/ip_tables.c::check_entry_size_and_hooks()) On ppc32 and sparc32, amongst others, __alignof__(uint64_t) == 8. With slab debugging, we use 32-bit redzones. And allocated slab objects aren't sufficiently aligned to hold a structure containing a uint64_t. By _just_ setting ARCH_KMALLOC_MINALIGN to __alignof__(u64) we'd disable redzone checks on those architectures. By using 64-bit redzones we avoid that loss of debugging, and also fix the other problem while we're at it. When investigating this, I noticed that on 64-bit platforms we're using a 32-bit value of RED_ACTIVE/RED_INACTIVE in the 64-bit memory location set aside for the redzone. Which means that the four bytes immediately before or after the allocated object at 0x00,0x00,0x00,0x00 for LE and BE machines, respectively. Which is probably not the most useful choice of poison value. One way to fix both of those at once is just to switch to 64-bit redzones in all cases. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <clameter@engr.sgi.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:14:57 -07:00
Dmitry Torokhov	334d0dd8b6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2007-05-08 01:31:11 -04:00
Paul Mackerras	02bbc0f09c	Merge branch 'linux-2.6'	2007-05-08 13:37:51 +10:00
Johannes Berg	f596575e81	[POWERPC] via-pmu: remove LED sleep notifier The generic LED code now makes sure that suspended devices don't blink, so we no longer need to do it ourselves. For the suspend to disk case, however, we need to make sure that we don't blink if the PMU sysdev was suspended before the LED device. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-08 11:54:19 +10:00
Vitaly Wool	972edcb79e	[MTD] [NAND] platform NAND driver: update header This patch extends nand.h in order to enable platform NAND driver. Signed-off-by: Vitaly Wool <vitalywool@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-05-08 00:40:32 +01:00
Linus Torvalds	2d56d3c43c	Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux * 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux: gfs2: nfs lock support for gfs2 lockd: add code to handle deferred lock requests lockd: always preallocate block in nlmsvc_lock() lockd: handle test_lock deferrals lockd: pass cookie in nlmsvc_testlock lockd: handle fl_grant callbacks lockd: save lock state on deferral locks: add fl_grant callback for asynchronous lock return nfsd4: Convert NFSv4 to new lock interface locks: add lock cancel command locks: allow {vfs,posix}_lock_file to return conflicting lock locks: factor out generic/filesystem switch from setlock code locks: factor out generic/filesystem switch from test_lock locks: give posix_test_lock same interface as ->lock locks: make ->lock release private data before returning in GETLK case locks: create posix-to-flock helper functions locks: trivial removal of unnecessary parentheses	2007-05-07 12:34:24 -07:00
Linus Torvalds	5cefcab3db	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (34 commits) [GFS2] Uncomment sprintf_symbol calling code [DLM] lowcomms style [GFS2] printk warning fixes [GFS2] Patch to fix mmap of stuffed files [GFS2] use lib/parser for parsing mount options [DLM] Lowcomms nodeid range & initialisation fixes [DLM] Fix dlm_lowcoms_stop hang [DLM] fix mode munging [GFS2] lockdump improvements [GFS2] Patch to detect corrupt number of dir entries in leaf and/or inode blocks [GFS2] bz 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump) [DLM] fs/dlm/ast.c should #include "ast.h" [DLM] Consolidate transport protocols [DLM] Remove redundant assignment [GFS2] Fix bz 234168 (ignoring rgrp flags) [DLM] change lkid format [DLM] interface for purge (2/2) [DLM] add orphan purging code (1/2) [DLM] split create_message function [GFS2] Set drop_count to 0 (off) by default ...	2007-05-07 12:26:27 -07:00
Linus Torvalds	9fa0853a85	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: rfkill: add support for input key to control wireless radio [NET] net/core: Fix error handling [TG3]: Update version and reldate. [TG3]: Eliminate spurious interrupts. [TG3]: Add ASPM workaround. [Bluetooth] Correct SCO buffer for another Broadcom based dongle [Bluetooth] Add support for Targus ACB10US USB dongle [Bluetooth] Disconnect L2CAP connection after last RFCOMM DLC [Bluetooth] Check that device is in rfcomm_dev_list before deleting [Bluetooth] Use in-kernel sockets API [Bluetooth] Attach host adapters to the Bluetooth bus [Bluetooth] Fix L2CAP and HCI setsockopt() information leaks	2007-05-07 12:23:31 -07:00
Paolo 'Blaisorblade' Giarrusso	e024715f5f	uml: improve checking and diagnostics of ethernet MACs Improve checking and diagnostics for broadcast and multicast Ethernet MAC addresses, and distinguish between those cases in output; also make sure the device is assigned a MAC address valid only locally to avoid collisions. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:13:02 -07:00
Rusty Russell	c5e631cf65	ARRAY_SIZE: check for type We can use a gcc extension to ensure that ARRAY_SIZE() is handed an array, not a pointer. This is especially important when code is changed from a fixed array to a pointer. I assume the Intel compiler doesn't support __builtin_types_compatible_p. [jdike@addtoit.com: uml: update UML definition of ARRAY_SIZE] Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:13:00 -07:00
Rafael J. Wysocki	56f99bcb52	swsusp: free more memory Move the definition of PAGES_FOR_IO to kernel/power/power.h and introduce SPARE_PAGES representing the number of pages that should be freed by the swsusp's memory shrinker in addition to PAGES_FOR_IO so that device drivers can allocate some memory (up to 1 MB total) in their .suspend() routines without causing the suspend to fail. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:59 -07:00
Johannes Berg	ab3bfca7ab	remove software_suspend() Remove software_suspend() and all its users since pm_suspend(PM_SUSPEND_DISK) should be equivalent and there's no point in having two interfaces for the same thing. The patch also changes the valid_state function to return 0 (false) for PM_SUSPEND_DISK when SOFTWARE_SUSPEND is not configured instead of accepting it and having the whole thing fail later. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:59 -07:00
Rafael J. Wysocki	04293355ac	mm: remove unused page flags Remove the two page flags that were previously used by swsusp and are no longer needed. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:59 -07:00
Rafael J. Wysocki	74dfd666de	swsusp: do not use page flags Make swsusp use memory bitmaps instead of page flags for marking 'nosave' and free pages. This allows us to 'recycle' two page flags that can be used for other purposes. Also, the memory needed to store the bitmaps is allocated when necessary (ie. before the suspend) and freed after the resume which is more reasonable. The patch is designed to minimize the amount of changes and there are some nice simplifications and optimizations possible on top of it. I am going to implement them separately in the future. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:59 -07:00
Rafael J. Wysocki	7be9823491	swsusp: use inline functions for changing page flags Replace direct invocations of SetPageNosave(), SetPageNosaveFree() etc. with calls to inline functions that can be changed in subsequent patches without modifying the code calling them. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:58 -07:00
Bryan Wu	194de56127	blackfin: serial driver This patch implements the driver necessary use the Analog Devices Blackfin processor's Serial Port. Signed-off-by: Bryan Wu <bryan.wu@analog.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Russell King <rmk+lkml@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:58 -07:00
Bryan Wu	1394f03221	blackfin architecture This adds support for the Analog Devices Blackfin processor architecture, and currently supports the BF533, BF532, BF531, BF537, BF536, BF534, and BF561 (Dual Core) devices, with a variety of development platforms including those avaliable from Analog Devices (BF533-EZKit, BF533-STAMP, BF537-STAMP, BF561-EZKIT), and Bluetechnix! Tinyboards. The Blackfin architecture was jointly developed by Intel and Analog Devices Inc. (ADI) as the Micro Signal Architecture (MSA) core and introduced it in December of 2000. Since then ADI has put this core into its Blackfin processor family of devices. The Blackfin core has the advantages of a clean, orthogonal,RISC-like microprocessor instruction set. It combines a dual-MAC (Multiply/Accumulate), state-of-the-art signal processing engine and single-instruction, multiple-data (SIMD) multimedia capabilities into a single instruction-set architecture. The Blackfin architecture, including the instruction set, is described by the ADSP-BF53x/BF56x Blackfin Processor Programming Reference http://blackfin.uclinux.org/gf/download/frsrelease/29/2549/Blackfin_PRM.pdf The Blackfin processor is already supported by major releases of gcc, and there are binary and source rpms/tarballs for many architectures at: http://blackfin.uclinux.org/gf/project/toolchain/frs There is complete documentation, including "getting started" guides available at: http://docs.blackfin.uclinux.org/ which provides links to the sources and patches you will need in order to set up a cross-compiling environment for bfin-linux-uclibc This patch, as well as the other patches (toolchain, distribution, uClibc) are actively supported by Analog Devices Inc, at: http://blackfin.uclinux.org/ We have tested this on LTP, and our test plan (including pass/fails) can be found at: http://docs.blackfin.uclinux.org/doku.php?id=testing_the_linux_kernel [m.kozlowski@tuxland.pl: balance parenthesis in blackfin header files] Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: Aubrey Li <aubrey.li@analog.com> Signed-off-by: Jie Zhang <jie.zhang@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:58 -07:00
Christoph Lameter	906e0be197	page migration: Only migrate pages if allocation in the highest zone is possible Address spaces contain an allocation flag that specifies restriction on the zone for pages placed in the mapping. I.e. some device may require pages to be allocated from a DMA zone. Block devices may not be able to use pages from HIGHMEM. Memory policies and the common use of page migration works only on the highest zone. If the address space does not allow allocation from the highest zone then the pages in the address space are not migratable simply because we can only allocate memory for a specified node if we allow allocation for the highest zone on each node. Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:57 -07:00
Christoph Lameter	cfce66047f	Slab allocators: remove useless __GFP_NO_GROW flag There is no user remaining and I have never seen any use of that flag. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:57 -07:00
Christoph Lameter	4f10493459	slab allocators: Remove SLAB_CTOR_ATOMIC SLAB_CTOR atomic is never used which is no surprise since I cannot imagine that one would want to do something serious in a constructor or destructor. In particular given that the slab allocators run with interrupts disabled. Actions in constructors and destructors are by their nature very limited and usually do not go beyond initializing variables and list operations. (The i386 pgd ctor and dtors do take a spinlock in constructor and destructor..... I think that is the furthest we go at this point.) There is no flag passed to the destructor so removing SLAB_CTOR_ATOMIC also establishes a certain symmetry. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:57 -07:00
Christoph Lameter	50953fe9e0	slab allocators: Remove SLAB_DEBUG_INITIAL flag I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by SLAB. I think its purpose was to have a callback after an object has been freed to verify that the state is the constructor state again? The callback is performed before each freeing of an object. I would think that it is much easier to check the object state manually before the free. That also places the check near the code object manipulation of the object. Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was compiled with SLAB debugging on. If there would be code in a constructor handling SLAB_DEBUG_INITIAL then it would have to be conditional on SLAB_DEBUG otherwise it would just be dead code. But there is no such code in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real use of, difficult to understand and there are easier ways to accomplish the same effect (i.e. add debug code before kfree). There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be clear in fs inode caches. Remove the pointless checks (they would even be pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors. This is the last slab flag that SLUB did not support. Remove the check for unimplemented flags from SLUB. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:57 -07:00
Christoph Lameter	0a31bd5f2b	KMEM_CACHE(): simplify slab cache creation This patch provides a new macro KMEM_CACHE(<struct>, <flags>) to simplify slab creation. KMEM_CACHE creates a slab with the name of the struct, with the size of the struct and with the alignment of the struct. Additional slab flags may be specified if necessary. Example struct test_slab { int a,b,c; struct list_head; } __cacheline_aligned_in_smp; test_slab_cache = KMEM_CACHE(test_slab, SLAB_PANIC) will create a new slab named "test_slab" of the size sizeof(struct test_slab) and aligned to the alignment of test slab. If it fails then we panic. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:55 -07:00
Christoph Lameter	5af6083990	slab allocators: Remove obsolete SLAB_MUST_HWCACHE_ALIGN This patch was recently posted to lkml and acked by Pekka. The flag SLAB_MUST_HWCACHE_ALIGN is 1. Never checked by SLAB at all. 2. A duplicate of SLAB_HWCACHE_ALIGN for SLUB 3. Fulfills the role of SLAB_HWCACHE_ALIGN for SLOB. The only remaining use is in sparc64 and ppc64 and their use there reflects some earlier role that the slab flag once may have had. If its specified then SLAB_HWCACHE_ALIGN is also specified. The flag is confusing, inconsistent and has no purpose. Remove it. Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:55 -07:00
Peter Zijlstra	f9a14399ae	mm: optimize kill_bdev() Remove duplicate work in kill_bdev(). It currently invalidates and then truncates the bdev's mapping. invalidate_mapping_pages() will opportunistically remove pages from the mapping. And truncate_inode_pages() will forcefully remove all pages. The only thing truncate doesn't do is flush the bh lrus. So do that explicitly. This avoids (very unlikely) but possible invalid lookup results if the same bdev is quickly re-issued. It also will prevent extreme kernel latencies which are observed when blockdevs which have a large amount of pagecache are unmounted, by avoiding invalidate_mapping_pages() on that path. invalidate_mapping_pages() has no cond_resched (it can be called under spinlock), whereas truncate_inode_pages() has one. [akpm@linux-foundation.org: restore nrpages==0 optimisation] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:55 -07:00
Peter Zijlstra	f98393a64c	mm: remove destroy_dirty_buffers from invalidate_bdev() Remove the destroy_dirty_buffers argument from invalidate_bdev(), it hasn't been used in 6 years (so akpm says). find * -name \.[ch] \| xargs grep -l invalidate_bdev \| while read file; do quilt add $file; sed -ie 's/invalidate_bdev($[^,]$,[^)]*)/invalidate_bdev(\1)/g' $file; done Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:55 -07:00
Christoph Lameter	6225e93735	Quicklists for page table pages On x86_64 this cuts allocation overhead for page table pages down to a fraction (kernel compile / editing load. TSC based measurement of times spend in each function): no quicklist pte_alloc 1569048 4.3s(401ns/2.7us/179.7us) pmd_alloc 780988 2.1s(337ns/2.7us/86.1us) pud_alloc 780072 2.2s(424ns/2.8us/300.6us) pgd_alloc 260022 1s(920ns/4us/263.1us) quicklist: pte_alloc 452436 573.4ms(8ns/1.3us/121.1us) pmd_alloc 196204 174.5ms(7ns/889ns/46.1us) pud_alloc 195688 172.4ms(7ns/881ns/151.3us) pgd_alloc 65228 9.8ms(8ns/150ns/6.1us) pgd allocations are the most complex and there we see the most dramatic improvement (may be we can cut down the amount of pgds cached somewhat?). But even the pte allocations still see a doubling of performance. 1. Proven code from the IA64 arch. The method used here has been fine tuned for years and is NUMA aware. It is based on the knowledge that accesses to page table pages are sparse in nature. Taking a page off the freelists instead of allocating a zeroed pages allows a reduction of number of cachelines touched in addition to getting rid of the slab overhead. So performance improves. This is particularly useful if pgds contain standard mappings. We can save on the teardown and setup of such a page if we have some on the quicklists. This includes avoiding lists operations that are otherwise necessary on alloc and free to track pgds. 2. Light weight alternative to use slab to manage page size pages Slab overhead is significant and even page allocator use is pretty heavy weight. The use of a per cpu quicklist means that we touch only two cachelines for an allocation. There is no need to access the page_struct (unless arch code needs to fiddle around with it). So the fast past just means bringing in one cacheline at the beginning of the page. That same cacheline may then be used to store the page table entry. Or a second cacheline may be used if the page table entry is not in the first cacheline of the page. The current code will zero the page which means touching 32 cachelines (assuming 128 byte). We get down from 32 to 2 cachelines in the fast path. 3. x86_64 gets lightweight page table page management. This will allow x86_64 arch code to faster repopulate pgds and other page table entries. The list operations for pgds are reduced in the same way as for i386 to the point where a pgd is allocated from the page allocator and when it is freed back to the page allocator. A pgd can pass through the quicklists without having to be reinitialized. 64 Consolidation of code from multiple arches So far arches have their own implementation of quicklist management. This patch moves that feature into the core allowing an easier maintenance and consistent management of quicklists. Page table pages have the characteristics that they are typically zero or in a known state when they are freed. This is usually the exactly same state as needed after allocation. So it makes sense to build a list of freed page table pages and then consume the pages already in use first. Those pages have already been initialized correctly (thus no need to zero them) and are likely already cached in such a way that the MMU can use them most effectively. Page table pages are used in a sparse way so zeroing them on allocation is not too useful. Such an implementation already exits for ia64. Howver, that implementation did not support constructors and destructors as needed by i386 / x86_64. It also only supported a single quicklist. The implementation here has constructor and destructor support as well as the ability for an arch to specify how many quicklists are needed. Quicklists are defined by an arch defining CONFIG_QUICKLIST. If more than one quicklist is necessary then we can define NR_QUICK for additional lists. F.e. i386 needs two and thus has config NR_QUICK int default 2 If an arch has requested quicklist support then pages can be allocated from the quicklist (or from the page allocator if the quicklist is empty) via: quicklist_alloc(<quicklist-nr>, <gfpflags>, <constructor>) Page table pages can be freed using: quicklist_free(<quicklist-nr>, <destructor>, <page>) Pages must have a definite state after allocation and before they are freed. If no constructor is specified then pages will be zeroed on allocation and must be zeroed before they are freed. If a constructor is used then the constructor will establish a definite page state. F.e. the i386 and x86_64 pgd constructors establish certain mappings. Constructors and destructors can also be used to track the pages. i386 and x86_64 use a list of pgds in order to be able to dynamically update standard mappings. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:54 -07:00
Christoph Lameter	643b113849	slub: enable tracking of full slabs If slab tracking is on then build a list of full slabs so that we can verify the integrity of all slabs and are also able to built list of alloc/free callers. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:54 -07:00

... 2 3 4 5 6 ...

6597 commits