Remove symbol exports from ia64_ksyms.c that are already exported in
lib/string.c.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Have a facility to account for potentially hot-pluggable CPUs. ACPI doesnt
give a determinstic method to find hot-pluggable CPUs. Hence we use 2 methods
to assist.
- BIOS can mark potentially hot-pluggable CPUs as disabled in the MADT tables.
- User can specify the number of hot-pluggable CPUs via parameter
additional_cpus=X
The option is enabled only if ACPI_CONFIG_HOTPLUG_CPU=y which enables the
physical hotplug option. Without which user can still use logical onlining
and offlining of CPUs by enabling CONFIG_HOTPLUG_CPU=y
Adds more bits to cpu_possible_map for potentially hot-pluggable cpus.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Do not set cpu_possible_map for NR_CPUS when ACPI_CONFIG_HOTPLUG_CPU is set.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
MCA driver can cause panic if kernel gets a state info with no minstate.
This patch adds minstate validation before handling it.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Remove an erroneous kfree, and unlink the pcidev_info struct from the
pcidev_info list prior to free'ing the pcidev_info struct.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Registers system call for the ia64 architecture.
Reserves space for ppoll and pselect, and adds unshare at system
call number 1296.
Signed-off-by: Janak Desai <janak@us.ibm.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
No platform in the community tree uses PLATFORM_MCA_HANDLERS, remove
the references.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Update the comm field on the MCA handler for user tasks as well as for
verified kernel tasks. This helps to identify the task that was
running when the MCA occurred.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Print a message identifying the monarch MCA handler. Print a summary
of the status of the slave MCA cpus.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Prevent SN2 specific code to be executed on non SN2 platforms when
running a generic kernel.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
There were two problems with enabling the PRINTK_TIME config
option:
1) The first calls to printk() occur before per-cpu data virtual
address is pinned into the TLB, so sched_clock() can fault.
2) sched_clock() is based on ar.itc, which may not be synchronized
across cpus.
Ken Chen started this patch, Tony Luck tinkered with it, and Jes
Sorensen perfected it.
Signed-off-by: Tony Luck <tony.luck@intel.com>
The check of (end != cp) after memparse in efi.c looks wrong to me.
The result is that we can't use mem= and max_addr= kernel parameter at
the same time.
The following patch removed the check just like other arches do.
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Rewrite the SN pio_phys_xxx macros in assembly language. This
avoids issues with the Intel icc compiler. Function call
overhead is not an issue - the functions reference PIOs
and take 100's nsec to complete.
In addition, the functions should likely be in assembly
language anyway - they reference memory using physical
addressing mode. One function executes with psr.ic disabled.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
After converting the cpu physical address to shub2 physical
addressing, the address was run through TO_PHYS() which
clobbered a high node offset bit causing the BTE to fail
on shub2 nodes with large memory. This fix corrects
that problem.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
The patch implements cpu topology exportation by sysfs.
Items (attributes) are similar to /proc/cpuinfo.
1) /sys/devices/system/cpu/cpuX/topology/physical_package_id:
represent the physical package id of cpu X;
2) /sys/devices/system/cpu/cpuX/topology/core_id:
represent the cpu core id to cpu X;
3) /sys/devices/system/cpu/cpuX/topology/thread_siblings:
represent the thread siblings to cpu X in the same core;
4) /sys/devices/system/cpu/cpuX/topology/core_siblings:
represent the thread siblings to cpu X in the same physical package;
To implement it in an architecture-neutral way, a new source file,
driver/base/topology.c, is to export the 5 attributes.
If one architecture wants to support this feature, it just needs to
implement 4 defines, typically in file include/asm-XXX/topology.h.
The 4 defines are:
#define topology_physical_package_id(cpu)
#define topology_core_id(cpu)
#define topology_thread_siblings(cpu)
#define topology_core_siblings(cpu)
The type of **_id is int.
The type of siblings is cpumask_t.
To be consistent on all architectures, the 4 attributes should have
deafult values if their values are unavailable. Below is the rule.
1) physical_package_id: If cpu has no physical package id, -1 is the
default value.
2) core_id: If cpu doesn't support multi-core, its core id is 0.
3) thread_siblings: Just include itself, if the cpu doesn't support
HT/multi-thread.
4) core_siblings: Just include itself, if the cpu doesn't support
multi-core and HT/Multi-thread.
So be careful when declaring the 4 defines in include/asm-XXX/topology.h.
If an attribute isn't defined on an architecture, it won't be exported.
Thank Nathan, Greg, Andi, Paul and Venki.
The patch provides defines for i386/x86_64/ia64.
Signed-off-by: Zhang, Yanmin <yanmin.zhang@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
During some testing, we got a warning about trying to allocate
memory while holding a lock. This fixes that problem.
Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Maintenance patch:
- Add missing __init calls
- Do not zero initialize global variables
- No need to typecast function call returns to void
- Some formatting
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
If SAL_CACHE_FLUSH drops interrupts, complain about it and fall back to
using PAL_CACHE_FLUSH instead.
This is to work around a defect in HP rx5670 firmware: when an interrupt
occurs during SAL_CACHE_FLUSH, SAL drops the interrupt but leaves it marked
"in-service", which leaves the interrupt (and others of equal or lower
priority) masked.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Somehow I doubt this comment is meant to be here anymore... It's
been floating after the L1_CACHE_SHIFT entry since before Linux
moved to bitkeeper.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Temporary patch to make pci_enable_msi() fail gracefully on altix. Will be
removed after 2.6.16 releases and the msi abstraction patches start flowing.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Redirecting interrupts using smp_affinity on altix does not work on kernels
built with CONFIG_PCI_MSI. The problem is that move_irq() turns into a noop
if MSI is built in. This patch calls move_native_irq() instead of move_irq()
to get around that.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
On SN2, MMIO writes which are issued from separate processors are not
guaranteed to arrive in any particular order at the IO hardware. When
performing such writes from the kernel this is not a problem, as a
kernel thread will not migrate to another CPU during execution, and
mmiowb() calls can guarantee write ordering when control of the IO
resource is allowed to move between threads.
However, when MMIO writes can be performed from user space (e.g. DRM)
there are no such guarantees and mechanisms, as the process may
context-switch at any time, and may migrate to a different CPU as part
of the switch. For such programs/hardware to operate correctly, it is
required that the MMIO writes from the old CPU be accepted by the IO
hardware before subsequent writes from the new CPU can be issued.
The following patch implements this behavior on SN2 by waiting for a
Shub register to indicate that these writes have been accepted. This
is placed in the context switch-in path, and only performs the wait
when the newly scheduled task changes CPUs.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Brent Casavant <bcasavan@sgi.com>
This patch finishes support for SHUB2 (the new chipset). Most of the
changes are performance related. A few changes are workarounds for
"interesting" chipset features.
Some temporary debugging code has also been deleted.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Various bugfixes and hardware bug workarounds necessary for the rev 1.0 version
of the altix TIO CE asic.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The only user of the MCA/INIT sigdelayed code (SGI's I/O probing) has
moved from the kernel into SAL. Delete the MCA/INIT sigdelayed code.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Actually I think this is more appropriate so we don't end up with 17
cases that add drivers/sn to the build lib.
Include drivers/sn when CONFIG_IA64_SGI_SN2 or CONFIG_IA64_GENERIC
is enabled.
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
arch/ia64/sn/Makefile sets CPPFLAGS, expecting that setting to
propogate to all the subdirectories. For a normal build with its
recursive descent it does work, but doing a selective build like
'make arch/ia64/sn/kernel/io_init.i' does not do a recursive descent,
it goes directly to arch/ia64/sn/kernel/Makefile so the flags do not
get set.
To support selective builds, set the flags in all the subordinate Makefiles.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Remove the GFP_DMA flag from XPC kmalloc() calls.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Eliminate a hot shared cacheline that occurs if multiple cpus are
taking unaligned exceptions.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Takashi helped us track down a bad page state bug we thought was coming
from alsa. It turns out we weren't paying attention to the gfp flags
that were passed in to sn_dma_alloc_coherent().
From: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Edwards <edwardsg@sgi.com>
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
sos->os_status is set to a default value of IA64_MCA_COLD_BOOT for an
MCA, but then is incorrectly overwritten with IA64_MCA_SAME_CONTEXT (0).
This makes SAL think that all MCAs have been recovered.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix an unnecessary softlockup watchdog warning in the ia64
uncached_build_memmap() that occurs occasionally at 256p and always at
512p. The problem occurs at boot time.
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Migrate perfmon from using an old semaphore to a completion handler.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Migrate arch/ia64/ia32/sys_ia32 to using a mutex for mmap protection.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Adds the ability to disability packet split at compile time and use the legacy receive path on PCI express hardware. Made this a CONFIG option and modified the Kconfig, to reflect the new option.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Migrate sn2 code to use mutex and completion events rather than
semaphores.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Bugfix... the altix SN_SAL_IOIF_SLOT_ENABLE & SN_SAL_IOIF_SLOT_DISABLE
SAL calls need to pass the segment# down
Signed-off-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Work-around to temporarily support older PROMs with new flush device code.
This code allows systems running older PROMs to continue to run on the new
kernel base until a new official PROM is released.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Acked-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch fixes the bug that pci_claim_resource() is called multiple
times for the same P2P bridge's resource structures if P2P bridges
require their own PCI I/O resources.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
First step to memory hotplug for ia64 (add only,
all new memory is added to node 0, does not use
ZONE_EASY_RECLAIM yet).
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add driver support for a 2 port PCI IOC3-based serial card on Altix boxes:
This is a re-submission. On the original submission I was asked to
organize the code so that the MIPS ioc3 ethernet and serial parts could be
used with this driver. Stanislaw Skowronek was kind enough to provide the
shim layer for this - thanks Stanislaw. This patch includes the shim layer
and the Altix PCI ioc3 serial driver. The MIPS merged ioc3 ethernet and
serial support is forthcoming.
Signed-off-by: Patrick Gefre <pfg@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
tmp_buf_sem sems to be a common name for something completely unused...
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de> ("usb portion")
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
TTY layer buffering revamp broke ia64 in commit
33f0f88f1c
CC arch/ia64/hp/sim/simserial.o
arch/ia64/hp/sim/simserial.c: In function `receive_chars':
arch/ia64/hp/sim/simserial.c:170: error: structure has no member named `flip'
... and so on ...
make[1]: *** [arch/ia64/hp/sim/simserial.o] Error 1
Patch from Andreas Schwab.
Signed-off-by: Tony Luck <tony.luck@intel.com>
When jprobe is hit, the function parameters of the original function
should be saved before jprobe handler is executed, and restored it after
jprobe handler is executed, because jprobe handler might change the
register values due to tail call optimization by the gcc.
Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add hotplug cpu support to salinfo.c.
The cpu_event field is a cpumask so use the cpu_* macros consistently,
replacing the existing mixture of cpu_* and *_bit macros.
Instead of counting the number of outstanding events in a semaphore and
trying to track that count over user space context, interrupt context,
non-maskable interrupt context and cpu hotplug, replace the semaphore
with a test for "any bits set" combined with a mutex.
Modify the locking to make the test for "work to do" an atomic
operation.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
We need to handle debug traps in fsys mode non-fatally. They can
happen now that we have fsyscalls which contain probe instructions.
Signed-off-by: Jason Uhlenkott <jasonuhl@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch separates the sn_flush_device_list struct into kernel and
common (both kernel and PROM accessible) structures. As it was, if the
size of a spinlock_t changed (due to additional CONFIG options, etc.) the
sal call which populated the sn_flush_device_list structs would erroneously
write data (and cause memory corruption and/or a panic).
This patch does the following:
1. Removes sn_flush_device_list and adds sn_flush_device_common and
sn_flush_device_kernel.
2. Adds a new SAL call to populate a sn_flush_device_common struct per
device, not per widget as previously done.
3. Correctly initializes each device's sn_flush_device_kernel spinlock_t
struct (before it was only doing each widget's first device).
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
I originally thought this was an bug only in the SN code, but I think I
also see a hole in the generic IA64 tlb code. (Separate patch was sent
for the SN problem).
It looks like there is a bug in the TLB flushing code. During context switch,
kernel threads (kswapd, for example) inherit the mm of the task that was
previously running on the cpu. Normally, this is ok because the previous context
is still loaded into the RR registers. However, if the owner of the mm
migrates to another cpu, changes it's context number, and references a
page before kswapd issues a tlb_purge for that same page, the purge will be
done with a stale context number (& RR registers).
Signed-off-by: Tony Luck <tony.luck@intel.com>
Altix (shub2) pushes the BTE clean-up into SAL.
This patch correctly interfaces with the now implemented SAL call.
It also fixes a bug when delaying clean-up to allow busy BTEs to
complete (or error out).
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
On return from INIT handler we must convert the address of the
minstate area from a kernel virtual uncached address (0xC...)
to physical uncached (0x8...). A typo (or thinko?) in the code
converted to physical cached.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cleanup a few items after moving xpc.h from arch/ia64/sn/kernel to
include/asm-ia64/sn.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Move xpc.h from arch/ia64/sn/kernel to include/asm-ia64/sn without change.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Move xpc_system_reboot() to be closer to the file it calls for readability
reasons (which are indeed subjective).
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Allow for the loss of heartbeat while in kdebug to be ignored by remote
partitions.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cleanup the XPC disengage related messages that are printed to the log.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch fixes a problem in XPC disengage processing whereby it was not
seeing the request to disengage from a remote partition, so the disengage
wasn't happening. The disengagement is suppose to transpire during the time
a XPC channel is disconnecting, and should be completed before the channel
is declared to be disconnected.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When this new syscall was added to ia64 in commit
39743889aa
fsys.S was forgotten. Add a ".data8 0" there to keep
it in step. [Reported by Stephane Eranian]
Signed-off-by: Tony Luck <tony.luck@intel.com>
on ia64 thread_info is at the constant offset from task_struct and stack
is embedded into the same beast. Set __HAVE_THREAD_FUNCTIONS, made
task_thread_info() just add a constant.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Ingo Molnar <mingo@elte.hu>
This is the latest version of the scheduler cache-hot-auto-tune patch.
The first problem was that detection time scaled with O(N^2), which is
unacceptable on larger SMP and NUMA systems. To solve this:
- I've added a 'domain distance' function, which is used to cache
measurement results. Each distance is only measured once. This means
that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT
distances 0 and 1, and on SMP distance 0 is measured. The code walks
the domain tree to determine the distance, so it automatically follows
whatever hierarchy an architecture sets up. This cuts down on the boot
time significantly and removes the O(N^2) limit. The only assumption
is that migration costs can be expressed as a function of domain
distance - this covers the overwhelming majority of existing systems,
and is a good guess even for more assymetric systems.
[ People hacking systems that have assymetries that break this
assumption (e.g. different CPU speeds) should experiment a bit with
the cpu_distance() function. Adding a ->migration_distance factor to
the domain structure would be one possible solution - but lets first
see the problem systems, if they exist at all. Lets not overdesign. ]
Another problem was that only a single cache-size was used for measuring
the cost of migration, and most architectures didnt set that variable
up. Furthermore, a single cache-size does not fit NUMA hierarchies with
L3 caches and does not fit HT setups, where different CPUs will often
have different 'effective cache sizes'. To solve this problem:
- Instead of relying on a single cache-size provided by the platform and
sticking to it, the code now auto-detects the 'effective migration
cost' between two measured CPUs, via iterating through a wide range of
cachesizes. The code searches for the maximum migration cost, which
occurs when the working set of the test-workload falls just below the
'effective cache size'. I.e. real-life optimized search is done for
the maximum migration cost, between two real CPUs.
This, amongst other things, has the positive effect hat if e.g. two
CPUs share a L2/L3 cache, a different (and accurate) migration cost
will be found than between two CPUs on the same system that dont share
any caches.
(The reliable measurement of migration costs is tricky - see the source
for details.)
Furthermore i've added various boot-time options to override/tune
migration behavior.
Firstly, there's a blanket override for autodetection:
migration_cost=1000,2000,3000
will override the depth 0/1/2 values with 1msec/2msec/3msec values.
Secondly, there's a global factor that can be used to increase (or
decrease) the autodetected values:
migration_factor=120
will increase the autodetected values by 20%. This option is useful to
tune things in a workload-dependent way - e.g. if a workload is
cache-insensitive then CPU utilization can be maximized by specifying
migration_factor=0.
I've tested the autodetection code quite extensively on x86, on 3
P3/Xeon/2MB, and the autodetected values look pretty good:
Dual Celeron (128K L2 cache):
---------------------
migration cost matrix (max_cache_size: 131072, cpu: 467 MHz):
---------------------
[00] [01]
[00]: - 1.7(1)
[01]: 1.7(1) -
---------------------
cacheflush times [2]: 0.0 (0) 1.7 (1784008)
---------------------
Here the slow memory subsystem dominates system performance, and even
though caches are small, the migration cost is 1.7 msecs.
Dual HT P4 (512K L2 cache):
---------------------
migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz):
---------------------
[00] [01] [02] [03]
[00]: - 0.4(1) 0.0(0) 0.4(1)
[01]: 0.4(1) - 0.4(1) 0.0(0)
[02]: 0.0(0) 0.4(1) - 0.4(1)
[03]: 0.4(1) 0.0(0) 0.4(1) -
---------------------
cacheflush times [2]: 0.0 (33900) 0.4 (448514)
---------------------
Here it can be seen that there is no migration cost between two HT
siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory
system makes inter-physical-CPU migration pretty cheap: 0.4 msecs.
8-way P3/Xeon [2MB L2 cache]:
---------------------
migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz):
---------------------
[00] [01] [02] [03] [04] [05] [06] [07]
[00]: - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[01]: 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[02]: 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[03]: 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1)
[04]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1)
[05]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1)
[06]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1)
[07]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) -
---------------------
cacheflush times [2]: 0.0 (0) 19.2 (19281756)
---------------------
This one has huge caches and a relatively slow memory subsystem - so the
migration cost is 19 msecs.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: <wilder@us.ibm.com>
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add per-arch sched_cacheflush() which is a write-back cacheflush used by
the migration-cost calibration code at bootup time.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
arch: Use <linux/capability.h> where capable() is used.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is a window where a probe gets removed right after the probe is hit
on some different cpu. In this case probe handlers can't find a matching
probe instance related to break address. In this case we need to read the
original instruction at break address to see if that is not a break/int3
instruction and recover safely.
Previous code had a bug where we were not checking for the above race in
case of reentrant probes and the below patch fixes this race.
Tested on IA64, Powerpc, x86_64.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently arch_remove_kprobes() is only implemented/required for x86_64 and
powerpc. All other architecture like IA64, i386 and sparc64 implementes a
dummy function which is being called from arch independent kprobes.c file.
This patch removes the dummy functions and replaces it with
#define arch_remove_kprobe(p, s) do { } while(0)
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that all these entries in the arch ioctl32.c files are gone [1], we can
build fs/compat_ioctl.c as a normal object and kill tons of cruft. We need a
special do_ioctl32_pointer handler for s390 so the compat_ptr call is done.
This is not needed but harmless on all other architectures. Also remove some
superflous includes in fs/compat_ioctl.c
Tested on ppc64.
[1] parisc still had it's PPP handler left, which is not fully correct
for ppp and besides that ppp uses the generic SIOCPRIV ioctl so it'd
kick in for all netdevice users. We can introduce a proper handler
in one of the next patch series by adding a compat_ioctl method to
struct net_device but for now let's just kill it - parisc doesn't
compile in mainline anyway and I don't want this to block this
patchset.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The comment in compat.c is wrong, every architecture provides a
get_compat_sigevent() for the IPC compat code already.
This basically moves the x86_64 version to common code and removes all the
others.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a hook so architectures can validate /dev/mem mmap requests.
This is analogous to validation we already perform in the read/write
paths.
The identity mapping scheme used on ia64 requires that each 16MB or
64MB granule be accessed with exactly one attribute (write-back or
uncacheable). This avoids "attribute aliasing", which can cause a
machine check.
Sample problem scenario:
- Machine supports VGA, so it has uncacheable (UC) MMIO at 640K-768K
- efi_memmap_init() discards any write-back (WB) memory in the first granule
- Application (e.g., "hwinfo") mmaps /dev/mem, offset 0
- hwinfo receives UC mapping (the default, since memmap says "no WB here")
- Machine check abort (on chipsets that don't support UC access to WB
memory, e.g., sx1000)
In the scenario above, the only choices are
- Use WB for hwinfo mmap. Can't do this because it causes attribute
aliasing with the UC mapping for the VGA MMIO space.
- Use UC for hwinfo mmap. Can't do this because the chipset may not
support UC for that region.
- Disallow the hwinfo mmap with -EINVAL. That's what this patch does.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove various things which were checking for gcc-1.x and gcc-2.x compilers.
From: Adrian Bunk <bunk@stusta.de>
Some documentation updates and removes some code paths for gcc < 3.2.
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The ptrace_get_task_struct() helper that I added as part of the ptrace
consolidation is useful in variety of places that currently opencode it.
Switch them to the common helpers.
Add a ptrace_traceme() helper that needs to be explicitly called, and simplify
the ptrace_get_task_struct() interface. We don't need the request argument
now, and we return the task_struct directly, using ERR_PTR() for error
returns. It's a bit more code in the callers, but we have two sane routines
that do one thing well now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
sys_migrate_pages implementation using swap based page migration
This is the original API proposed by Ray Bryant in his posts during the first
half of 2005 on linux-mm@kvack.org and linux-kernel@vger.kernel.org.
The intent of sys_migrate is to migrate memory of a process. A process may
have migrated to another node. Memory was allocated optimally for the prior
context. sys_migrate_pages allows to shift the memory to the new node.
sys_migrate_pages is also useful if the processes available memory nodes have
changed through cpuset operations to manually move the processes memory. Paul
Jackson is working on an automated mechanism that will allow an automatic
migration if the cpuset of a process is changed. However, a user may decide
to manually control the migration.
This implementation is put into the policy layer since it uses concepts and
functions that are also needed for mbind and friends. The patch also provides
a do_migrate_pages function that may be useful for cpusets to automatically
move memory. sys_migrate_pages does not modify policies in contrast to Ray's
implementation.
The current code here is based on the swap based page migration capability and
thus is not able to preserve the physical layout relative to it containing
nodeset (which may be a cpuset). When direct page migration becomes available
then the implementation needs to be changed to do a isomorphic move of pages
between different nodesets. The current implementation simply evicts all
pages in source nodeset that are not in the target nodeset.
Patch supports ia64, i386 and x86_64.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This was causing some ordering problems. Remove the up-front evaluation
and just revaluate the compiler version each time we need it.
(The up-front evaluation was problematic because some architectures modify
the value of $(CC)).
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
arch/ia64/kernel/setup.c: In function `show_cpuinfo':
arch/ia64/kernel/setup.c:576: warning: long unsigned int format, different type arg (arg 12)
arch/ia64/kernel/setup.c:576: warning: long unsigned int format, different type arg (arg 13)
Introduced by 95235ca2c2
Signed-off-by: Tony Luck <tony.luck@intel.com>
here is the BSP removal support for IA64. Its pretty much the same thing that
was released a while back, but has your feedback incorporated.
- Removed CONFIG_BSP_REMOVE_WORKAROUND and associated cmdline param
- Fixed compile issue with sn2/zx1 due to a undefined fix_b0_for_bsp
- some formatting nits (whitespace etc)
This has been tested on tiger and long back by alex on hp systems as well.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>