Huge ptes have a special type on s390 and cannot be handled with the standard
pte functions in certain cases, e.g. because of a different location of the
invalid bit. This patch adds some new architecture- specific functions to
hugetlb common code, as a prerequisite for the s390 large page support.
This won't affect other architectures in functionality, but I need to add some
new dummy inline functions to the headers.
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A cow break on a hugetlbfs page with page_count > 1 will set a new pte with
set_huge_pte_at(), w/o any tlb flush operation. The old pte will remain in
the tlb and subsequent write access to the page will result in a page fault
loop, for as long as it may take until the tlb is flushed from somewhere else.
This patch introduces an architecture-specific huge_ptep_clear_flush()
function, which is called before the the set_huge_pte_at() in hugetlb_cow().
ATTENTION: This is just a nop on all architectures for now, the s390
implementation will come with our large page patch later. Other architectures
should define their own huge_ptep_clear_flush() if needed.
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch moves all architecture functions for hugetlb to architecture header
files (include/asm-foo/hugetlb.h) and converts all macros to inline functions.
It also removes (!) ARCH_HAS_HUGEPAGE_ONLY_RANGE,
ARCH_HAS_HUGETLB_FREE_PGD_RANGE, ARCH_HAS_PREPARE_HUGEPAGE_RANGE,
ARCH_HAS_SETCLEAR_HUGE_PTE and ARCH_HAS_HUGETLB_PREFAULT_HOOK.
Getting rid of the ARCH_HAS_xxx #ifdef and macro fugliness should increase
readability and maintainability, at the price of some code duplication. An
asm-generic common part would have reduced the loc, but we would end up with
new ARCH_HAS_xxx defines eventually.
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
s390 for one, cannot implement VM_MIXEDMAP with pfn_valid, due to their memory
model (which is more dynamic than most). Instead, they had proposed to
implement it with an additional path through vm_normal_page(), using a bit in
the pte to determine whether or not the page should be refcounted:
vm_normal_page()
{
...
if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
if (vma->vm_flags & VM_MIXEDMAP) {
#ifdef s390
if (!mixedmap_refcount_pte(pte))
return NULL;
#else
if (!pfn_valid(pfn))
return NULL;
#endif
goto out;
}
...
}
This is fine, however if we are allowed to use a bit in the pte to determine
refcountedness, we can use that to _completely_ replace all the vma based
schemes. So instead of adding more cases to the already complex vma-based
scheme, we can have a clearly seperate and simple pte-based scheme (and get
slightly better code generation in the process):
vm_normal_page()
{
#ifdef s390
if (!mixedmap_refcount_pte(pte))
return NULL;
return pte_page(pte);
#else
...
#endif
}
And finally, we may rather make this concept usable by any architecture rather
than making it s390 only, so implement a new type of pte state for this.
Unfortunately the old vma based code must stay, because some architectures may
not be able to spare pte bits. This makes vm_normal_page a little bit more
ugly than we would like, but the 2 cases are clearly seperate.
So introduce a pte_special pte state, and use it in mm/memory.c. It is
currently a noop for all architectures, so this doesn't actually result in any
compiled code changes to mm/memory.o.
BTW:
I haven't put vm_normal_page() into arch code as-per an earlier suggestion.
The reason is that, regardless of where vm_normal_page is actually
implemented, the *abstraction* is still exactly the same. Also, while it
depends on whether the architecture has pte_special or not, that is the
only two possible cases, and it really isn't an arch specific function --
the role of the arch code should be to provide primitive functions and
accessors with which to build the core code; pte_special does that. We do
not want architectures to know or care about vm_normal_page itself, and
we definitely don't want them being able to invent something new there
out of sight of mm/ code. If we made vm_normal_page an arch function, then
we have to make vm_insert_mixed (next patch) an arch function too. So I
don't think moving it to arch code fundamentally improves any abstractions,
while it does practically make the code more difficult to follow, for both
mm and arch developers, and easier to misuse.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Cc: Jared Hulbert <jaredeh@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement __fls on all 64-bit archs:
alpha has an implementation of fls64.
Added __fls(x) = fls64(x) - 1.
ia64 has fls, but not __fls.
Added __fls based on code of fls.
mips and powerpc have __ilog2, which is the same as __fls.
Added __fls = __ilog2.
parisc, s390, sh and sparc64:
Include generic __fls.
x86_64 already has __fls.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Driver for I2C interfaces in master mode on SH7760.
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel: (62 commits)
sched: build fix
sched: better rt-group documentation
sched: features fix
sched: /debug/sched_features
sched: add SCHED_FEAT_DEADLINE
sched: debug: show a weight tree
sched: fair: weight calculations
sched: fair-group: de-couple load-balancing from the rb-trees
sched: fair-group scheduling vs latency
sched: rt-group: optimize dequeue_rt_stack
sched: debug: add some debug code to handle the full hierarchy
sched: fair-group: SMP-nice for group scheduling
sched, cpuset: customize sched domains, core
sched, cpuset: customize sched domains, docs
sched: prepatory code movement
sched: rt: multi level group constraints
sched: task_group hierarchy
sched: fix the task_group hierarchy for UID grouping
sched: allow the group scheduler to have multiple levels
sched: mix tasks and groups
...
We currently keep 2 lists of PCI devices in the system, one in the
driver core, and one all on its own. This second list is sorted at boot
time, in "BIOS" order, to try to remain compatible with older kernels
(2.2 and earlier days). There was also a "nosort" option to turn this
sorting off, to remain compatible with even older kernel versions, but
that just ends up being what we have been doing from 2.5 days...
Unfortunately, the second list of devices is not really ever used to
determine the probing order of PCI devices or drivers[1]. That is done
using the driver core list instead. This change happened back in the
early 2.5 days.
Relying on BIOS ording for the binding of drivers to specific device
names is problematic for many reasons, and userspace tools like udev
exist to properly name devices in a persistant manner if that is needed,
no reliance on the BIOS is needed.
Matt Domsch and others at Dell noticed this back in 2006, and added a
boot option to sort the PCI device lists (both of them) in a
breadth-first manner to help remain compatible with the 2.4 order, if
needed for any reason. This option is not going away, as some systems
rely on them.
This patch removes the sorting of the internal PCI device list in "BIOS"
mode, as it's not needed at all anymore, and hasn't for many years.
I've also removed the PCI flags for this from some other arches that for
some reason defined them, but never used them.
This should not change the ordering of any drivers or device probing.
[1] The old-style pci_get_device and pci_find_device() still used this
sorting order, but there are very few drivers that use these functions,
as they are deprecated for use in this manner. If for some reason, a
driver rely on the order and uses these functions, the breadth-first
boot option will resolve any problem.
Cc: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[rebased for sched-devel/latest]
- Add a new cpuset file, having levels:
sched_relax_domain_level
- Modify partition_sched_domains() and build_sched_domains()
to take attributes parameter passed from cpuset.
- Fill newidle_idx for node domains which currently unused but
might be required if sched_relax_domain_level become higher.
- We can change the default level by boot option 'relax_domain_level='.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds a MigoR specific header file. We may want to use a cpu
specific header file instead, but this will do for now.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Presently this only checks to see if an address is an RAM, but this
doesn't work with XIP, so just always return 1. Follows m68knommu.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add support for Solution Engine SH7721 board(MS7721RP01).
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add KEYSC platform data for the Solution Engine 7722 board.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add a platform driver for the SuperH KEYSC block. The driver expects to get
mode, timing information and keypad layout from the board code as platform
data. The board code is resonsible for pin configuration.
Both sh7343 and sh7722 should be supported, but only the sh7722 processor has
been tested so far. SH_KEYSC_MODE_3 is yet to be tested.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
It is always == '((base) + 0x206)' if CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS=y
and it is not needed otherwise (arm, blackfin, parisc, ppc64, sh, sparc[64]).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Semaphores are no longer performance-critical, so a generic C
implementation is better for maintainability, debuggability and
extensibility. Thanks to Peter Zijlstra for fixing the lockdep
warning. Thanks to Harvey Harrison for pointing out that the
unlikely() was unnecessary.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Currently include/linux/kvm.h is not considered by make headers_install,
because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h. This problem
was introduced by
commit fb56dbb31c
Author: Avi Kivity <avi@qumranet.com>
Date: Sun Dec 2 10:50:06 2007 +0200
KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM
Currently, make headers_check barfs due to <asm/kvm.h>, which <linux/kvm.h>
includes, not existing. Rather than add a zillion <asm/kvm.h>s, export kvm.
only if the arch actually supports it.
Signed-off-by: Avi Kivity <avi@qumranet.com>
which makes this an 2.6.25 regression.
One way of solving the issue is to enhance Kbuild, but Avi and David conviced
me, that changing headers_install is not the way to go. This patch changes
the definition for linux/kvm.h to unifdef-y.
If unifdef-y is used for linux/kvm.h "make headers_check" will fail on all
architectures without asm/kvm.h. Therefore, this patch also provides
asm/kvm.h on all architectures.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Avi Kivity <avi@qumranet.com>
Cc: Sam Ravnborg <sam@ravnborg.org
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes the unused include/asm-sh/floppy.h
(ARCH_MAY_HAVE_PC_FDC was not enabled).
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The unlazy_fpu() path calls in to save_fpu() if the task has
TIF_USEDFPU set. save_fpu() being the crap API that it is has the side
effect of clearing the flag itself, which presently doesn't happen
if we're using FPU emulation. Fix this up for now, pending an overhaul
in 2.6.26.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Presently with preempt enabled there's the possibility to be preempted
after the TIF_USEDFPU test and the register save, leading to bogus
state post-__switch_to(). Use an explicit preempt_disable()/enable()
pair around unlazy_fpu()/clear_fpu() to avoid this. Follows the x86
change.
Reported-by: Takuo Koguchi <takuo.koguchi.sw@hitachi.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
posix_types.h and byteorder.h were sticking purely with the Kconfig
symbols, which doesn't work when we scrub the headers for user use.
Fixes a very unhelpful build error in current klibc.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This header is needed on other architectures as well (namely h8300),
which currently fails to build without this in place. Rather than
duplicating the port definition completely there, just move this to a
common location instead.
This should get h8300 working again for 2.6.25, in addition to the
changes already pushed by Sato-san in -rc2.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
drivers/char/pcmcia/synclink_cs.c:284:1: warning: "CCR3" redefined
In file included from include/asm/cache.h:13,
from include/asm/processor_32.h:15,
from include/asm/processor.h:60,
from include/linux/prefetch.h:14,
from include/linux/list.h:8,
from include/linux/module.h:9,
from drivers/char/pcmcia/synclink_cs.c:38:
include/asm/cpu/cache.h:38:1: warning: this is the location of the previous definition
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch is a fix to make sure readsN/writesN are used over insN/outsN for
ioreadN_rep/iowriteN_rep.
The current state of the sh io code is that mmio operations like readN/writeN
and ioreadN/iowriteN are unaffected by the value of generic_io_base. This is
different fom port based io like inN/outN which gets adjusted using the value
in generic_io_base.
Without this patch ioreadN_rep/iowriteN_rep get their addresses adjusted.
The address for mmio access is adjusted using generic_io_base. This is wrong.
The ata core code currently crashes if generic_io_base is set.
This patch changes ioreadN_rep/iowriteN_rep to follow the same rules as the
rest of the mmio operations, ie don't adjust using generic_io_base.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Katsuya MATSUBARA <matsu@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The SH-5 build currently fails when trying to build the i8042 code due
to the missing IRQ definitions. These are provided in asm/cpu/irq.h, so
just include that there to get it building again.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
linux/swap.h really wants to include linux/pagemap.h in order to satisfy
the page_cache_release()/release_pages() definition requirements when
CONFIG_SWAP=n. Unfortunately the code in question contains:
/* only sparc can not include linux/pagemap.h in this file
* so leave page_cache_release and release_pages undeclared... */
#define free_page_and_swap_cache(page) \
page_cache_release(page)
#define free_pages_and_swap_cache(pages, nr) \
release_pages((pages), (nr), 0);
so it looks like we're stuck with doing it in asm/tlb.h instead, as
others already do (ARM, CRIS, etc.). Grumble.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch removes defunct. led support functions from hp6xx.h since they are now
added in a proper driver (see commit below). Also adds tabs instead of spaces before comments.
*commit d39a7a63eb
Signed-off-by: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Previously this took an explicit range, update this to use the same
behaviour as the rest of the SH parts where we simply flush out a line
from the start address.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds sh7366 cpu supports. Just the most basic things like interrupt
controller, clocks and serial port are included at this point.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch converts the highlander CF device from good old machvec readb/writeb
to the new shiny trapped io.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch converts the CF device on r2d boards from machvec readb/writeb
to trapped io.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The idea is that we want to get rid of the in/out/readb/writeb callbacks from
the machvec and replace that with simple inline read and write operations to
memory. Fast and simple for most hardware devices (think pci).
Some devices require special treatment though - like 16-bit only CF devices -
so we need to have some method to hook in callbacks.
This patch makes it possible to add a per-device trap generating filter. This
way we can get maximum performance of sane hardware - which doesn't need this
filter - and crappy hardware works but gets punished by a performance hit.
V2 changes things around a bit and replaces io access callbacks with a
simple minimum_bus_width value. In the future we can add stride as well.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch changes copy_from_user() and copy_to_user() from macros
into static inline functions. This way we can use them as function
pointers. Also unify the 64 bit and 32 bit versions.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
These ports are holding up progress and now have been for months. Do the
job for them.
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Background: I've implemented 1K/2K page tables for s390. These sub-page
page tables are required to properly support the s390 virtualization
instruction with KVM. The SIE instruction requires that the page tables
have 256 page table entries (pte) followed by 256 page status table entries
(pgste). The pgstes are only required if the process is using the SIE
instruction. The pgstes are updated by the hardware and by the hypervisor
for a number of reasons, one of them is dirty and reference bit tracking.
To avoid wasting memory the standard pte table allocation should return
1K/2K (31/64 bit) and 2K/4K if the process is using SIE.
Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means
the s390 version for pte_alloc_one cannot return a pointer to a struct
page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
cannot return a pointer to a pte either, since that would require more than
32 bit for the return value of pte_alloc_one (and the pte * would not be
accessible since its not kmapped).
Solution: The only solution I found to this dilemma is a new typedef: a
pgtable_t. For s390 pgtable_t will be a (pte *) - to be introduced with a
later patch. For everybody else it will be a (struct page *). The
additional problem with the initialization of the ptl lock and the
NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
a destructor pgtable_page_dtor. The page table allocation and free
functions need to call these two whenever a page table page is allocated or
freed. pmd_populate will get a pgtable_t instead of a struct page pointer.
To get the pgtable_t back from a pmd entry that has been installed with
pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page
call in free_pte_range and apply_to_pte_range.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move STACK_TOP[_MAX] out of asm/a.out.h and into asm/processor.h as they're
required whether or not A.OUT format is available.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>