This patch against tip/x86/iommu virtually reverts
2842e5bf31. But just reverting the
commit breaks AMD IOMMU so this patch also includes some fixes.
The above commit adds new two options to x86 IOMMU generic kernel boot
options, fullflush and nofullflush. But such change that affects all
the IOMMUs needs more discussion (all IOMMU parties need the chance to
discuss it):
http://lkml.org/lkml/2008/9/19/106
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86 has set_bit_string() that does the exact same thing that
set_bit_area() in lib/iommu-helper.c does.
This patch exports set_bit_area() in lib/iommu-helper.c as
iommu_area_reserve(), converts GART, Calgary, and AMD IOMMU to use it.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The alloc_coherent implementation for AMD IOMMU currently uses
*dev->dma_mask per default. This patch changes it to prefer
dev->coherent_dma_mask if it is set.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The command buffer release function uses the CMD_BUF_SIZE macro for
get_order. Replace this with iommu->cmd_buf_size which is more reliable
about the actual size of the buffer.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The current calculation of the IVHD entry size is hard to read. So move
this code to a seperate function to make it more clear what this
calculation does.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The ctrl variable is only u32 and readl also returns a 32 bit value. So
the cast to u64 is pointless. Remove it with this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The amd_iommu_pd_alloc_bitmap is allocated with a calculated order and
freed with order 1. This is not a bug since the calculated order always
evaluates to 1, but its unclean code. So replace the 1 with the
calculation in the release path.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The current calculation is very complicated. This patch replaces it with
a much simpler version.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Remove the memset and use __GFP_ZERO at allocation time instead.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86's common alloc_coherent (dma_alloc_coherent in dma-mapping.h) sets
up the gfp flag according to the device dma_mask but AMD IOMMU doesn't
need it for devices that the IOMMU can do virtual mappings for. This
patch avoids unnecessary low zone allocation.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Remove some magic numbers and split the pte_root using standard
functions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In isolation mode the protection domains for the devices are
preallocated and preassigned. This is bad if a device should be passed
to a virtualization guest because the IOMMU code does not know if it is
in use by a driver. This patch changes the code to assign the device to
the preallocated domain only if there are dma mapping requests for it.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This function determines if the AMD IOMMU implementation is responsible
for a given device. So the DMA layer can get this information from the
driver.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There is a bit in the device entry to suppress all IO page faults
generated by a device. This bit was set until now because there was no
event logging. Now that there is event logging this patch allows IO page
faults from devices to see them in the kernel log.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The code to log IOMMU events is in place now. So enable event logging
with this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds code for polling and printing out events generated by
the AMD IOMMU.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The AMD IOMMU can generate interrupts for various reasons. This patch
adds the basic interrupt enabling infrastructure to the driver.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We need the pci_dev later anyways to enable MSI for the IOMMU hardware.
So remove the devid pointing to the BDF and replace it with the pci_dev
structure where the IOMMU is implemented.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the pci_seg field to the amd_iommu structure and fills
it with the corresponding value from the ACPI table.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the allocation of a event buffer for each AMD IOMMU in
the system. The hardware will log events like device page faults or
other errors to this buffer once this is enabled.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The API definition for dma_alloc_coherent states that the bus address
has to be aligned to the next power of 2 boundary greater than the
allocation size. This is violated by AMD IOMMU so far and this patch
fixes it.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds branch hints to the cecks if a completion_wait is
necessary. The completion_waits in the mapping paths are unlikly because
they will only happen on software implementations of AMD IOMMU which
don't exists today or with lazy IO/TLB flushing when the allocator wraps
around the address space. With lazy IO/TLB flushing the completion_wait
in the unmapping path is unlikely too.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The IO/TLB flushing on every unmaping operation is the most expensive
part in AMD IOMMU code and not strictly necessary. It is sufficient to
do the flush before any entries are reused. This is patch implements
lazy IO/TLB flushing which does exactly this.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The GART currently implements the iommu=[no]fullflush command line
parameters which influence its IO/TLB flushing strategy. This patch
makes these parameters generic so that they can be used by the AMD IOMMU
too.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch moves the invocation of the flushing functions to the
map/unmap helpers because its common code in all dma_ops relevant
mapping/unmapping code.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently AMD IOMMU code triggers a BUG_ON if NULL is passed as the
device. This is inconsistent with other IOMMU implementations.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
gart alloc_coherent need to do virtual mapppings only when an
allocated buffer is not DMA-capable for a device.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86's common alloc_coherent (dma_alloc_coherent in dma-mapping.h) sets
up the gfp flag according to the device dma_mask but Calgary doesn't
need it because of virtual mappings. This patch avoids unnecessary low
zone allocation.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently, GART IOMMU ingores device's dma_mask when it does virtual
mappings. So it could give a device a virtual address that the device
can't access to.
This patch fixes the above problem.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] Fix PCI_DMA_BUS_IS_PHYS for ARM
[ARM] 5247/1: tosa: SW_EAR_IN support
[ARM] 5246/1: tosa: add proper clock alias for tc6393xb clock
[ARM] 5245/1: Fix warning about unused return value in drivers/pcmcia
[ARM] OMAP: Fix MMC device data
imx serial: fix rts handling for non imx1 based hardware
imx serial: set RXD mux bit on i.MX27 and i.MX31
i.MX serial: fix init failure
pcm037: add rts/cts support for serial port
PCI_DMA_BUS_IS_PHYS was defined to be zero, which meant we ignored
the DMA mask for IDE and SCSI transfers. This is wrong - we have
no DMA translation hardware. We want to obey DMA masks so that the
block layer performs bouncing itself.
Reported-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Acked-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add clock alias for clock that is used by tc6393xb device on tosa.
As that chip plays pretty major part in tosa life and is currently
disabled, this is 2.4.27 material.
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As noticed by Russell King, we were not setting this properly
to the number of entries, but rather the total size.
This results in the core dumping code allocating waayyyy too
much memory.
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to pass IRQF_SHARED, otherwise we get things like:
IRQ handler type mismatch for IRQ 33
current handler: PSYCHO_UE
Call Trace:
[000000000048394c] request_irq+0xac/0x120
[00000000007c5f6c] psycho_scan_bus+0x98/0x158
[00000000007c2bc0] pcibios_init+0xdc/0x12c
[0000000000426a5c] do_one_initcall+0x1c/0x160
[00000000007c0180] kernel_init+0x9c/0xfc
[0000000000427050] kernel_thread+0x30/0x60
[00000000006ae1d0] rest_init+0x10/0x60
on e3500 and similar systems.
On a single board, the UE interrupts of two Psycho nodes
are funneled through the same interrupt, from of_debug=3
dump:
/pci@b,4000: direct translate 2ee --> 21
...
/pci@b,2000: direct translate 2ee --> 21
Decimal "33" mentioned above is the hex "21" mentioned here.
Thanks to Meelis Roos for dumps and testing.
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the MN10300 fault handler to make it check in_atomic() rather than
in_interrupt() as commit 6edaf68a87 did for other
architectures:
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Wed Dec 6 20:32:18 2006 -0800
[PATCH] mm: arch do_page_fault() vs in_atomic()
In light of the recent pagefault and filemap_copy_from_user work I've
gone through all the arch pagefault handlers to make sure the
inc_preempt_count() 'feature' works as expected.
Several sections of code (including the new filemap_copy_from_user)
rely on the fact that faults do not take locks under increased preempt
count.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'kvm-updates/2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: VMX: Always return old for clear_flush_young() when using EPT
KVM: SVM: fix guest global tlb flushes with NPT
KVM: SVM: fix random segfaults with NPT enabled
OMAPs MMC device data was passing the wrong structure via the platform
device. Moreover, a missing function means that both sx1_defconfig
and omap_h2_1610_defconfig builds failed with
undefined reference to `omap_set_mmc_info'
errors. Fix this by updating the MMC support from the omapzoom tree.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As well as discard fake accessed bit and dirty bit of EPT.
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Accesses to CR4 are intercepted even with Nested Paging enabled. But the code
does not check if the guest wants to do a global TLB flush. So this flush gets
lost. This patch adds the check and the flush to svm_set_cr4.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch introduces a guest TLB flush on every NPF exit in KVM. This fixes
random segfaults and #UD exceptions in the guest seen under some workloads
(e.g. long running compile workloads or tbench). A kernbench run with and
without that fix showed that it has a slowdown lower than 0.5%
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Use the IMAP offset calculation for OBIO devices as documented in the
programmer's manual. Which is "0x10000 + ((ino & 0x1f) << 3)"
Signed-off-by: David S. Miller <davem@davemloft.net>
Make ia64 refrain from clearing a given to-be-offlined CPU's bit in the
cpu_online_mask until it has processed pending irqs. This change
prevents other CPUs from being blindsided by an apparently offline CPU
nevertheless changing globally visible state. Also remove the existing
redundant cpu_clear(cpu, cpu_online_map).
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Error handling code following a kmalloc should free the allocated data.
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Tony Luck <tony.luck@intel.com>