also print out irq no in /proc/interrups and /proc/stat in hex, so could
tell bus/dev/func.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
when CONFIG_HAVE_SPARSE_IRQ
preallocate some irq_2_iommu entries, and use get_one_free_irq_2_iomm to
get new one and link to irq_desc if needed.
else will use dyn_array or static array.
v2: <= nr_irqs fix
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch extends the VT-d driver to support KVM
[Ben: fixed memory pinning]
[avi: move dma_remapping.h as well]
Signed-off-by: Kay, Allen M <allen.m.kay@intel.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
Signed-off-by: Amit Shah <amit.shah@qumranet.com>
Acked-by: Mark Gross <mgross@linux.intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This is loosely based on a patch by Jesse Barnes to check the user-space
PCI mappings though the sysfs interfaces. Quoting Jesse's original
explanation:
It's fairly common for applications to map PCI resources through sysfs.
However, with the current implementation, it's possible for an application
to map far more than the range corresponding to the resourceN file it
opened. This patch plugs that hole by checking the range at mmap time,
similar to what is done on platforms like sparc64 in their lower level
PCI remapping routines.
It was initially put together to help debug the e1000e NVRAM corruption
problem, since we initially thought an X driver might be walking past the
end of one of its mappings and clobbering the NVRAM. It now looks like
that's not the case, but doing the check is still important for obvious
reasons.
and this version of the patch differs in that it uses a helper function
to clarify the code, and does all the checks in pages (instead of bytes)
in order to avoid overflows when doing "<< PAGE_SHIFT" etc.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
.... so that they can be used by MTD map drivers. Lets us close#9420
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
IO resource and ioremap debugging uncovered this ioremap() done
by drivers/pci/hotplug/ibmphp_ebda.c:
initcall pci_hotplug_init+0x0/0x41 returned 0 after 3 msecs
calling ibmphp_init+0x0/0x360 @ 1
ibmphpd: IBM Hot Plug PCI Controller Driver version: 0.6
resource map sanity check conflict: 0x9f800 0xaf5e7 0x9f800 0x9ffff reserved
------------[ cut here ]------------
WARNING: at arch/x86/mm/ioremap.c:175 __ioremap_caller+0x5c/0x226()
Pid: 1, comm: swapper Not tainted 2.6.27-rc7-tip-00914-g347b10f-dirty #36038
[<c013a72d>] warn_on_slowpath+0x41/0x68
[<c0156f00>] ? __lock_acquire+0x9ba/0xa7f
[<c012158c>] ? do_flush_tlb_all+0x0/0x59
[<c015ac31>] ? smp_call_function_mask+0x74/0x17d
[<c012158c>] ? do_flush_tlb_all+0x0/0x59
[<c013b228>] ? printk+0x1a/0x1c
[<c013f302>] ? iomem_map_sanity_check+0x82/0x8c
[<c0a773e8>] ? _read_unlock+0x22/0x25
[<c013f302>] ? iomem_map_sanity_check+0x82/0x8c
[<c0154e17>] ? trace_hardirqs_off+0xb/0xd
[<c0127731>] __ioremap_caller+0x5c/0x226
[<c0156158>] ? trace_hardirqs_on+0xb/0xd
[<c012767d>] ? iounmap+0x9d/0xa5
[<c01279dd>] ioremap_nocache+0x15/0x17
[<c0403c42>] ? ioremap+0xd/0xf
[<c0403c42>] ioremap+0xd/0xf
[<c0f1928f>] ibmphp_access_ebda+0x60/0xa0e
[<c0f17f64>] ibmphp_init+0xb5/0x360
[<c0101057>] do_one_initcall+0x57/0x138
[<c0f17eaf>] ? ibmphp_init+0x0/0x360
[<c0156158>] ? trace_hardirqs_on+0xb/0xd
[<c0148d75>] ? __queue_work+0x2b/0x30
[<c0f17eaf>] ? ibmphp_init+0x0/0x360
[<c0f015a0>] kernel_init+0x17b/0x1e2
[<c0f01425>] ? kernel_init+0x0/0x1e2
[<c01178b3>] kernel_thread_helper+0x7/0x10
=======================
---[ end trace a7919e7f17c0a725 ]---
initcall ibmphp_init+0x0/0x360 returned -19 after 144 msecs
calling zt5550_init+0x0/0x6a @ 1
the problem is this code:
io_mem = ioremap (ebda_seg<<4, 65000);
it assumes that the EBDA is 65000 bytes. But BIOS EBDA pointers are
at most 1K large.
_if_ the Rio code truly extends upon the customary EBDA size it needs
to iounmap() this memory and ioremap() it larger, once it knows it from
the generic descriptors that a Rio system is around.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
pci_get_subsys() changed in 2.6.26 so that the from pointer is modified
when the call is being invoked, so fix up the 'const' marking of it that
the compiler is complaining about.
Reported-by: Rufus & Azrael <rufus-azrael@numericable.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pcie_aspm=force did not work because aspm_force was being double negated
leading to the sanity check failing. Moving a bracket should fix this.
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
There's no good reason why a resource_size_t shouldn't just be a
physical address, so simply redefine it in terms of phys_addr_t.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Print out for device BAR values before the kernel tries to update them.
Also make related output use KERN_DEBUG.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch fixes an obvious bug (loop was never entered) caused by
commit 820943b6fc
(pciehp: cleanup pcie_poll_cmd).
Reported-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit fe99740cac (construct one
fakephp slot per PCI slot) introduced a regression, causing a
deadlock when removing a PCI device.
We also never actually removed the device from the PCI core.
So we:
- remove the device from the PCI core
- do not directly call remove_slot() to prevent deadlock
Yu Zhao reported and diagnosed this defect.
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Yu Zhao <yu.zhao@intel.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Again, the cleaned up code introduced some resource warnings:
drivers/pci/setup-bus.c: In function 'pci_bus_dump_res':
drivers/pci/setup-bus.c:542: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'resource_size_t'
drivers/pci/setup-bus.c:542: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'resource_size_t'
Fix those up too.
Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The cleaned up resource code in probe.c introduced some warnings:
drivers/pci/probe.c: In function 'pci_read_bridge_bases':
drivers/pci/probe.c:386: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'resource_size_t'
drivers/pci/probe.c:386: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'resource_size_t'
drivers/pci/probe.c:398: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'resource_size_t'
drivers/pci/probe.c:398: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'resource_size_t'
drivers/pci/probe.c:434: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'resource_size_t'
drivers/pci/probe.c:434: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'resource_size_t'
So fix them up.
Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some BIOSes (the Intel DG33BU, for example) wrongly claim to have DMAR
when they don't. Avoid the resulting crashes when it doesn't work as
expected.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some BIOSes (the Intel DG33BU, for example) wrongly claim to have DMAR
when they don't. Avoid the resulting crashes when it doesn't work as
expected.
I'd still be grateful if someone could test it on a DG33BU with the old
BIOS though, since I've killed mine. I tested the DMI version, but not
this one.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit 884525655d ("PCI: clean up resource
alignment management") changed the resource handling to mark how a
resource was aligned on a per-resource basis.
Thus, instead of looking at the resource number to determine whether it
was a bridge resource or a regular resource (they have different
alignment rules), we should just ask the resource for its alignment
directly.
The reason this broke only cardbus resources was that for the other
types of resources, the old way of deciding alignment actually still
happened to work. But CardBus bridge resources had been changed by
commit 934b7024f0 ("Fix cardbus resource
allocation") to look more like regular resources than PCI bridge
resources from an alignment handling standpoint.
Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alex Chiang and Matthew Wilcox pointed out that pci_get_dev_by_id() does
not properly decrement the reference on the from pointer if it is
present, like the documentation for the function states it will.
It fixes a pretty bad leak in the hotplug core (we were leaking an
entire struct pci_dev for each function of each offlined card, the first
time around; subsequent onlines/offlines were ok).
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: stable <stable@kernel.org>
Tested-by: Alex Chiang <achiang@hp.com>
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit ef0ff95f13 (shpchp: fix slot name)
introduces the shpchp_slot_with_bus module parameter, which was intended
to help work around broken firmware that assigns the same name to multiple
slots.
Commit b3bd307c62 (shpchp: add message about
shpchp_slot_with_bus option) tells the user to use the above parameter
in the event of a name collision.
This approach is sub-optimal because it requires too much work from
the user.
Instead, let's rename the slot on behalf of the user. If firmware
assigns the name N to multiple slots, then:
The first registered slot is assigned N
The second registered slot is assigned N-1
The third registered slot is assigned N-2
The Mth registered slot becomes N-M
In the event we overflow the slot->name parameter, we report an
error to the user.
This is a temporary fix until the entire PCI core can be reworked
such that individual drivers no longer have to manage their own
slot names.
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 3800345f72 (pciehp: fix slot name)
introduces the pciehp_slot_with_bus module parameter, which was intended
to help work around broken firmware that assigns the same name to multiple
slots.
Commit 9e4f2e8d4d (pciehp: add message about
pciehp_slot_with_bus option) tells the user to use the above parameter
in the event of a name collision.
This approach is sub-optimal because it requires too much work from
the user.
Instead, let's rename the slot on behalf of the user. If firmware
assigns the name N to multiple slots, then:
The first registered slot is assigned N
The second registered slot is assigned N-1
The third registered slot is assigned N-2
The Mth registered slot becomes N-M
In the event we overflow the slot->name parameter, we report an
error to the user.
This is a temporary fix until the entire PCI core can be reworked
such that individual drivers no longer have to manage their own
slot names.
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit f46753c5e3 ("PCI: introduce pci_slot") removed the need for this error path. Eliminate this warning:
drivers/pci/hotplug/rpaphp_slot.c: In function 'rpaphp_register_slot':
drivers/pci/hotplug/rpaphp_slot.c:151: warning: label 'sysfs_fail' defined but not used
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Consolidate finding of a root bridge and getting its handle to the one
inline function. It's cut & pasted on multiple places. Use this new
inline in those.
Cc: kristen.c.accardi@intel.com
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
_OSC should be ran on a root bridge instead of the device itself. Do
this before touching OSHP since PCI fw specs states that _OSC should be
preferred over OSHP (however if the device has OSHP but not _OSC -- not
a root bridge -- it's not).
Cc: kristen.c.accardi@intel.com
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Export pci_pme_active() to drivers, so that they can clear the
PME_status bit and disable PME# for their devices without involving
ACPI.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Check the return value of device_create_bin_file in pci_create_bus and
unwind if necessary. Don't propagate error to caller, as failure to create
these files shouldn't prevent PCI from being initialised. Instead, just
log a warning.
Cc: Sven Wegener <sven.wegener@stealer.net>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
With the recent change to avoid masking MSIs using the MSI enable bit, devices
without an MSI mask bit will have their MSI capability always enabled when MSI
is in use, so we need to restore it regardless of the mask bit state.
Fixes kernel bz 11178.
Acked-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Libata has some hacks to deal with certain controllers going silly in D3
state. The right way to handle this is to keep a PCI device flag for
such devices. That can then be generalised for no ATA devices with power
problems.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
A new option, pcie_aspm=force, will force ASPM to be enabled, even on system
with PCIe 1.0 devices.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Disable ASPM on pre-1.1 PCIe devices, as many of them don't implement it
correctly.
Tested-by: Jack Howarth <howarth@bromo.msbb.uc.edu>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The ACPI FADT table includes an ASPM control bit. If the bit is set, do
not enable ASPM since it may indicate that the platform doesn't actually
support the feature.
Tested-by: Jack Howarth <howarth@bromo.msbb.uc.edu>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
David Vrabel has a device which generates an interrupt storm on the INTx
pin if we disable MSI interrupts altogether. Masking interrupts is only
a performance optimisation, so we can ignore the request to mask the
interrupt.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If the kernel is configured to support 64-bit resources on a 32-bit
machine, we can support 64-bit BARs properly. Just change the condition
to check sizeof(resource_size_t) instead of BITS_PER_LONG.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Factor out the code to read one BAR from the loop in pci_read_bases into
a new function, __pci_read_base. The new code is slightly more
readable, better commented and removes the ifdef.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: fixup sparse endianness warnings in proc.c
PCI PM: make more PCI PM core functionality available to drivers
PCI/DMAR: don't assume presence of RMRRs
PCI hotplug: fix error path in pci_slot's register_slot
Make more PCI PM core functionality available to drivers
* Export pci_pme_capable() so that it can be called directly by
drivers (for example, tg3 needs that).
* Move the state choosing part of pci_prepare_to_sleep() to a
separate function, pci_target_state(), that can be called directly
by drivers (for example, tg3 needs that).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
RMRRs do not necessarily have to be present on all VT-d capable platforms.
The printk is just informational and does not need to be followed by an error
return.
Signed-off-by: Yong Y Wang <yong.y.wang@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: mark gross <mgross@linux.intel.com>
Cc: Keshavamurthy, Anil S <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Kobjects do not have a limit in name size since a while, so stop
pretending that they do.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix kernel-doc comments so that they don't produce errors.
Also cut some extraneous copy-paste text.
Error(linhead//drivers/pci/pci.c:1133): duplicate section name 'Description'
Error(linhead//drivers/pci/pci.c:1189): duplicate section name 'Description'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits)
Revert "x86/PCI: ACPI based PCI gap calculation"
PCI: remove unnecessary volatile in PCIe hotplug struct controller
x86/PCI: ACPI based PCI gap calculation
PCI: include linux/pm_wakeup.h for device_set_wakeup_capable
PCI PM: Fix pci_prepare_to_sleep
x86/PCI: Fix PCI config space for domains > 0
Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n
PCI: Simplify PCI device PM code
PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep
PCI ACPI: Rework PCI handling of wake-up
ACPI: Introduce new device wakeup flag 'prepared'
ACPI: Introduce acpi_device_sleep_wake function
PCI: rework pci_set_power_state function to call platform first
PCI: Introduce platform_pci_power_manageable function
ACPI: Introduce acpi_bus_power_manageable function
PCI: make pci_name use dev_name
PCI: handle pci_name() being const
PCI: add stub for pci_set_consistent_dma_mask()
PCI: remove unused arch pcibios_update_resource() functions
PCI: fix pci_setup_device()'s sprinting into a const buffer
...
Fixed up conflicts in various files (arch/x86/kernel/setup_64.c,
arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c,
drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86
and ACPI updates manually.
Since the second argument of acpi_pci_choose_state() and
platform_pci_choose_state() is never used, remove it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Len Brown <len.brown@intel.com>
Get rid of a superfluous acpi_pm_device_sleep_state() parameter. The
only legitimate value of that parameter must be derived from the first
parameter, which is what all the callers already do. (However, this
does not address the fact that ACPI still doesn't set up those flags.)
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Proper memory barriers have been added to order accesses
to ->cmd_busy, so volatile declaration for cmd_busy can
be removed.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
drivers/pci/pci.c needs pm_wakeup.h since it uses device_set_wakup_capable().
The latter also needs to be stubbed out for !CONFIG_PM.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The recently introduced pci_prepare_to_sleep() needs the following fix,
because there are systems which are not power manageable by ACPI (ie.
ACPI doesn't provide methods to put the device into low power states and
back), but require ACPI hooks to be executed for wake-up to work.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
MSI and MSI-X support for interrupt remapping infrastructure.
MSI address register will be programmed with interrupt-remapping table
entry(IRTE) index and the IRTE will contain information about the vector,
cpu destination, etc.
For MSI-X, all the IRTE's will be consecutively allocated in the table,
and the address registers will contain the starting index to the block
and the data register will contain the subindex with in that block.
This also introduces a new irq_chip for cleaner irq migration (in the process
context as opposed to the current irq migration in the context of an interrupt.
interrupt-remapping infrastructure will help us achieve this).
As MSI is edge triggered, irq migration is a simple atomic update(of vector
and cpu destination) of IRTE and flushing the hardware cache.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
IO-APIC support in the presence of interrupt-remapping infrastructure.
IO-APIC RTE will be programmed with interrupt-remapping table entry(IRTE)
index and the IRTE will contain information about the vector, cpu destination,
trigger mode etc, which traditionally was present in the IO-APIC RTE.
Introduce a new irq_chip for cleaner irq migration (in the process
context as opposed to the current irq migration in the context of an interrupt.
interrupt-remapping infrastructure will help us achieve this cleanly).
For edge triggered, irq migration is a simple atomic update(of vector
and cpu destination) of IRTE and flush the hardware cache.
For level triggered, we need to modify the io-apic RTE aswell with the update
vector information, along with modifying IRTE with vector and cpu destination.
So irq migration for level triggered is little bit more complex compared to
edge triggered migration. But the good news is, we use the same algorithm
for level triggered migration as we have today, only difference being,
we now initiate the irq migration from process context instead of the
interrupt context.
In future, when we do a directed EOI (combined with cpu EOI broadcast
suppression) to the IO-APIC, level triggered irq migration will also be
as simple as edge triggered migration and we can do the irq migration
with a simple atomic update to IO-APIC RTE.
TBD: some tests/changes needed in the presence of fixup_irqs() for
level triggered irq migration.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Interrupt-remapping enables queued invalidation. And once queued invalidation
is enabled, IOTLB invalidation also needs to use the queued invalidation
mechanism and the register based IOTLB invalidation doesn't work.
For now, Support for IOTLB invalidation using queued invalidation is
missing. Meanwhile, disable DMA-remapping, if Interrupt-remapping
support is detected.
For the meanwhile, if someone wants to really enable DMA-remapping, they
can use nox2apic, which will disable interrupt-remapping and as such
doesn't enable queued invalidation.
And given that none of the release platforms support intr-remapping yet,
we should be ok for this temporary hack.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Queued invalidation (part of Intel Virtualization Technology for
Directed I/O architecture) infrastructure.
This will be used for invalidating the interrupt entry cache in the
case of Interrupt-remapping and IOTLB invalidation in the case
of DMA-remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Allocate the iommu during the parse of DMA remapping hardware
definition structures. And also, introduce routines for device
scope initialization which will be explicitly called during
dma-remapping initialization.
These will be used for enabling interrupt remapping separately from the
existing DMA-remapping enabling sequence.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Clean up the intel-iommu code related to deferred iommu flush logic. There is
no need to allocate all the iommu's as a sequential array.
This will be used later in the interrupt-remapping patch series to
allocate iommu much early and individually for each device remapping
hardware unit.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
code reorganization of the generic Intel vt-d parsing related routines and linux
iommu routines specific to Intel vt-d.
drivers/pci/dmar.c now contains the generic vt-d parsing related routines
drivers/pci/intel_iommu.c contains the iommu routines specific to vt-d
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
gart.h has only GART-specific stuff. Only GART code needs it. Other
IOMMU stuff should include iommu.h instead of gart.h.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
want to remove arch_get_ram_range, and use early_node_map instead.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If the offset of PCI device's PM capability in its configuration space,
the mask of states that the device supports PME# from and the D1 and D2
support bits are cached in the corresponding struct pci_dev, the PCI
device PM code can be simplified quite a bit.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce functions pci_prepare_to_sleep() and pci_back_from_sleep(),
to be used by the PCI drivers that want to place their devices into
the lowest power state appropiate for them (PCI_D3hot, if the device
is not supposed to wake up the system, or the deepest state from
which the wake-up is possible, otherwise) while the system is being
prepared to go into a sleeping state and to put them back into D0
during the subsequent transition to the working state.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* Introduce function acpi_pm_device_sleep_wake() for enabling and
disabling the system wake-up capability of devices that are power
manageable by ACPI.
* Introduce function acpi_bus_can_wakeup() allowing other (dependent)
subsystems to check if ACPI is able to enable the system wake-up
capability of given device.
* Introduce callback .sleep_wake() in struct pci_platform_pm_ops and
for the ACPI PCI 'driver' make it use acpi_pm_device_sleep_wake().
* Introduce callback .can_wakeup() in struct pci_platform_pm_ops and
for the ACPI 'driver' make it use acpi_bus_can_wakeup().
* Move the PME# handlig code out of pci_enable_wake() and split it
into two functions, pci_pme_capable() and pci_pme_active(),
allowing the caller to check if given device is capable of
generating PME# from given power state and to enable/disable the
device's PME# functionality, respectively.
* Modify pci_enable_wake() to use the new ACPI callbacks and the new
PME#-related functions.
* Drop the generic .platform_enable_wakeup() callback that is not
used any more.
* Introduce device_set_wakeup_capable() that will set the
power.can_wakeup flag of given device.
* Rework PCI device PM initialization so that, if given device is
capable of generating wake-up events, either natively through the
PME# mechanism, or with the help of the platform, its
power.can_wakeup flag is set and its power.should_wakeup flag is
unset as appropriate.
* Make ACPI set the power.can_wakeup flag for devices found to be
wake-up capable by it.
* Make the ACPI wake-up code enable/disable GPEs for devices that
have the wakeup.flags.prepared flag set (which means that their
wake-up power has been enabled).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Rework pci_set_power_state() so that the platform callback is
invoked before the native mechanism, if necessary. Also, make
the function check if the device is power manageable by the
platform before invoking the platform callback.
This may matter if the device dependent on additional power
resources controlled by the platform is being put into D0, in which
case those power resources must be turned on before we attempt to
handle the device itself.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce function pointer platform_pci_power_manageable to be used
by the platform-related code to point to a function allowing us to
check if given device is power manageable by the platform.
Introduce acpi_pci_power_manageable() playing that role for ACPI.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
It seems VT3336 can't do msi either as with its bro 3351. Disable it.
Reported in the following SUSE bug.
https://bugzilla.novell.com/show_bug.cgi?id=300001
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This changes pci_setup_device to handle pci_name() now returning a
constant string.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
During the development of the physical PCI slot patch series, Gary Hade
kept on reporting strange oopses due to interactions between pci_slot
and acpiphp.
http://lkml.org/lkml/2007/11/28/319
find_root_bridges() unconditionally installs
handle_hotplug_event_bridge() as an ACPI_SYSTEM_NOTIFY handler for all
root bridges.
However, during module cleanup, remove_bridge() will only remove the
notify handler iff the root bridge had a hot-pluggable slot directly
underneath. That is:
root bridge -> hotplug slot
But, if the topology looks like either of the following:
root bridge -> non-hotplug slot
root bridge -> p2p bridge -> hotplug slot
Then we currently do not remove the notify handler from that root
bridge.
This can cause a kernel oops if we modprobe acpiphp later and it gets
loaded somewhere else in memory. If the root bridge then receives a
hotplug event, it will then attempt to call a stale, non-existent notify
handler and we blow up.
Much thanks goes to Gary Hade for his persistent debugging efforts.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the
VPD end tag will hang the device. This problem was initially
observed when a vpd entry was created in sysfs
('/sys/bus/pci/devices/<id>/vpd'). A read to this sysfs entry
will dump 32k of data. Reading a full 32k will cause an access
beyond the VPD end tag causing the device to hang. Once the device
is hung, the bnx2 driver will not be able to reset the device.
We believe that it is legal to read beyond the end tag and
therefore the solution is to limit the read/write length.
A majority of this patch is from Matthew Wilcox who gave code for
reworking the PCI vpd size information. A PCI quirk added for the
Broadcom NIC's to limit the read/write's.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some PCI devices will lock up if we attempt to read from VPD addresses
beyond some device-dependent limit. Until we can identify these
devices and adjust the file size accordingly, only let root read VPD
through sysfs to prevent a DoS by normal users.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Make pci_setup_device() write the bus ID directly into the allotted storage,
rather than using pci_name() as the address as that now returns a const
pointer.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Current pciehp driver saves its private data pointer into pci_dev
structure using pci_set_drvdata()/pci_get_drvdata(). But because
pciehp is not a pci device driver but a PCI Express service driver, it
should save its private data pointer into pcie_device structure using
set_service_data()/get_service_data().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Currently, pciehp driver enables command completed interrupt as follows.
(1) Don't enable at initialization.
(2) Enable command completed interrupt whenever pciehp issues a
command, if the command doesn't attempt to disable the interrupt.
(3) Disable command completed interrupt at driver unloading.
Once we enable command completed interrupt, we don't need to re-enable
it for every command. So we can simplify above steps as follows:
(1) Enable command completed interrupt at initialization.
(2) No special sequence for command completed interrupt.
(3) Disable command completed interrupt at driver unloading.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Current pciehp driver's intialization sequence is as follows:
(1) initialize controller data structure
(2) install interrupt handler
(3) enable software notification
(4) initialize controller specific slot data structure
(5) initialize generic slot data structure and register it to pci hotplug core
The interrupt handler of pciehp assumes that controller specific slot
data structure is already initialized. However, it is installed at (2)
before initializing controller specific slot data structure at
(4). Because of this, pciehp driver cannot handle the following cases
properly.
- If devices that shares IRQ with pciehp raise interrupts between (2) and (4).
- If hotplug events (e.g. MRL open) happen between (3) and (4).
We already have a workaround for this problem ("pciehp: fix NULL
dereference in interrupt handler: dbd79aed1a).
But we still need fundamental fix.
This patch fix the problem by changing the initilization sequence as follows:
(1) initialize controller data structure
(2) initialize controller specific slot data structure
(3) install interrupt handler
(4) enable software notification
(5) initialize generic slot data structure and register it to pci hotplug core
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>