Eric Wollesen ported the Bluesmoke Memory Controller driver for the Intel
5000X/V/P (Blackford/Greencreek) chipset to the in kernel EDAC model.
This patch incorporates those required changes to the edac_mc.c and edac_mc.h
core files by added new Fully Buffered DIMM interface to the EDAC Core module.
Signed-off-by: eric wollesen <ericw@xmtp.net>
Signed-off-by: doug thompson <norsk5@xmission.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is an attempt of providing an interface for memory scrubbing control in
EDAC.
This patch modifies the EDAC Core to provide the Interface for memory
controller modules to implment.
The following things are still outstanding:
- K8 is the first implemenation,
The patch provide a method of configuring the K8 hardware memory scrubber
via the 'mcX' sysfs directory. There should be some fallback to a generic
scrubber implemented in software if the hardware does not support
scrubbing.
Or .. the scrubbing sysfs entry should not be visible at all.
- Only works with SDRAM, not cache,
The K8 can scrub cache and l2cache also - but I think this is not so
useful as the cache is busy all the time (one hopes).
One would also expect that cache scrubbing requires hardware support.
- Error Handling,
I would like that errors are returned to the user in "terms of file
system".
- Presentation,
I chose Bandwidth in Bytes/Second as a representation of the scrubbing
rate for the following reasons:
I like that the sysfs entries are sort-of textual, related to something
that makes sense instead of magical values that must be looked up.
"My People" wants "% main memory scrubbed per hour" others prefer "%
memory bandwidth used" as representation, "bandwith used" makes it easy to
calculate both versions in one-liner scripts.
If one later wants to scrub cache, the scaling becomes wierd for K8
changing from "blocks of 64 byte memory" to "blocks of 64 cache lines" to
"blocks of 64 bit". Using "bandwidth used" makes sense in all three cases,
(I.M.O. anyway ;-).
- Discovery,
There is no way to discover the possible settings and what they do
without reading the code and the documentation.
*I* do not know how to make that work in a practical way.
- Bugs(??),
other tools can set invalid values in the memory scrub control register,
those will read back as '-1', requiring the user to reset the scrub rate.
This is how *I* think it should be.
- Afflicting other areas of code,
I made changes to edac_mc.c and edac_mc.h which will show up globally -
this is not nice, it would be better that the memory scrubbing fuctionality
and interface could be entirely contained within the memory controller it
applies to.
Frithiof Jensen
edac_mc.c and its .h file is a CORE helper module for EDAC
driver modules. This provides the abstraction for device specific
drivers. It is fine to modify this CORE to provide help for
new features of the the drivers
doug thompson
Signed-off-by: Frithiof Jensen <frithiof.jensen@ericson.com>
Signed-off-by: doug thompson <norsk5@xmission.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With CONFIG_PCI=n:
CC drivers/edac/edac_mc.o
drivers/edac/edac_mc.c: In function âadd_mc_to_global_listâ:
drivers/edac/edac_mc.c:1362: error: implicit declaration of function âto_platform_deviceâ
drivers/edac/edac_mc.c:1362: error: invalid type argument of â->â
drivers/edac/edac_mc.c: In function âedac_mc_add_mcâ:
drivers/edac/edac_mc.c:1467: error: invalid type argument of â->â
drivers/edac/edac_mc.c: In function âedac_mc_del_mcâ:
drivers/edac/edac_mc.c:1504: error: invalid type argument of â->â
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the quoted module name in the sysfs for EDAC modules and reported by several
people.
Instead of ../_edac_e752x_/ now the following will be presented, like other
modules: ../edac_e752x/
Signed-off-by: Doug Thompson <norsk5@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove add_mc_to_global_list(). In next patch, this function will be
reimplemented with different semantics.
1 Reimplement add_mc_to_global_list() with semantics that allow the caller to
determine the ID number for a mem_ctl_info structure. Then modify
edac_mc_add_mc() so that the caller specifies the ID number for the new
mem_ctl_info structure. Platform-specific code should be able to assign the
ID numbers in a platform-specific manner. For instance, on Opteron it makes
sense to have the ID of the mem_ctl_info structure match the ID of the node
that the memory controller belongs to.
2 Modify callers of edac_mc_add_mc() so they use the new semantics.
Signed-off-by: Doug Thompson <norsk5@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Change MC drivers from using CVS revision strings for their version number,
Now each driver has its own local string.
Remove some PCI dependencies from the core EDAC module. Made the code 'struct
device' centric instead of 'struct pci_dev' Most of the code changes here are
from a patch by Dave Jiang. It may be best to eventually move the
PCI-specific code into a separate source file.
Signed-off-by: Doug Thompson <norsk5@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Cosmetic indentation/formatting cleanup for EDAC code. Make sure we
are using tabs rather than spaces to indent, etc.
Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Fix code so we always hold mem_ctls_mutex while we are stepping
through the list of mem_ctl_info structures. Otherwise bad things
may happen if one task is stepping through the list while another
task is modifying it. We may eventually want to use reference
counting to manage the mem_ctl_info structures. In the meantime we
may as well fix this bug.
- Don't disable interrupts while we are walking the list of
mem_ctl_info structures in check_mc_devices(). This is unnecessary.
Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- After we unregister a kobject, wait for our kobject release method
to call complete(). This causes us to wait until the kobject
reference count reaches 0. Otherwise, a task accessing the EDAC
sysfs interface can hold the reference count above 0 until after the
EDAC module has been unloaded. When the reference count finally
drops to 0, this will result in an attempt to call our release
method inside the EDAC module after the module has already been
unloaded.
This isn't the best fix, since a process can get stuck sleeping forever
uninterruptibly if the user does the following:
rmmod my_module < /sys/my_sysfs/file
I'll go back and implement a better fix later. However this should
be ok for now.
- Call edac_remove_sysfs_mci_device() from edac_mc_del_mc() rather
than from edac_mc_free(). Since edac_mc_add_mc() calls
edac_create_sysfs_mci_device(), edac_mc_del_mc() should call
edac_remove_sysfs_mci_device().
Signed-off-by: David S. Peterson <dsp@llnl.gov>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Perform the following name substitutions on all source files:
sed 's/BS_MOD_STR/EDAC_MOD_STR/g'
sed 's/bs_thread_info/edac_thread_info/g'
sed 's/bs_thread/edac_thread/g'
sed 's/bs_xstr/edac_xstr/g'
sed 's/bs_str/edac_str/g'
The names that start with BS_ or bs_ are artifacts of when the code
was called "bluesmoke".
Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This implements the following idea:
On Monday 30 January 2006 19:22, Eric W. Biederman wrote:
> One piece missing from this conversation is the issue that we need errors
> in a uniform format. That is why edac_mc has helper functions.
>
> However there will always be errors that don't fit any particular model.
> Could we add a edac_printk(dev, ); That is similar to dev_printk but
> prints out an EDAC header and the device on which the error was found?
> Letting the rest of the string be user specified.
>
> For actual control that interface may be to blunt, but at least for people
> looking in the logs it allows all of the errors to be detected and
> harvested.
Signed-off-by: David S. Peterson <dsp@llnl.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a subset of the bluesmoke project core code, stripped of the NMI work
which isn't ready to merge and some of the "interesting" proc functionality
that needs reworking or just has no place in kernel. It requires no core
kernel changes except the added scrub functions already posted.
The goal is to merge further functionality only after the core code is
accepted and proven in the base kernel, and only at the point the upstream
extras are really ready to merge.
From: doug thompson <norsk5@xmission.com>
This converts EDAC to sysfs and is the final chunk neccessary before EDAC
has a stable user space API and can be considered for submission into the
base kernel.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: doug thompson <norsk5@xmission.com>
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>