1da177e4c3
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
41 lines
2.2 KiB
Text
41 lines
2.2 KiB
Text
Started Nov 1999 by Kanoj Sarcar <kanoj@sgi.com>
|
|
|
|
The intent of this file is to have an uptodate, running commentary
|
|
from different people about NUMA specific code in the Linux vm.
|
|
|
|
What is NUMA? It is an architecture where the memory access times
|
|
for different regions of memory from a given processor varies
|
|
according to the "distance" of the memory region from the processor.
|
|
Each region of memory to which access times are the same from any
|
|
cpu, is called a node. On such architectures, it is beneficial if
|
|
the kernel tries to minimize inter node communications. Schemes
|
|
for this range from kernel text and read-only data replication
|
|
across nodes, and trying to house all the data structures that
|
|
key components of the kernel need on memory on that node.
|
|
|
|
Currently, all the numa support is to provide efficient handling
|
|
of widely discontiguous physical memory, so architectures which
|
|
are not NUMA but can have huge holes in the physical address space
|
|
can use the same code. All this code is bracketed by CONFIG_DISCONTIGMEM.
|
|
|
|
The initial port includes NUMAizing the bootmem allocator code by
|
|
encapsulating all the pieces of information into a bootmem_data_t
|
|
structure. Node specific calls have been added to the allocator.
|
|
In theory, any platform which uses the bootmem allocator should
|
|
be able to to put the bootmem and mem_map data structures anywhere
|
|
it deems best.
|
|
|
|
Each node's page allocation data structures have also been encapsulated
|
|
into a pg_data_t. The bootmem_data_t is just one part of this. To
|
|
make the code look uniform between NUMA and regular UMA platforms,
|
|
UMA platforms have a statically allocated pg_data_t too (contig_page_data).
|
|
For the sake of uniformity, the function num_online_nodes() is also defined
|
|
for all platforms. As we run benchmarks, we might decide to NUMAize
|
|
more variables like low_on_memory, nr_free_pages etc into the pg_data_t.
|
|
|
|
The NUMA aware page allocation code currently tries to allocate pages
|
|
from different nodes in a round robin manner. This will be changed to
|
|
do concentratic circle search, starting from current node, once the
|
|
NUMA port achieves more maturity. The call alloc_pages_node has been
|
|
added, so that drivers can make the call and not worry about whether
|
|
it is running on a NUMA or UMA platform.
|