free website hit counter Memory - The Cluster Agenda
The Cluster AgendaMain Page | About | Help | FAQ | Special pages | Log in

Printable version | Disclaimers | Privacy policy

Memory

From The Cluster Agenda

Error Detection And Correction

A cluster of systems with large amounts of RAM provides system integrators and administrators with an opportunity to become familiar with Soft Errors.

According to arch/*/kernel/mce.c and arch/*/kernel/traps.c, Linux kernels older than 2.6.16 will either see an uncorrectable bit error as a Machine Check Exception (MCE), print out a message with the DIMM bank, and panic; or as a Non-Maskable Interrupt (NMI) and continue on with a "Dazed" message. An NMI would be seen if MCE panic was disabled with the mce=off kernel boot parameter.

There are new capabilites beginning with the 2.6.16 kernel. The code from the EDAC project was merged into the kernel as optional modules. The modules provide counters for correctable and uncorrectable errors, the ability to reset counters through sysfs, a reset counter - seconds since last reset, etc.

The Linux EDAC modules support the following memory controllers:


I/O on 64-bit Systems With Large Amounts of Memory

Part of the transition from 32-bit x86 to x86_64 with large amounts of memory involves handling I/O devices that only support 32-bit memory addresses. AMD products include a hardware IOMMU that makes everything work transparently, for the most part. Intel EM64T and IA64 products do not include an IOMMU, so the Linux kernel implements a "software I/O translation buffer". The memory allocated to the swiotlb is made unavailable to normal processes, and some device drivers (such as the proprietary NVIDIA graphics driver) may require more memory to be reserved in order to operate reliably. See the Linux kernel's documentation for information about the swiotlb and iommu boot parameters. Much of the information summarized in this paragraph was learned from an LWN DMA article by Jonathan Corbet.

Retrieved from "http://agenda.clustermonkey.net/index.php/Memory"

This page has been accessed 1,461 times. This page was last modified 06:48, 21 February 2006. Content is available under Attribution-NonCommercial-ShareAlike 2.5.


Find
Browse
Main Page
Community portal
Current events
Recent changes
Random page
Help
Donations
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Log in / create account
Special pages
New pages
File list
Statistics
Bug reports
More...