64-bit DMA concepts
On new IBM® Power Systems servers running Linux®, a set of the PCIe slots support a unique feature called 64-bit direct memory access (DMA). DMA improves I/O operations, and, therefore, system performance.
Adapters and PCIe slots that are enabled for 64-bit DMA allow I/O traffic to take place with less operating system usage, improving latency (time delay for data transfer) and throughput (average rate of successful data transfer).
Following are some key terms for 64-bit DMA.
- PCIe
- Peripheral Component Interconnect Express®, or PCI Express. PCIe is a high-speed, serial, computer expansion bus standard for connecting extension hardware devices to a system board. PCIe is one of the primary buses that are used to attach peripheral devices to an IBM Power Systems server.
- DMA
- Direct memory access. DMA allows an I/O adapter to access a limited amount of memory directly, without involving the CPU for memory transfers. Both the device driver for the adapter and the operating system must recognize and support DMA.
- RDMA
- Remote direct memory access. RDMA supports direct memory access from the memory of one system into another system's memory, without increasing operating system usage. To accomplish this access, data is copied from the network stack to the application memory area. Eliminating the operating system involvement promotes high throughput, low-latency communication. RDMA is often used in High Performance Computing (HPC).
- IOMMU
- Input/output memory management unit. IOMMU enables the connection between DMA-capable I/O buses and the main memory, and manages the I/O memory addresses. On IBM Power Systems, a Translation Control Entry (TCE) translates addresses generated by I/O devices into physical addresses.
- DMA window
- Direct memory access window. A DMA window is a range of addresses
that the adapter is allowed to access. A typical DMA window is relatively
small, around 2 GB, but can be as large as 1 TB. The DMA window address
is mapped to the physical memory by using a Translation Control Entry
(TCE) table in the IOMMU.
In the normal mode of using DMA, device drivers must request mappings from the operating system for every I/O operation, and later remove those mappings after they are used. Some I/O operations allow mappings to be cached and reused by the driver. The performance advantage of using IOMMU is that data is delivered directly to, or read directly from, memory that is a part of the application space. Typically, this approach eliminates extra memory copies for the I/O.
- 64-bit DMA
- 64-bit direct memory access. 64-bit DMA is a PCIe slot capability
on IBM Power Systems servers that enables
a DMA window to be wider, possibly allowing all the partition memory
to be mapped for DMA. This feature avoids increased system usage when
DMA mappings are requested by the driver, because all the system memory
assigned to the partition is already mapped. Consequently, this feature
enables the data transfer between the I/O card that is placed in this
slot and the system memory to be more efficient and with lower latency.
This capability is also known as Huge Dynamic DMA Window in some Linux kernel patches and discussions.
Not all PCIe slots or PCIe adapters support 64-bit DMA. If the card or the device driver does not support the 64-bit DMA feature, the PCIe slot works in a standard way, not being differentiated from the other slots.