What Is IOMMU Event Tracing?

The IOMMU event tracing feature enables reporting IOMMU events in the Linux Kernel as they happen during boot-time and run-time. IOMMU event tracing provides insight into IOMMU device topology in the Linux Kernel. This information helps understand which IOMMU group a device belongs to, as well as run-time device assignment changes as devices are moved from hosts to guests and back by the Kernel. The Linux Kernel moves devices from host to guest when users requests such a change.

In addition, IOMMU event tracing helps debug BIOS and firmware problems related to IOMMU hardware and firmware implementation, IOMMU drivers, and device assignment. For example, tracing occurs when a device is detached from the host and assigned to a virtual machine, or the device gets moved from the host domain to the VM domain and allows debugging to occur for each of these processes. The primary purpose of IOMMU event tracing is to help detect and solve performance issues.

Enabling IOMMU event tracing will provide useful information about devices that are using IOMMU as well as as changes that occur in device assignments. In this article, I’ll discuss the IOMMU event tracing feature and the various classes of IOMMU events. In part two of this series, I’ll discuss how to enable and use it to trace events during boot-time and run-time, and how to use the IOMMU tracing feature to get insight into what’s happening in virtualized environments as devices get assigned from hosts to virtual machines and vice versa. This feature helps debug IOMMU problems during development, maintenance, and support.

What is an IOMMU?

IOMMU is short for I/O Memory Management Unit. IOMMUs are hardware that translate device (I/O) addresses to the physical (machine) address space. IOMMU can be viewed as an MMU for devices. MMU maps virtual addresses into physical addresses. Similarly, IOMMU maps device addresses into physical addresses. The following picture shows a comparative depiction of IOMMU vs. MMU.

What Is IOMMU Event Tracing - iommu
A Comparison of IOMMU and MMU Address Mapping

In addition to basic mapping, the IOMMU provides device isolation via access permissions. Mapping requests are allowed or disallowed based on whether or not the device has proper permissions to access a certain memory region. Another key feature IOOMU brings to the table is I/O Virtualization which provides DMA remapping hardware that adds support for the isolation of device accesses to memory, as well as translation functionality. In other words, devices present I/O addresses to the IOMMU which translates them into machine addresses, thereby bridging the gap between device addressing capability and the system memory range.

What Is IOMMU Event Tracing - iommu_access

What Does an IOMMU Do?

IOMMU hardware provides several key features that enhance I/O performance on a system.

  • On systems that support IOMMU, one single contiguous virtual memory region can be mapped to multiple non-contiguous physical memory regions. IOMMU can make a non-contiguous memory region appear contiguous to a device
    (scatter/gather).
  • Scatter/gather optimizes streaming DMA performance for the I/O device.
  • Memory isolation and protection allows device access to memory regions that are mapped for it. As a result, faulty and/or malicious devices can’t corrupt system memory.
  • Memory isolation allows safe device assignment to virtual machines without compromising host and other guest operating systems. Similar to the faulty and/or malicious device case, devices are given access to memory regions which are mapped specifically for them. As a result, devices assigned to virtual machines will not have access to the host or another virtual machine’s memory regions.
  • IOMMU helps address discrepancies between I/O device and system memory addressing capabilities. For example, IOMMU enables 32-bit DMA capable non-DAC devices access to memory regions above 4GB.
  • IOMMU supports hardware interrupt remapping. This feature expands limited hardware interrupts to extendable software interrupts, thereby increasing the number of interrupts that can be supported. Primary uses of interrupt remapping are interrupt isolation, and the ability to translate between interrupt domains. e.g: ioapic vs. x2apic on x86.

As we all know, there is no free lunch! IOMMU introduces latency due to translation overhead in the dynamic DMA mapping path. However, most servers support I/O Translation Table (IOTLB) hardware to reduce the translation overhead.

IOMMU Groups and Device Isolation

Devices are isolated in IOMMU groups. Each group contains a single device or a group of devices, but single device isolation is not always possible for a variety of reasons. Devices behind a bridge can communicate without reaching IOMMU via peer-to-peer communication channels. Unless the I/O hardware/firmware provides a way to disable peer-to-peer communication, IOMMU can’t ensure single device isolation and will be forced to place all the devices behind a bridge in a single IOMMU group for isolation.

Multi-function cards don’t always support the PCI access control services required to describe isolation between functions. In such cases, all functions on a multi-function card are placed in a single IOMMU group. Device(s) in a group can’t be separated for assignment and all devices in that group must be assigned together, even when a virtual machine only needs one of them. For example, IOMMU might be forced to group all 4-ports on a multi-port card because device isolation at port granularity isn’t possible on all hardware.

What Is IOMMU Event Tracing - multiport_device1
Network hardware with device isolation at port level is capable of separating ports for specific IOMMU groups.
What Is IOMMU Event Tracing - multiport_device2
Without port isolation, the network hardware must assign all ports to a single group.

IOMMU Domains and Protection

IOMMU domains are intended to provide protection against a virtual machine corrupting another virtual machine’s memory. Devices get moved from one domain to another as they get moved between VM’s or from a host to a VM. Any device in a domain is given access to the memory regions mapped for the specific domain it belongs to. When a device is assigned to a VM, it is first detached from the host and removed from the host domain, moved to VM domain, and attached to the VM as shown below:

What Is IOMMU Event Tracing - device_host1
Step 1: A guest OS needs access to hardware that’s currently mapped to the host.
What Is IOMMU Event Tracing - device_detached
Step 2: The hardware must first be detached from the host and transfered from the host domain to the VM domain.
What Is IOMMU Event Tracing - device_vm
Step 3: Once the hardware has been moved to the Vm domain, it can be attached to the guest OS.

A Brief Overview of IOMMU Boot and Run-Time Operations

The IOMMU driver creates IOMMU groups and domains during initialization. Devices are placed in IOMMU groups based on their device isolation capabilities. iommu_group_add_device() is called when device is added to a group and iommu_group_remove_device() is called when a device is from a group.

All devices are attached to the host and when a user requests a device to be assigned to a VM, the device gets detached from the host and then attached to the VM. iommu_attach_device() is called to attach a device and iommu_detach_device() is called to detach it. The iommu_map() and iommu_unmap() interfaces are for creating and deleting mappings for the device address space and system address space.

A series of device additions occur during boot. During run-time, after a device is attached, a series of device maps, and unmaps occur until the device is detached.

What Is IOMMU Event Tracing - iommu_ops
IOMMU event tracing provides insight to what is occurring during all of these processes.

The ability to have visibility into device additions, deletions, attaches, detaches, maps, and unmaps is valuable in debugging IOMMU problems. As you can see below, this is exactly what IOMMU events are designed to do. In fact, the idea for this tracing work was a result of debugging several IOMMU problems without having a good insight into what’s happening. Let’s take a look at the trace events.

IOMMU Trace Event Classes

IOMMU events are classified into group, device, map and unmap, and error classes to trace activity in each of these areas. Group class events are generated whenever a device gets added and removed from a group. Device class events are intended for tracing device attach and detach activity. Map and unmap events trace map/unmap activity. Finally, In addition to these normal path events, error class events are for tracing autonomous IOMMU faults that might occur during boot-time and/or run-time.

What Is IOMMU Event Tracing - iommu_events
IOMMU Trace Event Classes

IOMMU Group Class Events

IOMMU group class events are triggered during boot. Traces are generated when devices get added to or removed from an IOMMU group. These traces provide insight into IOMMU device topology and how the devices are grouped for isolation.

  • Add device to a group – Ttriggered when IOMMU driver adds a device to a group. Format: IOMMU: groupID=%d device=%s
  • Remove device from a group – Triggered when IOMMU driver adds a device to a group. Format: IOMMU: groupID=%d device=%s

IOMMU Device Class Events

Events in this group are triggered during run-time, whenever devices are attached to and detached from domains. For example, when a device is detached from host and attached to a guest. This information provides insight into device assignment changes during run-time.

  • Attach (add) device to a domain –  Triggered when a device gets attached (added) to a domain. Format: IOMMU: device=%s
  • Detach (remove) device from a domain – Triggered when a device gets detached (removed) from a domain. Format: IOMMU: device=%s

IOMMU Map and Unmap Events

Events in this group are triggered during run-time whenever device drivers make IOMMU map and unmap requests. This information provides insight into map and unmap requests and helps debug performance and other problems.

  • IOMMU map event –  Triggered when IOMMU driver services a map request. Format: IOMMU: iova=0x%016llx paddr=0x%016llx size=%zu
  • IOMMU unmap event – Triggered when IOMMU driver services an unmap request. Format: IOMMU: iova=0x%016llx size=%zu unmapped_size=%zu

IOMMU Error Class Events

Events in this group are triggered during run-time when an IOMMU fault occurs. This information provides insight into IOMMU faults and useful in logging the fault and take measures to restart the faulting device. The information in the flags field is especially useful in debugging BIOS and firmware problems related to IOMMU hardware and firmware implementation, as well as, problems resulting from incompatibilities between the OS, BIOS, and firmware in spec compliance.

  • IO Page Fault (AMD-Vi): Triggered when an IO Page fault occurs on a AMD-Vi system. Format: IOMMU:%s %s iova=0x%016llx flags=0x%04x

Error class events are implemented in common IOMMU driver code, Intel, and ARM.

How Can IOMMU Event Tracing Help You?

This article is part one of a two part series on IOMMU event tracing. This introduction will help set the knowledge foundation for the second article which will cover how to use this feature to benefit you the most. Stay tuned to this blog to learn more about IOMMU event tracing!

References

Author: Shuah Khan

Shuah contributes to multiple aspects of the Linux Kernel, and she maintains the Kernel Selftest framework.

One thought on “What Is IOMMU Event Tracing?”

Comments are closed.