Intel® Virtualization Technology for Directed I/O: Architecture Specification [Rev. 2.4 ed.]

This document describes the Intel® Virtualization Technology for Directed I/O. Retrieved from https://software.intel.co

260 51 3MB

English Pages 276 Year 2016

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
1
2
3
Introduction
1.1
1.2
1.3
Overview
2.1
2.2
2.3
2.4
2.5
DMA3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
Audience ........................................................................................................ 1-1
Glossary ........................................................................................................ 1-2
References ..................................................................................................... 1-3
Intel® Virtualization Technology Overview .......................................................... 2-1
VMM and Virtual Machines ................................................................................ 2-1
Hardware Support for Processor Virtualization ..................................................... 2-1
I/O Virtualization............................................................................................. 2-2
Intel® Virtualization Technology For Directed I/O Overview ................................... 2-2
2.5.1 Hardware Support for DMA Remapping..................................................... 2-3
2.5.1.1 OS Usages of DMA Remapping................................................... 2-3
2.5.1.2 VMM Usages of DMA Remapping ................................................ 2-4
2.5.1.3 DMA Remapping Usages by Guests............................................. 2-4
2.5.1.4 Interaction with Processor Virtualization...................................... 2-5
2.5.2 Hardware Support for Interrupt Remapping .............................................. 2-6
2.5.2.1 Interrupt Isolation.................................................................... 2-6
2.5.2.2 Interrupt Migration................................................................... 2-6
2.5.2.3 x2APIC Support ....................................................................... 2-6
2.5.3 Hardware Support for Interrupt Posting .................................................... 2-7
2.5.3.1 Interrupt Vector Scalability........................................................ 2-7
2.5.3.2 Interrupt Virtualization Efficiency ............................................... 2-7
2.5.3.3 Virtual Interrupt Migration......................................................... 2-7
Remapping
Types of DMA requests..................................................................................... 3-1
Domains and Address Translation ...................................................................... 3-1
Remapping Hardware - Software View................................................................ 3-2
Mapping Devices to Domains ............................................................................ 3-2
3.4.1 Source Identifier ................................................................................... 3-3
3.4.2 Root-Entry & Extended-Root-Entry .......................................................... 3-3
3.4.3 Context-Entry ....................................................................................... 3-4
3.4.4 Extended-Context-Entry ......................................................................... 3-5
Hierarchical Translation Structures..................................................................... 3-7
First-Level Translation...................................................................................... 3-9
3.6.1 Translation Faults ................................................................................ 3-11
3.6.2 Access Rights ..................................................................................... 3-11
3.6.3 Accessed, Extended Accessed, and Dirty Flags ........................................ 3-12
3.6.4 Snoop Behavior................................................................................... 3-13
3.6.5 Memory Typing ................................................................................... 3-13
3.6.5.1 Selecting Memory Type from Page Attribute Table ...................... 3-14
3.6.5.2 Selecting Memory Type from Memory Type Range Registers ........ 3-14
3.6.5.3 Selecting Effective Memory Type.............................................. 3-15
Second-Level Translation................................................................................ 3-16
3.7.1 Translation Faults ................................................................................ 3-19
3.7.2 Access Rights ..................................................................................... 3-19
3.7.3 Snoop Behavior................................................................................... 3-20
3.7.4 Memory Typing ................................................................................... 3-20
Nested Translation ........................................................................................ 3-21
3.8.1 Translation Faults ................................................................................ 3-22
3.8.2 Access Rights ..................................................................................... 3-22
3.8.3 Snoop Behavior................................................................................... 3-23
3.8.4 Memory Typing ................................................................................... 3-24
June 2016
Order Number: D51397-008
Intel® Virtualization Technology for Directed I/O
Architecture Specification, Rev. 2.4
3
Intel® Virtualization Technology for Directed I/O—Contents
3.9
4
3.10
3.11
3.12
3.13
3.14
3.15
Support4.1
4.2
5
4.3
Interrupt5.1
5.2
6
Caching6.1
6.2
Identifying Origination of DMA Requests ........................................................... 3-25
3.9.1 Devices Behind PCI-Express to PCI/PCI-X Bridges ....................................3-25
3.9.2 Devices Behind Conventional PCI Bridges ................................................3-25
3.9.3 Root-Complex Integrated Devices ..........................................................3-25
3.9.4 PCI-Express Devices Using Phantom Functions.........................................3-25
Handling Requests from Processor Graphics Device ............................................ 3-26
Handling Requests Crossing Page Boundaries .................................................... 3-26
Handling of Zero-Length Reads........................................................................ 3-26
Handling Requests to Interrupt Address Range .................................................. 3-27
Handling Requests to Reserved System Memory ................................................ 3-27
Root-Complex Peer to Peer Considerations ........................................................ 3-28
For Device-TLBs
Device-TLB Operation...................................................................................... 4-1
4.1.1 Translation Request .............................................................................. 4-2
4.1.2 Translation Completion .......................................................................... 4-2
4.1.3 Translated Request ............................................................................... 4-3
4.1.4 Invalidation Request & Completion .......................................................... 4-3
Remapping Hardware Handling of Device-TLBs ................................................... 4-4
4.2.1 Handling of ATS Protocol Errors .............................................................. 4-4
4.2.2 Root-Port Control of ATS Address Types................................................... 4-4
4.2.3 Handling of Translation Requests ............................................................ 4-4
4.2.3.1 Accessed, Extended Accessed, and Dirty Flags ............................ 4-8
4.2.3.2 Translation Requests for Multiple Translations ............................. 4-9
4.2.4 Handling of Translated Requests ............................................................. 4-9
Handling of Device-TLB Invalidations ................................................................ 4-10
Remapping and Interrupt Posting
Interrupt Remapping ....................................................................................... 5-1
5.1.1 Identifying Origination of Interrupt Requests ............................................ 5-1
5.1.2 Interrupt Request Formats On Intel® 64 Platforms ................................... 5-2
5.1.2.1 Interrupt Requests in Compatibility Format ................................. 5-2
5.1.2.2 Interrupt Requests in Remappable Format .................................. 5-3
5.1.3 Interrupt Remapping Table .................................................................... 5-4
5.1.4 Interrupt-Remapping Hardware Operation................................................ 5-4
5.1.4.1 Interrupt Remapping Fault Conditions ........................................ 5-6
5.1.5 Programming Interrupt Sources To Generate Remappable Interrupts ........... 5-6
5.1.5.1 I/OxAPIC Programming ............................................................ 5-7
5.1.5.2 MSI and MSI-X Register Programming........................................ 5-8
5.1.6 Remapping Hardware - Interrupt Programming......................................... 5-9
5.1.7 Programming in Intel® 64 xAPIC Mode .................................................... 5-9
5.1.8 Programming in Intel® 64 x2APIC Mode..................................................5-10
5.1.9 Handling of Platform Events ..................................................................5-10
Interrupt Posting ........................................................................................... 5-11
5.2.1 Interrupt Remapping Table Support for Interrupt Posting ..........................5-11
5.2.2 Posted Interrupt Descriptor ...................................................................5-12
5.2.3 Interrupt-Posting Hardware Operation ....................................................5-12
5.2.4 Ordering Requirements for Interrupt Posting ...........................................5-13
5.2.5 Using Interrupt Posting for Virtual Interrupt Delivery ................................5-13
5.2.6 Interrupt Posting for Level Triggered Interrupts .......................................5-15
Translation Information
Caching Mode.................................................................................................
6-1
Address Translation Caches..............................................................................
6-1
6.2.1 Tagging of Cached Translations ..............................................................
6-2
6.2.2 Context-cache ......................................................................................
6-2
Intel® Virtualization Technology for Directed I/O
Architecture Specification, Rev. 2.4
4
June 2016
Order Number: D51397-008
Contents—Intel® Virtualization Technology for Directed I/O
6.3
6.4
6.5
7
6.6
6.7
6.8
6.9
6.10
Translation7.1
7.2
6.2.2.1 Context-Entry Programming Considerations................................. 6-4
6.2.3
PASID-cache ........................................................................................ 6-4
6.2.4
IOTLB .................................................................................................. 6-5
6.2.4.1 Details of IOTLB Use ................................................................ 6-6
6.2.4.2 Global Pages ........................................................................... 6-7
6.2.5 Caches for Paging Structures .................................................................. 6-7
6.2.5.1 PML4-cache ............................................................................ 6-8
6.2.5.2 PDPE-cache............................................................................. 6-9
6.2.5.3 PDE-cache ............................................................................ 6-11
6.2.5.4 Details of Paging-Structure Cache Use ...................................... 6-12
6.2.6 Using the Paging-Structure Caches to Translate Requests ......................... 6-13
6.2.7 Multiple Cached Entries for a Single Paging-Structure Entry....................... 6-14
Translation Caching at Endpoint Device ............................................................ 6-15
Interrupt Entry Cache .................................................................................... 6-15
Invalidation of Translation Caches ................................................................... 6-15
6.5.1 Register-based Invalidation Interface ..................................................... 6-16
6.5.1.1 Context Command Register ..................................................... 6-16
6.5.1.2 IOTLB Registers..................................................................... 6-16
6.5.2 Queued Invalidation Interface ............................................................... 6-17
6.5.2.1 Context-cache Invalidate Descriptor ......................................... 6-19
6.5.2.2 PASID-cache Invalidate Descriptor ........................................... 6-20
6.5.2.3 IOTLB Invalidate Descriptor..................................................... 6-21
6.5.2.4 Extended IOTLB Invalidate Descriptor....................................... 6-22
6.5.2.5 Device-TLB Invalidate Descriptor ............................................. 6-24
6.5.2.6 Extended Device-TLB Invalidate Descriptor................................ 6-25
6.5.2.7 Interrupt Entry Cache Invalidate Descriptor ............................... 6-26
6.5.2.8 Invalidation Wait Descriptor .................................................... 6-27
6.5.2.9 Hardware Generation of Invalidation Completion Events.............. 6-27
6.5.2.10 Hardware Handling of Queued Invalidation Interface Errors ......... 6-28
6.5.2.11 Queued Invalidation Ordering Considerations............................. 6-29
6.5.3 IOTLB Invalidation Considerations ......................................................... 6-29
6.5.3.1 Implicit Invalidation on Page Requests ...................................... 6-29
6.5.3.2 Caching Fractured Translations ................................................ 6-30
6.5.3.3 Recommended Invalidation ..................................................... 6-30
6.5.3.4 Optional Invalidation .............................................................. 6-31
6.5.3.5 Delayed Invalidation .............................................................. 6-32
6.5.4 TLB Shootdown Optimization for Root-Complex Integrated Devices ............ 6-32
6.5.4.1 Deferred Invalidation.............................................................. 6-33
6.5.4.2 PASID-State Table ................................................................. 6-34
6.5.4.3 Remapping Hardware Handling of PASID State-Update Requests .. 6-35
6.5.4.4 Root-Complex Integrated Device Handling of PASID State-Update
Responses ............................................................................ 6-35
6.5.4.5 Ordering of PASID State-Update Requests and Responses ........... 6-36
6.5.4.6 Example TLB Shootdown using Deferred Invalidations................. 6-36
6.5.5 Draining of Requests to Memory............................................................ 6-36
6.5.6 Interrupt Draining ............................................................................... 6-37
Set Root Table Pointer Operation ..................................................................... 6-38
Set Interrupt Remapping Table Pointer Operation .............................................. 6-38
Write Buffer Flushing ..................................................................................... 6-39
Hardware Register Programming Considerations ................................................ 6-39
Sharing Remapping Structures Across Hardware Units........................................ 6-39
Faults
Interrupt Translation Faults .............................................................................. 7-1
Address Translation Faults ................................................................................ 7-1
7.2.1 Non-Recoverable Address Translation Faults ............................................. 7-2
7.2.1.1 Non-Recoverable Faults for Untranslated Requests Without PASID .. 7-2
June 2016
Order Number: D51397-008
Intel® Virtualization Technology for Directed I/O
Architecture Specification, Rev. 2.4
5
Intel® Virtualization Technology for Directed I/O—Contents
7.3
7.4
7.5
7.6
7.7
8
7.8
7.9
7.10
7.11
7.12
BIOS
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
9
Translation9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9
9.107.2.1.2 Non-Recoverable Faults for Untranslated Requests With PASID...... 7-3
7.2.1.3 Non-Recoverable Faults for Translation Requests Without PASID.... 7-6
7.2.1.4 Non-Recoverable Faults for Translation Requests With PASID ........ 7-6
7.2.1.5 Non-Recoverable Faults for Translated Requests.......................... 7-8
7.2.2 Recoverable Address Translation Faults ................................................... 7-9
Non-Recoverable Fault Reporting ..................................................................... 7-10
7.3.1 Primary Fault Logging...........................................................................7-11
7.3.2 Advanced Fault Logging ........................................................................7-11
Non-Recoverable Fault Event........................................................................... 7-12
Recoverable Fault Reporting ............................................................................ 7-13
7.5.1 Handling of Page Requests ....................................................................7-13
7.5.1.1 Page Request Descriptor .........................................................7-15
Recoverable Fault Event ................................................................................. 7-17
Servicing Recoverable Faults ........................................................................... 7-18
7.7.1 Page Group Response Descriptor ...........................................................7-19
7.7.2 Page Stream Response Descriptor..........................................................7-20
Page Request Ordering and Draining ................................................................ 7-21
Page Response Ordering and Draining .............................................................. 7-21
Pending Page Request Handling on Terminal Conditions ...................................... 7-22
Software Steps to Drain Page Requests & Responses .......................................... 7-22
Revoking PASIDs with Pending Page Faults ....................................................... 7-23
Considerations
DMA Remapping Reporting Structure................................................................. 8-1
Remapping Structure Types ............................................................................. 8-2
DMA Remapping Hardware Unit Definition Structure ............................................ 8-3
8.3.1 Device Scope Structure ......................................................................... 8-4
8.3.1.1 Reporting Scope for I/OxAPICs.................................................. 8-6
8.3.1.2 Reporting Scope for MSI Capable HPET Timer Block ..................... 8-6
8.3.1.3 Reporting Scope for ACPI Name-space Devices............................ 8-6
8.3.1.4 Device Scope Example ............................................................. 8-6
8.3.2 Implications for ARI .............................................................................. 8-8
8.3.3 Implications for SR-IOV ......................................................................... 8-8
8.3.4 Implications for PCI/PCI-Express Hot Plug ................................................ 8-8
8.3.5 Implications with PCI Resource Rebalancing ............................................. 8-8
8.3.6 Implications with Provisioning PCI BAR Resources ..................................... 8-8
Reserved Memory Region Reporting Structure .................................................... 8-9
Root Port ATS Capability Reporting Structure..................................................... 8-10
Remapping Hardware Static Affinity Structure.................................................... 8-11
ACPI Name-space Device Declaration Structure ................................................. 8-12
Remapping Hardware Unit Hot Plug .................................................................. 8-12
8.8.1 ACPI Name Space Mapping ...................................................................8-12
8.8.2 ACPI Sample Code ...............................................................................8-13
8.8.3 Example Remapping Hardware Reporting Sequence..................................8-14
Structure Formats
Root Entry ..................................................................................................... 9-1
Extended Root Entry ....................................................................................... 9-3
Context Entry................................................................................................. 9-5
Extended-Context-Entry .................................................................................. 9-8
PASID Entry.................................................................................................. 9-15
PASID-State Entry ......................................................................................... 9-17
First-Level Paging Entries................................................................................ 9-18
Second-Level Paging Entries............................................................................ 9-25
Fault Record ................................................................................................. 9-32
Interrupt Remapping Table Entry (IRTE) for Remapped Interrupts ........................ 9-34
Intel® Virtualization Technology for Directed I/O
Architecture Specification, Rev. 2.4
6
June 2016
Order Number: D51397-008
Contents—Intel® Virtualization Technology for Directed I/O
10
9.11
9.12
Register10.110.210.310.4A
Non-RecoverableInterrupt Remapping Table Entry (IRTE) for Posted Interrupts ............................. 9-39
Posted Interrupt Descriptor (PID) .................................................................... 9-42
Descriptions
Register Location........................................................................................... 10-1
Software Access to Registers .......................................................................... 10-1
Register Attributes ........................................................................................ 10-2
Register Descriptions ..................................................................................... 10-3
10.4.1 Version Register.................................................................................. 10-7
10.4.2 Capability Register .............................................................................. 10-8
10.4.3 Extended Capability Register............................................................... 10-13
10.4.4 Global Command Register .................................................................. 10-17
10.4.5 Global Status Register........................................................................ 10-22
10.4.6 Root Table Address Register ............................................................... 10-24
10.4.7 Context Command Register ................................................................ 10-25
10.4.8 IOTLB Registers ................................................................................ 10-28
10.4.8.1 IOTLB Invalidate Register ..................................................... 10-29
10.4.8.2 Invalidate Address Register ................................................... 10-32
10.4.9 Fault Status Register ......................................................................... 10-34
10.4.10Fault Event Control Register ............................................................... 10-36
10.4.11Fault Event Data Register ................................................................... 10-38
10.4.12Fault Event Address Register .............................................................. 10-39
10.4.13Fault Event Upper Address Register ..................................................... 10-40
10.4.14Fault Recording Registers [n] .............................................................. 10-41
10.4.15Advanced Fault Log Register ............................................................... 10-44
10.4.16Protected Memory Enable Register....................................................... 10-45
10.4.17Protected Low-Memory Base Register................................................... 10-47
10.4.18Protected Low-Memory Limit Register .................................................. 10-48
10.4.19Protected High-Memory Base Register .................................................. 10-49
10.4.20Protected High-Memory Limit Register.................................................. 10-50
10.4.21Invalidation Queue Head Register ........................................................ 10-51
10.4.22Invalidation Queue Tail Register .......................................................... 10-52
10.4.23Invalidation Queue Address Register .................................................... 10-53
10.4.24Invalidation Completion Status Register ............................................... 10-54
10.4.25Invalidation Event Control Register ...................................................... 10-55
10.4.26Invalidation Event Data Register ......................................................... 10-56
10.4.27Invalidation Event Address Register ..................................................... 10-57
10.4.28Invalidation Event Upper Address Register ............................................ 10-58
10.4.29Interrupt Remapping Table Address Register......................................... 10-59
10.4.30Page Request Queue Head Register ..................................................... 10-60
10.4.31Page Request Queue Tail Register........................................................ 10-61
10.4.32Page Request Queue Address Register ................................................. 10-62
10.4.33Page Request Status Register ............................................................. 10-63
10.4.34Page Request Event Control Register ................................................... 10-64
10.4.35Page Request Event Data Register ....................................................... 10-65
10.4.36Page Request Event Address Register................................................... 10-66
10.4.37Page Request Event Upper Address Register ......................................... 10-67
10.4.38MTRR Capability Register.................................................................... 10-68
10.4.39MTRR Default Type Register................................................................ 10-69
10.4.40Fixed-Range MTRRs ........................................................................... 10-70
10.4.41Variable-Range MTRRs ....................................................................... 10-72
Fault Reason Encodings ................................................................. 1
June 2016
Order Number: D51397-008
Intel® Virtualization Technology for Directed I/O
Architecture Specification, Rev. 2.4
7
Intel® Virtualization Technology for Directed I/O—Contents
Figures
Figure
1-1.
General Platform Topology ......................................................................... 1-1
Figure
2-2.
Example OS Usage of DMA Remapping ........................................................ 2-3
Figure
2-3.
Example Virtualization Usage of DMA Remapping .......................................... 2-4
Figure
2-4.
Interaction Between I/O and Processor Virtualization ..................................... 2-5
Figure
3-5.
DMA Address Translation ........................................................................... 3-2
Figure
3-6.
Requester Identifier Format........................................................................ 3-3
Figure
3-7.
Device to Domain Mapping Structures using Root-Table ................................. 3-4
Figure
3-8.
Device to Domain Mapping Structures using Extended-Root-Table ................... 3-6
Figure
3-9.
Address Translation to a 4-KByte Page......................................................... 3-7
Figure
3-10.
Address Translation to a 2-MByte Large Page................................................ 3-8
Figure
3-11.
Address Translation to a 1-GByte Large Page ................................................ 3-8
Figure
3-12.
Nested Translation with 4-KByte pages .......................................................3-21
Figure
4-13.
Device-TLB Operation ................................................................................ 4-1
Figure
5-14.
Compatibility Format Interrupt Request........................................................ 5-2
Figure
5-15.
Remappable Format Interrupt Request......................................................... 5-3
Figure
5-16.
I/OxAPIC RTE Programming ....................................................................... 5-7
Figure
5-17.
MSI-X Programming .................................................................................. 5-8
Figure
5-18.
Remapping Hardware Interrupt Programming in Intel® 64 xAPIC Mode............ 5-9
Figure
5-19.
Remapping Hardware Interrupt Programming in Intel® 64 x2APIC Mode .........5-10
Figure
6-20.
Context-cache Invalidate Descriptor ...........................................................6-19
Figure
6-21.
PASID-cache Invalidate Descriptor .............................................................6-20
Figure
6-22.
IOTLB Invalidate Descriptor.......................................................................6-21
Figure
6-23.
Extended IOTLB Invalidate Descriptor .........................................................6-22
Figure
6-24.
Device-TLB Invalidate Descriptor................................................................6-24
Figure
6-25.
Extended Device-TLB Invalidate Descriptor ..................................................6-25
Figure
6-26.
Interrupt Entry Cache Invalidate Descriptor .................................................6-26
Figure
6-27.
Invalidation Wait Descriptor ......................................................................6-27
Figure
7-28.
Page Request Descriptor ...........................................................................7-15
Figure
7-29.
Page Group Response Descriptor................................................................7-19
Figure
7-30.
Page Stream Response Descriptor ..............................................................7-20
Figure
8-31.
Hypothetical Platform Configuration............................................................. 8-7
Figure
9-32.
Root-Entry Format .................................................................................... 9-1
Figure
9-33.
Extended-Root-Entry Format ...................................................................... 9-3
Figure
9-34.
Context-Entry Format ................................................................................ 9-5
Figure
9-35.
Extended-Context-Entry Format.................................................................. 9-8
Figure
9-36.
PASID Entry Format .................................................................................9-15
Figure
9-37.
PASID-State Entry Format.........................................................................9-17
Figure
9-38.
Format for First-Level Paging Entries ..........................................................9-18
Figure
9-39.
Format for Second-Level Paging Entries ......................................................9-25
Figure
9-40.
Fault-Record Format.................................................................................9-32
Figure
9-41.
Interrupt Remap Table Entry Format for Remapped Interrupts .......................9-34
Figure
9-42.
Interrupt Remap Table Entry Format for Posted Interrupts.............................9-39
Figure
9-43.
Posted Interrupt Descriptor Format ............................................................9-42
Figure
10-44.
Version Register ......................................................................................10-7
Figure
10-45.
Capability Register ...................................................................................10-8
Figure
10-46.
Extended Capability Register ................................................................... 10-13
Figure
10-47.
Global Command Register ....................................................................... 10-17
Figure
10-48.
Global Status Register ............................................................................ 10-22
Figure
10-49.
Root Table Address Register .................................................................... 10-24
Figure
10-50.
Context Command Register ..................................................................... 10-25
Figure
10-51.
IOTLB Invalidate Register........................................................................ 10-29
Figure
10-52.
Invalidate Address Register ..................................................................... 10-32
Figure
10-53.
Fault Status Register .............................................................................. 10-34
Intel® Virtualization Technology for Directed I/O
Architecture Specification, Rev. 2.4
8
June 2016
Order Number: D51397-008
Contents—Intel® Virtualization Technology for Directed I/O
Figure
10-54.
Fault Event Control Register.................................................................... 10-36
Figure
10-55.
Fault Event Data Register ....................................................................... 10-38
Figure
10-56.
Fault Event Address Register................................................................... 10-39
Figure
10-57.
Fault Event Upper Address Register.......................................................... 10-40
Figure
10-58.
Fault Recording Register ......................................................................... 10-41
Figure
10-59.
Advanced Fault Log Register ................................................................... 10-44
Figure
10-60.
Protected Memory Enable Register ........................................................... 10-45
Figure
10-61.
Protected Low-Memory Base Register ....................................................... 10-47
Figure
10-62.
Protected Low-Memory Limit Register....................................................... 10-48
Figure
10-63.
Protected High-Memory Base Register ...................................................... 10-49
Figure
10-64.
Protected High-Memory Limit Register ...................................................... 10-50
Figure
10-65.
Invalidation Queue Head Register ............................................................ 10-51
Figure
10-66.
Invalidation Queue Tail Register .............................................................. 10-52
Figure
10-67.
Invalidation Queue Address Register ........................................................ 10-53
Figure
10-68.
Invalidation Completion Status Register.................................................... 10-54
Figure
10-69.
Invalidation Event Control Register .......................................................... 10-55
Figure
10-70.
Invalidation Event Data Register.............................................................. 10-56
Figure
10-71.
Invalidation Event Address Register ......................................................... 10-57
Figure
10-72.
Invalidation Event Upper Address Register ................................................ 10-58
Figure
10-73.
Interrupt Remapping Table Address Register ............................................. 10-59
Figure
10-74.
Page Request Queue Head Register.......................................................... 10-60
Figure
10-75.
Page Request Queue Tail Register ............................................................ 10-61
Figure
10-76.
Page Request Queue Address Register...................................................... 10-62
Figure
10-77.
Page Request Status Register.................................................................. 10-63
Figure
10-78.
Page Request Event Control Register........................................................ 10-64
Figure
10-79.
Page Request Event Data Register ........................................................... 10-65
Figure
10-80.
Page Request Event Address Register ....................................................... 10-66
Figure
10-81.
Page Request Event Upper Address Register.............................................. 10-67
Figure
10-82.
MTRR Capability Register ........................................................................ 10-68
Figure
10-83.
MTRR Default Type Register .................................................................... 10-69
Figure
10-84.
Fixed-Range MTRR Format ...................................................................... 10-70
Figure
10-85.
Variable-Range MTRR Format .................................................................. 10-72
June 2016
Order Number: D51397-008
Intel® Virtualization Technology for Directed I/O
Architecture Specification, Rev. 2.4
9
Intel® Virtualization Technology for Directed I/O—Contents
Tables
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
Glossary .................................................................................................. 1-2
References ............................................................................................... 1-3
First-level Paging Structures ....................................................................... 3-9
Effective Memory Types ............................................................................3-15
Second-level Paging Structures ..................................................................3-17
Address Fields in Remappable Interrupt Request Format ................................ 5-3
Data Fields in Remappable Interrupt Request Format ..................................... 5-4
Interrupt Remapping Fault Conditions .......................................................... 5-6
Index Mask Programming..........................................................................6-26
Interrupt Remapping Fault Conditions .......................................................... 7-1
Non-Recoverable Faults for Untranslated Requests Without PASID ................... 7-2
Non-Recoverable Faults for Untranslated Requests With PASID ....................... 7-3
Non-Recoverable Faults For Translation Requests Without PASID..................... 7-6
Non-Recoverable Faults For Translation Requests With PASID ......................... 7-7
Non-Recoverable Faults For Translated Requests ........................................... 7-9
Recoverable Fault Conditions For Translation Requests ..................................7-10
Response Codes.......................................................................................7-20
Format of PML4E that references a Page-Directory-Pointer Table ....................9-19
Format of PDPE that maps a 1-GByte Page .................................................9-20
Format of PDPE that references a Page-Directory Table .................................9-21
Format of PDE that maps a 2-MByte Page ..................................................9-22
Format of PDE that references a Page Table.................................................9-23
Format of PTE that maps a 4-KByte Page ...................................................9-24
Format of SL-PML4E referencing a Second-Level-Page-Directory-Pointer Table .9-26
Format of SL-PDPE that maps a 1-GByte Page .............................................9-27
Format of SL-PDPE that references a Second-Level-Page-Directory .................9-28
Format of SL-PDE that maps to a 2-MByte Page ...........................................9-29
Format of SL-PDE that references a Second-Level-Page Table ........................9-30
Format of SL-PTE that maps 4-KByte Page ..................................................9-31
Address Mapping for Fixed-Range MTRRs .................................................. 10-71
Recommend Papers

Intel® Virtualization Technology for Directed I/O: Architecture Specification [Rev. 2.4 ed.]

  • Author / Uploaded
  • coll.
  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O Ar ch it e ct u r e Spe cifica t ion

Ju n e 2 0 1 6

Order Num ber: D51397- 008, Rev. 2.4

I NFORMATI ON I N THI S DOCUMENT I S PROVI DED I N CONNECTI ON WI TH I NTEL PRODUCTS. NO LI CENSE, EXPRESS OR I MPLI ED, BY ESTOPPEL OR OTHERWI SE, TO ANY I NTELLECTUAL PROPERTY RI GHTS I S GRANTED BY THI S DOCUMENT. EXCEPT AS PROVI DED I N I NTEL'S TERMS AND CONDI TI ONS OF SALE FOR SUCH PRODUCTS, I NTEL ASSUMES NO LI ABI LI TY WHATSOEVER AND I NTEL DI SCLAI MS ANY EXPRESS OR I MPLI ED WARRANTY, RELATI NG TO SALE AND/ OR USE OF I NTEL PRODUCTS I NCLUDI NG LI ABI LI TY OR WARRANTI ES RELATI NG TO FI TNESS FOR A PARTI CULAR PURPOSE, MERCHANTABI LI TY, OR I NFRI NGEMENT OF ANY PATENT, COPYRI GHT OR OTHER I NTELLECTUAL PROPERTY RI GHT. A " Mission Crit ical Applicat ion" is any applicat ion in which failure of t he I nt el Product could result , direct ly or indirect ly, in personal inj ury or deat h. SHOULD YOU PURCHASE OR USE I NTEL'S PRODUCTS FOR ANY SUCH MI SSI ON CRI TI CAL APPLI CATI ON, YOU SHALL I NDEMNI FY AND HOLD I NTEL AND I TS SUBSI DI ARI ES, SUBCONTRACTORS AND AFFI LI ATES, AND THE DI RECTORS, OFFI CERS, AND EMPLOYEES OF EACH, HARMLESS AGAI NST ALL CLAI MS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARI SI NG OUT OF, DI RECTLY OR I NDI RECTLY, ANY CLAI M OF PRODUCT LI ABI LI TY, PERSONAL I NJURY, OR DEATH ARI SI NG I N ANY WAY OUT OF SUCH MI SSI ON CRI TI CAL APPLI CATI ON, WHETHER OR NOT I NTEL OR I TS SUBCONTRACTOR WAS NEGLI GENT I N THE DESI GN, MANUFACTURE, OR WARNI NG OF THE I NTEL PRODUCT OR ANY OF I TS PARTS. I nt el m ay m ake changes t o specificat ions and product descript ions at any t im e, wit hout not ice. Designers m ust not rely on t he absence or charact erist ics of any feat ures or inst ruct ions m arked " reserved" or " undefined" . I nt el reserves t hese for fut ure definit ion and shall have no responsibilit y what soever for conflict s or incom pat ibilit ies arising from fut ure changes t o t hem . The inform at ion here is subj ect t o change wit hout not ice. Do not finalize a design wit h t his inform at ion. The product s described in t his docum ent m ay cont ain design defect s or errors known as errat a which m ay cause t he product t o deviat e from published specificat ions. Current charact erized errat a are available on request . Cont act your local I nt el sales office or your dist ribut or t o obt ain t he lat est specificat ions and before placing your product order. Copies of docum ent s which have an order num ber and are referenced in t his docum ent , or ot her I nt el lit erat ure, m ay be obt ained by calling 1- 800- 5484725, or go t o: ht t p: / / www.int el.com / design/ lit erat ure.ht m This docum ent cont ains inform at ion on product s in t he design phase of developm ent . I nt el ® 64 archit ect ure requires a syst em wit h a 64- bit enabled processor, chipset , BI OS and soft ware. Perform ance will vary depending on t he specific hardware and soft ware you use. Consult your PC m anufact urer for m ore inform at ion. For m ore inform at ion, visit ht t p: / / www.int el.com / info/ em 64t I nt el ® Virt ualizat ion Technology requires a com put er syst em wit h an enabled I nt el ® processor, BI OS, and virt ual m achine m onit or ( VMM) . Funct ionalit y, perform ance or ot her benefit s will vary depending on hardware and soft ware configurat ions. Soft ware applicat ions m ay not be com pat ible wit h all operat ing syst em s. Consult your PC m anufact urer. For m ore inform at ion, visit ht t p: / / www.int el.com / go/ virt ualizat ion Copyright © 2011- 2016, I nt el Corporat ion. All Right s Reserved. Legal Lines an d Disclaim ers

I nt el and I t anium are t radem arks of I nt el Corporat ion in t he U.S. and/ or ot her count ries. * Ot her nam es and brands m ay be claim ed as t he propert y of ot hers.

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 2

June 2016 Order Num ber: D51397- 008

Con t e n t s—I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O

Con t e nt s 1

I nt roduct ion 1.1 Audience ........................................................................................................ 1- 1 1.2 Glossary ........................................................................................................ 1- 2 1.3 References ..................................................................................................... 1- 3

2

Overview 2.1 I nt el ® Virt ualizat ion Technology Overview .......................................................... 2- 1 2.2 VMM and Virt ual Machines ................................................................................ 2- 1 2.3 Hardware Support for Processor Virt ualizat ion ..................................................... 2- 1 2.4 I / O Virt ualizat ion ............................................................................................. 2- 2 2.5 I nt el ® Virt ualizat ion Technology For Direct ed I / O Overview ................................... 2- 2 2.5.1 Hardware Support for DMA Rem apping..................................................... 2- 3 2.5.1.1 OS Usages of DMA Rem apping ................................................... 2- 3 2.5.1.2 VMM Usages of DMA Rem apping ................................................ 2- 4 2.5.1.3 DMA Rem apping Usages by Guest s............................................. 2- 4 2.5.1.4 I nt eract ion wit h Processor Virt ualizat ion ...................................... 2- 5 2.5.2 Hardware Support for I nt errupt Rem apping .............................................. 2- 6 2.5.2.1 I nt errupt I solat ion.................................................................... 2- 6 2.5.2.2 I nt errupt Migrat ion ................................................................... 2- 6 2.5.2.3 x2API C Support ....................................................................... 2- 6 2.5.3 Hardware Support for I nt errupt Post ing .................................................... 2- 7 2.5.3.1 I nt errupt Vect or Scalabilit y ........................................................ 2- 7 2.5.3.2 I nt errupt Virt ualizat ion Efficiency ............................................... 2- 7 2.5.3.3 Virt ual I nt errupt Migrat ion ......................................................... 2- 7

3

DMA Rem apping 3.1 Types of DMA request s..................................................................................... 3- 1 3.2 Dom ains and Address Translat ion ...................................................................... 3- 1 3.3 Rem apping Hardware - Soft ware View ................................................................ 3- 2 3.4 Mapping Devices t o Dom ains ............................................................................ 3- 2 3.4.1 Source I dent ifier ................................................................................... 3- 3 3.4.2 Root - Ent ry & Ext ended- Root - Ent ry .......................................................... 3- 3 3.4.3 Cont ext - Ent ry ....................................................................................... 3- 4 3.4.4 Ext ended- Cont ext - Ent ry ......................................................................... 3- 5 3.5 Hierarchical Translat ion St ruct ures..................................................................... 3- 7 3.6 First - Level Translat ion ...................................................................................... 3- 9 3.6.1 Translat ion Fault s ................................................................................ 3- 11 3.6.2 Access Right s ..................................................................................... 3- 11 3.6.3 Accessed, Ext ended Accessed, and Dirt y Flags ........................................ 3- 12 3.6.4 Snoop Behavior ................................................................................... 3- 13 3.6.5 Mem ory Typing ................................................................................... 3- 13 3.6.5.1 Select ing Mem ory Type from Page At t ribut e Table ...................... 3- 14 3.6.5.2 Select ing Mem ory Type from Mem ory Type Range Regist ers ........ 3- 14 3.6.5.3 Select ing Effect ive Mem ory Type .............................................. 3- 15 3.7 Second- Level Translat ion................................................................................ 3- 16 3.7.1 Translat ion Fault s ................................................................................ 3- 19 3.7.2 Access Right s ..................................................................................... 3- 19 3.7.3 Snoop Behavior ................................................................................... 3- 20 3.7.4 Mem ory Typing ................................................................................... 3- 20 3.8 Nest ed Translat ion ........................................................................................ 3- 21 3.8.1 Translat ion Fault s ................................................................................ 3- 22 3.8.2 Access Right s ..................................................................................... 3- 22 3.8.3 Snoop Behavior ................................................................................... 3- 23 3.8.4 Mem ory Typing ................................................................................... 3- 24

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 3

I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O—Con t e n t s

3.9

3.10 3.11 3.12 3.13 3.14 3.15

I dent ifying Originat ion of DMA Request s ........................................................... 3- 25 3.9.1 Devices Behind PCI - Express t o PCI / PCI - X Bridges .................................... 3- 25 3.9.2 Devices Behind Convent ional PCI Bridges ................................................ 3- 25 3.9.3 Root - Com plex I nt egrat ed Devices .......................................................... 3- 25 3.9.4 PCI - Express Devices Using Phant om Funct ions......................................... 3- 25 Handling Request s from Processor Graphics Device ............................................ 3- 26 Handling Request s Crossing Page Boundaries .................................................... 3- 26 Handling of Zero- Lengt h Reads........................................................................ 3- 26 Handling Request s t o I nt errupt Address Range .................................................. 3- 27 Handling Request s t o Reserved Syst em Mem ory ................................................ 3- 27 Root - Com plex Peer t o Peer Considerat ions ........................................................ 3- 28

4

Support For Device- TLBs 4.1 Device- TLB Operat ion ...................................................................................... 4- 1 4.1.1 Translat ion Request .............................................................................. 4- 2 4.1.2 Translat ion Com plet ion .......................................................................... 4- 2 4.1.3 Translat ed Request ............................................................................... 4- 3 4.1.4 I nvalidat ion Request & Com plet ion .......................................................... 4- 3 4.2 Rem apping Hardware Handling of Device- TLBs ................................................... 4- 4 4.2.1 Handling of ATS Prot ocol Errors .............................................................. 4- 4 4.2.2 Root - Port Cont rol of ATS Address Types................................................... 4- 4 4.2.3 Handling of Translat ion Request s ............................................................ 4- 4 4.2.3.1 Accessed, Ext ended Accessed, and Dirt y Flags ............................ 4- 8 4.2.3.2 Translat ion Request s for Mult iple Translat ions ............................. 4- 9 4.2.4 Handling of Translat ed Request s ............................................................. 4- 9 4.3 Handling of Device- TLB I nvalidat ions ................................................................ 4- 10

5

I nt errupt Rem apping and I nt errupt Post ing 5.1 I nt errupt Rem apping ....................................................................................... 5- 1 5.1.1 I dent ifying Originat ion of I nt errupt Request s ............................................ 5- 1 5.1.2 I nt errupt Request Form at s On I nt el® 64 Plat form s ................................... 5- 2 5.1.2.1 I nt errupt Request s in Com pat ibilit y Form at ................................. 5- 2 5.1.2.2 I nt errupt Request s in Rem appable Form at .................................. 5- 3 5.1.3 I nt errupt Rem apping Table .................................................................... 5- 4 5.1.4 I nt errupt - Rem apping Hardware Operat ion ................................................ 5- 4 5.1.4.1 I nt errupt Rem apping Fault Condit ions ........................................ 5- 6 5.1.5 Program m ing I nt errupt Sources To Generat e Rem appable I nt errupt s ........... 5- 6 5.1.5.1 I / OxAPI C Program m ing ............................................................ 5- 7 5.1.5.2 MSI and MSI - X Regist er Program m ing........................................ 5- 8 5.1.6 Rem apping Hardware - I nt errupt Program m ing ......................................... 5- 9 5.1.7 Program m ing in I nt el ® 64 xAPI C Mode .................................................... 5- 9 5.1.8 Program m ing in I nt el ® 64 x2API C Mode .................................................. 5- 10 5.1.9 Handling of Plat form Event s .................................................................. 5- 10 5.2 I nt errupt Post ing ........................................................................................... 5- 11 5.2.1 I nt errupt Rem apping Table Support for I nt errupt Post ing .......................... 5- 11 5.2.2 Post ed I nt errupt Descript or ................................................................... 5- 12 5.2.3 I nt errupt - Post ing Hardware Operat ion .................................................... 5- 12 5.2.4 Ordering Requirem ent s for I nt errupt Post ing ........................................... 5- 13 5.2.5 Using I nt errupt Post ing for Virt ual I nt errupt Delivery ................................ 5- 13 5.2.6 I nt errupt Post ing for Level Triggered I nt errupt s ....................................... 5- 15

6

Caching Translat ion I nform at ion 6.1 Caching Mode................................................................................................. 6.2 Address Translat ion Caches .............................................................................. 6.2.1 Tagging of Cached Translat ions .............................................................. 6.2.2 Cont ext - cache ......................................................................................

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 4

6- 1 6- 1 6- 2 6- 2

June 2016 Order Num ber: D51397- 008

Con t e n t s—I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O

6.2.2.1 Cont ext - Ent ry Program m ing Considerat ions................................. 6- 4 PASI D- cache ........................................................................................ 6- 4 I OTLB .................................................................................................. 6- 5 6.2.4.1 Det ails of I OTLB Use ................................................................ 6- 6 6.2.4.2 Global Pages ........................................................................... 6- 7 6.2.5 Caches for Paging St ruct ures .................................................................. 6- 7 6.2.5.1 PML4- cache ............................................................................ 6- 8 6.2.5.2 PDPE- cache............................................................................. 6- 9 6.2.5.3 PDE- cache ............................................................................ 6- 11 6.2.5.4 Det ails of Paging- St ruct ure Cache Use ...................................... 6- 12 6.2.6 Using t he Paging- St ruct ure Caches t o Translat e Request s ......................... 6- 13 6.2.7 Mult iple Cached Ent ries for a Single Paging- St ruct ure Ent ry....................... 6- 14 Translat ion Caching at Endpoint Device ............................................................ 6- 15 I nt errupt Ent ry Cache .................................................................................... 6- 15 I nvalidat ion of Translat ion Caches ................................................................... 6- 15 6.5.1 Regist er- based I nvalidat ion I nt erface ..................................................... 6- 16 6.5.1.1 Cont ext Com m and Regist er ..................................................... 6- 16 6.5.1.2 I OTLB Regist ers ..................................................................... 6- 16 6.5.2 Queued I nvalidat ion I nt erface ............................................................... 6- 17 6.5.2.1 Cont ext - cache I nvalidat e Descript or ......................................... 6- 19 6.5.2.2 PASI D- cache I nvalidat e Descript or ........................................... 6- 20 6.5.2.3 I OTLB I nvalidat e Descript or ..................................................... 6- 21 6.5.2.4 Ext ended I OTLB I nvalidat e Descript or ....................................... 6- 22 6.5.2.5 Device- TLB I nvalidat e Descript or ............................................. 6- 24 6.5.2.6 Ext ended Device- TLB I nvalidat e Descript or ................................ 6- 25 6.5.2.7 I nt errupt Ent ry Cache I nvalidat e Descript or ............................... 6- 26 6.5.2.8 I nvalidat ion Wait Descript or .................................................... 6- 27 6.5.2.9 Hardware Generat ion of I nvalidat ion Com plet ion Event s .............. 6- 27 6.5.2.10 Hardware Handling of Queued I nvalidat ion I nt erface Errors ......... 6- 28 6.5.2.11 Queued I nvalidat ion Ordering Considerat ions............................. 6- 29 6.5.3 I OTLB I nvalidat ion Considerat ions ......................................................... 6- 29 6.5.3.1 I m plicit I nvalidat ion on Page Request s ...................................... 6- 29 6.5.3.2 Caching Fract ured Translat ions ................................................ 6- 30 6.5.3.3 Recom m ended I nvalidat ion ..................................................... 6- 30 6.5.3.4 Opt ional I nvalidat ion .............................................................. 6- 31 6.5.3.5 Delayed I nvalidat ion .............................................................. 6- 32 6.5.4 TLB Shoot down Opt im izat ion for Root - Com plex I nt egrat ed Devices ............ 6- 32 6.5.4.1 Deferred I nvalidat ion .............................................................. 6- 33 6.5.4.2 PASI D- St at e Table ................................................................. 6- 34 6.5.4.3 Rem apping Hardware Handling of PASI D St at e- Updat e Request s .. 6- 35 6.5.4.4 Root - Com plex I nt egrat ed Device Handling of PASI D St at e- Updat e Responses ............................................................................ 6- 35 6.5.4.5 Ordering of PASI D St at e- Updat e Request s and Responses ........... 6- 36 6.5.4.6 Exam ple TLB Shoot down using Deferred I nvalidat ions................. 6- 36 6.5.5 Draining of Request s t o Mem ory ............................................................ 6- 36 6.5.6 I nt errupt Draining ............................................................................... 6- 37 Set Root Table Point er Operat ion ..................................................................... 6- 38 Set I nt errupt Rem apping Table Point er Operat ion .............................................. 6- 38 Writ e Buffer Flushing ..................................................................................... 6- 39 Hardware Regist er Program m ing Considerat ions ................................................ 6- 39 Sharing Rem apping St ruct ures Across Hardware Unit s........................................ 6- 39 6.2.3 6.2.4

6.3 6.4 6.5

6.6 6.7 6.8 6.9 6.10 7

Translat ion Fault s 7.1 I nt errupt Translat ion Fault s .............................................................................. 7- 1 7.2 Address Translat ion Fault s ................................................................................ 7- 1 7.2.1 Non- Recoverable Address Translat ion Fault s ............................................. 7- 2 7.2.1.1 Non- Recoverable Fault s for Unt ranslat ed Request s Wit hout PASI D .. 7- 2

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 5

I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O—Con t e n t s

7.3

7.4 7.5

7.6 7.7

7.8 7.9 7.10 7.11 7.12 8

BI OS 8.1 8.2 8.3

8.4 8.5 8.6 8.7 8.8

9

7.2.1.2 Non- Recoverable Fault s for Unt ranslat ed Request s Wit h PASI D ...... 7- 3 7.2.1.3 Non- Recoverable Fault s for Translat ion Request s Wit hout PASI D.... 7- 6 7.2.1.4 Non- Recoverable Fault s for Translat ion Request s Wit h PASI D ........ 7- 6 7.2.1.5 Non- Recoverable Fault s for Translat ed Request s .......................... 7- 8 7.2.2 Recoverable Address Translat ion Fault s ................................................... 7- 9 Non- Recoverable Fault Report ing ..................................................................... 7- 10 7.3.1 Prim ary Fault Logging........................................................................... 7- 11 7.3.2 Advanced Fault Logging ........................................................................ 7- 11 Non- Recoverable Fault Event ........................................................................... 7- 12 Recoverable Fault Report ing ............................................................................ 7- 13 7.5.1 Handling of Page Request s .................................................................... 7- 13 7.5.1.1 Page Request Descript or ......................................................... 7- 15 Recoverable Fault Event ................................................................................. 7- 17 Servicing Recoverable Fault s ........................................................................... 7- 18 7.7.1 Page Group Response Descript or ........................................................... 7- 19 7.7.2 Page St ream Response Descript or .......................................................... 7- 20 Page Request Ordering and Draining ................................................................ 7- 21 Page Response Ordering and Draining .............................................................. 7- 21 Pending Page Request Handling on Term inal Condit ions ...................................... 7- 22 Soft ware St eps t o Drain Page Request s & Responses .......................................... 7- 22 Revoking PASI Ds wit h Pending Page Fault s ....................................................... 7- 23 Considerat ions DMA Rem apping Report ing St ruct ure ................................................................. 8- 1 Rem apping St ruct ure Types ............................................................................. 8- 2 DMA Rem apping Hardware Unit Definit ion St ruct ure ............................................ 8- 3 8.3.1 Device Scope St ruct ure ......................................................................... 8- 4 8.3.1.1 Report ing Scope for I / OxAPI Cs.................................................. 8- 6 8.3.1.2 Report ing Scope for MSI Capable HPET Tim er Block ..................... 8- 6 8.3.1.3 Report ing Scope for ACPI Nam e- space Devices............................ 8- 6 8.3.1.4 Device Scope Exam ple ............................................................. 8- 6 8.3.2 I m plicat ions for ARI .............................................................................. 8- 8 8.3.3 I m plicat ions for SR- I OV ......................................................................... 8- 8 8.3.4 I m plicat ions for PCI / PCI - Express Hot Plug ................................................ 8- 8 8.3.5 I m plicat ions wit h PCI Resource Rebalancing ............................................. 8- 8 8.3.6 I m plicat ions wit h Provisioning PCI BAR Resources ..................................... 8- 8 Reserved Mem ory Region Report ing St ruct ure .................................................... 8- 9 Root Port ATS Capabilit y Report ing St ruct ure..................................................... 8- 10 Rem apping Hardware St at ic Affinit y St ruct ure.................................................... 8- 11 ACPI Nam e- space Device Declarat ion St ruct ure ................................................. 8- 12 Rem apping Hardware Unit Hot Plug .................................................................. 8- 12 8.8.1 ACPI Nam e Space Mapping ................................................................... 8- 12 8.8.2 ACPI Sam ple Code ............................................................................... 8- 13 8.8.3 Exam ple Rem apping Hardware Report ing Sequence.................................. 8- 14

Translat ion St ruct ure Form at s 9.1 Root Ent ry ..................................................................................................... 9- 1 9.2 Ext ended Root Ent ry ....................................................................................... 9- 3 9.3 Cont ext Ent ry ................................................................................................. 9- 5 9.4 Ext ended- Cont ext - Ent ry .................................................................................. 9- 8 9.5 PASI D Ent ry .................................................................................................. 9- 15 9.6 PASI D- St at e Ent ry ......................................................................................... 9- 17 9.7 First - Level Paging Ent ries................................................................................ 9- 18 9.8 Second- Level Paging Ent ries............................................................................ 9- 25 9.9 Fault Record ................................................................................................. 9- 32 9.10 I nt errupt Rem apping Table Ent ry ( I RTE) for Rem apped I nt errupt s ........................ 9- 34

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 6

June 2016 Order Num ber: D51397- 008

Con t e n t s—I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O

9.11 9.12

I nt errupt Rem apping Table Ent ry ( I RTE) for Post ed I nt errupt s ............................. 9- 39 Post ed I nt errupt Descript or ( PI D) .................................................................... 9- 42

10

Regist er Descript ions 10.1 Regist er Locat ion........................................................................................... 10- 1 10.2 Soft ware Access t o Regist ers .......................................................................... 10- 1 10.3 Regist er At t ribut es ........................................................................................ 10- 2 10.4 Regist er Descript ions ..................................................................................... 10- 3 10.4.1 Version Regist er .................................................................................. 10- 7 10.4.2 Capabilit y Regist er .............................................................................. 10- 8 10.4.3 Ext ended Capabilit y Regist er ............................................................... 10- 13 10.4.4 Global Com m and Regist er .................................................................. 10- 17 10.4.5 Global St at us Regist er ........................................................................ 10- 22 10.4.6 Root Table Address Regist er ............................................................... 10- 24 10.4.7 Cont ext Com m and Regist er ................................................................ 10- 25 10.4.8 I OTLB Regist ers ................................................................................ 10- 28 10.4.8.1 I OTLB I nvalidat e Regist er ..................................................... 10- 29 10.4.8.2 I nvalidat e Address Regist er ................................................... 10- 32 10.4.9 Fault St at us Regist er ......................................................................... 10- 34 10.4.10 Fault Event Cont rol Regist er ............................................................... 10- 36 10.4.11 Fault Event Dat a Regist er ................................................................... 10- 38 10.4.12 Fault Event Address Regist er .............................................................. 10- 39 10.4.13 Fault Event Upper Address Regist er ..................................................... 10- 40 10.4.14 Fault Recording Regist ers [ n] .............................................................. 10- 41 10.4.15 Advanced Fault Log Regist er ............................................................... 10- 44 10.4.16 Prot ect ed Mem ory Enable Regist er ....................................................... 10- 45 10.4.17 Prot ect ed Low- Mem ory Base Regist er ................................................... 10- 47 10.4.18 Prot ect ed Low- Mem ory Lim it Regist er .................................................. 10- 48 10.4.19 Prot ect ed High- Mem ory Base Regist er .................................................. 10- 49 10.4.20 Prot ect ed High- Mem ory Lim it Regist er .................................................. 10- 50 10.4.21 I nvalidat ion Queue Head Regist er ........................................................ 10- 51 10.4.22 I nvalidat ion Queue Tail Regist er .......................................................... 10- 52 10.4.23 I nvalidat ion Queue Address Regist er .................................................... 10- 53 10.4.24 I nvalidat ion Com plet ion St at us Regist er ............................................... 10- 54 10.4.25 I nvalidat ion Event Cont rol Regist er ...................................................... 10- 55 10.4.26 I nvalidat ion Event Dat a Regist er ......................................................... 10- 56 10.4.27 I nvalidat ion Event Address Regist er ..................................................... 10- 57 10.4.28 I nvalidat ion Event Upper Address Regist er ............................................ 10- 58 10.4.29 I nt errupt Rem apping Table Address Regist er ......................................... 10- 59 10.4.30 Page Request Queue Head Regist er ..................................................... 10- 60 10.4.31 Page Request Queue Tail Regist er ........................................................ 10- 61 10.4.32 Page Request Queue Address Regist er ................................................. 10- 62 10.4.33 Page Request St at us Regist er ............................................................. 10- 63 10.4.34 Page Request Event Cont rol Regist er ................................................... 10- 64 10.4.35 Page Request Event Dat a Regist er ....................................................... 10- 65 10.4.36 Page Request Event Address Regist er ................................................... 10- 66 10.4.37 Page Request Event Upper Address Regist er ......................................... 10- 67 10.4.38 MTRR Capabilit y Regist er .................................................................... 10- 68 10.4.39 MTRR Default Type Regist er ................................................................ 10- 69 10.4.40 Fixed- Range MTRRs ........................................................................... 10- 70 10.4.41 Variable- Range MTRRs ....................................................................... 10- 72

A

N on- Re cove r a ble Fa u lt Re a son En codings ................................................................. 1

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 7

I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O—Con t e n t s

Figu r e s Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure

1- 1. 2- 2. 2- 3. 2- 4. 3- 5. 3- 6. 3- 7. 3- 8. 3- 9. 3- 10. 3- 11. 3- 12. 4- 13. 5- 14. 5- 15. 5- 16. 5- 17. 5- 18. 5- 19. 6- 20. 6- 21. 6- 22. 6- 23. 6- 24. 6- 25. 6- 26. 6- 27. 7- 28. 7- 29. 7- 30. 8- 31. 9- 32. 9- 33. 9- 34. 9- 35. 9- 36. 9- 37. 9- 38. 9- 39. 9- 40. 9- 41. 9- 42. 9- 43. 10- 44. 10- 45. 10- 46. 10- 47. 10- 48. 10- 49. 10- 50. 10- 51. 10- 52. 10- 53.

General Plat form Topology ......................................................................... 1- 1 Exam ple OS Usage of DMA Rem apping ........................................................ 2- 3 Exam ple Virt ualizat ion Usage of DMA Rem apping .......................................... 2- 4 I nt eract ion Bet ween I / O and Processor Virt ualizat ion ..................................... 2- 5 DMA Address Translat ion ........................................................................... 3- 2 Request er I dent ifier Form at ........................................................................ 3- 3 Device t o Dom ain Mapping St ruct ures using Root - Table ................................. 3- 4 Device t o Dom ain Mapping St ruct ures using Ext ended- Root - Table ................... 3- 6 Address Translat ion t o a 4- KByt e Page ......................................................... 3- 7 Address Translat ion t o a 2- MByt e Large Page................................................ 3- 8 Address Translat ion t o a 1- GByt e Large Page ................................................ 3- 8 Nest ed Translat ion wit h 4- KByt e pages ....................................................... 3- 21 Device- TLB Operat ion ................................................................................ 4- 1 Com pat ibilit y Form at I nt errupt Request ........................................................ 5- 2 Rem appable Form at I nt errupt Request ......................................................... 5- 3 I / OxAPI C RTE Program m ing ....................................................................... 5- 7 MSI - X Program m ing .................................................................................. 5- 8 Rem apping Hardware I nt errupt Program m ing in I nt el® 64 xAPI C Mode ............ 5- 9 Rem apping Hardware I nt errupt Program m ing in I nt el® 64 x2API C Mode ......... 5- 10 Cont ext - cache I nvalidat e Descript or ........................................................... 6- 19 PASI D- cache I nvalidat e Descript or ............................................................. 6- 20 I OTLB I nvalidat e Descript or ....................................................................... 6- 21 Ext ended I OTLB I nvalidat e Descript or ......................................................... 6- 22 Device- TLB I nvalidat e Descript or ................................................................ 6- 24 Ext ended Device- TLB I nvalidat e Descript or .................................................. 6- 25 I nt errupt Ent ry Cache I nvalidat e Descript or ................................................. 6- 26 I nvalidat ion Wait Descript or ...................................................................... 6- 27 Page Request Descript or ........................................................................... 7- 15 Page Group Response Descript or ................................................................ 7- 19 Page St ream Response Descript or .............................................................. 7- 20 Hypot het ical Plat form Configurat ion ............................................................. 8- 7 Root - Ent ry Form at .................................................................................... 9- 1 Ext ended- Root - Ent ry Form at ...................................................................... 9- 3 Cont ext - Ent ry Form at ................................................................................ 9- 5 Ext ended- Cont ext - Ent ry Form at .................................................................. 9- 8 PASI D Ent ry Form at ................................................................................. 9- 15 PASI D- St at e Ent ry Form at ......................................................................... 9- 17 Form at for First - Level Paging Ent ries .......................................................... 9- 18 Form at for Second- Level Paging Ent ries ...................................................... 9- 25 Fault - Record Form at ................................................................................. 9- 32 I nt errupt Rem ap Table Ent ry Form at for Rem apped I nt errupt s ....................... 9- 34 I nt errupt Rem ap Table Ent ry Form at for Post ed I nt errupt s............................. 9- 39 Post ed I nt errupt Descript or Form at ............................................................ 9- 42 Version Regist er ...................................................................................... 10- 7 Capabilit y Regist er ................................................................................... 10- 8 Ext ended Capabilit y Regist er ................................................................... 10- 13 Global Com m and Regist er ....................................................................... 10- 17 Global St at us Regist er ............................................................................ 10- 22 Root Table Address Regist er .................................................................... 10- 24 Cont ext Com m and Regist er ..................................................................... 10- 25 I OTLB I nvalidat e Regist er ........................................................................ 10- 29 I nvalidat e Address Regist er ..................................................................... 10- 32 Fault St at us Regist er .............................................................................. 10- 34

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 8

June 2016 Order Num ber: D51397- 008

Con t e n t s—I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O

Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure

10- 54. 10- 55. 10- 56. 10- 57. 10- 58. 10- 59. 10- 60. 10- 61. 10- 62. 10- 63. 10- 64. 10- 65. 10- 66. 10- 67. 10- 68. 10- 69. 10- 70. 10- 71. 10- 72. 10- 73. 10- 74. 10- 75. 10- 76. 10- 77. 10- 78. 10- 79. 10- 80. 10- 81. 10- 82. 10- 83. 10- 84. 10- 85.

Fault Event Cont rol Regist er .................................................................... Fault Event Dat a Regist er ....................................................................... Fault Event Address Regist er ................................................................... Fault Event Upper Address Regist er .......................................................... Fault Recording Regist er ......................................................................... Advanced Fault Log Regist er ................................................................... Prot ect ed Mem ory Enable Regist er ........................................................... Prot ect ed Low- Mem ory Base Regist er ....................................................... Prot ect ed Low- Mem ory Lim it Regist er ....................................................... Prot ect ed High- Mem ory Base Regist er ...................................................... Prot ect ed High- Mem ory Lim it Regist er ...................................................... I nvalidat ion Queue Head Regist er ............................................................ I nvalidat ion Queue Tail Regist er .............................................................. I nvalidat ion Queue Address Regist er ........................................................ I nvalidat ion Com plet ion St at us Regist er .................................................... I nvalidat ion Event Cont rol Regist er .......................................................... I nvalidat ion Event Dat a Regist er .............................................................. I nvalidat ion Event Address Regist er ......................................................... I nvalidat ion Event Upper Address Regist er ................................................ I nt errupt Rem apping Table Address Regist er ............................................. Page Request Queue Head Regist er .......................................................... Page Request Queue Tail Regist er ............................................................ Page Request Queue Address Regist er ...................................................... Page Request St at us Regist er .................................................................. Page Request Event Cont rol Regist er ........................................................ Page Request Event Dat a Regist er ........................................................... Page Request Event Address Regist er ....................................................... Page Request Event Upper Address Regist er .............................................. MTRR Capabilit y Regist er ........................................................................ MTRR Default Type Regist er .................................................................... Fixed- Range MTRR Form at ...................................................................... Variable- Range MTRR Form at ..................................................................

June 2016 Order Num ber: D51397- 008

10- 36 10- 38 10- 39 10- 40 10- 41 10- 44 10- 45 10- 47 10- 48 10- 49 10- 50 10- 51 10- 52 10- 53 10- 54 10- 55 10- 56 10- 57 10- 58 10- 59 10- 60 10- 61 10- 62 10- 63 10- 64 10- 65 10- 66 10- 67 10- 68 10- 69 10- 70 10- 72

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 9

I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O—Con t e n t s

Ta ble s Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30.

Glossary .................................................................................................. 1- 2 References ............................................................................................... 1- 3 First - level Paging St ruct ures ....................................................................... 3- 9 Effect ive Mem ory Types ............................................................................ 3- 15 Second- level Paging St ruct ures .................................................................. 3- 17 Address Fields in Rem appable I nt errupt Request Form at ................................ 5- 3 Dat a Fields in Rem appable I nt errupt Request Form at ..................................... 5- 4 I nt errupt Rem apping Fault Condit ions .......................................................... 5- 6 I ndex Mask Program m ing .......................................................................... 6- 26 I nt errupt Rem apping Fault Condit ions .......................................................... 7- 1 Non- Recoverable Fault s for Unt ranslat ed Request s Wit hout PASI D ................... 7- 2 Non- Recoverable Fault s for Unt ranslat ed Request s Wit h PASI D ....................... 7- 3 Non- Recoverable Fault s For Translat ion Request s Wit hout PASI D ..................... 7- 6 Non- Recoverable Fault s For Translat ion Request s Wit h PASI D ......................... 7- 7 Non- Recoverable Fault s For Translat ed Request s ........................................... 7- 9 Recoverable Fault Condit ions For Translat ion Request s .................................. 7- 10 Response Codes....................................................................................... 7- 20 Form at of PML4E t hat references a Page- Direct ory- Point er Table .................... 9- 19 Form at of PDPE t hat m aps a 1- GByt e Page ................................................. 9- 20 Form at of PDPE t hat references a Page- Direct ory Table ................................. 9- 21 Form at of PDE t hat m aps a 2- MByt e Page .................................................. 9- 22 Form at of PDE t hat references a Page Table................................................. 9- 23 Form at of PTE t hat m aps a 4- KByt e Page ................................................... 9- 24 Form at of SL- PML4E referencing a Second- Level- Page- Direct ory- Point er Table . 9- 26 Form at of SL- PDPE t hat m aps a 1- GByt e Page ............................................. 9- 27 Form at of SL- PDPE t hat references a Second- Level- Page- Direct ory ................. 9- 28 Form at of SL- PDE t hat m aps t o a 2- MByt e Page ........................................... 9- 29 Form at of SL- PDE t hat references a Second- Level- Page Table ........................ 9- 30 Form at of SL- PTE t hat m aps 4- KByt e Page .................................................. 9- 31 Address Mapping for Fixed- Range MTRRs .................................................. 10- 71

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 10

June 2016 Order Num ber: D51397- 008

Re vision H ist or y—I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O

Re vision H ist or y

Date March 2006

Re vision Draft

D e scr ipt ion •

Prelim inary Draft Specificat ion

May 2007

1.0



1.0 Specificat ion

Sept em ber 2007

1.1



Specificat ion updat e for x2API C support

Sept em ber 2008

1.2

February 2011

1.3

January 2012

2.0

August 2013

2.1

• •

2.2

Updat ed chapt er 8 ( BI OS requirem ent s) t o com prehend plat form s wit h ACPI devices capable of generat ing DMA request s ( such as Low Pow er Subsyst em ( LPSS) on client plat form s) .



Ext ended page group request wit h a st ream response request ed flag t o request st ream responses for page request s except t he last request in group. Added an Blocked- On- Fault field t o page request s re quest ing st ream response as a hint t o indicat e t he respect ive fault caused a blocking condit ion on t he endpoint device. Clarified hardware behavior on page request sreceived when page request queue is full.



• • • • • • •

• Oct ober 2014

2.3

• • •

• • •

• • June 2016

2.4 • • • •

June 2016 Order Num ber: D51397- 008

Fixed docum ent at ion errors; Added BI OS support t o report X2API C_OPT_OUT





Sept em ber 2013

Miscellaneous docum ent at ion fixes/ clarificati ons, including BI OS support for NUMA, hot plug

Added support for Shared Virt ual Mem ory ( SVM) capabilit y. Fixed ANDD st ruct ure definit ion in DMAR ACPI t able t o support 2- byt e lengt h field. Fixed invalidat ion granularit y encoding for ext ended I OTLB invalidat ion descript or. Updat ed bit posit ions of fields in PASI D- St at e t able ent ry. Added support for I nt errupt Post ing capabilit y support . Clarified specific regist ers whose read com plet ions are required t o drain various t ypes of int errupt request s generat ed by t he rem apping hardware. Fixed t ypo in effect ive m em ory- t ype com puta t ion for first - level paging ent ry accesses when nest ed t ranslat ions are enabled w it h Ext ended Mem ory Type disabled in secondlevel t ranslat ion t ables. Fixed Page Request St at us Regist er and Page Request Event Cont rol Regist er descript ions t o clarify t hat queueing of any page_req_desc in t he page request queue result s in hardware set t ing t he Pending Page Request ( PPR) field. Fixed Supervisor Request Enable ( SRE) field locat ion from Ext ended- cont ext - ent ry t o PASI D- ent ry, t o dist inguish privileged versus non- privileged PASI Ds of a device. Fixed Ext ended Access Flag Enable ( EAFE) fie ld locat ion from PASI D- ent ry t o Ext endedCont ext- ent ry. Relaxed cont ext- ent ry program m ing considerat ions to clarify soft ware requirem ent t o ensure self- consist ency w hen m odifying present root , ext ended- root , cont ext or ext ended- cont ext ent ries. Reserved Translat ion Type ( TT) field encoding of 110b and 111b in ext ended- cont ext ent ries ( previously docum ent ed incorrect ly as PASI D- only t ranslat ion t ypes) . Fixed locat ion of PASI D Support enum erat ion field in ECAP_REG from bit 28 t o bit 40. Fixed t ypo in Sect ion 4.2.3 t o clarify t hat for t ranslat ion- request s- wit h- PASI D wit h PR= 1, rem apping hardware support ing supervisor- request s ( SRS= 1) ret urn PRI V bit as always 1. Previous versions of t he spec. incorrect ly specified hardware ret urning PRI V bit as 1 only if t he U/ S field is 0 in at least one of t he first- level paging- st ruct ure ent ries cont rolling t he t ranslat ion. Clarified t he ordering requirem ent t o be follow ed by rem apping hardware on page request descript or writ es and recoverable fault report ing event int errupt . Updat ed Chapt er 6 t o include Device-TLB invalidat oi n t hrot t ling support for SR- I OV devices. New Device-TLB I nvalidat ion Throt t ling ( DI T) capabilit y field added t o ECAP_REG. Updat ed Chapt er 6 t o include a new Page- request Drain ( PD) flag in inv_wait _dsc for page request draining. Updat ed Chapt er 7 t o include det ails on page request and page response ordering and draining, including handling of t erm inal condit ions on device wit h pending page fault s. Added ECAP_REG capabilit y fields t o report support for Device-TLB invalidat ion t hrot t ling and page- request draining. Clarified Caching Mode ( CM= 1) behavior t o indicat e t hat t he reserved Dom ain- I D of 0 is used only for cont ext- cache and rest of t he caching st ruct ures follow sam e t agging for cached ent ries for CM= 0 and CM = 1( including for cached fault ing ent ries when CM= 1) .

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 11

I n t e l ® V ir t u a liz a t ion Te chn ology for D ir e ct e d I / O—Re vision H ist or y

ft e sL I e nk g a l B Pa s y i l l a Th n o ti n te In

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 12

June 2016 Order Num ber: D51397- 008

I n t r odu ct ion —I n t e l ® V ir t u a liz a t ion Te ch nology for D ir e ct e d I / O

1

I n t r odu ct ion

This docum ent describes t he I nt el ® Virt ualizat ion Technology for Direct ed I / O ( “ I nt el ® VT for Direct ed I / O” ) ; specifically, it describes t he com ponent s support ing I / O virt ualizat ion as it applies t o plat form s t hat use I nt el ® processors and core logic chipset s com plying wit h I nt el ® plat form specificat ions. Figure 1- 1 illust rat es t he general plat form t opology.

P ro c e s s o r

P ro c e s s o r S y s te m B u s

N o rth B rid g e

DRAM

D M A & In te rru p t R e m a p p in g In te g ra te d D e v ic e s

P C I E x p re s s D e v ic e s

S o u th B rid g e

P C I, L P C , L e g a c y d e v ic e s

Figur e 1 - 1 . Ge n e r a l Pla t for m Topology

The docum ent includes t he following t opics: •

An overview of I / O subsyst em hardware funct ions for virt ualizat ion support



A brief overview of expect ed usages of t he generalized hardware funct ions



The t heory of operat ion of hardware, including t he program m ing int erface

The following t opics are not covered ( or are covered in a lim it ed cont ext ) : •

I nt el® Virt ualizat ion Technology for I nt el ® 64 Archit ect ure. For m ore inform at ion, refer t o t he “ I nt el ® 64 Archit ect ure Soft ware Developer's Manual, Volum e 3B: Syst em Program m ing Guide”.

1 .1

Au die n ce

This docum ent is aim ed at hardware designers developing I nt el plat form s or core- logic providing hardware support for virt ualizat ion. The docum ent is also expect ed t o be used by Operat ing Syst em ( OS) and Virt ual Machine Monit or ( VMM) developers ut ilizing t he I / O virt ualizat ion hardware funct ions.

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 1- 1

I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O—I n t r oduct ion

1 .2

Glossa r y

The docum ent uses t he t erm s list ed in t he following t able. Ta ble 1 .

Glossa r y

Te r m

D e fin it ion

Cont ext

A hardware represent at ion of st at e t hat ident ifies a device and t he dom ain t o w hich t he device is assigned.

Cont ext cache

Rem apping hardware cache t hat st ores device t o dom ain m appings

Device-TLB

A t ranslat ion cache at t he endpoint device ( as opposed t o in t he plat form ) .

DMA

Direct Mem ory Access: Address rout ed in- bound request s from I / O devices

DMA Rem apping

The act of t ranslat ing t he address in a DMA request t o a host physical address ( HPA) .

Dom ain

A collect ion of physical, logical, or virt ual resources t hat are allocat ed t o work t oget her. Used as a generic t erm for virt ual m achines, part it ions, et c.

DMA Address

Address in a DMA request : Depending on t he soft ware usage and hardware capabilit ies, DMA address can be Guest Physical Address ( GPA) , Guest Virt ual Address ( GVA) , Virt ual Address ( VA) , or I / O Virt ual Address ( I OVA) .

First - Level Paging

Paging st ruct ures used for address t ranslat ion of DMA request s w it h Process Address Space I D ( PASI D)

First - Level Caches

Translat ion caches used by rem apping hardware unit s t o cache int erm ediat e ( non- leaf ) ent ries of t he first - level paging st ruct ures. These include PML4 cache, PDP cache, and PDE cache.

GAW

Guest Address Widt h: Physical addressabilit y lim it wit hin a part it ion ( virt ual m achine)

GPA

Guest Physical Address: t he view of physical m em ory from soft ware running in a part it ion ( virt ual m achine) .

Guest

Soft ware running w it hin a virt ual m achine environm ent ( part it ion) .

GVA

Guest Virt ual Address: Processor virt ual address used by soft ware running in a part it ion ( virt ual m achine) .

HAW

Host Address Widt h: t he DMA physical addressabilit y lim it for a plat form .

HPA

Host Physical Address: Physical address used by hardware t o access m em ory and m em ory- m apped resources.

I EC

I nt errupt Ent ry Cache: A t ranslat ion cache in rem apping hardware unit t hat caches frequent ly used int errupt - rem apping t able ent ries.

I OTLB

I / O Translat ion Lookaside Buffer: an address t ranslat ion cache in rem apping hardware unit t hat caches effect ive t ranslat ions from DVA ( GPA) t o HPA.

I / OxAPI C

I / O Advanced Program m able I nt errupt Cont roller

I OVA

I / O Virt ual Address: Virt ual address creat ed by soft ware for use in I / O request s.

I nt errupt Rem apping

The act of t ranslat ing an int errupt request before it is delivered t o t he CPU com plex.

MGAW

Maxim um Guest Address Widt h: t he m axim um DMA virt ual addressabilit y support ed by a rem apping hardware im plem ent at ion.

MSI

Message Signalled I nt errupt s.

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 1- 2

June 2016 Order Num ber: D51397- 008

I n t r odu ct ion —I n t e l ® V ir t u a liz a t ion Te ch nology for D ir e ct e d I / O

Ta ble 1 .

Glossa r y

Te r m

1 .3

D e fin it ion

SecondLevel Caches

Translat ion caches used by rem apping hardware unit s t o cache int erm ediat e ( non- leaf ) ent ries of t he second- level ( SL) paging st ruct ures. For hardware support ing 48- bit Guest Address Widt h, t hese include SL- PML4 cache, SL- PDP cache, and SL- PDE cache.

PASI D

Process Address Space I dent ifier: DMA request s w it h virt ual address ( or guest virt ual address) are t agged wit h a PASI D value t hat ident ifies t he t arget ed virt ual address space.

PASI Dcache

Rem apping hardware cache t hat caches frequent ly accessed PASI D- t able ent ries used t o t ranslat e DMA request s wit h PASI D.

PASI D St at e Table

Dat a st ruct ure used by hardware t o report t o soft ware if a given PASI D is act ive at a endpoint device or not . PASI D st at e is used by soft ware t o im plem ent opt im izat ions for I OTLB invalidat ions.

SecondLevel Paging

Paging St ruct ures used for address t ranslat ion of DMA request s w it hout Process Address Space I D ( PASI D) .

Source I D

A 16- bit ident ificat ion num ber t o ident ify t he source of a DMA or int errupt request . For PCI fam ily devices t his is t he ‘Request er I D’ which consist s of PCI Bus num ber, Device num ber, and Funct ion num ber.

Root Com plex

Refers t o one or m ore hardware com ponent s t hat connect processor com plexes t o t he I / O and m em ory subsyst em s. The chipset m ay include a variet y of int egrat ed devices.

VA

Virt ual Address: Virt ual address used by soft ware on a host processor.

VMM

Virt ual Machine Monit or: a soft ware layer t hat cont rols virt ualizat ion. Also referred t o as hypervisor in t his docum ent .

x2API C

The ext ension of xAPI C archit ect ure t o support 32- bit API C addressabilit y of processors and associat ed enhancem ent s.

Re fe r e n ce s Ta ble 2 .

Re fe r e nce s D e scr ipt ion

®

I nt el 64 Archit ect ure Soft ware Developer's Manuals ht t p: / / developer.int el.com / product s/ processor/ m anuals/ index.ht m PCI - Express* Base Specificat ions ht t p: / / w w w.pcisig.com / specificat ions/ pciexpress PCI - Express Address Translat ion Services Specificat ion, Revision 1.1 ht t p: / / w w w.pcisig.com / specificat ions/ iov PCI - Express Process Address Space I D, and PASI D Translat ion ECNs PCI - Express Alt ernat ive Rout ing- I D I nt erpret at ion ( ARI ) ECN PCI - Express Single- Root I / O Virt ualizat ion and Sharing ( SR- I OV) Specificat ion, Revision 1.0 ht t p: / / w w w.pcisig.com / specificat ions/ iov ACPI Specificat ion ht t p: / / ww w.acpi.info/ PCI - Express t o PCI / PCI -X Bridge Specificat ion, Revision 1.0 ht t p: / / w w w.pcisig.com / specificat ions/ pciexpress/ bridge

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 1- 3

I n t e l ® V ir t u a liza t ion Te ch n ology for D ir e ct e d I / O—Ove r vie w

2

Ove r vie w

This chapt er provides a brief overview of I nt el ® VT, t he virt ualizat ion soft ware ecosyst em it enables, and hardware support offered for processor and I / O virt ualizat ion.

2 .1

I n t e l ® Vir t u a liza t ion Te ch n ology Ove r vie w

I nt el ® VT consist s of t echnology com ponent s t hat support virt ualizat ion of plat form s based on I nt el processors, t hereby enabling t he running of m ult iple operat ing syst em s and applicat ions in independent part it ions. Each part it ion behaves like a virt ual m achine ( VM) and provides isolat ion and prot ect ion across part it ions. This hardware- based virt ualizat ion solut ion, along wit h virt ualizat ion soft ware, enables m ult iple usages such as server consolidat ion, act ivit y part it ioning, workload isolat ion, em bedded m anagem ent , legacy soft ware m igrat ion, and disast er recovery.

2 .2

VM M a n d Vir t u a l M a ch in e s

I nt el ® VT support s virt ual m achine archit ect ures com prised of t wo principal classes of soft ware: • Vir t ua l- M a chine M onit or ( VM M ) : A VMM act s as a host and has full cont rol of t he processor( s) and ot her plat form hardware. VMM present s guest soft ware ( see below) wit h an abst ract ion of a virt ual processor and allows it t o execut e direct ly on a logical processor. A VMM is able t o ret ain select ive cont rol of processor resources, physical m em ory, int errupt m anagem ent , and I / O. • Gu e st Soft w a r e : Each virt ual m achine is a guest soft ware environm ent t hat support s a st ack consist ing of an operat ing syst em ( OS) and applicat ion soft ware. Each operat es independent ly of ot her virt ual m achines and uses t he sam e int erface t o processor( s) , m em ory, st orage, graphics, and I / O provided by a physical plat form . The soft ware st ack act s as if it were running on a plat form wit h no VMM. Soft ware execut ing in a virt ual m achine m ust operat e wit h reduced privilege so t hat t he VMM can ret ain cont rol of plat form resources. The VMM is a key com ponent of t he plat form infrast ruct ure in virt ualizat ion usages. I nt el ® VT can im prove t he reliabilit y and support abilit y of virt ualizat ion infrast ruct ure soft ware wit h program m ing int erfaces t o virt ualize processor hardware. I t also provides a foundat ion for addit ional virt ualizat ion support for ot her hardware com ponent s in t he plat form .

2 .3

H a r dw a r e Su ppor t for Pr oce ssor Vir t u a liza t ion

Hardware support for processor virt ualizat ion enables sim ple, robust and reliable VMM soft ware. VMM soft ware relies on hardware support on operat ional det ails for t he handling of event s, except ions, and resources allocat ed t o virt ual m achines. I nt el ® VT provides hardware support for processor virt ualizat ion. For I nt el ® 64 processors, t his support consist s of a set of virt ual- m achine ext ensions ( VMX) t hat support virt ualizat ion of processor hardware for m ult iple soft ware environm ent s by using virt ual m achines.

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 2- 1

June 2016 Order Num ber: D51397- 008

Ove r vie w —I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O

2 .4

I / O Vir t u a liza t ion

A VMM m ust support virt ualizat ion of I / O request s from guest soft ware. I / O virt ualizat ion m ay be support ed by a VMM t hrough any of t he following m odels: • Em ulat ion: A VMM m ay expose a virt ual device t o guest soft ware by em ulat ing an exist ing ( legacy) I / O device. VMM em ulat es t he funct ionalit y of t he I / O device in soft ware over what ever physical devices are available on t he physical plat form . I / O virt ualizat ion t hrough em ulat ion provides good com pat ibilit y ( by allowing exist ing device drivers t o run wit hin a guest ) , but pose lim it at ions wit h perform ance and funct ionalit y. • New Soft ware I nt erfaces: This m odel is sim ilar t o I / O em ulat ion, but inst ead of em ulat ing legacy devices, VMM soft ware exposes a synt het ic device int erface t o guest soft ware. The synt het ic device int erface is defined t o be virt ualizat ion- friendly t o enable efficient virt ualizat ion com pared t o t he overhead associat ed wit h I / O em ulat ion. This m odel provides im proved perform ance over em ulat ion, but has reduced com pat ibilit y ( due t o t he need for specialized guest soft ware or drivers ut ilizing t he new soft ware int erfaces) . • Assignm ent : A VMM m ay direct ly assign t he physical I / O devices t o VMs. I n t his m odel, t he driver for an assigned I / O device runs in t he VM t o which it is assigned and is allowed t o int eract direct ly wit h t he device hardware wit h m inim al or no VMM involvem ent . Robust I / O assignm ent requires addit ional hardware support t o ensure t he assigned device accesses are isolat ed and rest rict ed t o resources owned by t he assigned part it ion. The I / O assignm ent m odel m ay also be used t o creat e one or m ore I / O cont ainer part it ions t hat support em ulat ion or soft ware int erfaces for virt ualizing I / O request s from ot her guest s. The I / O- cont ainer- based approach rem oves t he need for running t he physical device drivers as part of VMM privileged soft ware. • I / O Device Sharing: I n t his m odel, which is an ext ension t o t he I / O assignm ent m odel, an I / O device support s m ult iple funct ional int erfaces, each of which m ay be independent ly assigned t o a VM. The device hardware it self is capable of accept ing m ult iple I / O request s t hrough any of t hese funct ional int erfaces and processing t hem ut ilizing t he device's hardware resources. Depending on t he usage requirem ent s, a VMM m ay support any of t he above m odels for I / O virt ualizat ion. For exam ple, I / O em ulat ion m ay be best suit ed for virt ualizing legacy devices. I / O assignm ent m ay provide t he best perform ance when host ing I / O- int ensive workloads in a guest . Using new soft ware int erfaces m akes a t rade- off bet ween com pat ibilit y and perform ance, and device I / O sharing provides m ore virt ual devices t han t he num ber of physical devices in t he plat form .

2 .5

I n t e l ® Vir t u a liza t ion Te chn ology For D ir e ct e d I / O Ove r vie w

A general requirem ent for all of above I / O virt ualizat ion m odels is t he abilit y t o isolat e and rest rict device accesses t o t he resources owned by t he part it ion m anaging t he device. I nt el ® VT for Direct ed I / O provides VMM soft ware wit h t he following capabilit ies: • I / O device assignm ent : for flexibly assigning I / O devices t o VMs and ext ending t he prot ect ion and isolat ion propert ies of VMs for I / O operat ions. • DMA rem apping: for support ing address t ranslat ions for Direct Mem ory Accesses ( DMA) from devices. • I nt errupt rem apping: for support ing isolat ion and rout ing of int errupt s from devices and ext ernal int errupt cont rollers t o appropriat e VMs. • I nt errupt post ing: for support ing direct delivery of virt ual int errupt s from devices and ext ernal int errupt cont rollers t o virt ual processors. • Reliabilit y: for recording and report ing of DMA and int errupt errors t o syst em soft ware t hat m ay ot herwise corrupt m em ory or im pact VM isolat ion.

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 2- 2

I n t e l ® V ir t u a liza t ion Te ch n ology for D ir e ct e d I / O—Ove r vie w

2 .5 .1

H a r dw a r e Su ppor t for D M A Re m a ppin g

To generalize I / O virt ualizat ion and m ake it applicable t o different processor archit ect ures and operat ing syst em s, t his docum ent refers t o dom ains as abst ract isolat ed environm ent s in t he plat form t o which a subset of host physical m em ory is allocat ed. DMA rem apping provides hardware support for isolat ion of device accesses t o m em ory, and enables each device in t he syst em t o be assigned t o a specific dom ain t hrough a dist inct set of paging st ruct ures. When t he device at t em pt s t o access syst em m em ory, t he DMA- rem apping hardware int ercept s t he access and ut ilizes t he page t ables t o det erm ine whet her t he access can be perm it t ed; it also det erm ines t he act ual locat ion t o access. Frequent ly used paging st ruct ures can be cached in hardware. DMA rem apping can be configured independent ly for each device, or collect ively across m ult iple devices.

2 .5 .1 .1

OS Usa ge s of D M A Re m a ppin g

There are several ways in which operat ing syst em s can use DMA rem apping: • OS Prot ect ion: An OS m ay define a dom ain cont aining it s crit ical code and dat a st ruct ures, and rest rict access t o t his dom ain from all I / O devices in t he syst em . This allows t he OS t o lim it erroneous or unint ended corrupt ion of it s dat a and code t hrough incorrect program m ing of devices by device drivers, t hereby im proving OS robust ness and reliabilit y. • Feat ure Support : An OS m ay use dom ains t o bet t er m anage DMA from legacy devices t o high m em ory ( For exam ple, 32- bit PCI devices accessing m em ory above 4GB) . This is achieved by program m ing t he I / O page- t ables t o rem ap DMA from t hese devices t o high m em ory. Wit hout such support , soft ware m ust resort t o dat a copying t hrough OS “ bounce buffers”. • DMA I solat ion: An OS m ay m anage I / O by creat ing m ult iple dom ains and assigning one or m ore I / O devices t o each dom ain. Each device- driver explicit ly regist ers it s I / O buffers wit h t he OS, and t he OS assigns t hese I / O buffers t o specific dom ains, using hardware t o enforce DMA dom ain prot ect ion. See Figure 2- 2. • Shared Virt ual Mem ory: For devices support ing appropriat e PCI - Express1 capabilit ies, OS m ay use t he DMA rem apping hardware capabilit ies t o share virt ual address space of applicat ion processes wit h I / O devices. Shared virt ual m em ory along wit h support for I / O page- fault s enable applicat ion program s t o freely pass arbit rary dat a- st ruct ures t o devices such as graphics processors or accelerat ors, wit hout t he overheads of pinning and m arshalling of dat a. System Memory

System Memory Domain 1

OS Code & Data I/O Buffers

Domain 2

Driver A I/O Buffers I/O Buffers

Driver A I/O Buffers

Driver B I/O Buffers Driver B I/O Buffers

DMA-Remapping Hardware

I/O Devices Device DMA without isolation

Device A

Device B

Device DMA isolated using DMA remapping hardware

Figur e 2 - 2 . Ex a m ple OS Usa ge of D M A Re m a pping 1. Refer t o Process Address Space I D ( PASI D) capabilit y in PCI - Express* base specificat ion.

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 2- 3

June 2016 Order Num ber: D51397- 008

Ove r vie w —I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O

2 .5 .1 .2

VM M Usa ge s of D M A Re m a ppin g

The lim it at ions of soft ware- only m et hods for I / O virt ualizat ion can be im proved t hrough direct assignm ent of I / O devices t o part it ions. Wit h t his approach, t he driver for an assigned I / O device runs only in t he part it ion t o which it is assigned and is allowed t o int eract direct ly wit h t he device hardware wit h m inim al or no VMM involvem ent . The hardware support for DMA rem apping enables t his direct device assignm ent wit hout device- specific knowledge in t he VMM. See Figure 2- 3. VM Virtual Machine (n)

Virtual Machine VM (0) App

App App

App

App App

Guest OS 0

Guest OS 0

Driver for Virtual Devices

Driver for Virtual Devices

Virtual Machine (0) VM App

App App

Guest OS 0 Device A Driver

Virtual Machine (n) VM App

App App

Guest OS 0 Device B Driver

Virtual Machine Monitor (VMM) or Hosting OS Virtual Devices Emulation Virtual Machine Monitor (VMM) or Hosting OS Device A Driver

Device B Driver DMA-Remapping Hardware Physical Host Hardware

Device A

Device B Example Software-based I/O Virtualization

Device A

Device B

Direct Assignment of I/O Devices

Figur e 2 - 3 . Ex a m ple V ir t ua liza t ion Usa ge of D M A Re m a pping

I n t his m odel, t he VMM rest rict s it self t o enabling direct assignm ent of devices t o t heir part it ions. Rat her t han invoking t he VMM for all I / O request s from a part it ion, t he VMM is invoked only when guest soft ware accesses prot ect ed resources ( such as configurat ion accesses, int errupt m anagem ent , et c.) t hat im pact syst em funct ionalit y and isolat ion. To support direct assignm ent of I / O devices, a VMM m ust enforce isolat ion of DMA request s. I / O devices can be assigned t o dom ains, and t he rem apping hardware can be used t o rest rict DMA from an I / O device t o t he physical m em ory present ly owned by it s dom ain. For dom ains t hat m ay be relocat ed in physical m em ory, t he rem apping hardware can be program m ed t o perform t he necessary t ranslat ion. I / O device assignm ent allows ot her I / O sharing usages — for exam ple, assigning an I / O device t o an I / O part it ion t hat provides I / O services t o ot her user part it ions. Rem apping hardware enables virt ualizat ion soft ware t o choose t he right com binat ion of device assignm ent and soft ware- based m et hods for I / O virt ualizat ion.

2 .5 .1 .3

D M A Re m a ppin g Usa ge s by Gu e st s

A guest OS running in a VM m ay benefit from t he availabilit y of rem apping hardware t o support t he usages described in Sect ion 2.5.1.1. To support such usages, t he VMM m ay virt ualize t he rem apping hardware t o it s guest s. For exam ple, t he VMM m ay int ercept guest accesses t o t he virt ual rem apping hardware regist ers, and m anage a shadow copy of t he guest rem apping st ruct ures t hat is provided t o t he physical rem apping hardware. On updat es t o t he guest I / O page t ables, t he guest soft ware perform s appropriat e virt ual invalidat ion operat ions. The virt ual invalidat ion request s m ay be int ercept ed by t he VMM, t o updat e t he respect ive shadow page t ables and perform invalidat ions of

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 2- 4

I n t e l ® V ir t u a liza t ion Te ch n ology for D ir e ct e d I / O—Ove r vie w

rem apping hardware. Due t o t he non- rest art abilit y of fault ing DMA t ransact ions ( unlike CPU m em ory m anagem ent virt ualizat ion) , a VMM cannot perform lazy updat es t o it s shadow rem apping st ruct ures. To keep t he shadow st ruct ures consist ent wit h t he guest st ruct ures, t he VMM m ay expose virt ual rem apping hardware wit h eager pre- fet ching behavior ( including caching of not- present ent ries) or use processor m em ory m anagem ent m echanism s t o writ e- prot ect t he guest rem apping st ruct ures. On hardware im plem ent at ions support ing t wo levels of address t ranslat ions ( first- level t ranslat ion t o rem ap a virt ual address t o int erm ediat e ( guest ) physical address, and second- level t ranslat ions t o rem ap a int erm ediat e physical address t o m achine ( host ) physical address) , a VMM m ay virt ualize guest OS use of first- level t ranslat ions ( such as for Shared Virt ual Mem ory usages) wit hout shadowing page- t ables, but by configuring hardware t o perform nest ed t ranslat ion of first and second- levels.

2 .5 .1 .4

I n t e r a ct ion w it h Pr oce ssor Vir t u a liza t ion

Figure 2- 4 depict s how syst em soft ware int eract s wit h hardware support for bot h processor- level virt ualizat ion and I nt el ® VT for Direct ed I / O. Virtual Machines

App App

App

Guest OS

App App

App

Guest OS

Guest OS

Virtual Machine Monitor (VMM)

Physical Memory

DMA

I/O Devices

DMA Remapping

CPU Accesses

CPU Memory Virtualization

Logical Processors

Figur e 2 - 4 . I nt e r a ct ion Be t w e e n I / O a nd Pr oce ssor Vir t ua liz a t ion

The VMM m anages processor request s t o access physical m em ory via t he processor ’s m em ory m anagem ent hardware. DMA request s t o access physical m em ory use rem apping hardware. Bot h processor m em ory m anagem ent and DMA m em ory m anagem ent are under t he cont rol of t he VMM.

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 2- 5

June 2016 Order Num ber: D51397- 008

Ove r vie w —I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O

2 .5 .2

H a r dw a r e Su ppor t for I n t e r r u pt Re m a ppin g

I nt errupt rem apping provides hardware support for rem apping and rout ing of int errupt request s from I / O devices ( generat ed direct ly or t hrough I / O int errupt cont rollers) . The indirect ion achieved t hrough rem apping enables isolat ion of int errupt s across part it ions. The following usages are envisioned for t he int errupt- rem apping hardware.

2 .5 .2 .1

I n t e r r u pt I sola t ion

On I nt el archit ect ure plat form s, int errupt request s are ident ified by t he Root- Com plex as writ e t ransact ions t arget ing an archit ect ural address range ( 0xFEEx_xxxxh) . The int errupt request s are self- describing ( i.e., at t ribut es of t he int errupt request are encoded in t he request address and dat a) , allowing any DMA init iat or t o generat e int errupt m essages wit h arbit rary at t ribut es. The int errupt- rem apping hardware m ay be ut ilized by a Virt ual Machine Monit or ( VMM) t o im prove t he isolat ion of ext ernal int errupt request s across dom ains. For exam ple, t he VMM m ay ut ilize t he int errupt- rem apping hardware t o dist inguish int errupt request s from specific devices and rout e t hem t o t he appropriat e VMs t o which t he respect ive devices are assigned. The VMM m ay also ut ilize t he int errupt- rem apping hardware t o cont rol t he at t ribut es of t hese int errupt request s ( such as dest inat ion CPU, int errupt vect or, delivery m ode et c.) . Anot her exam ple usage is for t he VMM t o use t he int errupt- rem apping hardware t o disam biguat e ext ernal int errupt s from t he VMM owned int er- processor int errupt s ( I PI s) . Soft ware m ay enforce t his by ensuring none of t he rem apped ext ernal int errupt s have at t ribut es ( such as vect or num ber) t hat m at ches t he at t ribut es of t he VMM I PI s.

2 .5 .2 .2

I n t e r r u pt M igr a t ion

The int errupt- rem apping archit ect ure m ay be used t o support dynam ic re- direct ion of int errupt s when t he t arget for an int errupt request is m igrat ed from one logical processor t o anot her logical processor. Wit hout int errupt- rem apping hardware support , re- balancing of int errupt s require soft ware t o reprogram t he int errupt sources. However re- program m ing of t hese resources are non- at om ic ( requires m ult iple regist ers t o be re- program m ed) , oft en com plex ( m ay require t em porary m asking of int errupt source) , and dependent on int errupt source charact erist ics ( e.g. no m asking capabilit y for som e int errupt sources; edge int errupt s m ay be lost when m asked on som e sources, et c.) I nt errupt- rem apping enables soft ware t o efficient ly re- direct int errupt s wit hout re- program m ing t he int errupt configurat ion at t he sources. I nt errupt m igrat ion m ay be used by OS soft ware for balancing load across processors ( such as when running I / O int ensive workloads) , or by t he VMM when it m igrat es virt ual CPUs of a part it ion wit h assigned devices across physical processors t o im prove CPU ut ilizat ion.

2 .5 .2 .3

x 2 API C Suppor t

I nt el ® 64 x2API C archit ect ure ext ends t he API C addressabilit y t o 32- bit s ( from 8- bit s) . Refer t o I nt el ® 64 Archit ect ure Soft ware Developer's Manual, Volum e 3B: Syst em Program m ing Guide for det ails. I nt errupt rem apping enables x2API Cs t o support t he expanded API C addressabilit y for ext ernal int errupt s wit hout requiring hardware changes t o int errupt sources ( such as I / OxAPI Cs and MSI / MSI X devices) .

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 2- 6

I n t e l ® V ir t u a liza t ion Te ch n ology for D ir e ct e d I / O—Ove r vie w

2 .5 .3

H a r dw a r e Su ppor t for I n t e r r u pt Post in g

I nt errupt post ing includes hardware support for opt im ized processing of int errupt request s from I / O devices ( Physical Funct ions, or Single Root I / O Virt ualizat ion ( SR- I OV) Virt ual Funct ions) t hat are direct ly assigned t o a virt ual m achine. The following usages are envisioned for t he int errupt- post ing hardware.

2 .5 .3 .1

I n t e r r u pt Ve ct or Sca la bilit y

Devices support ing I / O virt ualizat ion capabilit ies such as SR- I OV, virt ually increases t he I / O fan- out of t he plat form , by allowing m ult iple Virt ual Funct ions ( VFs) t o be enabled for a Physical Funct ion ( PF) . Any of t hese PFs or VFs can be assigned t o a virt ual m achine. I nt errupt request s from such assigned devices are referred t o as virt ual int errupt s as t hey t arget virt ual processors of t he assigned VM. Each VF requires it s own independent int errupt resources, result ing in m ore int errupt vect ors needed t han ot herwise required wit hout such I / O virt ualizat ion. Wit hout int errupt- post ing hardware support , all int errupt sources in t he plat form are m apped t o t he sam e physical int errupt vect or space ( 8- bit vect or space per logical CPU on I nt el ® 64 processors) . For virt ualizat ion usages, part it ioning t he physical vect or space across virt ual processors is challenging in a dynam ic environm ent when t here is no st at ic affinit y bet ween virt ual process and logical processors. Hardware support for int errupt post ing addresses t his vect or scalabilit y problem by allowing int errupt request s from device funct ions assigned t o virt ual m achines t o operat e in virt ual vect or space, t hereby scaling nat urally wit h t he num ber of virt ual m achines or virt ual processors.

2 .5 .3 .2

I n t e r r u pt Vir t u a liza t ion Efficie n cy

Wit hout hardware support for int errupt post ing, int errupt s from devices assigned t o virt ual m achines are processed t hrough t he VMM soft ware. Specifically, whenever an ext ernal int errupt dest ined for a virt ual m achine is received by t he CPU, cont rol is t ransferred t o t he VMM, requiring t he VMM t o process and inj ect corresponding virt ual int errupt t o t he virt ual m achine. The cont rol t ransfers associat ed wit h such VMM processing of ext ernal int errupt s incurs bot h hardware and soft ware overheads. Wit h hardware support for int errupt post ing, int errupt s from devices assigned t o virt ual m achines are post ed ( recorded) in m em ory descript ors specified by t he VMM, and processed based on t he running st at e of t he virt ual processor t arget ed by t he int errupt . For exam ple, if t he t arget virt ual processor is running on any logical processor, hardware can direct ly deliver ext ernal int errupt s t o t he virt ual processor wit hout any VMM int ervent ion. I nt errupt s received while t he t arget virt ual processor is pre- em pt ed ( wait ing for it s t urn t o run) can be accum ulat ed in m em ory by hardware for delivery when t he virt ual processor is lat er scheduled. This avoids disrupt ing execut ion of current ly running virt ual processors on ext ernal int errupt s for non- running virt ual m achines. I f t he t arget virt ual processor is halt ed ( idle) at t he t im e of int errupt arrival or if t he int errupt is qualified as requiring real- t im e processing, hardware can t ransfer cont rol t o VMM, enabling VMM t o schedule t he virt ual processor and have hardware direct ly deliver pending int errupt s t o t hat virt ual processor. This t arget virt ual processor st at e based processing of int errupt s reduces overall int errupt lat ency t o virt ual m achines and reduces overheads ot herwise incurred by t he VMM for virt ualizing int errupt s.

2 .5 .3 .3

Vir t u a l I n t e r r u pt M igr a t ion

To opt im ize overall plat form ut ilizat ion, VMM soft ware m ay need t o dynam ically evaluat e t he opt im al logical processor t o schedule a virt ual processor, and in t hat process, m igrat e virt ual processors across CPUs. For virt ual m achines wit h assigned devices, m igrat ing a virt ual processor across logical processors eit her incurs t he overhead of forwarding int errupt s in soft ware ( e.g. via VMM generat ed I PI s) , or com plexit y t o independent ly m igrat e each int errupt t arget ing t he virt ual processor t o t he new logical

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 2- 7

June 2016 Order Num ber: D51397- 008

Ove r vie w —I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O

processor. Hardware support for int errupt post ing enables VMM soft ware t o at om ically co- m igrat e all int errupt s t arget ing a virt ual processor when t he virt ual processor is scheduled t o anot her logical processor.

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 2- 8

I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O—D M A Re m a ppin g

3

D M A Re m a ppin g

This chapt er describes t he hardware archit ect ure for DMA rem apping. The archit ect ure envisions rem apping hardware t o be im plem ent ed in Root- Com plex com ponent s, such as t he m em ory cont roller hub ( MCH) or I / O hub ( I OH) .

3 .1

Type s of D M A r e qu e st s

Rem apping hardware t reat s inbound m em ory request s from root- com plex int egrat ed devices and PCI Express at t ached discret e devices int o t wo cat egories: • Request s wit hout address- space- ident ifier: These are t he norm al m em ory request s from endpoint devices. These request s t ypically specify t he t ype of access ( read/ writ e/ at om ics) , t arget ed DMA address/ size, and ident it y of t he device originat ing t he request . • Request s wit h address- space- ident ifier: These are m em ory request s wit h addit ional inform at ion ident ifying t he t arget ed process address space from endpoint devices support ing virt ual m em ory capabilit ies. Beyond at t ribut es in norm al request s, t hese request s specify t he t arget ed process address space ident ifier ( PASI D) , and ext ended at t ribut es such as Execut e- Request ed ( ER) flag ( t o indicat e reads t hat are inst ruct ion fet ches) , and Privileged- m ode- Request ed ( PR) flag ( t o dist inguish user versus supervisor access) . For det ails, refer t o t he Process Address Space I D ( PASI D) Capabilit y in t he PCI - Express specificat ions. For sim plicit y, t his docum ent refers t o t hese cat egories as Re que st s- w it h out - PASI D, and Re que st s- w it h- PASI D. Previous versions of t his specificat ion support ed only rem apping of request s- wit hout- PASI D.

3 .2

D om a ins a nd Addr e ss Tr a n sla t ion

A dom ain is abst ract ly defined as an isolat ed environm ent in t he plat form , t o which a subset of t he host physical m em ory is allocat ed. I / O devices t hat are allowed t o access physical m em ory direct ly are allocat ed t o a dom ain and are referred t o as t he dom ain’s assigned devices. For virt ualizat ion usages, soft ware m ay t reat each virt ual m achine as a separat e dom ain. The isolat ion propert y of a dom ain is achieved by blocking access t o it s physical m em ory from resources not assigned t o it . Mult iple isolat ed dom ains are support ed in a syst em by ensuring t hat all I / O devices are assigned t o som e dom ain ( possibly a null dom ain) , and t hat t hey can only access t he physical resources allocat ed t o t heir dom ain. The DMA rem apping archit ect ure facilit at es flexible assignm ent of I / O devices t o an arbit rary num ber of dom ains. Each dom ain has a view of physical address space t hat m ay be different t han t he host physical address space. Rem apping hardware t reat s t he address in inbound request s as DMA Address. Depending on t he soft ware usage m odel, t he DMA address space m ay be t he Guest- Physical Address ( GPA) space of t he virt ual m achine t o which t he device is assigned, or applicat ion Virt ual Address ( VA) space defined by t he PASI D assigned t o an applicat ion, or som e abst ract I / O virt ual address ( I OVA) space defined by soft ware. I n all cases, DMA rem apping t ransform s t he address in a DMA request issued by an I / O device t o it s corresponding Host- Physical Address ( HPA) . For sim plicit y, t his docum ent refers t o address in request s- wit hout- PASI D as GPA, and address in request s- wit h- PASI D as Virt ual Address ( VA) ( or Guest Virt ual Address ( GVA) , if such request is from a device assigned t o a virt ual m achine) . The t ranslat ed address is referred t o as HPA.

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 3- 1

June 2016 Order Num ber: D51397- 008

D M A Re m a ppin g—I n t e l ® V ir t u a liz a t ion Te ch n ology for D ir e ct e d I / O

Figure 3- 5 illust rat es DMA address t ranslat ion. I / O devices 1 and 2 are assigned t o dom ains 1 and 2, respect ively. The soft ware responsible for creat ing and m anaging t he dom ains allocat es syst em physical m em ory for bot h dom ains and set s up t he DMA address t ranslat ion funct ion. DMA address in request s init iat ed by devices 1 & 2 are t ranslat ed t o appropriat e HPAs by t he rem apping hardware.

10000h Domain 1

HPA = 6000h

4000h

CPU Mem ory Management Dom ain 2

4000h

(G)VA/GPA = 4000h

HPA = 6000h DM A Mem ory Managem ent

HPA = 3000h

HPA = 3000h

Device 1

Assigned to Domain 1

(G)VA/GPA = 4000h Device 2 Assigned to Domain 2

0h Physical Memory

Figur e 3 - 5 . D M A Addr e ss Tr a n sla t ion The host plat form m ay support one or m ore rem apping hardware unit s. Each hardware unit support s rem apping DMA request s originat ing wit hin it s hardware scope. For exam ple, a deskt op plat form m ay expose a single rem apping hardware unit t hat t ranslat es all DMA t ransact ions at t he m em ory cont roller hub ( MCH) com ponent . A server plat form wit h one or m ore core com ponent s m ay support independent t ranslat ion hardware unit s in each com ponent , each t ranslat ing DMA request s originat ing wit hin it s I / O hierarchy ( such as a PCI - Express root port ) . The archit ect ure support s configurat ions in which t hese hardware unit s m ay eit her share t he sam e t ranslat ion dat a st ruct ures ( in syst em m em ory) or use independent st ruct ures, depending on soft ware program m ing. The rem apping hardware t ranslat es t he address in a request t o host physical address ( HPA) before furt her hardware processing ( such as address decoding, snooping of processor caches, and/ or forwarding t o t he m em ory cont rollers) .

3 .3

Re m a ppin g H a r dw a r e - Soft w a r e Vie w

The rem apping archit ect ure allows hardware im plem ent at ions support ing a single PCI segm ent group t o expose ( t o soft ware) t he rem apping funct ion eit her as a single hardware unit covering t he ent ire PCI segm ent group, or as m ult iple hardware unit s, each support ing a m ut ually exclusive subset of devices in t he PCI segm ent group hierarchy. For exam ple, an im plem ent at ion m ay expose a rem apping hardware unit t hat support s one or m ore int egrat ed devices on t he root bus, and addit ional rem apping hardware unit s for devices behind one or a set of PCI - Express root port s. The plat form firm ware ( BI OS) report s each rem apping hardware unit in t he plat form t o soft ware. Chapt er 8 describes a proposed report ing st ruct ure t hrough ACPI const ruct s. For hardware im plem ent at ions support ing m ult iple PCI segm ent groups, t he rem apping archit ect ure requires hardware t o expose independent rem apping hardware unit s ( at least one per PCI segm ent group) for processing request s originat ing wit hin t he I / O hierarchy of each segm ent group.

3 .4

M a ppin g D e vice s t o D om a in s

The following sub- sect ions describe t he DMA rem apping archit ect ure and dat a st ruct ures used t o m ap I / O devices t o dom ains.

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 3- 2

I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O—D M A Re m a ppin g

3 .4 .1

Sou r ce I de n t ifie r

Each inbound request appearing at t he address- t ranslat ion hardware is required t o ident ify t he device originat ing t he request . The at t ribut e ident ifying t he originat or of an I / O t ransact ion is referred t o as t he “ source- id” in t his docum ent . The rem apping hardware m ay det erm ine t he source- id of a t ransact ion in im plem ent at ion- specific ways. For exam ple, som e I / O bus prot ocols m ay provide t he originat ing device ident it y as part of each I / O t ransact ion. I n ot her cases ( for Root- Com plex int egrat ed devices, for exam ple) , t he source- id m ay be derived based on t he Root- Com plex int ernal im plem ent at ion. For PCI - Express devices, t he source- id is t he request er ident ifier in t he PCI - Express t ransact ion layer header. The request er ident ifier of a device, which is com posed of it s PCI Bus/ Device/ Funct ion num ber, is assigned by configurat ion soft ware and uniquely ident ifies t he hardware funct ion t hat init iat ed t he request . Figure 3- 6 illust rat es t he request er- id 1 as defined by t he PCI - Express Specificat ion. 1 5

87 Bus #

3 2 Device #

0

Function #

Figur e 3 - 6 . Re que st e r I de nt ifie r For m a t

The following sect ions describe t he dat a st ruct ures for m apping I / O devices t o dom ains.

3 .4 .2

Root - En t r y & Ex t e n de d- Root - En t r y

The root- t able funct ions as t he t op level st ruct ure t o m ap devices t o t heir respect ive dom ains. The locat ion of t he root- t able in syst em m em ory is program m ed t hrough t he Root Table Address Regist er described in Sect ion 10.4.6. The root - t able is 4- KByt e in size and cont ains 256 root- ent ries t o cover t he PCI bus num ber space ( 0- 255) . The bus num ber ( upper 8- bit s) encoded in a request ’s source- id field is used t o index int o t he root- ent ry st ruct ure. Each root- ent ry cont ains t he following fields: • Pr e se nt fla g: The present field indicat es t he root- ent ry is present and t he cont ext- t able point er ( CTP) field is init ialized. Soft ware m ay Clear t he present field for root ent ries corresponding t o bus num bers t hat are eit her not present in t he plat form , or don’t have any downst ream devices at t ached. DMA request s processed t hrough root- ent ries wit h present field Clear result in t ranslat ion- fault . • Con t e x t - t a ble poin t e r : The cont ext- t able point er references t he cont ext- t able for devices on t he bus ident ified by t he root- ent ry. Sect ion 3.4.3 describes cont ext- ent ries in t he cont ext- t able. Sect ion 9.1 provides t he exact root- t able ent ry form at . For im plem ent at ions support ing Ext ended- Cont ext- Support ( ECS= 1 in Ext ended Capabilit y Regist er) , t he Root Table Address Regist er ( RTADDR_REG) point s t o an ext ended- root- t able when Root-TableType field in t he Regist er is Set ( RTT= 1) . The ext ended- root- t able is sim ilar t o t he root- t able ( 4KB in size and cont aining 256 ext ended- root- ent ries t o cover t he 0- 255 PCI bus num ber space) , but has an ext ended form at t o reference ext ended- cont ext - t ables. Each ext ended- root- ent ry cont ains t he following fields:

1. For PCI - Express devices support ing Alt ernat ive Rout ing- I D I nt erpret at ion ( ARI ) , bit s t radit ionally used for t he Device Num ber field in t he Request er- id are used inst ead t o expand t he Funct ion Num ber field.

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 3- 3

June 2016 Order Num ber: D51397- 008

D M A Re m a ppin g—I n t e l ® V ir t u a liz a t ion Te ch n ology for D ir e ct e d I / O

• Low e r Pr e se nt fla g: The lower- present field indicat es t he lower 64- bit s of t he ext ended- rootent ry is present and t he lower- cont ext- t able point er ( LCTP) field is init ialized. Soft ware m ay Clear t he lower- present field for ext ended- root- ent ries corresponding t o bus num bers t hat are eit her not present in t he plat form , or don’t have downst ream devices wit h device num bers 0- 15 at t ached. DMA request s processed t hrough t he lower part of an ext ended- root- ent ry wit h t he lower- present field Clear result in t ranslat ion- fault . • Low e r Cont e x t - t a ble point e r : The lower- cont ext- t able point er references t he lower- cont extt able for devices wit h device num ber 0- 15, on t he bus ident ified by t he referencing ext endedroot- ent ry. Sect ion 3.4.4 describes ext ended- cont ext- ent ries in t he lower- cont ext- t able. • Uppe r Pr e se nt fla g: The upper- present field indicat es t he upper 64- bit s of t he ext ended- rootent ry is present and t he upper- cont ext- t able point er ( UCTP) field is init ialized. Soft ware m ay Clear t he upper- present field for ext ended- root- ent ries corresponding t o bus num bers t hat are eit her not present in t he plat form , or don’t have downst ream devices wit h device num bers 16- 31 at t ached. DMA request s processed t hrough t he upper part of an ext ended- root- ent ry wit h t he upper- present field Clear result in t ranslat ion- fault . • Uppe r Cont e x t - t a ble poin t e r : The upper- cont ext- t able point er references t he upper- cont extt able for devices wit h device num ber 16- 31, on t he bus ident ified by t he referencing ext endedroot- ent ry. Sect ion 3.4.4 describes ext ended- cont ext- ent ries in t he upper- cont ext- t able. Sect ion 9.2 provides t he exact ext ended- root- t able ent ry form at .

3 .4 .3

Con t e x t - En t r y

A cont ext- ent ry m aps a specific I / O device on a bus t o t he dom ain t o which it is assigned, and, in t urn, t o t he address t ranslat ion st ruct ures for t he dom ain. The cont ext ent ries are program m ed t hrough m em ory- resident cont ext- t ables. Each root- ent ry in t he root- t able cont ains t he point er t o t he cont ext- t able for t he corresponding bus num ber. Each cont ext- t able cont ains 256 ent ries, wit h each ent ry corresponding t o a PCI device funct ion on t he bus. For a PCI device, t he device and funct ion num bers ( lower 8- bit s) of source- id are used t o index int o t he cont ext- t able. Figure 3- 7 illust rat es device t o dom ain m apping t hrough root - t able. (Dev 31, Func 7) Context-entry 255

(Dev 0, Func 1) (Bus 255)

Root-entry 255

(Bus N)

Root-entry N

(Bus 0)

Root-entry 0

(Dev 0, Func 0)

Context-entry 0 Context-table for Bus N

Address Translation Structures for Dom ain A

Root-table Context-entry 255

Address Translation Structures for Dom ain B Context-entry 0 Context-table for Bus 0

Figur e 3 - 7 . D e vice t o D om a in M a pping St r uct ur e s using Root - Ta ble

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 3- 4

I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O—D M A Re m a ppin g

Cont ext- ent ries support only request s- wit hout- PASI D, and cont ains t he following fields: • Pr e se nt Fla g: The present field is used by soft ware t o indicat e t o hardware whet her t he cont extent ry is present and init ialized. Soft ware m ay Clear t he present field for cont ext ent ries corresponding t o device funct ions t hat are not present in t he plat form . I f t he present field of a cont ext- ent ry used t o process a request is Clear, t he request is blocked, result ing in a t ranslat ionfault . • Tr a nsla t ion Type : The t ranslat ion- t ype field indicat es what t ypes of request s are allowed t hrough t he cont ext- ent ry, and t he t ype of t he address t ranslat ion t hat m ust be used for such request s. • Addr e ss W idt h : The address- widt h field indicat es t he address- widt h of t he dom ain t o which t he device corresponding t o t he cont ext- ent ry is assigned. • Se cond- le ve l Pa ge - t a ble Point e r : The second- level page- t able point er field provides t he host physical address of t he address t ranslat ion st ruct ure in syst em m em ory t o be used for rem apping request s- wit hout- PASI D processed t hrough t he cont ext- ent ry. • D om a in I de nt ifie r: The dom ain- ident ifier is a soft ware- assigned field in a cont ext- ent ry t hat ident ifies t he dom ain t o which a device wit h t he given source- id is assigned. Hardware m ay use t his field t o t ag it s caching st ruct ures. Cont ext ent ries program m ed wit h t he sam e dom ain ident ifier m ust reference t he sam e address t ranslat ion st ruct ure. Cont ext ent ries referencing t he sam e address t ranslat ion st ruct ures are recom m ended t o use t he sam e dom ain- ident ifier for best hardware efficiency. • Fa u lt Pr oce ssin g D isa ble Fla g: The fault- processing- disable field enables soft ware t o select ively disable recording and report ing of rem apping fault s det ect ed for request s processed t hrough t he cont ext- ent ry. Mult iple devices m ay be assigned t o t he sam e dom ain by program m ing t he cont ext- ent ries for t he devices t o reference t he sam e t ranslat ion st ruct ures, and program m ing t hem wit h t he sam e dom ain ident ifier. Sect ion 9.3 provides t he exact cont ext- ent ry form at .

3 .4 .4

Ex t e n de d- Con t e x t - En t r y

For im plem ent at ions support ing Ext ended- Cont ext- Support ( ECS= 1 in Ext ended Capabilit y Regist er) , when using ext ended- root- t able, each ext ended- root- ent ry references a lower- cont ext- t able and a upper- cont ext- t able. The Lower- cont ext- t able is 4- KByt e in size and cont ains 128 ext ended- cont extent ries corresponding t o PCI funct ions in device range 0- 15 on t he bus. The Upper- cont ext- t able is also 4- KByt e in size and cont ains 128 ext ended- cont ext- ent ries corresponding t o PCI funct ions in device range 16- 31 on t he bus. Figure 3- 8 illust rat es device t o dom ain m apping t hrough ext endedroot- t able. Ext ended- cont ext- ent ries are capable of support ing bot h request s- wit hout- PASI D and request s- wit hPASI D. For request s- wit hout- PASI D, it support s t he sam e fields as in t he regular cont ext- ent ry ( described above) . Sect ion 9.4 provides t he exact ext ended- cont ext- ent ry form at . For request s- wit hPASI D, ext ended- cont ext- ent ries cont ain t he following addit ional fields: • Ex t e nde d Tr a nsla t ion Type : The t ranslat ion- t ype field is ext ended t o provide addit ional cont rols t o specify how request s wit h and wit hout PASI D should be processed. Ext ended- cont extent ries support s t wo levels of t ranslat ion, referred t o as first- level t ranslat ion and second- level t ranslat ion. First- level t ranslat ion applies t o request s- wit h- PASI D. Second- level t ranslat ion applies t o request s- wit hout- PASI D. When nest ed t ranslat ion is specified in t he ext ended- cont ext- ent ry, request s- wit h- PASI D are subj ect t o nest ed first- level and second- level t ranslat ion. • Tr a nsla t ion St r uct ur e Point e r s: For first- level t ranslat ion, t he ext ended- cont ext- ent ry cont ains a point er t o a PASI D- t able. Each 8- byt e PASI D- t able- ent ry corresponds t o a PASI D value, and cont ains t he root of first- level t ranslat ion st ruct ures used t o t ranslat e request s- wit h- PASI D t agged wit h t he respect ive PASI D. For second- level t ranslat ion, ext ended- cont ext- ent ry cont ains a point er t o t he second- level page- t able, which is t he sam e as t he second- level page- t able point er field in t he regular cont ext- ent ry ( described in Sect ion 3.4.3) .

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 3- 5

June 2016 Order Num ber: D51397- 008

D M A Re m a ppin g—I n t e l ® V ir t u a liz a t ion Te ch n ology for D ir e ct e d I / O

• Tr a n sla t ion Con t r ols: These include addit ional cont rols such as Page- Global- Enable, Writ eProt ect- Enable, No- Execut e- Enable, Supervisor- Mode- Execut e- Prot ect ion, et c. t hat are applied when processing request s- wit h- PASI D. • M e m or y- t ype At t r ibu t e s: Ext ended- cont ext- ent ries support fields such as Page-At t ribut e-Table, Ext ended- m em ory- t ype et c., t hat are used t o com put e t he effect ive m em ory- t ype for request swit h- PASI D from devices operat ing in t he processor coherency dom ain. • Pa ge Re que st Ena ble : The page- request- enable field in t he ext ended- cont ext- ent ry allows soft ware t o select ively enable or disable page- fault request s from t he device. When enabled, page- request s from t he device are report ed t o soft ware t hrough a m em ory- resident pagerequest- queue. Chapt er 7 provides det ails on page request processing. • D e fe r r e d I nva lida t ion Cont r ols: The PASI D- st at e t able point er field enables devices t o com m unicat e whet her a given address- space ( PASI D) is act ive or not at t he device. Soft ware can ut ilize t he PASI D- st at e t ables for deferred invalidat ion of cached m appings for inact ive PASI Ds in t ranslat ion caches ( TLBs) . Chapt er 6 describes t he various t ranslat ion caching st ruct ures and invalidat ion operat ions, including deferred invalidat ion support . Figure 3- 8 illust rat es device t o dom ain m apping using ext ended- root- t able. (Dev 31, Func 7) E xt-context-entry 127

(Dev 16, Func 1) (Dev 16, Func 0)

E xt-context-entry 0 U p p er-co n text-tab le fo r B u s N

(Bus 255) E xt-root-entry 255

(Dev 15, Func 7) E xt-context-entry 127 (Bus N)

Ext-root-entry N

S econd -level P aging Structures for D om ain A (Dev 0, Func 1) (Dev 0, Func 0)

E xt-context-entry 0 L o w er-co n text-tab le fo r B u s N

E xt-context-entry 127 (Bus 0)

P A SID -entry P 2 PA S ID -entry P 1

E xt-root-entry 0

E xten d ed -ro o t-tab le

F irst-level P aging S tructures for Process P 2

P AS ID -entry 0 E xt-context-entry 0

P A S ID -T ab le fo r D o m ain B

U p p er-co n text-tab le fo r B u s 0 F irst-level P aging S tructures for Process P 1

E xt-context-entry 127

S econd -level P aging Structures for D om ain B Ext-context-entry 0 L o w er-co n text-tab le fo r B u s 0

Figur e 3 - 8 . D e vice t o D om a in M a pping St r uct ur e s using Ex t e nde d- Root - Ta ble

June 2016 Order Num ber: D51397- 008

I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion, Rev. 2.4 3- 6

I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O—D M A Re m a ppin g

3 .5

H ie r a r ch ica l Tr a n sla t ion St r u ct u r e s

DMA rem apping uses hierarchical t ranslat ion st ruct ures for bot h first- level t ranslat ion ( for request swit h- PASI D) and second- level t ranslat ion ( for request s- wit hout- PASI D and for nest ed t ranslat ion of request s- wit h- PASI D) . For first- level t ranslat ion, and second- level t ranslat ion of request s- wit hout- PASI D, t he DMA- address in t he request is used as t he input address. For nest ed t ranslat ion of request s- wit h- PASI D, any address generat ed by first- level t ranslat ion ( bot h addresses t o access first- level t ranslat ion st ruct ures and t he out put address from first- level t ranslat ion) is used as t he input address for nest ing wit h second- level t ranslat ion. Sect ion 3.6, Sect ion 3.7 and Sect ion 3.8 provides m ore det ails on first- level, secondlevel, and nest ed t ranslat ion respect ively. Every paging st ruct ure in t he hierarchy is 4- KByt es in size, wit h 512 8- Byt e ent ries. Rem apping hardware uses t he upper port ion of input address t o ident ify a series of paging- st ruct ure ent ries. The last of t hese ent ries ident ifies t he physical address of t he region t o which t he input address t ranslat es ( called t he page fram e) . The lower port ion of t he input address ( called t he page offset ) ident ifies t he specific offset wit hin t hat region t o which t he input address t ranslat es. Each paging- st ruct ure ent ry cont ains a physical address, which is eit her t he address of anot her paging st ruct ure or t he address of a page fram e. First- level t ranslat ion support s 4- level st ruct ure. Second- level t ranslat ion support s a Nlevel st ruct ure, where t he value of N depends on t he Guest Address Widt h ( GAW) support ed by an im plem ent at ion as enum erat ed in t he Capabilit y Regist er. The paging st ruct ures support a base page- size of 4- KByt es. The page- size field in paging ent ries enable larger page allocat ions. When a paging ent ry wit h t he page- size field Set is encount ered by hardware on a page- t able walk, t he t ranslat ed address is form ed im m ediat ely by com bining t he pagebase- address in t he paging- ent ry wit h t he unused input address bit s. Rem apping archit ect ure defines support for 2- MByt e and 1- GByt e large- page sizes. I m plem ent at ions report support for large- pages and int erm ediat e- pages t hrough t he Capabilit y Regist er. Figure 3- 9 illust rat es t he paging st ruct ure for t ranslat ing a 48- bit address t o a 4- KByt e page. 11 21

9-bits

22 10

0 12-bits

3 2 0 9

9-bits

3 3 9 8 9-bits

4 8

9-bits

6 3

+