Notice: This document contains information on products in the design phase of development. The information here is subject to change without notice. Do not finalize a design with this information.
Intel technologies' features and benefits depend on system configuration and may require enabled hardware, software, or service activation. No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses. Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm. Intel, the Intel logo, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. Copyright © 2016, Intel Corporation. All Rights Reserved.
1 Introduction
1.1 Existing Paging in IA-32e Mode
1.2 Linear-Address Width and VMX Transitions
1.3 Existing Extended Page Tables (EPT)
2 Expanding Linear Addresses: 5-Level Paging
2.1 5-Level Paging: Introduction
2.2 Enumeration and Enabling
2.2.1 Enumeration by CPUID
2.2.2 Enabling by Software
2.3 Linear-Address Generation and Canonicality
2.4 5-Level Paging: Linear-Address Translation
2.5 Linear-Address Registers and Canonicality
2.5.1 Canonicality Checking on RIP Loads
2.5.2 Canonicality Checking on Other Loads
2.6 Interactions with TLB-Invalidation Instructions
2.7 Interactions with Intel® MPX
2.8 Interactions with Intel® SGX
3 Linear-Address Expansion and VMX Transitions
3.1 Linear-Address Expansion and VM Entries
3.2 Linear-Address Expansion and VM Exits
4 5-Level EPT
4.1 4-Level EPT: Guest-Physical-Address Limit
4.2 5-Level EPT: Enumeration and Enabling
4.2.1 Enumeration
4.2.2 Enabling by Software
4.3 5-Level EPT: Guest-Physical-Address Translation
4.4 5-Level EPT and EPTP Switching
5 Intel® Virtualization Technology for Directed I/O
Figures
1-1 Linear-Address Translation Using IA-32e Paging
2-1 Linear-Address Translation Using 5-Level Paging
Tables
2-1 Format of a PML5 Entry (PML5E) that References a PML4 Table
4-1 Format of an EPT PML5 Entry (EPT PML5E)
I nit ial Release
I n t r odu ct ion This docum ent describes planned ext ensions t o t he I nt el 64 archit ect ure t o expand t he size of addresses t hat can be t ranslat ed t hrough a processor ’s m em ory- t ranslat ion hardware. Modern operat ing syst em s use address- t ranslat ion support called pa ging. Paging t ranslat es lin e a r a ddr e sse s ( also known as virt ual addresses) , which are used by soft ware, t o ph ysica l a ddr e sse s, which are used t o access m em ory ( or m em orym apped I / O) . Sect ion 1.1 describes t he 64- bit paging hardware on I nt el 64 processors. Exist ing processors lim it linear addresses t o 48 bit s. Chapt er 2 describes paging ext ensions t hat would relax t hat lim it t o 57 linear- address bit s. Virt ual- m achine m onit ors ( VMMs) use t he vir t u a l- m a chine e x t e n sion s ( VM X) t o support guest soft ware operat ing in a virt ual m achine. VM X t r a nsit ions are cont rolflow t ransfers bet ween t he VMM and guest soft ware. VMX t ransit ions involve t he loading and st oring of various processor regist ers. Som e of t hese regist ers are defined t o cont ain linear addresses. Because of t his, t he operat ion of VMX t ransit ions depends in part on t he linear- address widt h support ed by t he processor. Sect ion 1.2 describes t he exist ing t reat m ent of linear- address regist ers by VMX t ransit ions, while Chapt er 3 describes t he changes required t o support larger linear addresses. VMMs m ay also use addit ional address- t ranslat ion support called e x t e nde d pa ge t a ble s ( EPT) . When EPT is used, paging produces gu e st - physica l a ddr e sse s, which EPT t ranslat es t o physical addresses. Sect ion 1.3 describes t he EPT hardware on exist ing I nt el 64 processors, which lim it guest- physical addresses t o 48 bit s. Chapt er 4 describes EPT ext ensions t o support 57 guest- physical- address bit s.
1 .1
Ex ist in g Pa gin g in I A- 3 2 e M ode On processors support ing I nt el 64 archit ect ure, soft ware t ypically references m em ory using lin e a r a ddr e sse s. Most m odern operat ing syst em s configure processors t o use pa gin g, which t ranslat es linear addresses t o physical addresses. The processor uses t he result ing physical addresses t o access m em ory. I A- 3 2 e m ode is a m ode of processor execut ion t hat ext ends t he older 32- bit operat ion, known as le ga cy m ode . Soft ware can ent er I A- 32e m ode wit h t he following algorit hm . 1. Use t he MOV CR inst ruct ion t o set CR4.PAE[ bit 5] . ( Physical- address ext ension m ust be enabled t o ent er I A- 32e m ode.) 2. Use t he WRMSR inst ruct ion t o set bit 8 ( LME) of t he I A32_EFER MSR ( index C0000080H) . 3. Use t he MOV CR inst ruct ion t o load CR3 wit h t he address of a PML4 t able ( see below) . 4. Use t he MOV CR inst ruct ion t o set CR0.PG[ bit 31] . A logical processor is in I A- 32e m ode whenever CR0.PG = 1 and I A32_EFER.LME = 1. This fact is report ed in I A32_EFER.LMA[ bit 10] . Soft ware cannot set t his bit direct ly; it is always t he logical- AND of CR0.PG and I A32_EFER.LME.
I n I A- 32e m ode, linear addresses are 64 bit s in size. 1 However, t he corresponding paging m ode ( current ly called I A- 3 2 e pa ging) does not use all 64 linear- address bit s. I A- 32e paging does not use all 64 linear- address bit s because processors lim it t he size of linear addresses. This lim it is enum erat ed by t he CPUI D inst ruct ion. Specifically, CPUI D.80000008H: EAX[ bit s 15: 8] enum erat es t he num ber of linear- address bit s ( t he m axim um linear- address widt h) support ed by t he processor. Exist ing processors enum erat e t his value as 48. N ot e :
Processors also lim it t he size of physical addresses and enum erat e t he lim it using CPUI D. CPUI D.80000008H: EAX[ bit s 7: 0] enum erat es t he num ber of physical- address bit s support ed by t he processor, t he m axim um physical- address widt h. Exist ing processors have enum erat ed values up t o 46. Soft ware can use m ore t han 32 physicaladdress bit s only if physica l- a ddr e ss e x t e nsion has been enabled by set t ing CR4.PAE, bit 5 of cont rol regist er CR4. The enum erat ed lim it at ion on t he linear- address widt h im plies t hat paging t ranslat es only t he low 48 bit s of each 64- bit linear address. Aft er a linear address is generat ed but before it is t ranslat ed, t he processor confirm s t hat t he address uses only t he 48 bit s t hat t he processor support s. The lim it at ion t o 48 linear- address bit s result s from t he nat ure of I A- 32e paging, which is illust rat ed in Figure 1- 1.
Linear Address 39 38 30 29 21 20 PML4 Directory Table Directory Ptr 9
PTE Page-DirectoryPointer Table PDPTE
40 Page-Directory
12 11
0 Offset
12 4-KByte Page Physical Addr 40
Page Table
9 40 PML4E
40 CR3
Figure 1-1.
Linear-Address Translation Using IA-32e Paging
1. I A- 32e m ode com prises t wo sub- m odes: com pat ibilit y m ode and 64- bit m ode. I n com pat ibilit y m ode, soft ware uses 32- bit addresses, which t he processor zero- ext ends t o 64- bit linear addresses. I n 64- bit m ode, soft ware uses 64- bit addresses direct ly.
The processor perform s I A- 32e paging by t raversing a 4- level hierarchy of pa gin g st r u ct u r e s whose root st ruct ure resides at t he physical address in cont rol regist er CR3. Each paging st ruct ure is 4- KByt es in size and com prises 512 8- byt e ent ries. The processor uses t he upper 36 bit s of a linear address ( bit s 47: 12) , 9 bit s at a t im e, t o select paging- st ruct ure ent ries from t he hierarchy. N ot e :
Figure 1- 1 illust rat es t he t ranslat ion of a linear address t o a 4- KByt e page. The paging process can be configured so t hat t he t ranslat ion of som e linear addresses st ops one or t wo levels earlier, t ranslat ing inst ead t o 2- MByt e pages or 1- GByt e pages. I n general, bit s 51: 12 of each paging- st ruct ure ent ry cont ain a 4- KByt e aligned physical address. For each ent ry except t he last , t his address is t hat of t he next paging st ruct ure; in t he last ent ry, it is t he physical address of a 4- KByt e pa ge fr a m e . The final physical address is obt ained by com bining t his page- fram e address wit h t he pa ge offse t , bit s 11: 0 of t he original linear address. Because only bit s 47: 0 of a linear address are used in address- t ranslat ion, t he processor reserves bit s 63: 48 for fut ure expansion using a concept known as ca nonica lit y. A linear address is ca nonica l if bit s 63: 47 of t he address are ident ical. ( Put different ly, a linear address is canonical only if bit s 63: 48 are a sign- ext ension of bit 47, which is t he upperm ost bit used in linear- address t ranslat ion.) When a 64- bit linear address is generat ed t o access m em ory, t he processor first confirm s t hat t he address is canonical. I f t he address is not canonical, t he m em ory access causes a fault , and t he processor m akes no at t em pt t o t ranslat e t he address. 1 I nt el 64 archit ect ure includes num erous regist ers t hat are defined t o hold linear addresses. These regist ers m ay be loaded using a variet y of inst ruct ions. I n m ost cases, t hese inst ruct ions cause a general- prot ect ion except ion ( # GP) if an at t em pt is m ade t o load one of t hese regist ers wit h a value t hat is not canonical. Physical- address bit s in a paging- st ruct ure ent ry beyond t he enum erat ed physicaladdress widt h are reserved. A page- fault except ion ( # PF) result s if an at t em pt is m ade t o access a linear address whose t ranslat ion encount ers a paging- st ruct ure ent ry t hat set s any of t hose bit s.
1 .2
Lin e a r - Addr e ss W idt h a n d VM X Tr a n sit ion s VM ent ries and VM exit s m anipulat e num erous processor regist ers t hat cont ain linear addresses. The t ransit ions respect t he processor ’s linear- address widt h in a m anner based on canonicalit y. Cert ain fields in t he VMCS correspond t o regist ers t hat cont ain linear addresses. VM ent ries confirm t hat m ost of t hose fields cont ain values t hat are canonical. Som e regist ers, such as RI P and t he LDTR base address, receive special t reat m ent . VM exit s save int o t he VMCS t he st at e of cert ain regist ers, som e of which cont ain linear addresses. Because t he processor generally ensures t hat t he values in t hese regist ers are canonical ( see Sect ion 1.1) , t he values t hat VM exit s save for t hese regist ers will generally be canonical.
1. I n general, an at t em pt t o access m em ory using a linear address t hat is not canonical causes a general- prot ect ion except ion ( # GP) . A st ack- fault except ion — # SS — occurs inst ead if t he m em ory access was m ade using t he SS segm ent .
VM exit s also load from t he VMCS cert ain regist ers, som e of which cont ain linear addresses. Each VM exit ensures t hat t he value of each of t hese regist ers is canonical. Specifically, bit s 47: 0 of t he regist er are loaded from t he field in t he host- st at e area; t he value of bit 47 is t hen sign- ext ended int o bit s 63: 48 of t he regist er.
1 .3
Ex ist in g Ex t e n de d Pa ge Ta ble s ( EPT) Most I nt el 64 processors support ing VMX also support an addit ional layer of address t ranslat ion called e x t e nde d pa ge t a ble s ( EPT) . VM ent ry can be configured t o act ivat e EPT for guest soft ware. When EPT is act ive, t he addresses used and produced by paging ( Sect ion 1.1) are not used as physical addresses t o reference in m em ory. I nst ead, t he processor int erpret s t hem as gu e st physica l a ddr e sse s, and t ranslat es t hem t o physical addresses in a m anner det erm ined by t he VMM. ( This t ranslat ion from guest- physical t o physical applies not only t o t he out put of paging but also t o t he addresses t hat t he processor uses t o reference t he guest paging st ruct ures.) I f t he EPT t ranslat ion process cannot t ranslat e a guest- physical address, it causes an EPT viola t ion. ( EPT violat ions m ay also occur when an access t o a guest- physical address violat es t he perm issions est ablished by EPT for t hat guest- physical address.) An EPT violat ion is a VMX- specific except ion, usually causing a VM exit . As not ed in Sect ion 1.1, exist ing processors lim it physical addresses t o 46 bit s. That lim it applies also t o guest- physical addresses. As a result , guest- physical addresses t hat set bit s beyond t his lim it are not t ranslat ed by EPT. ( For exam ple, a page fault result s if linear- address t ranslat ion encount ers a paging- st ruct ure ent ry wit h such an address.) Because of t his, exist ing EPT has been lim it ed t o t ranslat ing only 48 guestphysical- address bit s. The exist ing EPT t ranslat ion process is analogous t o t he paging process t hat was illust rat ed earlier in Figure 1- 1. Like 4- level paging, t he processor im plem ent s EPT by t raversing a 4- level hierarchy of 4- KByt e EPT pa ging st r uct ur e s. The last EPT pagingst ruct ure ent ry cont ains t he upper bit s of t he final physical address, while t he lowest bit s com e from t he original guest- physical address.
Ex pa n din g Lin e a r Addr e sse s: 5 - Le ve l Pa gin g
2 .1
5 - Le ve l Pa gin g: I n t r odu ct ion 5 - le ve l pa ging is a new paging m ode t hat will be available in I A- 32e m ode. As it s nam e suggest s, it will t ranslat e linear addresses by t raversing a 5- level hierarchy of paging st ruct ures. Because t he process is ot herwise unm odified, 5- level paging ext ends t he processor ’s linear- address widt h t o 57 bit s. ( The addit ional 9 bit s are used t o select an ent ry from t he fift h level of t he hierarchy.) For clarit y, t he paging m ode form erly called I A- 32e paging will now be called 4 - le ve l pa ging. The rem ainder of t his chapt er specifies t he archit ect ural changes t hat define and are ent ailed by 5- level paging. Sect ion 2.2 specifies how t he CPU enum erat es t he new feat ure and how it is enabled by soft ware. Sect ion 2.3 describes changes t o t he process of linear- address generat ion, as well as a revision t o t he concept of canonicalit y. Sect ion 2.4 det ails how 5- level paging t ranslat es linear addresses. Sect ion 2.5 clarifies how t he processor t reat s loads of regist ers cont aining linear addresses, while Sect ion 2.6 t o Sect ion 2.8 consider int eract ions wit h various ot her feat ures. ( I nt eract ions wit h t he virt ual- m achine ext ensions are specified in Chapt er 3.)
2 .2
En u m e r a t ion a n d En a blin g This sect ion describes how processors enum erat e t o soft ware support for 5- level paging and relat ed feat ures and also how soft ware enables t he processor t o use t hat support .
2 .2 .1
En u m e r a t ion by CPUI D Processors support ing t he I nt el 64 archit ect ure t ypically use t he CPUI D inst ruct ion t o enum erat e t o soft ware specific processor funct ionalit y. Those processors t hat support 5- level paging enum erat e t hat fact t hrough a new feat ure flag as well as t hrough changes in how relat ed feat ures are report ed: •
CPUI D.( EAX= 07H, ECX= 0) : ECX[ bit 16] is a new feat ure flag t hat will enum erat e basic support for 5- level paging. All older processors clear t his bit . A processor will set t his bit if and only if it support s 5- level paging.
As not ed in Sect ion 1.1, CPUI D.80000008H: EAX[ bit s 15: 8] enum erat es t he m axim um linear- address widt h support ed by t he processor. All older processors t hat support I nt el 64 archit ect ure enum erat ed t his value as 48. Processors t hat support 5- level paging will inst ead enum erat e t his value as 57.
As not ed in Sect ion 1.1, CPUI D.80000008H: EAX[ bit s 7: 0] enum erat es t he m axim um physical- address widt h support ed by t he processor. Processors t hat support I nt el 64 archit ect ure have enum erat ed at m ost 46 for t his value. Processors t hat support 5- level paging are expect ed t o enum erat e higher values, up t o 52.
CPUI D.( EAX= 07H,ECX= 0H) : ECX.MAWAU[ bit s21: 17] is an exist ing field t hat enum erat es t he user MPX address- widt h adj ust ( MAWAU) . This value specifies t he num ber of linear- address bit s above 48 on which t he BNDLDX and BNDSTX inst ruct ions operat e in 64- bit m ode when CPL = 3.
Older processors t hat support I nt el ® MPX enum erat ed 0 for t his value. Processors t hat support 5- level paging m ay enum erat e eit her 0 or 9, depending on configurat ion by syst em soft ware. See Sect ion 2.7 for m ore det ails on how BNDLDX and BNDSTX use MAWAU and how syst em soft ware det erm ines it s value. •
2 .2 .2
CPUI D.( EAX= 12H,ECX= 0H) : EDX[ bit s 15: 8] is an exist ing field t hat enum erat es inform at ion t hat specifies t he m axim um support ed size of a 64- bit enclave. I f t he value enum erat ed is n, t he m axim um size is 2 n . Older processors t hat support I nt el ® SGX enum erat ed at m ost 47 for t his value. Processors t hat support 5- level paging are expect ed t o enum erat e t his value as 56.
En a blin g by Soft w a r e Sect ion 1.1 ident ified an algorit hm by which soft ware can ent er I A- 32e m ode. On processors t hat do not support 5- level paging, t his algorit hm enables 4- level paging. On processors t hat support 5- level paging, it can be adapt ed t o enable 5- level paging inst ead. Processors t hat support 5- level paging allow soft ware t o set a new enabling bit , CR4.LA57[ bit 12] . 1 A logical processor in I A- 32e m ode ( I A32_EFER.LMA = 1) uses 5level paging if CR4.LA57 = 1. Out side of I A- 32e m ode ( I A32_EFER.LMA = 0) , t he value of CR4.LA57 does not affect paging operat ion. The following it em s det ail how a logical processor det erm ines t he current paging m ode. • •
I f CR0.PG = 0, paging is disabled. I f I A32_EFER.LMA = 0, one of t he legacy 32- bit pagin g m odes is used ( depending on t he value of legacy paging- m ode bit s in CR4) . 2
I f I A32_EFER.LMA = 1 and CR4.LA57 = 0, 4- level paging is used.
I f I A32_EFER.LMA = 1 and CR4.LA57 = 1, 5- level paging is used.
Soft ware can t hus use t he following algorit hm t o ent er I A- 32e m ode wit h 5- level paging. 1. Use t he MOV CR inst ruct ion t o set CR4.PAE and CR4.LA57. 2. Use t he WRMSR inst ruct ion t o set I A32_EFER.LME. 3. Use t he MOV CR inst ruct ion t o load CR3 wit h t he address of a PML5 t able ( see Sect ion 2.4) . 4. Use t he MOV CR inst ruct ion t o set CR0.PG. The processor allows soft ware t o m odify CR4.LA57 only out side of I A- 32e m ode. I n I A- 32e m ode, an at t em pt t o m odify CR4.LA57 using t he MOV CR inst ruct ion causes a general- prot ect ion except ion ( # GP) .
2 .3
Lin e a r - Addr e ss Ge n e r a t ion a n d Ca n on ica lit y As not ed in Sect ion 1.1, processors wit h a linear- address widt h of 48 bit s reserve linear- address bit s 63: 48 for fut ure expansion. Linear addresses t hat use only bit s 47: 0 ( because bit s 63: 48 are a sign- ext ension of bit 47) are called ca nonica l.
1. Soft ware can set CR4.LA57 only if CPUI D.( EAX= 07H, ECX= 0) : ECX[ bit 16] is enum erat ed as 1. 2. Recall t hat I A32_EFER.LMA is t he logical-AND of CR0.PG and I A32_EFER.LME.
When a 64- bit linear address is generat ed t o access m em ory, t he processor first confirm s t hat t he address is canonical. I f t he address is not canonical, t he m em ory access causes a fault , and t he address is not t ranslat ed. Processors t hat support 5- level paging can t ranslat e 57- bit linear addresses when 5level paging is enabled. But if soft ware has enabled only 4- level paging, such a processor can t ranslat e only 48- bit linear addresses. This fact m ot ivat es t he definit ion of t wo levels of canonicalit y. A linear address is 4 8 - bit ca nonica l if bit s 63: 47 of t he address are ident ical. Sim ilarly, an address is 5 7 - bit ca nonica l if bit s 63: 56 of t he address are ident ical. Any linear address is t hat 48- bit canonical is also 57- bit canonical. When a 64- bit linear address is generat ed t o access m em ory, a processor t hat support s 5- level paging checks for canonicalit y based on t he current paging m ode: if 4- level paging is enabled, t he address m ust be 48- bit canonical; if 5- level paging is enabled, t he address need only be 57- bit canonical. I f t he appropriat e canonicalit y is not observed, t he m em ory access causes a fault .
2 .4
5 - Le ve l Pa gin g: Lin e a r - Addr e ss Tr a n sla t ion As not ed in Sect ion 2.2.2, a logical processor uses 5- level paging if I A32_EFER.LMA = 1 and CR4.LA57 = 1. Like 4- level paging, 5- level paging t ranslat es linear addresses using a hierarchy of inm em ory paging st ruct ures. Because 5- level paging increases t he linear- address widt h t o 57 bit s ( from t he 48 bit s support ed by 4- level paging) , 5- level paging allows up t o 128 PByt es of linear- address space t o be accessed at any given t im e. Also like 4- level paging, 5- level paging uses CR3 t o locat e t he first paging- st ruct ure in t he hierarchy. ( CR3 has t he sam e m ode- specific form at wit h 5- level paging as it does wit h 4- level paging.) The following it em s describe in m ore det ail t he changes t hat 5level paging m akes t o t he t ranslat ion process. •
Translat ion begins by ident ifying a 4- KByt e nat urally aligned PML5 t able. I t is locat ed at t he physical address specified in bit s 51: 12 of CR3. A PML5 t able com prises 512 64- bit ent ries ( PML5Es) . A PML5E is select ed using t he physical address defined as follows. — Bit s 51: 12 are from CR3. — Bit s 11: 3 are bit s 56: 48 of t he linear address. — Bit s 2: 0 are all 0. Because a PML5E is ident ified using bit s 56: 48 of t he linear address, it cont rols access t o a 256-TByt e region of t he linear- address space. The form at of a PML5E is given in Table 2- 1.
Table 2-1.
For m a t of a PM L5 En t r y ( PM L5 E) t ha t Re fe r e nce s a PM L4 Ta ble Bit Posit ion ( s)
Con t e n t s
0 ( P)
Present ; m ust be 1 t o reference a PML4 t able.
1 ( R/ W)
Read/ writ e; if 0, writ es m ay not be allowed t o t he 256-TByt e region cont rolled by t his ent ry.
2 ( U/ S)
User/ supervisor; if 0, user- m ode accesses are not allowed t o t he 256-TByt e region cont rolled by t his ent ry.
3 ( PWT)
Page- level writ e- t hrough; indirect ly det erm ines t he m em ory t ype used t o access t he PML4 t able referenced by t his ent ry.
Ta ble 2 - 1 .
For m a t of a PM L5 En t r y ( PM L5 E) t ha t Re fe r e nce s a PM L4 Ta ble ( Cont inue d) Bit Posit ion ( s)
Con t e n t s
4 ( PCD)
Page- level cache disable; indirect ly det erm ines t he m em ory t ype used t o access t he PML4 t able referenced by t his ent ry.
5 ( A)
Accessed; indicat es w het her t his ent ry has been used for linear- address t ranslat ion.
I gnored.
7 ( PS)
Reserved ( m ust be 0) .
11: 8
I gnored.
M–1: 12
Physical address of 4- KByt e aligned PML4 t able referenced by t his ent ry.
51: M
Reserved ( m ust be 0) .
62: 52
I gnored.
I f I A32_EFER.NXE = 1, execut e- disable ( if 1, inst ruct ion fet ches are not allow ed from t he 256-TByt e region cont rolled by t his ent ry) ; ot herwise, reserved ( m ust be 0) .
The next st ep of t he t ranslat ion process ident ifies a 4- KByt e nat urally aligned PML4 t able. I t is locat ed at t he physical address specified in bit s 51: 12 of t he PML5E ( see Table 2- 1) . A PML4 t able com prises 512 64- bit ent ries ( PML4Es) . A PML4E is select ed using t he physical address defined as follows. — Bit s 51: 12 are from t he PML5E. — Bit s 11: 3 are bit s 47: 39 of t he linear address. — Bit s 2: 0 are all 0. As is norm ally t he case when accessing a paging- st ruct ure ent ry, t he m em ory t ype used t o access t he PML4E is based in part on t he PCD and PWT bit s in t he PML5E. Because a PML4E is ident ified using bit s 56: 39 of t he linear address, it cont rols access t o a 512- GByt e region of t he linear- address space.
Once t he PML4E is ident ified, bit s 38: 0 of t he linear address det erm ine t he rem ainder of t he t ranslat ion process exact ly as is done for 4- level paging. As suggest ed in Table 2- 1, t he values of bit 1, bit 2, and bit 63 of t he PML5E are used norm ally ( in com binat ion wit h t he corresponding bit s in ot her paging- st ruct ure ent ries) t o det erm ine access right s. The accessed flag ( bit 5) in t he PML5E is updat ed as is done for ot her paging- st ruct ure ent ries. The operat ion of 5- level paging is illust rat ed in Figure 2- 1.
2 .5
Lin e a r - Addr e ss Re gist e r s a n d Ca n on ica lit y I nt el 64 archit ect ure includes num erous regist ers t hat are defined t o hold linear addresses. These regist ers m ay be loaded using a variet y of inst ruct ions. As not ed in Sect ion 1.1, each of t hese inst ruct ions t ypically causes a general- prot ect ion except ion ( # GP) if an at t em pt is m ade t o load a linear- address regist er wit h a value t hat is not canonical. As not ed in Sect ion 2.3, processors t hat support 5- level paging use t wo definit ions of canonicalit y: 48- bit canonicalit y and 57- bit canonicalit y. This sect ion describes how such a processor checks t he canonicalit y of t he values being loaded int o t he linearaddress regist ers. One approach is used for operat ions t hat load RI P ( t he inst ruct ion point er; see Sect ion 2.5.1) and anot her is used for t hose t hat load ot her regist ers ( see Sect ion 2.5.2) .
56 PML5
Linear Address 39 38 47 30 29 PML4 Directory Ptr Directory 9
21 20
40 Page-Directory 40 Pointer Table
12 11
12 4-KByte Page Physical Addr
Page Directory
9 PTE 40 PML5E
Page Table
40 CR3
Figure 2-1.
2 .5 .1
Lin e a r - Addr e ss Tr a n sla t ion Usin g 5 - Le ve l Pa gin g
Ca n on ica lit y Ch e ck in g on RI P Loa ds The RI P regist er cont ains t he offset of t he current inst ruct ion point er wit hin t he CS segm ent . Because t he processor t reat s t he CS base address as zero in 64- bit m ode, t he value of t he RI P regist er in t hat m ode is t he linear address of t he inst ruct ion point er. Operat ions t hat load RI P ( including bot h inst ruct ions such as JMP as well as cont rol t ransfers t hrough t he I DT) check first whet her t he value t o be loaded is canonical relat ive t o t he current paging m ode. I f t he processor det erm ines t hat t he address is not canonical, t he RI P load is not perform ed and a general- prot ect ion except ion ( # GP) occurs.
N ot e :
An inst ruct ion t hat would load RI P wit h a non- canonical address fault s, m eaning t hat t he ret urn inst ruct ion point er of t he fault handler is t he address of t he fault ing inst ruct ion and not t he non- canonical address whose load was at t em pt ed. The canonicalit y checking perform ed by t hese operat ions uses 48- bit canonicalit y when 4- level paging is act ive. When 5- level paging is act ive, t he checking is relaxed t o require only 57- bit canonicalit y. The SYSCALL and SYSENTER inst ruct ions load RI P from t he I A32_LSTAR and I A32_SYSENTER_EI P MSRs, respect ively. On processors t hat support only 4- level paging, t hese inst ruct ions do not check t hat t he values being loaded are canonical because t he WRMSR inst ruct ion ensures t hat each of t hese MSRs cont ains a value t hat is 48- bit canonical. On processors t hat support 5- level paging, t he checking by WRMSR is relaxed t o 57- bit canonicalit y ( see Sect ion 2.5.2) . On such processors, an execut ion
of SYSCALL or SYSENTER wit h 4- level paging checks t hat t he value being loaded int o RI P is 48- bit canonical. 1 The norm al advancing of t he inst ruct ion point er t o t he next inst ruct ion boundary m ay result in t he RI P regist er holding a non- canonical address. The fet ch of t he next inst ruct ion from t hat non- canonical address will result in a general- prot ect ion except ion as indicat ed in Sect ion 2.3. I n t his case, t he ret urn inst ruct ion point er of t he fault handler will be t hat non- canonical address.
2 .5 .2
Ca n on ica lit y Ch e ck in g on Ot h e r Loa ds I n addit ion t o RI P, t he CPU m aint ains num erous ot her regist ers t hat hold linear addresses: •
GDTR and I DTR ( in t heir base- address port ions) .
LDTR, TR, FS, and GS ( in t he base- address port ionsof t heir hidden descript or caches) .
The debug- address regist ers ( DR0 t hrough DR3) , which hold t he linear addresses of breakpoint s.
The x87 FPU inst ruct ion point er ( FI P) .
The user- m ode configurat ion regist er BNDCFGU, used by I nt el® MPX.
Wit h a few except ions, t he processor ensures t hat t he addresses in t hese regist ers are always canonical in t he following ways. •
Som e inst ruct ions fault on at t em pt s t o load a linear- address regist er wit h a noncanonical address: — An execut ion of t he LGDT or LI DT inst ruct ion causes a general- prot ect ion except ion ( # GP) if t he base address specified in t he inst ruct ion’s m em ory operand is not canonical. — An execut ion of t he LLDT or LTR inst ruct ion causes a # GP if t he base address t o be loaded from t he GDT is not canonical. — An execut ion of WRMSR, WRFSBASE, or WRGSBASE causes a # GP if it would load t he base address of eit her FS or GS wit h a non- canonical address. — An execut ion of WRMSR causes a # GP if it would load any of t he following MSRs wit h a non- canonical address: I A32_BNDCFGS, I A32_DS_AREA, I A32_FS_BASE, I A32_GS_BASE, I A32_KERNEL_GS_BASE, I A32_LSTAR, I A32_RTI T_ADDR0_A, I A32_RTI T_ADDR0_B, I A32_RTI T_ADDR1_A, I A32_RTI T_ADDR1_B, I A32_RTI T_ADDR2_A, I A32_RTI T_ADDR2_B, I A32_RTI T_ADDR3_A, I A32_RTI T_ADDR3_B, I A32_SYSENTER_EI P, or I A32_SYSENTER_ESP. 2
1. The SYSRET and SYSEXI T inst ruct ions, which com plem ent SYSCALL and SYSENTER, load RI P from RCX and RDX, respect ively. Even before 5- level paging, t hese inst ruct ions checked t he canonicalit y of t he value t o be loaded int o RI P. As wit h ot her inst ruct ions t hat load RI P, t his checking will be based on t he current paging m ode. 2. Such canonicalit y checking m ay apply also when t he WRMSR inst ruct ion is used t o load som e non- archit ect ural MSRs ( not list ed here) t hat hold a linear address.
— An execut ion of XRSTORS causes a # GP if it would load any of t he following MSRs wit h a non- canonical address: I A32_RTI T_ADDR0_A, I A32_RTI T_ADDR0_B, I A32_RTI T_ADDR1_A, I A32_RTI T_ADDR1_B, I A32_RTI T_ADDR2_A, I A32_RTI T_ADDR2_B, I A32_RTI T_ADDR3_A, and I A32_RTI T_ADDR3_B. This enforcem ent always uses t he enum erat ed m axim um linear- address widt h and is independent of t he current paging m ode. Thus, a processor t hat support s 5- level paging will allow t he inst ruct ions m ent ioned above t o load t hese regist ers wit h addresses t hat are 57- bit canonical but not 48- bit canonical — e ve n if 4 - le ve l pa ging is a ct ive . ( As a result , inst ruct ions t hat st ore t hese values — SGDT, SI DT, SLDT, STR, RDFSBASE, RDGSBASE, RDMSR, XSAVE, XSAVEC, XSAVEOPT, and XSAVES — m ay save addresses t hat are 57- bit canonical but not 48- bit canonical, even if 4- level paging is act ive.) •
The FXRSTOR, XRSTOR, and XRSTORS inst ructi ons ignore at t em pt s t o load som e of t hese regist ers wit h non- canonical addresses: — Loads of FI P ignore any bit s in t he m em ory im age beyond t he enum erat ed m axim um linear- address widt h. The processor sign- ext ends t o m ost significant bit ( e.g., bit 56 on processors t hat support 5- level paging) t o ensure t hat FI P is always canonical. — Loads of BNDCFGU ( by XRSTOR or XRSTORS) ignore any bit s in t he m em ory im age beyond t he enum erat ed m axim um linear- address widt h. The processor sign- ext ends t o m ost significant bit ( e.g., bit 56 on processors t hat support 5level paging) t o ensure t hat BNDCFGU is always canonical.
Every non- cont rol x87 inst ruct ion loads FI P. The value loaded is always canonical relat ive t o t he current paging m ode: 48- bit canonical if 4- level paging is act ive, and 57- bit canonical if 5- level paging is act ive.
DR0 t hrough DR3 can be loaded wit h t he MOV t o DR inst ruct ion. The inst ruct ion allows t hose regist ers t o be loaded wit h non- canonical addresses. The MOV from DR inst ruct ion will ret urn t he value last loaded wit h t he MOV t o DR inst ruct ion, even if t he address is not canonical. Breakpoint address m at ching is support ed only for canonical linear addresses.
2 .6
I n t e r a ct ion s w it h TLB- I n va lida t ion I n st r u ct ion s I nt el 64 archit ect ure includes t hree inst ruct ions t hat m ay invalidat e TLB ent ries for t he linear address of an inst ruct ion operand: I NVLPG, I NVPCI D, and I NVVPI D. The following it em s describe how t hey are affect ed by linear- address widt h. •
The I NVLPG inst ruct ion t akes a m em ory operand. I t invalidat es any TLB ent ries t hat t he logical processor is caching for t he linear address of t hat operand for t he current linear address space. The inst ruct ion does not fault if t hat address is not canonical relat ive t o t he current paging m ode ( e.g., is not 48- bit canonical when 4level paging is act ive) . However, no invalidat ion is perform ed because t he processor does not cache TLB ent ries for addresses t hat are not canonical relat ive t o t he current paging m ode.
The I NVPCI D inst ruct ion t akes a regist er operand ( I NVPCI D t ype) and a m em ory operand ( I NVPCI D descript or) . I f t he I NVPCI D t ype is 0, t he inst ruct ion invalidat es any TLB ent ries t hat t he logical processor is caching for t he linear address and PCI D specified in t he I NVPCI D descript or. I f t he linear address is not canonical relat ive t he linear- address widt h support ed by t he processor, t he inst ruct ion causes a general- prot ect ion except ion ( # GP) . I f t he processor support s 5- level paging, t he inst ruct ion will not cause such a # GP for an address t hat is 57- bit canonical, regardless of paging m ode, even if 4- level paging is act ive and t he address is not 48- bit canonical.
2 .7
The I NVVPI D inst ruct ion t akes a regist er operand (I NVVPI D t ype) and a m em ory operand ( I NVVPI D descript or) . I f t he I NVPCI D t ype is 0, t he inst ruct ion invalidat es any TLB ent ries t hat t he logical processor is caching for t he linear address and VPI D specified in t he I NVVPI D descript or. I f t he linear address is not canonical relat ive t he linear- address widt h support ed by t he processor, t he inst ruct ion fails. 1 I f t he processor support s 5- level paging, t he inst ruct ion will not fail for an address t hat is 57- bit canonical, regardless of paging m ode, even if 4- level paging is act ive and t he address is not 48- bit canonical.
I n t e r a ct ion s w it h I n t e l ® M PX The I nt el ® Mem ory Prot ect ion Ext ensions ( I nt el ® MPX) define a set of 4 bound regist ers, each of which soft ware can associat e wit h a specific point er in m em ory. I nt el MPX includes t wo inst ruct ions — BNDLDX and BNDSTX — t hat allow soft ware t o load from or st ore int o m em ory t he bounds associat ed wit h a part icular point er in m em ory. The BNDLDX and BNDSTX inst ruct ions each t ake a bound regist er and a m em ory operand ( t he associat ed point er) . Each of t hese parses t he linear address of t he m em ory operand t o t raverse a hierarchical dat a st ruct ure in m em ory. I n 64- bit m ode, t hese inst ruct ions do not necessarily use all t he bit s in t he supplied 64- bit addresses. The num ber of bit s used is 48 plus a value called t he M PX a ddr e ss- w idt h a dj u st ( M AW A) . The value of MAWA depends on CPL; t he current paging m ode ( 4- level paging or 5- level paging) ; and, if 5- level paging is act ive, t he value of a new MSR. Processors t hat support bot h I nt el MPX and 5- level paging support t he I A32_MPX_LAX MSR ( MSR index 1000H) . Only bit 0 of t he MSR is defined. I f CPL < 3, t he supervisor MAWA ( M AW AS) is used. The value of MAWAS is det erm ined by t he set t ing of CR4.LA57. I f CR4.LA57 = 0 ( 4- level paging is act ive; recall t hat MAWA is relevant only in 64- bit m ode) , t he value of MAWAS is 0. I f CR4.LA57 = 1 ( 5- level paging is act ive) , t he value of MAWAS is 9. The value of MAWAS is not enum erat ed by t he CPUI D inst ruct ion. I f CPL = 3, t he user MAWA ( M AW AU) is used. The value of MAWAU is det erm ined as follows. I f CR4.LA57 = 0 or I A32_MPX_LAX[ bit 0] = 0, t he value of MAWAU is 0. I f CR4.LA57 = 1 and I A32_MPX_LAX[ bit 0] = 1, t he value of MAWAU is 9. The current value of MAWAU is enum erat ed in CPUI D.( EAX= 07H,ECX= 0H) : ECX.MAWAU[ bit s 21: 17] . The following it em s specify how an execut ion of t he BNDLDX and BNDSTX inst ruct ions in 64- bit m ode parses a linear address t o t raverse a hierarchical dat a st ruct ure. •
A bound direct ory is locat ed at t he 4- KByt e aligned linear address specified in bit s 63: 12 of BNDCFGx. 2 A BDE is select ed using t he LAp ( linear address of point er t o a buffer) t o const ruct a 64- bit offset as follows: — bit s 63: 31+ MAWA are 0; — bit s 30+ MAWA: 3 are LAp[ bit s 47+ MAWA: 20] ; and — bit s 2: 0 are 0.
1. I NVVPI D is a VMX inst ruct ion. I n response t o cert ain condit ions, execut ion of a VMX m ay fa il, m eaning t hat it does not com plet e it s norm al operat ion. When a VMX inst ruct ion fails, cont rol passes t o t he next inst ruct ion ( rat her t han t o a fault handler) and a flag is set t o report t he failure. 2. I f CPL < 3, BNDCFGS is used; if CPL = 3, BNDCFGU is used.
The address of t he BDE is t he sum of t he bound- direct ory base address ( from BNDCFGx) plus t his 64- bit offset . I f eit her BNDLDX or BNDSTX is execut ed inside an enclave, t he inst ruct ion operat es as if MAWAU = 0 ( regardless of t he values of CR4.LA57 and I A32_MPX_LAX[ bit 0] ) . •
The processor uses bit s 63: 3 of t he BDE as t he 8- byt e aligned address of a bound t able ( BT) . A BTE is select ed using t he LAp ( linear address of point er t o a buffer) t o const ruct a 64- bit offset as follows: — bit s 63: 22 are 0; — bit s 21: 5 are LAp[ bit s 19: 3] ; and — bit s 4: 0 are 0. The address of t he BTE is t he sum of t he bound- t able base address ( from t he BDE) plus t his 64- bit offset .
A bound direct ory com prises 2 28+ MAWA 64- bit ent ries ( BDEs) ; 1 t hus, t he size of a bound direct ory in 64- bit m ode is 2 1+ MAWA GByt es. A bound t able com prises 2 17 32byt e ent ries ( BTEs) ; t hus, t he size of a bound t able in 64- bit m ode is 4 MByt es ( independent of MAWA) .
2 .8
I n t e r a ct ion s w it h I n t e l ® SGX I nt el ® Soft ware Guard Ext ensions ( I nt el ® SGX) define new processor funct ionalit y t hat is im plem ent ed as SGX leaf funct ions wit hin t he ENCLS ( supervisor) and ENCLU ( user) inst ruct ions. The SGX leaf funct ions include m em ory accesses using linear addresses norm ally. When execut ed in 64- bit m ode, t he linear address are 64 bit s in widt h and are subj ect t o t he norm al t reat m ent of accesses t o m em ory wit h 64- bit linear addresses ( see Sect ion 2.3) . I n addit ion, som e of t he leaf funct ions apply specific archit ect ural checks relat ed t o linear- address widt h. The following it em s det ail t hese checks and how t hey are defined for processors t hat support 5- level paging. •
The ECREATE leaf funct ion of ENCLS creat es a new enclave by creat ing a new SGX enclave cont rol st ruct ure ( SECS) . For a 64- bit enclave, t he processor checks whet her t he enclave base linear address ( specified in t he SECS) is canonical, generat ing a general- prot ect ion except ion ( # GP) if it is not . On processors t hat support 5- level paging, t his check is for 57- bit canonicalit y, regardless of t he current paging m ode. I n addit ion t o checking t he canonicalit y of t he enclave base linear address, ECREATE confirm s t hat t he enclave size ( specified in t he SECS) is not great er t han t he m axim um size support ed by t he processor ( if t he enclave size is t oo large, ECREATE generat es a # GP) . As not ed in Sect ion 2.2.1, older processors support ed 64- bit enclaves wit h sizes up t o 2 47 byt es; processors t hat support 5- level paging are expect ed t o support enclaves wit h sizes up t o 2 56 byt es. I f bit s 4: 3 of t he enclave’s XSAVE feat ure request m ask ( XFRM) are set ( indicat ing t hat I nt el MPX will be enabled during execut ion of t he enclave) , ECREATE generat es a # GP if t he enclave’s size is great er t han 2 48 byt es, even if t he processor enum erat es support for larger enclaves.
The EENTER and ERESUME leaf funct ions of ENCLU t ransfer cont rol flow t o an ent ry point wit hin a specified enclave. For ent ry t o a 64- bit enclave, t he processor checks
1. A bound direct ory used in a 64- bit enclave always com prises 2 28 64- bit BDEs and t hus has a size of 2 GByt es.
whet her cert ain linear addresses are canonical, generat ing a general- prot ect ion except ion ( # GP) if any one is not . The following it em s det ail t hese checks. — The linear address of t he specified ent ry point m ust be canonical. I f 4- level paging is act ive, it m ust be 48- bit canonical; if 5- level paging is act ive, it m ust be 57- bit canonical. — The linear address of t he asynchronous exit point ( AEP — t he address t o which t he processor t ransfers cont rol on an asynchronous enclave exit ) m ust be canonical. I f 4- level paging is act ive, it m ust be 48- bit canonical; if 5- level paging is act ive, it m ust be 57- bit canonical. — The enclave values for t he base addresses of t he FS and GS segm ent s m ust be canonical. On processors t hat support s 5- level paging, t hese checks are for 57bit canonicalit y, regardless of t he current paging m ode. •
The EEXI T leaf funct ion exit s t he current ly executi ng enclave and branches t o a specified address. For an exit from a 64- bit enclave, t he processor checks whet her t hat t arget linear address is canonical, generat ing a general- prot ect ion except ion ( # GP) if it is not . I f 4- level paging is act ive, it m ust be 48- bit canonical; if 5- level paging is act ive, it need only be 57- bit canonical.
As not ed in Sect ion 2.7, execut ions of BNDLDX and BNDSTX in a 64- bit enclave always operat e as if MAWAU = 0.
Lin e a r - Addr e ss Ex pa n sion a n d VM X Tr a n sit ion s As not ed in Sect ion 1.2, VM ent ries and VM exit s m anipulat e num erous processor regist ers t hat cont ain linear addresses. The t ransit ions respect t he processor ’s linearaddress widt h in a m anner based on canonicalit y. As discussed in Chapt er 2, processors t hat support 5- level paging expand t he linearaddress widt h from 48 bit s t o 57 bit s. That expansion changes t he operat ion of VMX t ransit ions. Changes t o VM ent ries are det ailed in Sect ion 3.1, while changes t o VM exit s are given in Sect ion 3.2.
3 .1
Lin e a r - Addr e ss Ex pa n sion a n d VM En t r ie s Cert ain fields in t he VMCS correspond t o regist ers t hat cont ain linear addresses. VM ent ries confirm t hose fields cont ain values t hat are canonical. This checking is based on t he linear- address widt h support ed by t he processor ( e.g., is based on 57- bit canonicalit y if t he processor support s 5- level paging) . The following are t he fields t o which t his applies. •
I n t he host- st at e area: — The fields for t he I A32_SYSENTER_EI P and I A32_SYSENTER_ESP MSRs. — The base- address fields for FS, GS, TR, GDTR, and I DTR.
I n t he guest- st at e area: — The fields for t he I A32_SYSENTER_EI P and I A32_SYSENTER_ESP MSRs. — The base- address fields for FS, GS, TR, GDTR, and I DTR. — The base- address field for LDTR ( if LDTR will be usable) . — The field for t he I A32_BNDCFGS MSR ( if VM ent ry is loading t hat MSR) .
A VM ent ry t o 64- bit m ode also perform s a check on t he RI P field in t he guest- st at e area of t he current VMCS. I f t he VM ent ry would result in 4- level paging, it checks t hat bit s 63: 48 of t he guest RI P field are ident ical; if it would result in 5- level paging, t hat check is on bit s 63: 57. 1
3 .2
Lin e a r - Addr e ss Ex pa n sion a n d VM Ex it s VM exit s save t he st at e of cert ain regist ers int o t he guest- st at e area of t he VMCS. Som e of t hese regist ers cont ain linear addresses. As discussed in Sect ion 1.1, t he CPU generally ensures t hat t he values in t hese regist ers respect t he CPU’s linear- address widt h. As a result , t he values t he VM exit s save for t hese regist ers will do t he sam e.
1. Not e t hat t hese checks do not confirm t hat t he guest RI P field is canonical relat ive t o t he paging m ode being ent ered. For exam ple, bit s 63: 47 are ident ical in a 48- bit canonical address. However, VM ent ry t o 4- level paging m ay load RI P wit h a value in which bit 47 differs from t hat of bit s 63: 48.
There is a special case for LDTR base address. I f LDTR was not usable at t he t im e of a VM exit , t he value saved for t he base address is undefined. However, t his undefined value is always 48- bit canonical on processors t hat do not support 5- level paging and is always 57- bit canonical on processors t hat do support 5- level paging. VM exit s load t he st at e of cert ain regist ers from t he host- st at e area of t he VMCS. Som e of t hese regist ers cont ain linear addresses. Each VM exit ensures t hat t he value of each of t he following regist ers is canonical: t he I A32_SYSENTER_EI P and I A32_SYSENTER_ESP MSRs; and t he base addresses for FS, GS, TR, GDTR, and I DTR. How t his is done depends on whet her t he processor support s 5- level paging. •
I f t he processor does not support 5- level paging, bit s 47: 0 of t he regist er are loaded from t he field in t he host- st at e area; t he value of bit 47 is t hen signext ended int o bit s 63: 48 of t he regist er.
I f t he processor does support 5- level paging, bit s56: 0 of t he regist er are loaded from t he field in t he host- st at e area; t he value of bit 56 is t hen sign- ext ended int o bit s 63: 57 of t he regist er.
Again, t here is a special case for LDTR. LDTR is always unusable aft er a VM exit . I t s base address m ay be loaded wit h an undefined value. This undefined value is always 48- bit canonical on processors t hat do not support 5- level paging and is always 57- bit canonical on processors t hat do support 5- level paging.
5 - Le ve l EPT 5 - le ve l EPT is a new m ode for EPT. As it s nam e suggest s, it will t ranslat e guestphysical addresses by t raversing a 5- level hierarchy of EPT paging st ruct ures. Because t he process is ot herwise unm odified, 5- level paging ext ends t he processor ’s guestphysical- address widt h t o 57 bit s. ( The addit ional 9 bit s are used t o select an ent ry from t he fift h level of t he hierarchy.) For clarit y, t he original EPT m ode will now be called 4 - le ve l EPT. The rem ainder of t his chapt er specifies archit ect ural changes t o 4- level EPT as well as t hose t hat define and are ent ailed by 5- level EPT. Sect ion 4.1 describes how t he expansion of t he guest- physical- address widt h affect s 4- level EPT. Sect ion 4.2 specifies how t he CPU enum erat es 5- level EPT and how t he feat ure is enabled by soft ware. Sect ion 4.3 det ails how 5- level EPT t ranslat es guest- physical addresses.
4 .1
4 - Le ve l EPT: Gu e st - Ph ysica l- Addr e ss Lim it As explained in Sect ion 1.3, 4- level EPT is lim it ed t o t ranslat ing 48- bit guest- physical addresses. This is not a problem on exist ing processors, because t hey lim it t he physical- address widt h t o 46 bit s ( see Sect ion 1.1) . A processor ’s physical- address widt h also lim it s guest- physical addresses. That m eans t hat , on exist ing processors, any at t em pt t o use a guest- physical address t hat set s a bit above t he low 48 bit s will cause a page- fault except ion ( # PF) . Processors t hat support 5- level paging are expect ed t o support 52 physical- address bit s. Such processors allow use of a guest- physical address t hat set s bit s in t he range 51: 48; no # PF is generat ed. A guest- physical address t hat set s bit s in t he range 51: 48 cannot be t ranslat ed by 4level EPT. An at t em pt t o access such an address when 4- level EPT is act ive causes an EPT viola t ion ( see Sect ion 1.3) . EPT violat ions generat e inform at ion about t he except ion in a value called t he e x it qua lifica t ion . I n general, EPT violat ions caused by at t em pt s t o access a guest- physical address t hat is t oo wide est ablish t he exit qualificat ion as is current ly done for ot her EPT violat ions. Except ions are m ade for bit s 6: 3 of t he exit qualificat ion, which report t he access right s for t he guest- physical address. The new EPT violat ions always clear t hese bit s.
4 .2
5 - Le ve l EPT: En u m e r a t ion a n d En a blin g This sect ion describes how processors enum erat e t o soft ware support for 5- level EPT and how soft ware enables t he processor t o use t hat support .
4 .2 .1
En u m e r a t ion Processors support ing EPT enum erat e det ails relat ed t o EPT in t he I A32_VMX_EPT_VPI D_CAP MSR ( index 48CH) . Current ly, I A32_VMX_EPT_VPI D_CAP[ bit 6] enum erat es support for 4- level EPT. Processors t hat also support 5- level EPT will enum erat e t hat fact by also set t ing I A32_VMX_EPT_VPI D_CAP[ bit 7] .
The guest- physical- address widt h support ed by a processor is not enum erat ed using t he I A32_VMX_EPT_VPI D_CAP MSR. This is because t hat widt h is always t he sam e as t he processor ’s m axim um physical- address widt h as enum erat ed by CPUI D.80000008H: EAX[ bit s 7: 0] .
4 .2 .2
En a blin g by Soft w a r e A VMM enables EPT by set t ing t he “ enable EPT” VM- execut ion cont rol in t he current VMCS before using t he VMCS for VM ent ry. Specific det ails of EPT operat ion are det erm ined by t he ext ended- page- t able point er field ( EPTP) in t he VMCS. I n part icular, EPTP[ bit s 5: 3] cont ain a value t hat is 1 less t han t he num ber of levels used by t he EPT. On exist ing processors, t his value m ust be 3, indicat ing 4- level EPT. ( VM ent ry fails if a different value is used.) Processors t hat also support 5- level EPT will also allow t he value 4 ( indicat ing 5- level EPT) . I n sum m ary, VM ent ry on a processor t hat support s 5- level check EPTP[ bit s 5: 3] . I f t he value is 3, t he VM ent ry act ivat es 4- level EPT. I f t he value is 4, t he VM ent ry act ivat es 5- level EPT. Wit h any ot her value, VM ent ry fails.
4 .3
5 - Le ve l EPT: Gu e st - Ph ysica l- Addr e ss Tr a n sla t ion Like 4- level EPT, 5- level EPT t ranslat es guest- physical addresses using a hierarchy of in- m em ory paging st ruct ures. Because 5- level EPT increases t he guest- physicaladdress widt h t o 57 bit s ( from t he 48 bit s support ed by 4- level EPT) , 5- level EPT allows up t o 128 PByt es of guest- physical- address space t o be accessed at any given t im e. The following it em s describe in m ore det ail t he changes t hat 5- level EPT m akes t o t he t ranslat ion process. •
Translat ion begins by ident ifying a 4- KByte nat urally aligned EPT PML5 t able. I t is locat ed at t he physical address specified in bit s 51: 12 of EPTP. An EPT PML5 t able com prises 512 64- bit ent ries ( EPT PML5Es) . An EPT PML5E is select ed using t he physical address defined as follows. — Bit s 63: 52 are all 0. — Bit s 51: 12 are from EPTP. — Bit s 11: 3 are bit s 56: 48 of t he guest- physical address. — Bit s 2: 0 are all 0.
Table 4-1. Format of an EPT PML5 Entry (EPT PML5E)
For m a t of a n EPT PM L5 En t r y ( EPT PM L5 E) Bit Posit ion ( s)
Con t e n t s
Read access; indicat es whet her reads are allowed from t he 256-TByt e region cont rolled by t his ent ry.
Writ e access; indicat es w het her w rit es are allow ed from t he 256-TByt e region cont rolled by t his ent ry.
I f t he “ m ode- based execut e cont rol for EPT” VM- execut ion cont rol is 0, execut e access; indicat es whet her inst ruct ion fet ches are allowed from t he 256-TByt e region cont rolled by t his ent ry. I f t hat cont rol is 1, execut e access for supervisor- m ode linear addresses; indicat es whet her inst ruct ion fet ches are allow ed from supervisor- m ode linear addresses in t he 256-TByt e region cont rolled by t his ent ry.
Ta ble 4 - 1 .
For m a t of a n EPT PM L5 En t r y ( EPT PM L5 E) ( Cont in ue d) Bit Posit ion ( s)
Con t e n t s
7: 3
Reserved ( m ust be 0) .
I f bit 6 of EPTP is 1, accessed flag for EPT; indicat es whet her soft ware has accessed t he 256-TByt e region cont rolled by t his ent ry. I gnored if bit 6 of EPTP is 0.
I gnored.
Execut e access for user- m ode linear addresses. I f t he “ m ode- based execut e cont rol for EPT” VM- execut ion cont rol is 1, indicat es w het her inst ruct ion fet ches are allow ed from userm ode linear addresses in t he 256-TByt e region cont rolled by t his ent ry. I f t hat cont rol is 0, t his bit is ignored.
I gnored.
M–1: 12
Physical address of 4- KByt e aligned EPT PML4 t able referenced by t his ent ry.
51: M
Reserved ( m ust be 0) .
63: 52
I gnored.
The next st ep of t he t ranslat ion process ident ifies a 4- KByt e nat urally aligned EPT PML4 t able. I t is locat ed at t he physical address specified in bit s 51: 12 of t he EPT PML5E ( see Table 4- 1) . An EPT PML4 t able com prises 512 64- bit ent ries ( EPT PML4Es) . An EPT PML4E is select ed using t he physical address defined as follows. — Bit s 51: 12 are from t he EPT PML5E. — Bit s 11: 3 are bit s 47: 39 of t he guest- physical address. — Bit s 2: 0 are all 0. Because an EPT PML4E is ident ified using bit s 56: 39 of t he guest- physical address, it cont rols access t o a 512- GByt e region of t he guest- physical- address space.
Once t he EPT PML4E is ident ified, bit s 38: 0 of t he guest- physical address det erm ine t he rem ainder of t he t ranslat ion process exact ly as is done for 4- level EPT. As suggest ed in Table 4- 1, t he values of bit s 2: 0 and bit 10 of t he EPT PML5E are used norm ally ( in com binat ion wit h t he corresponding bit s in ot her EPT paging- st ruct ure ent ries) t o det erm ine whet her EPT violat ions occur. The accessed flag ( bit 8) in t he EPT PML5E is updat ed as is done for ot her EPT paging- st ruct ure ent ries.
4 .4
5 - Le ve l EPT a n d EPTP Sw it ch in g The value of EPTP m ay be m odified in VMX non- root operat ion by invoking VM funct ion 0 ( EPTP swit ching) . This is done by execut ing t he VMFUNC inst ruct ion wit h value 0 in t he EAX regist er. I nvocat ion of VM funct ion 0 loads EPTP wit h a value select ed from a dat a st ruct ure in m em ory. Before loading EPTP in t his way, t he processor first confirm s t hat t he value t o be loaded is valid. The definit ion of a valid EPTP value depends on whet her t he processor support s 5- level EPT. •
I f t he processor does not support 5- level EPT, an EPTP value in m em ory is considered valid if it would not cause VM ent ry t o fail ( e.g., it does not set any reserved bit s) .
I f t he processor does support 5- level EPT, an EPTP value in m em ory is considered valid only if it would not cause VM ent ry t o fail ( as above) a n d if it s value in bit s 5: 3 ( which cont rols t he num ber of EPT levels) is t he sam e as t hat of t he current value of EPTP.
The im plicat ion is t hat an invocat ion of VM funct ion 0 cannot change t he EPT m ode bet ween 4- level EPT and 5- level EPT.
I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O I nt el ® Virt ualizat ion Technology for Direct ed I / O includes a feat ure called D M A r e m a pping. DMA rem apping provides hardware support for isolat ion of device accesses t o m em ory. When a device at t em pt s t o access syst em m em ory, DMA- rem apping hardware int ercept s t he access and ut ilizes paging st ruct ures t o det erm ine whet her t he access can be perm it t ed; it also det erm ines t he act ual locat ion t o access. The DMA- rem apping hardware m ay support t wo levels of address t ranslat ion. One level m ay t ranslat e a linear address t o a guest- physical address, while a second level m ay rem ap t he guest- physical address t o physical address. The first- level t ranslat ion uses paging st ruct ures wit h t he sam e form at as t hose used for ordinary paging. The second- level t ranslat ion uses paging st ruct ures wit h t he sam e form at as t hose used for EPT. I t is expect ed t hat , on plat form s t hat support wider linear and guest- physical addresses ( using 5- level paging and 5- level EPT, respect ively) , t he DMA- rem apping hardware will be sim ilarly enhanced t o support t hose wider addresses wit h 5- level t ranslat ion processes. This enhanced support for DMA rem apping will be det ailed in a fut ure revision of t he I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion.
