333 44 257KB
English Pages 26 Year 2016
5 - Le ve l Pa gin g a n d 5 - Le ve l EPT W h it e Pa pe r Re vision 1 .0 D e ce m be r 2 0 1 6
Docum ent Num ber: 335252- 001
N ot ice : This docum ent cont ains inform at ion on product s in t he design phase of developm ent . The inform at ion here is subj ect t o change wit hout not ice. Do not finalize a design wit h t his inform at ion. I nt el t echnologies’ feat ures and benefit s depend on syst em configurat ion and m ay require enabled hardware, soft ware, or service act ivat ion. Learn m ore at int el.com , or from t he OEM or ret ailer. Legal Lines an d Disclaim ers
No com put er syst em can be absolut ely secure. I nt el does not assum e any liabilit y for lost or st olen dat a or syst em s or any dam ages result ing from such losses. You m ay not use or facilit at e t he use of t his docum ent in connect ion w it h any infringem ent or ot her legal analysis concerning I nt el product s described herein. You agree t o grant I nt el a non- exclusive, royalt y- free license t o any pat ent claim t hereaft er draft ed which includes subj ect m at t er disclosed herein. No license ( express or im plied, by est oppel or ot herw ise) t o any int ellect ual propert y right s is grant ed by t his docum ent . The product s described m ay cont ain design defect s or errors known as errat a which m ay cause t he product t o deviat e from published specificat ions. Current charact erized errat a are available on request . This docum ent cont ains inform at ion on product s, services and/ or processes in developm ent . All inform at ion provided here is subj ect t o change wit hout not ice. Cont act your I nt el represent at ive t o obt ain t he lat est I nt el product specificat ions and roadm aps. I nt el disclaim s all express and im plied warrant ies, including wit hout lim it at ion, t he im plied warrant ies of m erchant abilit y, fit ness for a part icular purpose, and non- infringem ent , as well as any warrant y arising from course of perform ance, course of dealing, or usage in t rade. Copies of docum ent s which have an order num ber and are referenced in t his docum ent m ay be obt ained by calling 1- 800- 5484725 or by visit ing www.int el.com / design/ lit erat ure.ht m . I nt el, t he I nt el logo, and Xeon are t radem arks of I nt el Corporat ion in t he U.S. and/ or ot her count ries. * Ot her nam es and brands m ay be claim ed as t he propert y of ot hers. Copyright © 2016, I nt el Corporat ion. All Right s Reserved.
2
Docum ent Num ber: 335252- 001, Revision: 1.0
Con t e n t s 1
I nt r oduct ion .............................................................................................................. 3 1.1 Exist ing Paging in I A- 32e Mode ............................................................................. 3 1.2 Linear- Address Widt h and VMX Transit ions ............................................................. 5 1.3 Exist ing Ext ended Page Tables ( EPT) ...................................................................... 6
2
Ex pa nding Line a r Addr e sse s: 5 - Le ve l Pa gin g ............................................................. 7 2.1 5- Level Paging: I nt roduct ion ................................................................................. 7 2.2 Enum erat ion and Enabling .................................................................................... 7 2.2.1 Enum erat ion by CPUI D.............................................................................. 7 2.2.2 Enabling by Soft ware ................................................................................ 8 2.3 Linear- Address Generat ion and Canonicalit y ............................................................ 8 2.4 5- Level Paging: Linear- Address Translat ion ............................................................. 9 2.5 Linear- Address Regist ers and Canonicalit y ............................................................ 10 2.5.1 Canonicalit y Checking on RI P Loads .......................................................... 11 2.5.2 Canonicalit y Checking on Ot her Loads ....................................................... 12 2.6 I nt eract ions wit h TLB- I nvalidat ion I nst ruct ions ...................................................... 13 2.7 I nt eract ions wit h I nt el® MPX .............................................................................. 14 2.8 I nt eract ions wit h I nt el® SGX .............................................................................. 15
3
Lin e a r - Addr e ss Ex pa nsion a n d VM X Tr a nsit ion s ....................................................... 17 3.1 Linear- Address Expansion and VM Ent ries ............................................................. 17 3.2 Linear- Address Expansion and VM Exit s ................................................................ 17
4
5 - Le ve l EPT ............................................................................................................. 19 4.1 4- Level EPT: Guest - Physical- Address Lim it ............................................................ 19 4.2 5- Level EPT: Enum erat ion and Enabling ............................................................... 19 4.2.1 Enum erat ion .......................................................................................... 19 4.2.2 Enabling by Soft ware .............................................................................. 20 4.3 5- Level EPT: Guest - Physical- Address Translat ion ................................................... 20 4.4 5- Level EPT and EPTP Swit ching .......................................................................... 21
5
I nt e l® Vir t ua liza t ion Te chnology for D ir e ct e d I / O ................................................... 23
Figu r e s 1- 1 2- 1
Linear- Address Translat ion Using I A- 32e Paging ...................................................... 4 Linear- Address Translat ion Using 5- Level Paging ................................................... 11
Ta ble s 2- 1 4- 1
Form at of a PML5 Ent ry ( PML5E) t hat References a PML4 Table ................................. 9 Form at of an EPT PML5 Ent ry ( EPT PML5E) ........................................................... 20
Docum ent Num ber: 335252- 001, Revision: 1.0
3
Re vision H ist or y
4
D ocu m e n t N u m be r
Re vision N u m be r
335252- 001
1.0
D e scr ipt ion •
I nit ial Release
Date Decem ber 2016
Docum ent Num ber: 335252- 001, Revision: 1.0
1
I n t r odu ct ion This docum ent describes planned ext ensions t o t he I nt el 64 archit ect ure t o expand t he size of addresses t hat can be t ranslat ed t hrough a processor ’s m em ory- t ranslat ion hardware. Modern operat ing syst em s use address- t ranslat ion support called pa ging. Paging t ranslat es lin e a r a ddr e sse s ( also known as virt ual addresses) , which are used by soft ware, t o ph ysica l a ddr e sse s, which are used t o access m em ory ( or m em orym apped I / O) . Sect ion 1.1 describes t he 64- bit paging hardware on I nt el 64 processors. Exist ing processors lim it linear addresses t o 48 bit s. Chapt er 2 describes paging ext ensions t hat would relax t hat lim it t o 57 linear- address bit s. Virt ual- m achine m onit ors ( VMMs) use t he vir t u a l- m a chine e x t e n sion s ( VM X) t o support guest soft ware operat ing in a virt ual m achine. VM X t r a nsit ions are cont rolflow t ransfers bet ween t he VMM and guest soft ware. VMX t ransit ions involve t he loading and st oring of various processor regist ers. Som e of t hese regist ers are defined t o cont ain linear addresses. Because of t his, t he operat ion of VMX t ransit ions depends in part on t he linear- address widt h support ed by t he processor. Sect ion 1.2 describes t he exist ing t reat m ent of linear- address regist ers by VMX t ransit ions, while Chapt er 3 describes t he changes required t o support larger linear addresses. VMMs m ay also use addit ional address- t ranslat ion support called e x t e nde d pa ge t a ble s ( EPT) . When EPT is used, paging produces gu e st - physica l a ddr e sse s, which EPT t ranslat es t o physical addresses. Sect ion 1.3 describes t he EPT hardware on exist ing I nt el 64 processors, which lim it guest- physical addresses t o 48 bit s. Chapt er 4 describes EPT ext ensions t o support 57 guest- physical- address bit s.
1 .1
Ex ist in g Pa gin g in I A- 3 2 e M ode On processors support ing I nt el 64 archit ect ure, soft ware t ypically references m em ory using lin e a r a ddr e sse s. Most m odern operat ing syst em s configure processors t o use pa gin g, which t ranslat es linear addresses t o physical addresses. The processor uses t he result ing physical addresses t o access m em ory. I A- 3 2 e m ode is a m ode of processor execut ion t hat ext ends t he older 32- bit operat ion, known as le ga cy m ode . Soft ware can ent er I A- 32e m ode wit h t he following algorit hm . 1. Use t he MOV CR inst ruct ion t o set CR4.PAE[ bit 5] . ( Physical- address ext ension m ust be enabled t o ent er I A- 32e m ode.) 2. Use t he WRMSR inst ruct ion t o set bit 8 ( LME) of t he I A32_EFER MSR ( index C0000080H) . 3. Use t he MOV CR inst ruct ion t o load CR3 wit h t he address of a PML4 t able ( see below) . 4. Use t he MOV CR inst ruct ion t o set CR0.PG[ bit 31] . A logical processor is in I A- 32e m ode whenever CR0.PG = 1 and I A32_EFER.LME = 1. This fact is report ed in I A32_EFER.LMA[ bit 10] . Soft ware cannot set t his bit direct ly; it is always t he logical- AND of CR0.PG and I A32_EFER.LME.
Docum ent Num ber: 335252- 001, Revision: 1.0
3
I n I A- 32e m ode, linear addresses are 64 bit s in size. 1 However, t he corresponding paging m ode ( current ly called I A- 3 2 e pa ging) does not use all 64 linear- address bit s. I A- 32e paging does not use all 64 linear- address bit s because processors lim it t he size of linear addresses. This lim it is enum erat ed by t he CPUI D inst ruct ion. Specifically, CPUI D.80000008H: EAX[ bit s 15: 8] enum erat es t he num ber of linear- address bit s ( t he m axim um linear- address widt h) support ed by t he processor. Exist ing processors enum erat e t his value as 48. N ot e :
Processors also lim it t he size of physical addresses and enum erat e t he lim it using CPUI D. CPUI D.80000008H: EAX[ bit s 7: 0] enum erat es t he num ber of physical- address bit s support ed by t he processor, t he m axim um physical- address widt h. Exist ing processors have enum erat ed values up t o 46. Soft ware can use m ore t han 32 physicaladdress bit s only if physica l- a ddr e ss e x t e nsion has been enabled by set t ing CR4.PAE, bit 5 of cont rol regist er CR4. The enum erat ed lim it at ion on t he linear- address widt h im plies t hat paging t ranslat es only t he low 48 bit s of each 64- bit linear address. Aft er a linear address is generat ed but before it is t ranslat ed, t he processor confirm s t hat t he address uses only t he 48 bit s t hat t he processor support s. The lim it at ion t o 48 linear- address bit s result s from t he nat ure of I A- 32e paging, which is illust rat ed in Figure 1- 1.
47
Linear Address 39 38 30 29 21 20 PML4 Directory Table Directory Ptr 9
9
9
PTE Page-DirectoryPointer Table PDPTE
PDE
40 Page-Directory
12 11
0 Offset
12 4-KByte Page Physical Addr 40
Page Table
40
9 40 PML4E
40 CR3
Figur e 1 - 1 .
Line a r - Addr e ss Tr a nsla t ion Using I A- 3 2 e Pa ging
1. I A- 32e m ode com prises t wo sub- m odes: com pat ibilit y m ode and 64- bit m ode. I n com pat ibilit y m ode, soft ware uses 32- bit addresses, which t he processor zero- ext ends t o 64- bit linear addresses. I n 64- bit m ode, soft ware uses 64- bit addresses direct ly.
4
Docum ent Num ber: 335252- 001, Revision: 1.0
The processor perform s I A- 32e paging by t raversing a 4- level hierarchy of pa gin g st r u ct u r e s whose root st ruct ure resides at t he physical address in cont rol regist er CR3. Each paging st ruct ure is 4- KByt es in size and com prises 512 8- byt e ent ries. The processor uses t he upper 36 bit s of a linear address ( bit s 47: 12) , 9 bit s at a t im e, t o select paging- st ruct ure ent ries from t he hierarchy. N ot e :
Figure 1- 1 illust rat es t he t ranslat ion of a linear address t o a 4- KByt e page. The paging process can be configured so t hat t he t ranslat ion of som e linear addresses st ops one or t wo levels earlier, t ranslat ing inst ead t o 2- MByt e pages or 1- GByt e pages. I n general, bit s 51: 12 of each paging- st ruct ure ent ry cont ain a 4- KByt e aligned physical address. For each ent ry except t he last , t his address is t hat of t he next paging st ruct ure; in t he last ent ry, it is t he physical address of a 4- KByt e pa ge fr a m e . The final physical address is obt ained by com bining t his page- fram e address wit h t he pa ge offse t , bit s 11: 0 of t he original linear address. Because only bit s 47: 0 of a linear address are used in address- t ranslat ion, t he processor reserves bit s 63: 48 for fut ure expansion using a concept known as ca nonica lit y. A linear address is ca nonica l if bit s 63: 47 of t he address are ident ical. ( Put different ly, a linear address is canonical only if bit s 63: 48 are a sign- ext ension of bit 47, which is t he upperm ost bit used in linear- address t ranslat ion.) When a 64- bit linear address is generat ed t o access m em ory, t he processor first confirm s t hat t he address is canonical. I f t he address is not canonical, t he m em ory access causes a fault , and t he processor m akes no at t em pt t o t ranslat e t he address. 1 I nt el 64 archit ect ure includes num erous regist ers t hat are defined t o hold linear addresses. These regist ers m ay be loaded using a variet y of inst ruct ions. I n m ost cases, t hese inst ruct ions cause a general- prot ect ion except ion ( # GP) if an at t em pt is m ade t o load one of t hese regist ers wit h a value t hat is not canonical. Physical- address bit s in a paging- st ruct ure ent ry beyond t he enum erat ed physicaladdress widt h are reserved. A page- fault except ion ( # PF) result s if an at t em pt is m ade t o access a linear address whose t ranslat ion encount ers a paging- st ruct ure ent ry t hat set s any of t hose bit s.
1 .2
Lin e a r - Addr e ss W idt h a n d VM X Tr a n sit ion s VM ent ries and VM exit s m anipulat e num erous processor regist ers t hat cont ain linear addresses. The t ransit ions respect t he processor ’s linear- address widt h in a m anner based on canonicalit y. Cert ain fields in t he VMCS correspond t o regist ers t hat cont ain linear addresses. VM ent ries confirm t hat m ost of t hose fields cont ain values t hat are canonical. Som e regist ers, such as RI P and t he LDTR base address, receive special t reat m ent . VM exit s save int o t he VMCS t he st at e of cert ain regist ers, som e of which cont ain linear addresses. Because t he processor generally ensures t hat t he values in t hese regist ers are canonical ( see Sect ion 1.1) , t he values t hat VM exit s save for t hese regist ers will generally be canonical.
1. I n general, an at t em pt t o access m em ory using a linear address t hat is not canonical causes a general- prot ect ion except ion ( # GP) . A st ack- fault except ion — # SS — occurs inst ead if t he m em ory access was m ade using t he SS segm ent .
Docum ent Num ber: 335252- 001, Revision: 1.0
5
VM exit s also load from t he VMCS cert ain regist ers, som e of which cont ain linear addresses. Each VM exit ensures t hat t he value of each of t hese regist ers is canonical. Specifically, bit s 47: 0 of t he regist er are loaded from t he field in t he host- st at e area; t he value of bit 47 is t hen sign- ext ended int o bit s 63: 48 of t he regist er.
1 .3
Ex ist in g Ex t e n de d Pa ge Ta ble s ( EPT) Most I nt el 64 processors support ing VMX also support an addit ional layer of address t ranslat ion called e x t e nde d pa ge t a ble s ( EPT) . VM ent ry can be configured t o act ivat e EPT for guest soft ware. When EPT is act ive, t he addresses used and produced by paging ( Sect ion 1.1) are not used as physical addresses t o reference in m em ory. I nst ead, t he processor int erpret s t hem as gu e st physica l a ddr e sse s, and t ranslat es t hem t o physical addresses in a m anner det erm ined by t he VMM. ( This t ranslat ion from guest- physical t o physical applies not only t o t he out put of paging but also t o t he addresses t hat t he processor uses t o reference t he guest paging st ruct ures.) I f t he EPT t ranslat ion process cannot t ranslat e a guest- physical address, it causes an EPT viola t ion. ( EPT violat ions m ay also occur when an access t o a guest- physical address violat es t he perm issions est ablished by EPT for t hat guest- physical address.) An EPT violat ion is a VMX- specific except ion, usually causing a VM exit . As not ed in Sect ion 1.1, exist ing processors lim it physical addresses t o 46 bit s. That lim it applies also t o guest- physical addresses. As a result , guest- physical addresses t hat set bit s beyond t his lim it are not t ranslat ed by EPT. ( For exam ple, a page fault result s if linear- address t ranslat ion encount ers a paging- st ruct ure ent ry wit h such an address.) Because of t his, exist ing EPT has been lim it ed t o t ranslat ing only 48 guestphysical- address bit s. The exist ing EPT t ranslat ion process is analogous t o t he paging process t hat was illust rat ed earlier in Figure 1- 1. Like 4- level paging, t he processor im plem ent s EPT by t raversing a 4- level hierarchy of 4- KByt e EPT pa ging st r uct ur e s. The last EPT pagingst ruct ure ent ry cont ains t he upper bit s of t he final physical address, while t he lowest bit s com e from t he original guest- physical address.
6
Docum ent Num ber: 335252- 001, Revision: 1.0
2
Ex pa n din g Lin e a r Addr e sse s: 5 - Le ve l Pa gin g
2 .1
5 - Le ve l Pa gin g: I n t r odu ct ion 5 - le ve l pa ging is a new paging m ode t hat will be available in I A- 32e m ode. As it s nam e suggest s, it will t ranslat e linear addresses by t raversing a 5- level hierarchy of paging st ruct ures. Because t he process is ot herwise unm odified, 5- level paging ext ends t he processor ’s linear- address widt h t o 57 bit s. ( The addit ional 9 bit s are used t o select an ent ry from t he fift h level of t he hierarchy.) For clarit y, t he paging m ode form erly called I A- 32e paging will now be called 4 - le ve l pa ging. The rem ainder of t his chapt er specifies t he archit ect ural changes t hat define and are ent ailed by 5- level paging. Sect ion 2.2 specifies how t he CPU enum erat es t he new feat ure and how it is enabled by soft ware. Sect ion 2.3 describes changes t o t he process of linear- address generat ion, as well as a revision t o t he concept of canonicalit y. Sect ion 2.4 det ails how 5- level paging t ranslat es linear addresses. Sect ion 2.5 clarifies how t he processor t reat s loads of regist ers cont aining linear addresses, while Sect ion 2.6 t o Sect ion 2.8 consider int eract ions wit h various ot her feat ures. ( I nt eract ions wit h t he virt ual- m achine ext ensions are specified in Chapt er 3.)
2 .2
En u m e r a t ion a n d En a blin g This sect ion describes how processors enum erat e t o soft ware support for 5- level paging and relat ed feat ures and also how soft ware enables t he processor t o use t hat support .
2 .2 .1
En u m e r a t ion by CPUI D Processors support ing t he I nt el 64 archit ect ure t ypically use t he CPUI D inst ruct ion t o enum erat e t o soft ware specific processor funct ionalit y. Those processors t hat support 5- level paging enum erat e t hat fact t hrough a new feat ure flag as well as t hrough changes in how relat ed feat ures are report ed: •
CPUI D.( EAX= 07H, ECX= 0) : ECX[ bit 16] is a new feat ure flag t hat will enum erat e basic support for 5- level paging. All older processors clear t his bit . A processor will set t his bit if and only if it support s 5- level paging.
•
As not ed in Sect ion 1.1, CPUI D.80000008H: EAX[ bit s 15: 8] enum erat es t he m axim um linear- address widt h support ed by t he processor. All older processors t hat support I nt el 64 archit ect ure enum erat ed t his value as 48. Processors t hat support 5- level paging will inst ead enum erat e t his value as 57.
•
As not ed in Sect ion 1.1, CPUI D.80000008H: EAX[ bit s 7: 0] enum erat es t he m axim um physical- address widt h support ed by t he processor. Processors t hat support I nt el 64 archit ect ure have enum erat ed at m ost 46 for t his value. Processors t hat support 5- level paging are expect ed t o enum erat e higher values, up t o 52.
•
CPUI D.( EAX= 07H,ECX= 0H) : ECX.MAWAU[ bit s21: 17] is an exist ing field t hat enum erat es t he user MPX address- widt h adj ust ( MAWAU) . This value specifies t he num ber of linear- address bit s above 48 on which t he BNDLDX and BNDSTX inst ruct ions operat e in 64- bit m ode when CPL = 3.
Docum ent Num ber: 335252- 001, Revision: 1.0
7
Older processors t hat support I nt el ® MPX enum erat ed 0 for t his value. Processors t hat support 5- level paging m ay enum erat e eit her 0 or 9, depending on configurat ion by syst em soft ware. See Sect ion 2.7 for m ore det ails on how BNDLDX and BNDSTX use MAWAU and how syst em soft ware det erm ines it s value. •
2 .2 .2
CPUI D.( EAX= 12H,ECX= 0H) : EDX[ bit s 15: 8] is an exist ing field t hat enum erat es inform at ion t hat specifies t he m axim um support ed size of a 64- bit enclave. I f t he value enum erat ed is n, t he m axim um size is 2 n . Older processors t hat support I nt el ® SGX enum erat ed at m ost 47 for t his value. Processors t hat support 5- level paging are expect ed t o enum erat e t his value as 56.
En a blin g by Soft w a r e Sect ion 1.1 ident ified an algorit hm by which soft ware can ent er I A- 32e m ode. On processors t hat do not support 5- level paging, t his algorit hm enables 4- level paging. On processors t hat support 5- level paging, it can be adapt ed t o enable 5- level paging inst ead. Processors t hat support 5- level paging allow soft ware t o set a new enabling bit , CR4.LA57[ bit 12] . 1 A logical processor in I A- 32e m ode ( I A32_EFER.LMA = 1) uses 5level paging if CR4.LA57 = 1. Out side of I A- 32e m ode ( I A32_EFER.LMA = 0) , t he value of CR4.LA57 does not affect paging operat ion. The following it em s det ail how a logical processor det erm ines t he current paging m ode. • •
I f CR0.PG = 0, paging is disabled. I f I A32_EFER.LMA = 0, one of t he legacy 32- bit pagin g m odes is used ( depending on t he value of legacy paging- m ode bit s in CR4) . 2
•
I f I A32_EFER.LMA = 1 and CR4.LA57 = 0, 4- level paging is used.
•
I f I A32_EFER.LMA = 1 and CR4.LA57 = 1, 5- level paging is used.
Soft ware can t hus use t he following algorit hm t o ent er I A- 32e m ode wit h 5- level paging. 1. Use t he MOV CR inst ruct ion t o set CR4.PAE and CR4.LA57. 2. Use t he WRMSR inst ruct ion t o set I A32_EFER.LME. 3. Use t he MOV CR inst ruct ion t o load CR3 wit h t he address of a PML5 t able ( see Sect ion 2.4) . 4. Use t he MOV CR inst ruct ion t o set CR0.PG. The processor allows soft ware t o m odify CR4.LA57 only out side of I A- 32e m ode. I n I A- 32e m ode, an at t em pt t o m odify CR4.LA57 using t he MOV CR inst ruct ion causes a general- prot ect ion except ion ( # GP) .
2 .3
Lin e a r - Addr e ss Ge n e r a t ion a n d Ca n on ica lit y As not ed in Sect ion 1.1, processors wit h a linear- address widt h of 48 bit s reserve linear- address bit s 63: 48 for fut ure expansion. Linear addresses t hat use only bit s 47: 0 ( because bit s 63: 48 are a sign- ext ension of bit 47) are called ca nonica l.
1. Soft ware can set CR4.LA57 only if CPUI D.( EAX= 07H, ECX= 0) : ECX[ bit 16] is enum erat ed as 1. 2. Recall t hat I A32_EFER.LMA is t he logical-AND of CR0.PG and I A32_EFER.LME.
8
Docum ent Num ber: 335252- 001, Revision: 1.0
When a 64- bit linear address is generat ed t o access m em ory, t he processor first confirm s t hat t he address is canonical. I f t he address is not canonical, t he m em ory access causes a fault , and t he address is not t ranslat ed. Processors t hat support 5- level paging can t ranslat e 57- bit linear addresses when 5level paging is enabled. But if soft ware has enabled only 4- level paging, such a processor can t ranslat e only 48- bit linear addresses. This fact m ot ivat es t he definit ion of t wo levels of canonicalit y. A linear address is 4 8 - bit ca nonica l if bit s 63: 47 of t he address are ident ical. Sim ilarly, an address is 5 7 - bit ca nonica l if bit s 63: 56 of t he address are ident ical. Any linear address is t hat 48- bit canonical is also 57- bit canonical. When a 64- bit linear address is generat ed t o access m em ory, a processor t hat support s 5- level paging checks for canonicalit y based on t he current paging m ode: if 4- level paging is enabled, t he address m ust be 48- bit canonical; if 5- level paging is enabled, t he address need only be 57- bit canonical. I f t he appropriat e canonicalit y is not observed, t he m em ory access causes a fault .
2 .4
5 - Le ve l Pa gin g: Lin e a r - Addr e ss Tr a n sla t ion As not ed in Sect ion 2.2.2, a logical processor uses 5- level paging if I A32_EFER.LMA = 1 and CR4.LA57 = 1. Like 4- level paging, 5- level paging t ranslat es linear addresses using a hierarchy of inm em ory paging st ruct ures. Because 5- level paging increases t he linear- address widt h t o 57 bit s ( from t he 48 bit s support ed by 4- level paging) , 5- level paging allows up t o 128 PByt es of linear- address space t o be accessed at any given t im e. Also like 4- level paging, 5- level paging uses CR3 t o locat e t he first paging- st ruct ure in t he hierarchy. ( CR3 has t he sam e m ode- specific form at wit h 5- level paging as it does wit h 4- level paging.) The following it em s describe in m ore det ail t he changes t hat 5level paging m akes t o t he t ranslat ion process. •
Translat ion begins by ident ifying a 4- KByt e nat urally aligned PML5 t able. I t is locat ed at t he physical address specified in bit s 51: 12 of CR3. A PML5 t able com prises 512 64- bit ent ries ( PML5Es) . A PML5E is select ed using t he physical address defined as follows. — Bit s 51: 12 are from CR3. — Bit s 11: 3 are bit s 56: 48 of t he linear address. — Bit s 2: 0 are all 0. Because a PML5E is ident ified using bit s 56: 48 of t he linear address, it cont rols access t o a 256-TByt e region of t he linear- address space. The form at of a PML5E is given in Table 2- 1.
Ta ble 2 - 1 .
For m a t of a PM L5 En t r y ( PM L5 E) t ha t Re fe r e nce s a PM L4 Ta ble Bit Posit ion ( s)
Con t e n t s
0 ( P)
Present ; m ust be 1 t o reference a PML4 t able.
1 ( R/ W)
Read/ writ e; if 0, writ es m ay not be allowed t o t he 256-TByt e region cont rolled by t his ent ry.
2 ( U/ S)
User/ supervisor; if 0, user- m ode accesses are not allowed t o t he 256-TByt e region cont rolled by t his ent ry.
3 ( PWT)
Page- level writ e- t hrough; indirect ly det erm ines t he m em ory t ype used t o access t he PML4 t able referenced by t his ent ry.
Docum ent Num ber: 335252- 001, Revision: 1.0
9
Ta ble 2 - 1 .
For m a t of a PM L5 En t r y ( PM L5 E) t ha t Re fe r e nce s a PM L4 Ta ble ( Cont inue d) Bit Posit ion ( s)
Con t e n t s
4 ( PCD)
Page- level cache disable; indirect ly det erm ines t he m em ory t ype used t o access t he PML4 t able referenced by t his ent ry.
5 ( A)
Accessed; indicat es w het her t his ent ry has been used for linear- address t ranslat ion.
6
I gnored.
7 ( PS)
Reserved ( m ust be 0) .
11: 8
I gnored.
M–1: 12
Physical address of 4- KByt e aligned PML4 t able referenced by t his ent ry.
51: M
Reserved ( m ust be 0) .
62: 52
I gnored.
63
I f I A32_EFER.NXE = 1, execut e- disable ( if 1, inst ruct ion fet ches are not allow ed from t he 256-TByt e region cont rolled by t his ent ry) ; ot herwise, reserved ( m ust be 0) .
•
The next st ep of t he t ranslat ion process ident ifies a 4- KByt e nat urally aligned PML4 t able. I t is locat ed at t he physical address specified in bit s 51: 12 of t he PML5E ( see Table 2- 1) . A PML4 t able com prises 512 64- bit ent ries ( PML4Es) . A PML4E is select ed using t he physical address defined as follows. — Bit s 51: 12 are from t he PML5E. — Bit s 11: 3 are bit s 47: 39 of t he linear address. — Bit s 2: 0 are all 0. As is norm ally t he case when accessing a paging- st ruct ure ent ry, t he m em ory t ype used t o access t he PML4E is based in part on t he PCD and PWT bit s in t he PML5E. Because a PML4E is ident ified using bit s 56: 39 of t he linear address, it cont rols access t o a 512- GByt e region of t he linear- address space.
Once t he PML4E is ident ified, bit s 38: 0 of t he linear address det erm ine t he rem ainder of t he t ranslat ion process exact ly as is done for 4- level paging. As suggest ed in Table 2- 1, t he values of bit 1, bit 2, and bit 63 of t he PML5E are used norm ally ( in com binat ion wit h t he corresponding bit s in ot her paging- st ruct ure ent ries) t o det erm ine access right s. The accessed flag ( bit 5) in t he PML5E is updat ed as is done for ot her paging- st ruct ure ent ries. The operat ion of 5- level paging is illust rat ed in Figure 2- 1.
2 .5
Lin e a r - Addr e ss Re gist e r s a n d Ca n on ica lit y I nt el 64 archit ect ure includes num erous regist ers t hat are defined t o hold linear addresses. These regist ers m ay be loaded using a variet y of inst ruct ions. As not ed in Sect ion 1.1, each of t hese inst ruct ions t ypically causes a general- prot ect ion except ion ( # GP) if an at t em pt is m ade t o load a linear- address regist er wit h a value t hat is not canonical. As not ed in Sect ion 2.3, processors t hat support 5- level paging use t wo definit ions of canonicalit y: 48- bit canonicalit y and 57- bit canonicalit y. This sect ion describes how such a processor checks t he canonicalit y of t he values being loaded int o t he linearaddress regist ers. One approach is used for operat ions t hat load RI P ( t he inst ruct ion point er; see Sect ion 2.5.1) and anot her is used for t hose t hat load ot her regist ers ( see Sect ion 2.5.2) .
10
Docum ent Num ber: 335252- 001, Revision: 1.0
56 PML5
Linear Address 39 38 47 30 29 PML4 Directory Ptr Directory 9
9
21 20
9
9
PDE PDPTE
PML4E
40 Page-Directory 40 Pointer Table
12 11
Table
Offset
0
12 4-KByte Page Physical Addr
40
Page Directory
40
9 PTE 40 PML5E
Page Table
40 CR3
Figur e 2 - 1 .
2 .5 .1
Lin e a r - Addr e ss Tr a n sla t ion Usin g 5 - Le ve l Pa gin g
Ca n on ica lit y Ch e ck in g on RI P Loa ds The RI P regist er cont ains t he offset of t he current inst ruct ion point er wit hin t he CS segm ent . Because t he processor t reat s t he CS base address as zero in 64- bit m ode, t he value of t he RI P regist er in t hat m ode is t he linear address of t he inst ruct ion point er. Operat ions t hat load RI P ( including bot h inst ruct ions such as JMP as well as cont rol t ransfers t hrough t he I DT) check first whet her t he value t o be loaded is canonical relat ive t o t he current paging m ode. I f t he processor det erm ines t hat t he address is not canonical, t he RI P load is not perform ed and a general- prot ect ion except ion ( # GP) occurs.
N ot e :
An inst ruct ion t hat would load RI P wit h a non- canonical address fault s, m eaning t hat t he ret urn inst ruct ion point er of t he fault handler is t he address of t he fault ing inst ruct ion and not t he non- canonical address whose load was at t em pt ed. The canonicalit y checking perform ed by t hese operat ions uses 48- bit canonicalit y when 4- level paging is act ive. When 5- level paging is act ive, t he checking is relaxed t o require only 57- bit canonicalit y. The SYSCALL and SYSENTER inst ruct ions load RI P from t he I A32_LSTAR and I A32_SYSENTER_EI P MSRs, respect ively. On processors t hat support only 4- level paging, t hese inst ruct ions do not check t hat t he values being loaded are canonical because t he WRMSR inst ruct ion ensures t hat each of t hese MSRs cont ains a value t hat is 48- bit canonical. On processors t hat support 5- level paging, t he checking by WRMSR is relaxed t o 57- bit canonicalit y ( see Sect ion 2.5.2) . On such processors, an execut ion
Docum ent Num ber: 335252- 001, Revision: 1.0
11
of SYSCALL or SYSENTER wit h 4- level paging checks t hat t he value being loaded int o RI P is 48- bit canonical. 1 The norm al advancing of t he inst ruct ion point er t o t he next inst ruct ion boundary m ay result in t he RI P regist er holding a non- canonical address. The fet ch of t he next inst ruct ion from t hat non- canonical address will result in a general- prot ect ion except ion as indicat ed in Sect ion 2.3. I n t his case, t he ret urn inst ruct ion point er of t he fault handler will be t hat non- canonical address.
2 .5 .2
Ca n on ica lit y Ch e ck in g on Ot h e r Loa ds I n addit ion t o RI P, t he CPU m aint ains num erous ot her regist ers t hat hold linear addresses: •
GDTR and I DTR ( in t heir base- address port ions) .
•
LDTR, TR, FS, and GS ( in t he base- address port ionsof t heir hidden descript or caches) .
•
The debug- address regist ers ( DR0 t hrough DR3) , which hold t he linear addresses of breakpoint s.
•
The following MSRs: I A32_BNDCFGS, I A32_DS_AREA, I A32_KERNEL_GS_BASE, I A32_LSTAR, I A32_RTI T_ADDR0_A, I A32_RTI T_ADDR0_B, I A32_RTI T_ADDR1_A, I A32_RTI T_ADDR1_B, I A32_RTI T_ADDR2_A, I A32_RTI T_ADDR2_B, I A32_RTI T_ADDR3_A, I A32_RTI T_ADDR3_B, I A32_SYSENTER_EI P, and I A32_SYSENTER_ESP.
•
The x87 FPU inst ruct ion point er ( FI P) .
•
The user- m ode configurat ion regist er BNDCFGU, used by I nt el® MPX.
Wit h a few except ions, t he processor ensures t hat t he addresses in t hese regist ers are always canonical in t he following ways. •
Som e inst ruct ions fault on at t em pt s t o load a linear- address regist er wit h a noncanonical address: — An execut ion of t he LGDT or LI DT inst ruct ion causes a general- prot ect ion except ion ( # GP) if t he base address specified in t he inst ruct ion’s m em ory operand is not canonical. — An execut ion of t he LLDT or LTR inst ruct ion causes a # GP if t he base address t o be loaded from t he GDT is not canonical. — An execut ion of WRMSR, WRFSBASE, or WRGSBASE causes a # GP if it would load t he base address of eit her FS or GS wit h a non- canonical address. — An execut ion of WRMSR causes a # GP if it would load any of t he following MSRs wit h a non- canonical address: I A32_BNDCFGS, I A32_DS_AREA, I A32_FS_BASE, I A32_GS_BASE, I A32_KERNEL_GS_BASE, I A32_LSTAR, I A32_RTI T_ADDR0_A, I A32_RTI T_ADDR0_B, I A32_RTI T_ADDR1_A, I A32_RTI T_ADDR1_B, I A32_RTI T_ADDR2_A, I A32_RTI T_ADDR2_B, I A32_RTI T_ADDR3_A, I A32_RTI T_ADDR3_B, I A32_SYSENTER_EI P, or I A32_SYSENTER_ESP. 2
1. The SYSRET and SYSEXI T inst ruct ions, which com plem ent SYSCALL and SYSENTER, load RI P from RCX and RDX, respect ively. Even before 5- level paging, t hese inst ruct ions checked t he canonicalit y of t he value t o be loaded int o RI P. As wit h ot her inst ruct ions t hat load RI P, t his checking will be based on t he current paging m ode. 2. Such canonicalit y checking m ay apply also when t he WRMSR inst ruct ion is used t o load som e non- archit ect ural MSRs ( not list ed here) t hat hold a linear address.
12
Docum ent Num ber: 335252- 001, Revision: 1.0
— An execut ion of XRSTORS causes a # GP if it would load any of t he following MSRs wit h a non- canonical address: I A32_RTI T_ADDR0_A, I A32_RTI T_ADDR0_B, I A32_RTI T_ADDR1_A, I A32_RTI T_ADDR1_B, I A32_RTI T_ADDR2_A, I A32_RTI T_ADDR2_B, I A32_RTI T_ADDR3_A, and I A32_RTI T_ADDR3_B. This enforcem ent always uses t he enum erat ed m axim um linear- address widt h and is independent of t he current paging m ode. Thus, a processor t hat support s 5- level paging will allow t he inst ruct ions m ent ioned above t o load t hese regist ers wit h addresses t hat are 57- bit canonical but not 48- bit canonical — e ve n if 4 - le ve l pa ging is a ct ive . ( As a result , inst ruct ions t hat st ore t hese values — SGDT, SI DT, SLDT, STR, RDFSBASE, RDGSBASE, RDMSR, XSAVE, XSAVEC, XSAVEOPT, and XSAVES — m ay save addresses t hat are 57- bit canonical but not 48- bit canonical, even if 4- level paging is act ive.) •
The FXRSTOR, XRSTOR, and XRSTORS inst ructi ons ignore at t em pt s t o load som e of t hese regist ers wit h non- canonical addresses: — Loads of FI P ignore any bit s in t he m em ory im age beyond t he enum erat ed m axim um linear- address widt h. The processor sign- ext ends t o m ost significant bit ( e.g., bit 56 on processors t hat support 5- level paging) t o ensure t hat FI P is always canonical. — Loads of BNDCFGU ( by XRSTOR or XRSTORS) ignore any bit s in t he m em ory im age beyond t he enum erat ed m axim um linear- address widt h. The processor sign- ext ends t o m ost significant bit ( e.g., bit 56 on processors t hat support 5level paging) t o ensure t hat BNDCFGU is always canonical.
•
Every non- cont rol x87 inst ruct ion loads FI P. The value loaded is always canonical relat ive t o t he current paging m ode: 48- bit canonical if 4- level paging is act ive, and 57- bit canonical if 5- level paging is act ive.
DR0 t hrough DR3 can be loaded wit h t he MOV t o DR inst ruct ion. The inst ruct ion allows t hose regist ers t o be loaded wit h non- canonical addresses. The MOV from DR inst ruct ion will ret urn t he value last loaded wit h t he MOV t o DR inst ruct ion, even if t he address is not canonical. Breakpoint address m at ching is support ed only for canonical linear addresses.
2 .6
I n t e r a ct ion s w it h TLB- I n va lida t ion I n st r u ct ion s I nt el 64 archit ect ure includes t hree inst ruct ions t hat m ay invalidat e TLB ent ries for t he linear address of an inst ruct ion operand: I NVLPG, I NVPCI D, and I NVVPI D. The following it em s describe how t hey are affect ed by linear- address widt h. •
The I NVLPG inst ruct ion t akes a m em ory operand. I t invalidat es any TLB ent ries t hat t he logical processor is caching for t he linear address of t hat operand for t he current linear address space. The inst ruct ion does not fault if t hat address is not canonical relat ive t o t he current paging m ode ( e.g., is not 48- bit canonical when 4level paging is act ive) . However, no invalidat ion is perform ed because t he processor does not cache TLB ent ries for addresses t hat are not canonical relat ive t o t he current paging m ode.
•
The I NVPCI D inst ruct ion t akes a regist er operand ( I NVPCI D t ype) and a m em ory operand ( I NVPCI D descript or) . I f t he I NVPCI D t ype is 0, t he inst ruct ion invalidat es any TLB ent ries t hat t he logical processor is caching for t he linear address and PCI D specified in t he I NVPCI D descript or. I f t he linear address is not canonical relat ive t he linear- address widt h support ed by t he processor, t he inst ruct ion causes a general- prot ect ion except ion ( # GP) . I f t he processor support s 5- level paging, t he inst ruct ion will not cause such a # GP for an address t hat is 57- bit canonical, regardless of paging m ode, even if 4- level paging is act ive and t he address is not 48- bit canonical.
Docum ent Num ber: 335252- 001, Revision: 1.0
13
•
2 .7
The I NVVPI D inst ruct ion t akes a regist er operand (I NVVPI D t ype) and a m em ory operand ( I NVVPI D descript or) . I f t he I NVPCI D t ype is 0, t he inst ruct ion invalidat es any TLB ent ries t hat t he logical processor is caching for t he linear address and VPI D specified in t he I NVVPI D descript or. I f t he linear address is not canonical relat ive t he linear- address widt h support ed by t he processor, t he inst ruct ion fails. 1 I f t he processor support s 5- level paging, t he inst ruct ion will not fail for an address t hat is 57- bit canonical, regardless of paging m ode, even if 4- level paging is act ive and t he address is not 48- bit canonical.
I n t e r a ct ion s w it h I n t e l ® M PX The I nt el ® Mem ory Prot ect ion Ext ensions ( I nt el ® MPX) define a set of 4 bound regist ers, each of which soft ware can associat e wit h a specific point er in m em ory. I nt el MPX includes t wo inst ruct ions — BNDLDX and BNDSTX — t hat allow soft ware t o load from or st ore int o m em ory t he bounds associat ed wit h a part icular point er in m em ory. The BNDLDX and BNDSTX inst ruct ions each t ake a bound regist er and a m em ory operand ( t he associat ed point er) . Each of t hese parses t he linear address of t he m em ory operand t o t raverse a hierarchical dat a st ruct ure in m em ory. I n 64- bit m ode, t hese inst ruct ions do not necessarily use all t he bit s in t he supplied 64- bit addresses. The num ber of bit s used is 48 plus a value called t he M PX a ddr e ss- w idt h a dj u st ( M AW A) . The value of MAWA depends on CPL; t he current paging m ode ( 4- level paging or 5- level paging) ; and, if 5- level paging is act ive, t he value of a new MSR. Processors t hat support bot h I nt el MPX and 5- level paging support t he I A32_MPX_LAX MSR ( MSR index 1000H) . Only bit 0 of t he MSR is defined. I f CPL < 3, t he supervisor MAWA ( M AW AS) is used. The value of MAWAS is det erm ined by t he set t ing of CR4.LA57. I f CR4.LA57 = 0 ( 4- level paging is act ive; recall t hat MAWA is relevant only in 64- bit m ode) , t he value of MAWAS is 0. I f CR4.LA57 = 1 ( 5- level paging is act ive) , t he value of MAWAS is 9. The value of MAWAS is not enum erat ed by t he CPUI D inst ruct ion. I f CPL = 3, t he user MAWA ( M AW AU) is used. The value of MAWAU is det erm ined as follows. I f CR4.LA57 = 0 or I A32_MPX_LAX[ bit 0] = 0, t he value of MAWAU is 0. I f CR4.LA57 = 1 and I A32_MPX_LAX[ bit 0] = 1, t he value of MAWAU is 9. The current value of MAWAU is enum erat ed in CPUI D.( EAX= 07H,ECX= 0H) : ECX.MAWAU[ bit s 21: 17] . The following it em s specify how an execut ion of t he BNDLDX and BNDSTX inst ruct ions in 64- bit m ode parses a linear address t o t raverse a hierarchical dat a st ruct ure. •
A bound direct ory is locat ed at t he 4- KByt e aligned linear address specified in bit s 63: 12 of BNDCFGx. 2 A BDE is select ed using t he LAp ( linear address of point er t o a buffer) t o const ruct a 64- bit offset as follows: — bit s 63: 31+ MAWA are 0; — bit s 30+ MAWA: 3 are LAp[ bit s 47+ MAWA: 20] ; and — bit s 2: 0 are 0.
1. I NVVPI D is a VMX inst ruct ion. I n response t o cert ain condit ions, execut ion of a VMX m ay fa il, m eaning t hat it does not com plet e it s norm al operat ion. When a VMX inst ruct ion fails, cont rol passes t o t he next inst ruct ion ( rat her t han t o a fault handler) and a flag is set t o report t he failure. 2. I f CPL < 3, BNDCFGS is used; if CPL = 3, BNDCFGU is used.
14
Docum ent Num ber: 335252- 001, Revision: 1.0
The address of t he BDE is t he sum of t he bound- direct ory base address ( from BNDCFGx) plus t his 64- bit offset . I f eit her BNDLDX or BNDSTX is execut ed inside an enclave, t he inst ruct ion operat es as if MAWAU = 0 ( regardless of t he values of CR4.LA57 and I A32_MPX_LAX[ bit 0] ) . •
The processor uses bit s 63: 3 of t he BDE as t he 8- byt e aligned address of a bound t able ( BT) . A BTE is select ed using t he LAp ( linear address of point er t o a buffer) t o const ruct a 64- bit offset as follows: — bit s 63: 22 are 0; — bit s 21: 5 are LAp[ bit s 19: 3] ; and — bit s 4: 0 are 0. The address of t he BTE is t he sum of t he bound- t able base address ( from t he BDE) plus t his 64- bit offset .
A bound direct ory com prises 2 28+ MAWA 64- bit ent ries ( BDEs) ; 1 t hus, t he size of a bound direct ory in 64- bit m ode is 2 1+ MAWA GByt es. A bound t able com prises 2 17 32byt e ent ries ( BTEs) ; t hus, t he size of a bound t able in 64- bit m ode is 4 MByt es ( independent of MAWA) .
2 .8
I n t e r a ct ion s w it h I n t e l ® SGX I nt el ® Soft ware Guard Ext ensions ( I nt el ® SGX) define new processor funct ionalit y t hat is im plem ent ed as SGX leaf funct ions wit hin t he ENCLS ( supervisor) and ENCLU ( user) inst ruct ions. The SGX leaf funct ions include m em ory accesses using linear addresses norm ally. When execut ed in 64- bit m ode, t he linear address are 64 bit s in widt h and are subj ect t o t he norm al t reat m ent of accesses t o m em ory wit h 64- bit linear addresses ( see Sect ion 2.3) . I n addit ion, som e of t he leaf funct ions apply specific archit ect ural checks relat ed t o linear- address widt h. The following it em s det ail t hese checks and how t hey are defined for processors t hat support 5- level paging. •
The ECREATE leaf funct ion of ENCLS creat es a new enclave by creat ing a new SGX enclave cont rol st ruct ure ( SECS) . For a 64- bit enclave, t he processor checks whet her t he enclave base linear address ( specified in t he SECS) is canonical, generat ing a general- prot ect ion except ion ( # GP) if it is not . On processors t hat support 5- level paging, t his check is for 57- bit canonicalit y, regardless of t he current paging m ode. I n addit ion t o checking t he canonicalit y of t he enclave base linear address, ECREATE confirm s t hat t he enclave size ( specified in t he SECS) is not great er t han t he m axim um size support ed by t he processor ( if t he enclave size is t oo large, ECREATE generat es a # GP) . As not ed in Sect ion 2.2.1, older processors support ed 64- bit enclaves wit h sizes up t o 2 47 byt es; processors t hat support 5- level paging are expect ed t o support enclaves wit h sizes up t o 2 56 byt es. I f bit s 4: 3 of t he enclave’s XSAVE feat ure request m ask ( XFRM) are set ( indicat ing t hat I nt el MPX will be enabled during execut ion of t he enclave) , ECREATE generat es a # GP if t he enclave’s size is great er t han 2 48 byt es, even if t he processor enum erat es support for larger enclaves.
•
The EENTER and ERESUME leaf funct ions of ENCLU t ransfer cont rol flow t o an ent ry point wit hin a specified enclave. For ent ry t o a 64- bit enclave, t he processor checks
1. A bound direct ory used in a 64- bit enclave always com prises 2 28 64- bit BDEs and t hus has a size of 2 GByt es.
Docum ent Num ber: 335252- 001, Revision: 1.0
15
whet her cert ain linear addresses are canonical, generat ing a general- prot ect ion except ion ( # GP) if any one is not . The following it em s det ail t hese checks. — The linear address of t he specified ent ry point m ust be canonical. I f 4- level paging is act ive, it m ust be 48- bit canonical; if 5- level paging is act ive, it m ust be 57- bit canonical. — The linear address of t he asynchronous exit point ( AEP — t he address t o which t he processor t ransfers cont rol on an asynchronous enclave exit ) m ust be canonical. I f 4- level paging is act ive, it m ust be 48- bit canonical; if 5- level paging is act ive, it m ust be 57- bit canonical. — The enclave values for t he base addresses of t he FS and GS segm ent s m ust be canonical. On processors t hat support s 5- level paging, t hese checks are for 57bit canonicalit y, regardless of t he current paging m ode. •
The EEXI T leaf funct ion exit s t he current ly executi ng enclave and branches t o a specified address. For an exit from a 64- bit enclave, t he processor checks whet her t hat t arget linear address is canonical, generat ing a general- prot ect ion except ion ( # GP) if it is not . I f 4- level paging is act ive, it m ust be 48- bit canonical; if 5- level paging is act ive, it need only be 57- bit canonical.
As not ed in Sect ion 2.7, execut ions of BNDLDX and BNDSTX in a 64- bit enclave always operat e as if MAWAU = 0.
16
Docum ent Num ber: 335252- 001, Revision: 1.0
3
Lin e a r - Addr e ss Ex pa n sion a n d VM X Tr a n sit ion s As not ed in Sect ion 1.2, VM ent ries and VM exit s m anipulat e num erous processor regist ers t hat cont ain linear addresses. The t ransit ions respect t he processor ’s linearaddress widt h in a m anner based on canonicalit y. As discussed in Chapt er 2, processors t hat support 5- level paging expand t he linearaddress widt h from 48 bit s t o 57 bit s. That expansion changes t he operat ion of VMX t ransit ions. Changes t o VM ent ries are det ailed in Sect ion 3.1, while changes t o VM exit s are given in Sect ion 3.2.
3 .1
Lin e a r - Addr e ss Ex pa n sion a n d VM En t r ie s Cert ain fields in t he VMCS correspond t o regist ers t hat cont ain linear addresses. VM ent ries confirm t hose fields cont ain values t hat are canonical. This checking is based on t he linear- address widt h support ed by t he processor ( e.g., is based on 57- bit canonicalit y if t he processor support s 5- level paging) . The following are t he fields t o which t his applies. •
I n t he host- st at e area: — The fields for t he I A32_SYSENTER_EI P and I A32_SYSENTER_ESP MSRs. — The base- address fields for FS, GS, TR, GDTR, and I DTR.
•
I n t he guest- st at e area: — The fields for t he I A32_SYSENTER_EI P and I A32_SYSENTER_ESP MSRs. — The base- address fields for FS, GS, TR, GDTR, and I DTR. — The base- address field for LDTR ( if LDTR will be usable) . — The field for t he I A32_BNDCFGS MSR ( if VM ent ry is loading t hat MSR) .
A VM ent ry t o 64- bit m ode also perform s a check on t he RI P field in t he guest- st at e area of t he current VMCS. I f t he VM ent ry would result in 4- level paging, it checks t hat bit s 63: 48 of t he guest RI P field are ident ical; if it would result in 5- level paging, t hat check is on bit s 63: 57. 1
3 .2
Lin e a r - Addr e ss Ex pa n sion a n d VM Ex it s VM exit s save t he st at e of cert ain regist ers int o t he guest- st at e area of t he VMCS. Som e of t hese regist ers cont ain linear addresses. As discussed in Sect ion 1.1, t he CPU generally ensures t hat t he values in t hese regist ers respect t he CPU’s linear- address widt h. As a result , t he values t he VM exit s save for t hese regist ers will do t he sam e.
1. Not e t hat t hese checks do not confirm t hat t he guest RI P field is canonical relat ive t o t he paging m ode being ent ered. For exam ple, bit s 63: 47 are ident ical in a 48- bit canonical address. However, VM ent ry t o 4- level paging m ay load RI P wit h a value in which bit 47 differs from t hat of bit s 63: 48.
Docum ent Num ber: 335252- 001, Revision: 1.0
17
There is a special case for LDTR base address. I f LDTR was not usable at t he t im e of a VM exit , t he value saved for t he base address is undefined. However, t his undefined value is always 48- bit canonical on processors t hat do not support 5- level paging and is always 57- bit canonical on processors t hat do support 5- level paging. VM exit s load t he st at e of cert ain regist ers from t he host- st at e area of t he VMCS. Som e of t hese regist ers cont ain linear addresses. Each VM exit ensures t hat t he value of each of t he following regist ers is canonical: t he I A32_SYSENTER_EI P and I A32_SYSENTER_ESP MSRs; and t he base addresses for FS, GS, TR, GDTR, and I DTR. How t his is done depends on whet her t he processor support s 5- level paging. •
I f t he processor does not support 5- level paging, bit s 47: 0 of t he regist er are loaded from t he field in t he host- st at e area; t he value of bit 47 is t hen signext ended int o bit s 63: 48 of t he regist er.
•
I f t he processor does support 5- level paging, bit s56: 0 of t he regist er are loaded from t he field in t he host- st at e area; t he value of bit 56 is t hen sign- ext ended int o bit s 63: 57 of t he regist er.
Again, t here is a special case for LDTR. LDTR is always unusable aft er a VM exit . I t s base address m ay be loaded wit h an undefined value. This undefined value is always 48- bit canonical on processors t hat do not support 5- level paging and is always 57- bit canonical on processors t hat do support 5- level paging.
18
Docum ent Num ber: 335252- 001, Revision: 1.0
4
5 - Le ve l EPT 5 - le ve l EPT is a new m ode for EPT. As it s nam e suggest s, it will t ranslat e guestphysical addresses by t raversing a 5- level hierarchy of EPT paging st ruct ures. Because t he process is ot herwise unm odified, 5- level paging ext ends t he processor ’s guestphysical- address widt h t o 57 bit s. ( The addit ional 9 bit s are used t o select an ent ry from t he fift h level of t he hierarchy.) For clarit y, t he original EPT m ode will now be called 4 - le ve l EPT. The rem ainder of t his chapt er specifies archit ect ural changes t o 4- level EPT as well as t hose t hat define and are ent ailed by 5- level EPT. Sect ion 4.1 describes how t he expansion of t he guest- physical- address widt h affect s 4- level EPT. Sect ion 4.2 specifies how t he CPU enum erat es 5- level EPT and how t he feat ure is enabled by soft ware. Sect ion 4.3 det ails how 5- level EPT t ranslat es guest- physical addresses.
4 .1
4 - Le ve l EPT: Gu e st - Ph ysica l- Addr e ss Lim it As explained in Sect ion 1.3, 4- level EPT is lim it ed t o t ranslat ing 48- bit guest- physical addresses. This is not a problem on exist ing processors, because t hey lim it t he physical- address widt h t o 46 bit s ( see Sect ion 1.1) . A processor ’s physical- address widt h also lim it s guest- physical addresses. That m eans t hat , on exist ing processors, any at t em pt t o use a guest- physical address t hat set s a bit above t he low 48 bit s will cause a page- fault except ion ( # PF) . Processors t hat support 5- level paging are expect ed t o support 52 physical- address bit s. Such processors allow use of a guest- physical address t hat set s bit s in t he range 51: 48; no # PF is generat ed. A guest- physical address t hat set s bit s in t he range 51: 48 cannot be t ranslat ed by 4level EPT. An at t em pt t o access such an address when 4- level EPT is act ive causes an EPT viola t ion ( see Sect ion 1.3) . EPT violat ions generat e inform at ion about t he except ion in a value called t he e x it qua lifica t ion . I n general, EPT violat ions caused by at t em pt s t o access a guest- physical address t hat is t oo wide est ablish t he exit qualificat ion as is current ly done for ot her EPT violat ions. Except ions are m ade for bit s 6: 3 of t he exit qualificat ion, which report t he access right s for t he guest- physical address. The new EPT violat ions always clear t hese bit s.
4 .2
5 - Le ve l EPT: En u m e r a t ion a n d En a blin g This sect ion describes how processors enum erat e t o soft ware support for 5- level EPT and how soft ware enables t he processor t o use t hat support .
4 .2 .1
En u m e r a t ion Processors support ing EPT enum erat e det ails relat ed t o EPT in t he I A32_VMX_EPT_VPI D_CAP MSR ( index 48CH) . Current ly, I A32_VMX_EPT_VPI D_CAP[ bit 6] enum erat es support for 4- level EPT. Processors t hat also support 5- level EPT will enum erat e t hat fact by also set t ing I A32_VMX_EPT_VPI D_CAP[ bit 7] .
Docum ent Num ber: 335252- 001, Revision: 1.0
19
The guest- physical- address widt h support ed by a processor is not enum erat ed using t he I A32_VMX_EPT_VPI D_CAP MSR. This is because t hat widt h is always t he sam e as t he processor ’s m axim um physical- address widt h as enum erat ed by CPUI D.80000008H: EAX[ bit s 7: 0] .
4 .2 .2
En a blin g by Soft w a r e A VMM enables EPT by set t ing t he “ enable EPT” VM- execut ion cont rol in t he current VMCS before using t he VMCS for VM ent ry. Specific det ails of EPT operat ion are det erm ined by t he ext ended- page- t able point er field ( EPTP) in t he VMCS. I n part icular, EPTP[ bit s 5: 3] cont ain a value t hat is 1 less t han t he num ber of levels used by t he EPT. On exist ing processors, t his value m ust be 3, indicat ing 4- level EPT. ( VM ent ry fails if a different value is used.) Processors t hat also support 5- level EPT will also allow t he value 4 ( indicat ing 5- level EPT) . I n sum m ary, VM ent ry on a processor t hat support s 5- level check EPTP[ bit s 5: 3] . I f t he value is 3, t he VM ent ry act ivat es 4- level EPT. I f t he value is 4, t he VM ent ry act ivat es 5- level EPT. Wit h any ot her value, VM ent ry fails.
4 .3
5 - Le ve l EPT: Gu e st - Ph ysica l- Addr e ss Tr a n sla t ion Like 4- level EPT, 5- level EPT t ranslat es guest- physical addresses using a hierarchy of in- m em ory paging st ruct ures. Because 5- level EPT increases t he guest- physicaladdress widt h t o 57 bit s ( from t he 48 bit s support ed by 4- level EPT) , 5- level EPT allows up t o 128 PByt es of guest- physical- address space t o be accessed at any given t im e. The following it em s describe in m ore det ail t he changes t hat 5- level EPT m akes t o t he t ranslat ion process. •
Translat ion begins by ident ifying a 4- KByte nat urally aligned EPT PML5 t able. I t is locat ed at t he physical address specified in bit s 51: 12 of EPTP. An EPT PML5 t able com prises 512 64- bit ent ries ( EPT PML5Es) . An EPT PML5E is select ed using t he physical address defined as follows. — Bit s 63: 52 are all 0. — Bit s 51: 12 are from EPTP. — Bit s 11: 3 are bit s 56: 48 of t he guest- physical address. — Bit s 2: 0 are all 0.
Because an EPT PML5E is ident ified using bit s 56: 48 of t he guest- physical address, it cont rols access t o a 256-TByt e region of t he linear- address space. The form at of an EPT PML5E is given in Table 4- 1. Ta ble 4 - 1 .
For m a t of a n EPT PM L5 En t r y ( EPT PM L5 E) Bit Posit ion ( s)
20
Con t e n t s
0
Read access; indicat es whet her reads are allowed from t he 256-TByt e region cont rolled by t his ent ry.
1
Writ e access; indicat es w het her w rit es are allow ed from t he 256-TByt e region cont rolled by t his ent ry.
2
I f t he “ m ode- based execut e cont rol for EPT” VM- execut ion cont rol is 0, execut e access; indicat es whet her inst ruct ion fet ches are allowed from t he 256-TByt e region cont rolled by t his ent ry. I f t hat cont rol is 1, execut e access for supervisor- m ode linear addresses; indicat es whet her inst ruct ion fet ches are allow ed from supervisor- m ode linear addresses in t he 256-TByt e region cont rolled by t his ent ry.
Docum ent Num ber: 335252- 001, Revision: 1.0
Ta ble 4 - 1 .
For m a t of a n EPT PM L5 En t r y ( EPT PM L5 E) ( Cont in ue d) Bit Posit ion ( s)
Con t e n t s
7: 3
Reserved ( m ust be 0) .
8
I f bit 6 of EPTP is 1, accessed flag for EPT; indicat es whet her soft ware has accessed t he 256-TByt e region cont rolled by t his ent ry. I gnored if bit 6 of EPTP is 0.
9
I gnored.
10
Execut e access for user- m ode linear addresses. I f t he “ m ode- based execut e cont rol for EPT” VM- execut ion cont rol is 1, indicat es w het her inst ruct ion fet ches are allow ed from userm ode linear addresses in t he 256-TByt e region cont rolled by t his ent ry. I f t hat cont rol is 0, t his bit is ignored.
11
I gnored.
M–1: 12
Physical address of 4- KByt e aligned EPT PML4 t able referenced by t his ent ry.
51: M
Reserved ( m ust be 0) .
63: 52
I gnored.
•
The next st ep of t he t ranslat ion process ident ifies a 4- KByt e nat urally aligned EPT PML4 t able. I t is locat ed at t he physical address specified in bit s 51: 12 of t he EPT PML5E ( see Table 4- 1) . An EPT PML4 t able com prises 512 64- bit ent ries ( EPT PML4Es) . An EPT PML4E is select ed using t he physical address defined as follows. — Bit s 51: 12 are from t he EPT PML5E. — Bit s 11: 3 are bit s 47: 39 of t he guest- physical address. — Bit s 2: 0 are all 0. Because an EPT PML4E is ident ified using bit s 56: 39 of t he guest- physical address, it cont rols access t o a 512- GByt e region of t he guest- physical- address space.
Once t he EPT PML4E is ident ified, bit s 38: 0 of t he guest- physical address det erm ine t he rem ainder of t he t ranslat ion process exact ly as is done for 4- level EPT. As suggest ed in Table 4- 1, t he values of bit s 2: 0 and bit 10 of t he EPT PML5E are used norm ally ( in com binat ion wit h t he corresponding bit s in ot her EPT paging- st ruct ure ent ries) t o det erm ine whet her EPT violat ions occur. The accessed flag ( bit 8) in t he EPT PML5E is updat ed as is done for ot her EPT paging- st ruct ure ent ries.
4 .4
5 - Le ve l EPT a n d EPTP Sw it ch in g The value of EPTP m ay be m odified in VMX non- root operat ion by invoking VM funct ion 0 ( EPTP swit ching) . This is done by execut ing t he VMFUNC inst ruct ion wit h value 0 in t he EAX regist er. I nvocat ion of VM funct ion 0 loads EPTP wit h a value select ed from a dat a st ruct ure in m em ory. Before loading EPTP in t his way, t he processor first confirm s t hat t he value t o be loaded is valid. The definit ion of a valid EPTP value depends on whet her t he processor support s 5- level EPT. •
I f t he processor does not support 5- level EPT, an EPTP value in m em ory is considered valid if it would not cause VM ent ry t o fail ( e.g., it does not set any reserved bit s) .
•
I f t he processor does support 5- level EPT, an EPTP value in m em ory is considered valid only if it would not cause VM ent ry t o fail ( as above) a n d if it s value in bit s 5: 3 ( which cont rols t he num ber of EPT levels) is t he sam e as t hat of t he current value of EPTP.
The im plicat ion is t hat an invocat ion of VM funct ion 0 cannot change t he EPT m ode bet ween 4- level EPT and 5- level EPT.
Docum ent Num ber: 335252- 001, Revision: 1.0
21
22
Docum ent Num ber: 335252- 001, Revision: 1.0
5
I n t e l ® Vir t u a liza t ion Te ch n ology for D ir e ct e d I / O I nt el ® Virt ualizat ion Technology for Direct ed I / O includes a feat ure called D M A r e m a pping. DMA rem apping provides hardware support for isolat ion of device accesses t o m em ory. When a device at t em pt s t o access syst em m em ory, DMA- rem apping hardware int ercept s t he access and ut ilizes paging st ruct ures t o det erm ine whet her t he access can be perm it t ed; it also det erm ines t he act ual locat ion t o access. The DMA- rem apping hardware m ay support t wo levels of address t ranslat ion. One level m ay t ranslat e a linear address t o a guest- physical address, while a second level m ay rem ap t he guest- physical address t o physical address. The first- level t ranslat ion uses paging st ruct ures wit h t he sam e form at as t hose used for ordinary paging. The second- level t ranslat ion uses paging st ruct ures wit h t he sam e form at as t hose used for EPT. I t is expect ed t hat , on plat form s t hat support wider linear and guest- physical addresses ( using 5- level paging and 5- level EPT, respect ively) , t he DMA- rem apping hardware will be sim ilarly enhanced t o support t hose wider addresses wit h 5- level t ranslat ion processes. This enhanced support for DMA rem apping will be det ailed in a fut ure revision of t he I nt el ® Virt ualizat ion Technology for Direct ed I / O Archit ect ure Specificat ion.
Docum ent Num ber: 335252- 001, Revision: 1.0
23
24
Docum ent Num ber: 335252- 001, Revision: 1.0