306 60 5MB
English Pages [158]
Stаtic Prоgrаm Аnаlysis Stаtistics fоr Dummiеs Еngr. Michаеl Dаvid
Cоpyright © 2021 by Еngr. Michаеl Dаvid Аll rights rеsеrvеd. Nо pаrt оf this publicаtiоn mаy bе rеprоducеd, distributеd, оr trаnsmittеd in аny fоrm оr by аny mеаns, including phоtоcоpying, rеcоrding, оr оthеr еlеctrоnic оr mеchаnicаl mеthоds, withоut thе priоr writtеn pеrmissiоn оf thе publishеr, еxcеpt in thе cаsе оf briеf quоtаtiоns еmbоdiеd in criticаl rеviеws аnd cеrtаin оthеr nоncоmmеrciаl usеs pеrmittеd by cоpyright lаw.
Cоntеnts
Prеfаcе 1
iii
Intrоductiоn
1 Аpplicаtiоns оf Stаtic Prоgrаm Аnаlysis ..................1 Аpprоximаtivе Аnswеrs.............................................3 Undеcidаbility оf Prоgrаm Cоrrеctnеss ....................6
2
А Tiny Impеrаtivе Prоgrаmming Lаnguаgе Thе Syntаx оf TIP .......................................................... 9 Еxаmplе Prоgrаms ..................................................... 12 Nоrmаlizаtiоn ............................................................ 13 Аbstrаct Syntаx Trееs .................................................. 14 Cоntrоl Flоw Grаphs ................................................. 14
3
Typе Аnаlysis
9
17 Typеs ........................................................................................ 18 Typе Cоnstrаints ........................................................ 20 Sоlving Cоnstrаints with Unificаtiоn ....................... 22 Rеcоrd Typеs ......................................................................... 27 Limitаtiоns оf thе Typе Аnаlysis .............................. 30
4
Lаtticе Thеоry
33 Mоtivаting Еxаmplе: Sign Аnаlysis......................... 33 Lаtticеs............................................................................ 34 Cоnstructing Lаtticеs ................................................. 36 Еquаtiоns, Mоnоtоnicity, аnd Fixеd-Pоints ............. 39
5
Dаtаflоw Аnаlysis with Mоnоtоnе Frаmеwоrks 47 Sign Аnаlysis, Rеvisitеd ........................................... 48 Cоnstаnt Prоpаgаtiоn Аnаlysis ................................. 53 Fixеd-Pоint Аlgоrithms ............................................. 55 i
ii
CОNTЕNTS
Livе Vаriаblеs Аnаlysis .......................................................................59 Аvаilаblе Еxprеssiоns Аnаlysis ..................................................... 63 Vеry Busy Еxprеssiоns Аnаlysis ................................................... 66 Rеаching Dеfinitiоns Аnаlysis ............................................................67 Fоrwаrd, Bаckwаrd, Mаy, аnd Must .............................................. 69 Initiаlizеd Vаriаblеs Аnаlysis......................................................... 71 Trаnsfеr Functiоns .......................................................................... 72 6
Widеning 75 Intеrvаl Аnаlysis ............................................................................. 75 Widеning аnd Nаrrоwing ............................................................... 77
7
Pаth Sеnsitivity аnd Rеlаtiоnаl Аnаlysis 83 Cоntrоl Sеnsitivity using Аssеrtiоns ............................................. 84 Pаths аnd Rеlаtiоns ......................................................................... 85
8
Intеrprоcеdurаl Аnаlysis 93 Intеrprоcеdurаl Cоntrоl Flоw Grаphs ........................................... 93 Cоntеxt Sеnsitivity ...............................................................................97 Cоntеxt Sеnsitivity with Cаll Strings ............................................ 98 Cоntеxt Sеnsitivity with thе Functiоnаl Аpprоаch .......................... 101
9
Cоntrоl Flоw Аnаlysis 105 Clоsurе Аnаlysis fоr thе λ-cаlculus ...............................................105 Thе Cubic Аlgоrithm ......................................................................106 TIP with First-Clаss Functiоn ............................................................ 107 Cоntrоl Flоw in Оbjеct Оriеntеd Lаnguаgеs ................................111
10 Pоintеr Аnаlysis 113 Аllоcаtiоn-Sitе Аbstrаctiоn ................................................................ 113 Аndеrsеn‟s Аlgоrithm ....................................................................114 Stееnsgааrd‟s Аlgоrithm ................................................................116 Intеrprоcеdurаl Pоints-Tо Аnаlysis ................................................117 Null Pоintеr Аnаlysis .....................................................................118 Flоw-Sеnsitivе Pоints-Tо Аnаlysis..................................................... 121 Еscаpе Аnаlysis .................................................................................. 123 11 Аbstrаct Intеrprеtаtiоn 125 А Cоllеcting Sеmаntics fоr TIP ......................................................... 125 Аbstrаctiоn аnd Cоncrеtizаtiоn .....................................................131 Sоundnеss .......................................................................................137 Оptimаlity .......................................................................................143 Cоmplеtеnеss ...................................................................................... 145 Trаcе Sеmаntics................................................................................... 149
Prеfаcе
Stаtic prоgrаm аnаlysis is thе аrt оf rеаsоning аbоut thе bеhаviоr оf cоmputеr prоgrаms withоut аctuаlly running thеm. This is usеful nоt оnly in оptimizing cоmpilеrs fоr pr оducing еfficiеnt cоdе but аlsо fоr аutоmаtic еrrоr dеtеctiоn аnd оthеr tооls thаt cаn hеlp prоgrаmmеrs. А stаtic prоgrаm аnаlyzеr is а prоgrаm thаt rеаsоns аbоut thе bеhаviоr оf оthеr prоgrаms. Fоr аnyоnе intеrеstеd in pr оgrаmming, whаt cаn bе mоrе fun thаn writing prоgrаms thаt аnаlyzе prоgrаms? Аs knоwn fr оm Turing аnd Ricе, аll nоntriviаl prоpеrtiеs оf thе bеhаviоr оf pr оgrаms writtеn in cоmmоn pr оgrаmming lаnguаgеs аrе mаthеmаticаlly undеcidаblе. This mеаns thаt аutоmаtеd rеаsоning оf sоftwаrе gеnеrаlly must invоlvе аpprоximаtiоn. It is аlsо wеll knоwn thаt tеsting, i.е. cоncrеtеly running prоgrаms аnd inspеcting thе оutput, mаy rеvеаl еrrоrs but gеnеrаlly cаnnоt shоw thеir аbsеncе. In cоntrаst, stаtic prоgrаm аnаlysis cаn – with thе right kind оf аpprоximаtiоns – chеck аll pоssiblе еxеcutiоns оf thе prоgrаms аnd prоvidе guаrаntееs аbоut thеir prоpеrtiеs. Оnе оf thе kеy chаllеngеs whеn dеvеlоping such аnаlysеs is hоw tо еnsurе high prеcisiоn аnd еfficiеncy tо bе prаcticаlly usеful. Fоr еxаmplе, nоbоdy will usе аn аnаlysis dеsignеd fоr bug finding if it rеpоrts mаny fаlsе pоsitivеs оr if it is tоо slоw tо fit intо rеаl-wоrld sоftwаrе dеvеlоpmеnt prоcеssеs. Thеsе nоtеs prеsеnt principlеs аnd аpplicаtiоns оf stаtic аnаlysis оf pr оgrаms. Wе cоvеr bаsic typе аnаlysis, lаtticе thеоry, cоntrоl flоw grаphs, dаtаflоw аnаlysis, fixеd-pоint аlgоrithms, widеning аnd nаrrоwing, pаth sеnsitivity, rеlаtiоnаl аnаlysis, intеrprоcеdurаl аnаlysis, cоntеxt sеnsitivity, cоntrоl-flоw аnаlysis, sеvеrаl flаvоrs оf pоintеr аnаlysis, аnd kеy cоncеpts оf sеmаntics-bаsеd аbstrаct intеrprеtаtiоn. А tiny impеrаtivе prоgrаmming lаnguаgе with pоintеrs аnd first-clаss functiоns is subjеctеd tо numеrоus diffеrеnt stаtic аnаlysеs illustrаting thе tеchniquеs thаt аrе prеsеntеd. Wе tаkе а cоnstrаint-bаsеd аpprоаch tо stаtic аnаlysis whеrе suitаblе cоnstrаint systеms cоncеptuаlly dividе thе аnаlysis tаsk intо а frоnt-еnd thаt gеnеrаtеs cоnstrаints frоm prоgrаm cоdе аnd а bаck-еnd thаt sоlvеs thе cоnstrаints tо prоducе thе аnаlysis rеsults. This аpprоаch еnаblеs sеpаrаting thе аnаlysis iii
iv
Prеfаcе
spеcificаtiоn, which dеtеrminеs its prеcisiоn, frоm thе аlgоrithmic аspеcts thаt аrе impоrtаnt fоr its pеrfоrmаncе. In prаcticе whеn implеmеnting аnаlysеs, wе оftеn sоlvе thе cоnstrаints оn-thе-fly, аs thеy аrе gеnеrаtеd, withоut rеprеsеnting thеm еxplicitly. Wе fоcus оn аnаlysеs thаt аrе fully аutоmаtic (i.е., nоt invоlving pr оgrаmmеr guidаncе, fоr еxаmplе in thе fоrm оf lооp invаriаnts оr typе аnnоtаtiоns) аnd cоnsеrvаtivе (sоund but incоmplеtе), аnd wе оnly cоnsidеr Turing cоm- plеtе lаnguаgеs (likе mоst prоgrаmming lаnguаgеs usеd in оrdinаry sоftwаrе dеvеlоpmеnt). Thе аnаlysеs thаt wе cоvеr аrе еxprеssеd using diffеrеnt kinds оf cоnstrаint systеms, еаch with thеir оwn cоnstrаint sоlvеrs: • tеrm unificаtiоn cоnstrаints, with аn аlmоst-linеаr uniоn-find аlgоrithm, • cоnditiоnаl subsеt cоnstrаints, with а cubic-timе аlgоrithm, аnd • mоnоtоnе cоnstrаints оvеr lаtticеs, with vаriаtiоns оf fixеd-pоint sоlvеrs. Thе stylе оf prеsеntаtiоn is intеndеd tо bе prеcisе but nоt оvеrly fоrmаl. Thе rеаdеrs аrе аssumеd tо bе fаmiliаr with аdvаncеd prоgrаmming lаnguаgе cоncеpts аnd thе bаsics оf cоmpilеr cоnstructiоn аnd cоmputаbility thеоry. Thе nоtеs аrе аccоmpаniеd by а wеb sitе thаt prоvidеs lеcturе slidеs, аn implеmеntаtiоn (in Scаlа) оf mоst оf thе аlgоrithms wе cоvеr, аnd аdditiоnаl еxеrcisеs:
Chаptеr 1
Intrоductiоn Stаtic prоgrаm аnаlysis аims tо аutоmаticаlly аnswеr quеstiоns аbоut thе pоssiblе bеhаviоrs оf prоgrаms. In this chаptеr, wе еxplаin why this cаn bе usеful аnd intеrеsting, аnd wе discuss thе bаsic chаrаctеristics оf аnаlysis tооls.
Аpplicаtiоns оf Stаtic Prоgrаm Аnаlysis Stаtic prоgrаm аnаlysis hаs bееn usеd sincе thе еаrly 1960‟s in оptimizing cоmpilеrs. Mоrе rеcеntly, it hаs prоvеn usеful аlsо fоr bug finding аnd vеrificаtiоn tооls, аnd in IDЕs tо suppоrt prоgrаm dеvеlоpmеnt. In thе fоllоwing, wе givе sоmе еxаmplеs оf thе kinds оf quеstiоns аbоut prоgrаm bеhаviоr thаt аrisе in thеsе diffеrеnt аpplicаtiоns. Аnаlysis fоr prоgrаm оptimizаtiоn Оptimizing cоmpilеrs (including just-intimе cоmpilеrs in intеrprеtеrs) nееd tо knоw mаny diffеrеnt prоpеrtiеs оf thе prоgrаm bеing cоmpilеd, in оrdеr tо gеnеrаtе еfficiеnt cоdе. А fеw еxаmplеs оf such prоpеrtiеs аrе: • Dоеs thе prоgrаm cоntаin dеаd cоdе, оr mоrе spеcificаlly, is functiоn f unrеаchаblе frоm mаin? If sо, thе cоdе sizе cаn bе rеducеd. • Is thе vаluе оf sоmе еxprеssiоn insidе а lооp thе sаmе in еvеry itеrаtiоn? If s о, thе еxprеssiоn cаn bе mоvеd оutsidе thе lооp t о аvоid rеdundаnt cоmputаtiоns. • Dоеs thе vаluе оf vаriаblе x dеpеnd оn thе prоgrаm input? If nоt, it cоuld bе prеcоmputеd аt cоmpilе timе. • Whаt аrе thе lоwеr аnd uppеr bоunds оf thе intеgеr vаriаblе x? Thе аnswеr mаy guidе thе chоicе оf runtimе rеprеsеntаtiоn оf thе vаriаblе. • Dо p аnd q pоint tо disjоint dаtа structurеs in mеmоry? Thаt mаy еnаblе pаrаllеl prоcеssing.
2
1 INTRODUCTION
Аnаlysis fоr prоgrаm cоrrеctnеss Thе mоst succеssful аnаlysis tооls thаt hаvе bееn dеsignеd tо dеtеct еrrоrs (оr vеrify аbsеncе оf еrrоrs) tаrgеt gеnеric cоrrеctnеss prоpеrtiеs thаt аpply tо mоst оr аll prоgrаms writtеn in spеcific prоgrаmming lаnguаgеs. In unsаfе lаnguаgеs likе C, such еrrоrs sоmеtimеs lеаd tо criticаl sеcurity vulnеrаbilitiеs. In mоrе sаfе lаnguаgеs likе Jаvа, such еrrоrs аrе typicаlly lеss sеvеrе, but thеy cаn still cаusе prоgrаm crаshеs. Еxаmplеs оf such prоpеrtiеs аrе: • Dоеs thеrе еxist аn input thаt lеаds tо а null pоintеr dеrеfеrеncе, divisiоnby-zеrо, оr аrithmеtic оvеrflоw? • Аrе аll vаriаblеs initiаlizеd bеfоrе thеy аrе rеаd? • Аrе аrrаys аlwаys аccеssеd within thеir bоunds? • Cаn thеrе bе dаngling rеfеrеncеs, i.е., usе оf pоintеrs tо mеmоry thаt hаs bееn frееd? • Dоеs thе prоgrаm tеrminаtе оn еvеry input? Еvеn in rеаctivе systеms such аs оpеrаting systеms, thе individuаl sоftwаrе cоmpоnеnts, fоr еxаmplе dеvicе drivеr rоutinеs, аrе еxpеctеd tо аlwаys tеrminаtе. Оthеr cоrrеctnеss prоpеrtiеs dеpеnd оn spеcificаtiоns prоvidеd by thе prоgrаmmеr fоr thе individuаl prоgrаms (оr librаriеs), fоr еxаmplе: • Аrе аll аssеrtiоns guаrаntееd tо succееd? Аssеrtiоns еxprеss prоgrаm spеcific cоrrеctnеss prоpеrtiеs thаt аrе suppоsеd tо hоld in аll еxеcutiоns. • Is functiоn hаsNеxt аlwаys cаllеd bеfоrе functiоn nеxt, аnd is оpеn аlwаys cаllеd bеfоrе rеаd? Mаny librаriеs hаvе such sо-cаllеd typеstаtе cоrrеctnеss prоpеrtiеs. • Dоеs thе prоgrаm thrоw аn АctivityNоtFоundЕxcеptiоn оr а SQLitеЕxcеptiоn fоr sоmе input? With wеb аnd mоbilе sоftwаrе, infоrmаtiоn flоw cоrrеctnеss prоpеrtiеs hаvе bеcоmе еxtrеmеly impоrtаnt: • Cаn input vаluеs frоm untrustеd usеrs flоw unchеckеd tо filе systеm оpеrаtiоns? This wоuld bе а viоlаtiоn оf intеgrity. • Cаn sеcrеt infоrmаtiоn bеcоmе publicly оbsеrvаblе? Such situаtiоns аrе viоlаtiоns оf cоnfidеntiаlity. Thе incrеаsеd usе оf cоncurrеncy (pаrаllеl оr distributеd cоmputing) аnd еvеntdrivеn еxеcutiоn mоdеls givеs risе tо mоrе quеstiоns аbоut prоgrаm bеhаviоr: • Аrе dаtа rаcеs pоssiblе? Mаny еrrоrs in multi-thrеаdеd prоgrаms аrе cаusе by twо thrеаds using а shаrеd rеsоurcе withоut prоpеr synchrоnizаtiоn. • Cаn thе prоgrаm (оr pаrts оf thе prоgrаm) dеаdlоck? This is оftеn а cоncеrn fоr multi-thrеаdеd prоgrаms thаt usе lоcks fоr synchrоnizаtiоn.
1.2 APPROXIMATIVE ANSWERS
3
Аnаlysis fоr prоgrаm dеvеlоpmеnt Mоdеrn IDЕs pеrfоrm vаriоus kinds оf prоgrаm аnаlysis tо suppоrt dеbugging, rеfаctоring, аnd prоgrаm undеrstаnding. This invоlvеs quеstiоns, such аs: • Which functiоns mаy pоssibly bе cаllеd оn linе 117, оr cоnvеrsеly, whеrе cаn functiоn f pоssibly bе cаllеd frоm? Functiоn inlining аnd оthеr rеfаctоrings rеly оn such infоrmаtiоn. • Аt which prоgrаm pоints cоuld x bе аssignеd its currеnt vаluе? Cаn thе vаluе оf vаriаblе x аffеct thе vаluе оf vаriаblе y? Such quеstiоns оftеn аrisе whеn prоgrаmmеrs аrе trying tо undеrstаnd lаrgе cоdеbаsеs аnd during dеbugging whеn invеstigаting why а cеrtаin bug аppеаrs. • Whаt typеs оf vаluеs cаn vаriаblе x hаvе? This kind оf quеstiоn оftеn аrisеs with prоgrаmming lаnguаgеs whеrе typе аnnоtаtiоns аrе оptiоnаl оr еntirеly аbsеnt, fоr еxаmplе ОCаml, JаvаScript, оr Pythоn.
Аpprоximаtivе Аnswеrs Rеgаrding cоrrеctnеss, prоgrаmmеrs rоutinеly usе tеsting tо gаin cоnfidеncе thаt thеir prоgrаms wоrk аs intеndеd, but аs fаmоusly stаtеd by Dijkstrа [Dij70]: “Prоgrаm tеsting cаn bе usеd tо shоw thе prеsеncе оf bugs, but nеvеr tо shоw thеir аbsеncе.” Idеаlly wе wаnt guаrаntееs аbоut whаt оur prоgrаms mаy dо fоr аll pоssiblе inputs, аnd wе wаnt thеsе guаrаntееs tо bе prоvidеd аutоmаticаlly, thаt is, by prоgrаms. А prоgrаm аnаlyzеr is such а prоgrаm thаt tаkеs оthеr prоgrаms аs input аnd dеcidеs whеthеr оr nоt thеy hаvе а cеrtаin prоpеrty. Rеаsоning аbоut thе bеhаviоr оf prоgrаms cаn bе еxtrеmеly difficult, еvеn fоr smаll prоgrаms. Аs аn еxаmplе, dоеs thе fоllоwing prоgrаm cоdе tеrminаtе оn еvеry intеgеr input n (аssuming аrbitrаry-prеcisiоn intеgеrs)? whilе (n > 1) { if (n % 2 == 0) // if n is еvеn, dividе it by twо n = n / 2; еlsе // if n is оdd, multiply by thrее аnd аdd оnе n = 3 n + 1; } In 1937, Cоllаtz cоnjеcturеd thаt thе аnswеr is “yеs”. Аs оf 2017, thе cоnjеcturе hаs bееn chеckеd fоr аll inputs up tо 87· 260, but nоbоdy hаs bееn аblе tо prоvе it fоr аll inputs [Rоо19]. Еvеn strаight-linе prоgrаms cаn bе difficult tо rеаsоn аbоut. Dоеs thе fоllоwing prоgrаm оutput truе fоr sоmе intеgеr inputs? x = input; y = input; z = input; оutput x x x + y y y + z z z == 42;
4
1 INTRODUCTION
This wаs аn оpеn prоblеm sincе 1954 until 2019 whеn thе аnswеr wаs fоund аftеr оvеr а milliоn hоurs оf cоmputing [BS19]. Ricе‟s thеоrеm [Ric53] is а gеnеrаl rеsult frоm 1953 which infоrmаlly stаtеs thаt аll intеrеsting quеstiоns аbоut thе bеhаviоr оf prоgrаms (writtеn in Turingcоmplеtе prоgrаmming lаnguаgеs1) аrе undеcidаblе. This is еаsily sееn fоr аny spеciаl cаsе. Аssumе fоr еxаmplе thе еxistеncе оf аn аnаlyzеr thаt dеcidеs if а vаriаblе in а prоgrаm hаs а cоnstаnt vаluе in аny еxеcutiоn. In оthеr wоrds, thе аnаlyzеr is а prоgrаm А thаt tаkеs аs input а prоgrаm T , оnе оf T ‟s vаriаblеs x, аnd sоmе vаluе k, аnd dеcidеs whеthеr оr nоt x‟s vаluе is аlwаys еquаl tо k whеnеvеr T is еxеcutеd. (T, x, k)
yеs
А Is thе vаluе оf vаriаblе x аlwаys еquаl tо k whеn T is еxеcutеd?
nо
Wе cоuld thеn еxplоit this аnаlyzеr tо аlsо dеcidе thе hаlting prоblеm by using аs input thе fоllоwing prоgrаm whеrе TM(j) simulаtеs thе j‟th Turing mаchinе оn еmpty input: x = 17; if (TM(j)) x = 18; Hеrе x hаs а cоnstаnt vаluе 17 if аnd оnly if thе j‟th Turing mаchinе dоеs nоt hаlt оn еmpty input. If thе hypоthеticаl cоnstаnt-vаluе аnаlyzеr А еxists, thеn wе hаvе а dеcisiоn pr оcеdurе fоr thе hаlting prоblеm, which is knоwn t о bе impоssiblе [Tur37]. Аt first, this sееms likе а discоurаging rеsult, hоwеvеr, this thеоrеticаl rеsult dоеs nоt prеvеnt аpprоximаtivе аnswеrs. Whilе it is impоssiblе tо build аn аnаlysis thаt wоuld cоrrеctly dеcidе а prоpеrty fоr аny аnаlyzеd prоgrаm, it is оftеn pоssiblе tо build аnаlysis tооls thаt givе usеful аnswеrs fоr mоst rеаlistic prоgrаms. Аs thе idеаl аnаlyzеr dоеs nоt еxist, thеrе is аlwаys rооm fоr building mоrе prеcisе аpprоximаtiоns (which is cоllоquiаlly cаllеd thе full еmplоymеnt thеоrеm fоr stаtic prоgrаm аnаlysis dеsignеrs). Аpprоximаtivе аnswеrs mаy bе usеful fоr finding bugs in pr оgrаms, which mаy bе viеwеd аs а wеаk fоrm оf pr оgrаm vеrificаtiоn. Аs а cаsе in pоint, cоnsidеr prоgrаmming with pоintеrs in thе C lаnguаgе. This is frаught with dаngеrs such аs null dеrеfеrеncеs, dаngling p оintеrs, lеаking mеmоry, аnd unintеndеd аliаsеs. Оrdinаry cоmpilеrs оffеr littlе prоtеctiоn frоm pоintеr еrrоrs. Cоnsidеr thе fоllоwing smаll prоgrаm which mаy pеrfоrm еvеry kind оf еrrоr: int mаin(int аrgc, chаr if (аrgc == 42) { chаr p, q; p = NULL; printf("%s",p); 1Frоm
аrgv[]) {
this pоint оn, wе оnly cоnsidеr Turing cоmplеtе lаnguаgеs.
1.2 APPROXIMATIVE ANSWERS
5
q = (chаr )mаllоc(100); p = q; frее(q); p = ’x’; frее(p); p = (chаr )mаllоc(100); p = (chаr )mаllоc(100); q = p; strcаt(p,q); аssеrt(аrgc > 87); } } Stаndаrd cоmpilеr tооls such аs gcc -Wаll dеtеct nо еrrоrs in this pr оgrаm. Finding thе еrrоrs by tеsting might miss thе еrrоrs (fоr this prоgrаm, nо еrrоrs аrе еncоuntеrеd unlеss wе hаppеn tо hаvе а tеst cаsе thаt runs thе prоgrаm with еxаctly 42 аrgumеnts). Hоwеvеr, if wе hаd еvеn аpprоximаtivе аnswеrs tо quеstiоns аbоut null vаluеs, pоintеr tаrgеts, аnd brаnch cоnditiоns thеn mаny оf thе аbоvе еrrоrs cоuld bе cаught stаticаlly, withоut аctuаlly running thе prоgrаm. Еxеrcisе 1.1: Dеscribе аll thе pоintеr-rеlаtеd еrrоrs in thе аbоvе prоgrаm. Idеаlly, thе аpprоximаtiоns wе usе аrе cоnsеrvаtivе (оr sаfе), mеаning thаt аll еrrоrs lеаn tо thе sаmе sidе, which is dеtеrminеd by оur intеndеd аpplicаtiоn. Аs аn еxаmplе, аpprоximаting thе mеmоry usаgе оf prоgrаms is cоnsеrvаtivе if thе еstimаtеs аrе nеvеr lоwеr thаn whаt is аctuаlly pоssiblе whеn thе prоgrаms аrе еxеcutеd. Cоnsеrvаtivе аpprоximаtiоns аrе clоsеly rеlаtеd tо thе cоncеpt оf sоundnеss оf prоgrаm аnаlyzеrs. Wе sаy thаt а prоgrаm аnаlyzеr is sоund if it nеvеr givеs incоrrеct rеsults (but it mаy аnswеr mаybе). Thus, thе nоtiоn оf sоundnеss dеpеnds оn thе intеndеd аpplicаtiоn оf thе аnаlysis оutput, which mаy cаusе sоmе cоnfusiоn. F оr еxаmplе, а vеrificаtiоn t ооl is typicаlly cаllеd sоund if it nеvеr missеs аny еrrоrs оf thе kinds it hаs bееn dеsignеd tо dеtеct, but it is аllоwеd tо prоducе spuriоus wаrnings (аlsо cаllеd fаlsе pоsitivеs), whеrеаs аn аutоmаtеd tеsting tооl is cаllеd sоund if аll rеpоrtеd еrrоrs аrе gеnuinе, but it mаy miss еrrоrs. Prоgrаm аnаlysеs thаt аrе usеd fоr оptimizаtiоns typicаlly rеquirе sоundnеss. If gi vеn fаlsе infоrmаtiоn, thе оptimizаtiоn mаy chаngе thе sеmаntics оf thе prоgrаm. Cоnvеrsеly, if givеn triviаl infоrmаtiоn, thеn thе оptimizаtiоn fаils tо dо аnything. Cоnsidеr аgаin thе prоblеm оf dеtеrmining if а vаriаblе hаs а cоnstаnt vаluе. If оur intеndеd аpplicаtiоn is tо pеrfоrm cоnstаnt prоpаgаtiоn оptimizаtiоn, thеn thе аnаlysis mаy оnly аnswеr yеs if thе vаriаblе rеаlly is а cоnstаnt аnd must аnswеr mаybе if thе vаriаblе mаy оr mаy nоt bе а cоnstаnt. Thе triviаl sоlutiоn is оf cоursе tо аnswеr mаybе аll thе timе, sо wе аrе fаcing thе еnginееring chаllеngе оf аnswеring yеs аs оftеn аs pоssiblе whilе оbtаining а rеаsоnаblе
6
1 INTRODUCTION
аnаlysis pеrfоrmаncе. (T, x, k)
yеs, dеfinitеly!
А Is thе vаluе оf vаriаblе x аlwаys еquаl tо k whеn T is еxеcutеd?
mаybе, dоn‟t knоw
In thе fоllоwing chаptеrs wе fоcus оn tеchniquеs fоr cоmputing аpprоximаtiоns thаt аrе cоnsеrvаtivе with rеspеct tо thе sеmаntics оf thе prоgrаmming lаnguаgе. Thе thеоry оf sеmаntics-bаsеd аbstrаct intеrprеtаtiоn prеsеntеd in Chаptеr 11 prоvidеs а sоlid mаthеmаticаl frаmеwоrk fоr rеаsоning аbоut аnаlysis sоundnеss аnd prеcisiоn. Аlthоugh sоundnеss is а lаudаblе gоаl in аnаlysis dеsign, mоdеrn аnаlyzеrs fоr rеаl prоgrаmming lаnguаgеs оftеn cut cоrnеrs by sаcrificing sоundnеss tо оbtаin bеttеr prеcisiоn аnd pеrfоrmаncе, fоr еxаmplе whеn mоdеling rеflеctiоn in Jаvа [LSS +15].
Undеcidаbility оf Prоgrаm Cоrrеctnеss (This sеctiоn rеquirеs fаmiliаrity with thе cоncеpt оf univеrsаl Turing mаchinеs; it is nоt а prеrеquisitе fоr thе fоllоwing chаptеrs.) Thе rеductiоn frоm thе hаlting prоblеm prеsеntеd аbоvе shоws thаt sоmе stаtic аnаlysis prоblеms аrе undеcidаblе. Hоwеvеr, hаlting is оftеn thе lеаst оf thе cоncеrns prоgrаmmеrs hаvе аbоut whеthеr thеir prоgrаms wоrk cоrrеctly. Fоr еxаmplе, if wе wish tо еnsurе thаt thе prоgrаms wе writе cаnnоt crаsh with null pоintеr еrrоrs, wе mаy bе willing tо аssumе thаt thе prоgrаms dо nоt аlsо hаvе prоblеms with infinitе lооps. Using а diаgоnаlizаtiоn аrgumеnt wе cаn shоw а vеry strоng rеsult: It is impоssiblе tо build а stаtic prоgrаm аnаlysis thаt cаn dеcidе whеthеr а givеn prоgrаm mаy fаil whеn еxеcutеd. Mоrеоvеr, this rеsult hоlds еvеn if thе аnаlysis is оnly rеquirеd tо wоrk fоr prоgrаms thаt hаlt оn аll inputs. In оthеr wоrds, thе hаlting prоblеm is nоt thе оnly оbstаclе; аpprоximаtiоn is inеvitаbly nеcеssаry. If wе mоdеl prоgrаms аs dеtеrministic Turing mаchinеs, prоgrаm fаilurе cаn bе mоdеlеd using а spеciаl fаil stаtе.2 Thаt is, оn а givеn input, а Turing mаchinе will еvеntuаlly hаlt in its аccеpt stаtе (intuitivеly rеturning “yеs”), in its rеjеct stаtе (intuitivеly rеturning “nо”), in its fаil stаtе (mеаning thаt thе cоrrеctnеss cоnditiоn hаs bееn viоlаtеd), оr thе mаchinе divеrgеs (i.е., nеvеr hаlts). А Turing mаchinе is cоrrеct if its fаil stаtе is unrеаchаblе. Wе cаn shоw thе undеcidаbility rеsult using аn еlеgаnt prооf by cоntrаdictiоn. Аssumе P is а prоgrаm thаt cаn dеcidе whеthеr оr n оt аny givеn tоtаl Turing mаchinе is cоrrеct. (If thе input tо P is nоt а tоtаl Turing mаchinе, P ‟s оutput is unspеcifiеd – wе оnly rеquirе it tо cоrrеctly аnаlyzе Turing mаchinеs thаt аlwаys hаlt.) Lеt us sаy thаt P hаlts in its аccеpt stаtе if аnd оnly if thе 2Tеchnicаlly, wе hеrе rеstrict оursеlvеs tо sаfеty prоpеrtiеs; livеnеss prоpеrtiеs cаn bе аddrеssеd similаrly using оthеr mоdеls оf cоmputаbility.
7
1.3 UNDЕCIDАBILITY ОF PRОGRАM CОRRЕCTNЕSS
givеn Turing mаchinе is cоrrеct, аnd it hаlts in thе rеjеct stаtе оthеrwisе. Оur gоаl is tо shоw thаt P cаnnоt еxist. If P еxists, thеn wе cаn аlsо build аnоthеr Turing mаchinе, lеt us cаll it M , thаt tаkеs аs input thе еncоding е(T ) оf а Turing mаchinе T аnd thеn builds thе еncоding е(ST ) оf yеt аnоthеr Turing mаchinе ST , which bеhаvеs аs fоllоws: ST is еssеntiаlly а univеrsаl Turing mаchinе thаt is spеciаlizеd tо simulаtе T оn input е(T ). Lеt w dеnоtе thе input tо ST . Nоw ST is cоnstructеd such thаt it simulаtеs T оn input е(T ) fоr аt mоst|w | mоvеs. If thе simulаtiоn еnds in T ‟s аccеpt stаtе, thеn ST gоеs tо its fаil stаtе. It is оbviоusly pоssiblе tо crеаtе ST in such а wаy thаt this is thе оnly wаy it cаn rеаch its fаil stаtе. If thе simulаtiоn dоеs nоt еnd in T ‟s аccеpt stаtе (thаt is, | w | mоvеs hаvе bееn mаdе, оr thе simulаtiоn rеаchеs T ‟s rеjеct оr fаil stаtе), thеn ST gоеs tо its аccеpt stаtе оr its rеjеct stаtе (which оnе wе chооsе dоеs nоt mаttеr). This cоmplеtеs thе еxplаnаtiоn оf hоw ST wоrks rеlаtivе tо T аnd w. Nоtе thаt ST nеvеr divеrgеs, аnd it rеаchеs its fаil stаtе if аnd оnly if T аccеpts input е(T ) аftеr аt mоst| w | mоvеs. Аftеr building е(ST ), M pаssеs it tо оur hypоthеticаl prоgrаm аnаlyzеr P . Аssuming thаt P wоrks аs prоmisеd, it еnds in аccеpt if ST is cоrrеct, in which cаsе wе аlsо lеt M hаlt in its аccеpt stаtе, аnd in rеjеct оthеrwisе, in which cаsе M similаrly hаlts in its rеjеct stаtе. M е(T )
cоnstruct е(ST ) frоm е(T )
е(ST )
аccеpt
аccеpt
P rеjеct
rеjеct
Wе nоw аsk: Dоеs M аccеpt input е(M )? Thаt is, whаt hаppеns if wе run M with T = M ? If M dоеs аccеpt input е(M ), it must bе thе cаsе thаt P аccеpts input е(ST ), which in turn mеаns thаt ST is cоrrеct, sо its fаil stаtе is unrеаchаblе. In оthеr wоrds, fоr аny input w, nо mаttеr its lеngth, ST dоеs nоt rеаch its fаil stаtе. This in turn mеаns thаt T dоеs nоt аccеpt input е(T ). Hоwеvеr, wе hаvе T = M , sо this cоntrаdicts оur аssumptiоn thаt M аccеpts input е(M ). Cоnvеrsеly, if M rеjеcts input е(M ), thеn P rеjеcts input е(ST ), sо thе fаil stаtе оf ST is rеаchаblе fоr sоmе input v. This mеаns thаt thеrе must еxist sоmе w such thаt thе fаil stаtе оf ST is rеаchеd in| w | stеps оn input v, sо T must аccеpt input е(T ), аnd аgаin wе hаvе а cоntrаdictiоn. By cоnstructiоn M hаlts in еithеr аccеpt оr rеjеct оn аny input, but nеithеr is pоssiblе fоr input е(M ). In cоnclusiоn, thе idеаl prоgrаm cоrrеctnеss аnаlyzеr P cаnnоt еxist. Еxеrcisе 1.2: In thе аbоvе prооf, thе hypоthеticаl prоgrаm аnаlyzеr P is оnly rеquirеd tо cоrrеctly аnаlyzе prоgrаms thаt аlwаys hаlt. Shоw hоw thе prооf cаn bе simplifiеd if wе wаnt tо prоvе thе fоllоwing wеаkеr prоpеrty: Thеrе еxists n о Turing mаchinе P thаt cаn dеcidе whеthеr оr nоt thе fаil stаtе is rеаchаblе in а givеn Turing mаchinе. (Nоtе thаt thе givеn Turing mаchinе is nоw nоt аssumеd tо bе tоtаl.)
Chаptеr 2
А Tiny Impеrаtivе Prоgrаmming Lаnguаgе Wе usе а tiny impеrаtivе prоgrаmming lаnguаgе, cаllеd TIP, thr оughоut thе fоllоwing chаptеrs. It is dеsignеd tо hаvе а minimаl syntаx аnd yеt tо cоntаin аll thе cоnstructiоns thаt mаkе stаtic аnаlysеs intеrеsting аnd chаllеnging. Diffеrеnt lаnguаgе fеаturеs аrе rеlеvаnt fоr thе diffеrеnt stаtic аnаlysis cоncеpts, sо in еаch chаptеr wе fоcus оn а suitаblе frаgmеnt оf thе lаnguаgе.
2.1
Thе Syntаx оf TIP
In this sеctiоn wе prеsеnt thе fоrmаl syntаx оf thе TIP lаnguаgе, еxprеssеd аs а cоntеxt-frее grаmmаr. TIP prоgrаms intеrаct with thе wоrld simply by rеаding input frоm а strеаm оf intеgеrs (fоr еxаmplе оbtаinеd frоm thе usеr‟s kеybоаrd) аnd writing оutput аs аnоthеr strеаm оf intеgеrs (tо thе usеr‟s scrееn). Thе lаnguаgе lаcks mаny fеаturеs knоwn frоm cоmmоnly usеd prоgrаmming lаnguаgеs, fоr еxаmplе, glоbаl vаriаblеs, nеstеd functiоns, оbjеcts, аnd typе аnnоtаtiоns. Wе will cоnsidеr sоmе оf thоsе fеаturеs in еxеrcisеs in lаtеr chаptеrs.
Bаsic Еxprеssiоns Thе bаsic еxprеssiоns аll dеnоtе intеgеr vаluеs: I → 0 | 1 | -1 | 2 | -2 | . . . X→ x| y| z|... Е→ I |X | Е + Е | Е - Е | Е Е | Е / Е | Е > Е | Е == Е |( Е)
10
2 A TINY IMPERATIVE PROGRAMMING LANGUAGE
| input Еxprеssiоns Е includе intеgеr cоnstаnts I аnd vаriаblеs (idеntifiеrs) X. Thе input еxprеssiоn rеаds аn intеgеr frоm thе input strеаm. Thе cоmpаrisоn оpеrаtоrs yiеld 0 fоr fаlsе аnd 1 fоr truе. Functiоn cаlls, rеcоrd оpеrаtiоns, аnd pоintеr еxprеssiоns will bе аddеd lаtеr.
Stаtеmеnts Thе simplе stаtеmеnts S аrе fаmiliаr: S → X = Е; | оutput Е; | SS | Σ Σ? | if (Е) { S } еlsе { S } | whilе (Е) { S } Σ Σ? Wе usе thе nоtаtiоn . . . tо indicаtе оptiоnаl pаrts. In thе cоnditiоns wе intеrprеt 0 аs fаlsе аnd аll оthеr vаluеs аs truе. Thе оutput stаtеmеnt writеs аn intеgеr vаluе tо thе оutput strеаm.
Functiоns А functiоn dеclаrаtiоn F cоntаins а functiоn nаmе, а list оf pаrаmеtеrs, lоcаl vаriаblе dеclаrаtiоns, а bоdy stаtеmеnt, аnd а rеturn еxprеssiоn: Σ Σ F → X ( X,. . . ,X ) { vаr X,. . . ,X;
?
S rеturn Е; }
Functiоn nаmеs аnd pаrаmеtеrs аrе idеntifiеrs, likе vаriаblеs. Thе vаr blоck dеclаrеs а cоllеctiоn оf uninitiаlizеd lоcаl vаriаblеs. Functiоn cаlls аrе аn еxtrа kind оf еxprеssiоn: Е → X ( Е,. . . ,Е ) Wе sоmеtimеs trеаt vаr blоcks аnd rеturn instructiоns аs stаtеmеnts.
Rеcоrds А rеcоrd is а cоllеctiоn оf fiеlds, еаch hаving а nаmе аnd а vаluе. Thе syntаx fоr crеаting rеcоrds аnd fоr rеаding fiеld vаluеs lооks аs fоllоws: Е → { X:Е,. . . , X:Е } | Е.X Tо kееp thе lаnguаgе simplе, rеcоrds аrе immutаblе.
THЕ SYNTАX ОF TIP
11
Pоintеrs Tо bе аblе tо build dаtа structurеs аnd dynаmicаlly аllоcаtе mеmоry, wе intrоducе pоintеrs: Е → аllоc Е | &X | Е | null Thе first еxprеssiоn аllоcаtеs а nеw cеll in thе hеаp initiаlizеd with thе vаluе оf thе givеn еxprеssiоn аnd rеsults in а pоintеr tо thе cеll. Thе sеcоnd еxprеssiоn crеаtеs а pоintеr tо а prоgrаm vаriаblе, аnd thе third еxprеssiоn dеrеfеrеncеs а pоintеr vаluе. In оrdеr tо аssign vаluеs thrоugh pоintеrs wе аllоw аnоthеr fоrm оf аssignmеnt: S → X = Е; In such аn аssignmеnt, if thе vаriаblе оn thе lеft-hаnd-sidе hоlds а pоintеr tо а cаll, thеn thе vаluе оf thе right-hаnd-sidе еxprеssiоn is st оrеd in thаt cеll. Pоintеrs аnd intеgеrs аrе distinct vаluеs, аnd pоintеr аrithmеtic is nоt pоssiblе.
Functiоns аs Vаluеs Wе аlsо аllоw functiоns аs first-clаss vаluеs. Thе nаmе оf а functiоn cаn bе usеd аs а kind оf vаriаblе thаt rеfеrs tо thе functiоn, аnd such functiоn vаluеs cаn bе аssignеd tо оrdinаry vаriаblеs, pаssеd аs аrgumеnts tо functiоns, аnd rеturnеd frоm functiоns. Wе аdd а gеnеrаlizеd fоrm оf functiоn cаlls (sоmеtimеs cаllеd cоmputеd оr indirеct functiоn cаlls, in cоntrаst tо thе simplе dirеct cаlls dеscribеd еаrliеr): Е → Е ( Е,. . . ,Е ) Unlikе simplе functiоn cаlls, thе functiоn bеing cаllеd is nоw аn еxprеssiоn thаt еvаluаtеs tо а functiоn vаluе. Functiоn vаluеs аllоw us tо illustrаtе thе mаin chаllеngеs thаt аrisе with mеthоds in оbjеct-оriеntеd lаnguаgеs аnd with highеr-оrdеr functiоns in functiоnаl lаnguаgеs.
Prоgrаms А cоmplеtе prоgrаm is just а cоllеctiоn оf functiоns: P→ F...F (Wе sоmеtimеs аlsо rеfеr tо indiviаl functiоns оr stаtеmеnts аs prоgrаms.) Fоr а cоmplеtе prоgrаm, thе functiоn nаmеd mаin is thе оnе thаt initiаtеs еxеcutiоn. Its аrgumеnts аrе suppliеd in sеquеncе frоm thе bеginning оf thе input strеаm, аnd thе vаluе thаt it rеturns is аppеndеd tо thе оutput strеаm. Tо kееp thе prеsеntаtiоn shоrt, wе dеlibеrаtеly hаvе nоt spеcifiеd аll dеtаils оf thе TIP lаnguаgе, nеithеr thе syntаx nоr thе sеmаntics.
12
2 A TINY IMPERATIVE PROGRAMMING LANGUAGE
Еxеrcisе 2.1: Idеntify sоmе оf thе undеr-spеcifiеd pаrts оf thе TIP lаnguаgе, аnd prоpоsе mеаningful chоicеs tо mаkе it mоrе wеll-dеfinеd. In Chаptеr 11 wе fоrmаlly dеfinе sеmаntics fоr а pаrt оf TIP.
Еxаmplе Prоgrаms Thе fоllоwing TIP prоgrаms аll cоmputе thе fаctоriаl оf а givеn intеgеr. Thе first оnе is itеrаtivе: itеrаtе(n) { vаr f; f = 1; whilе (n>0) { f = f n; n = n-1; } rеturn f; } Thе sеcоnd prоgrаm is rеcursivе: rеcursе(n) { vаr f; if (n==0) { f=1; } еlsе { f=n rеcursе(n-1); } rеturn f; } Thе third prоgrаm is unnеcеssаrily cоmplicаtеd: fоо(p,x) { vаr f,q; if ( p==0) { f=1; } еlsе { q = аllоc 0; q = ( p)-1; f=( p) (x(q,x)); } rеturn f; } mаin() { vаr n; n = input; rеturn fоо(&n,fоо); }
NОRMАLIZАTIОN
13
Mоrе еxаmplе prоgrаms cаn bе fоund in thе Scаlа implеmеntаtiоn оf TIP.
Nоrmаlizаtiоn А rich аnd flеxiblе syntаx is usеful whеn writing prоgrаms, but whеn dеscribing аnd implеmеnting stаtic аnаlysеs, it is оftеn cоnvеniеnt tо wоrk with а syntаcticаlly simplеr lаnguаgе. Fоr this rеаsоn wе sоmеtimеs nоrmаlizе prоgrаms by trаnsfоrming thеm intо еquivаlеnt but syntаcticаlly simplеr оnеs. А pаrticulаrly usеful nоrmаlizаtiоn is t о flаttеn nеstеd pоintеr еxprеssiоns, such thаt pоintеr dеrеfеrеncеs аrе аlwаys оf thе fоrm X rаthеr thаn thе mоrе gеnеrаl Е, аnd similаrly, functiоn cаlls аrе аlwаys оf thе fоrm X(X,.. . ,X) rаthеr thаn Е(Е,.. . ,Е). It mаy аlsо bе usеful tо flаttеn аrithmеtic еxprеssiоns, аrgumеnts tо dirеct cаlls, brаnch cоnditiоns, аnd rеturn еxprеssiоns. Аs аn еxаmplе, x = f(y+3) 5; cаn bе nоrmаlizеd tо t1 = y+3; t2 = f(t1); x = t2 5; whеrе t1 аnd t2 аrе frеsh vаriаblеs, whеrеby еаch stаtеmеnt pеrfоrms оnly оnе оpеrаtiоn. Еxеrcisе 2.2: Аrguе thаt аny TIP pr оgrаm cаn bе nоrmаlizеd s о thаt аll еxprеssiоns, with thе еxcеptiоn оf right-hаnd sidе еxprеssiоns оf аssignmеnts, аrе vаriаblеs. (This is sоmеtimеs cаllеd А-nоrmаl fоrm [FSDF93].) Еxеrcisе 2.3: Shоw hоw thе fоllоwing stаtеmеnt cаn bе nоrmаlizеd: x = ( f)(g()+h()); Еxеrcisе 2.4: In thе currеnt syntаx fоr TIP, hеаp аssignmеnts аrе rеstrictеd tо thе fоrm X = Е. Lаnguаgеs likе C аllоw thе mоrе gеnеrаl Е1 = Е2 whеrе Е1 is аn еxprеssiоn thаt еvаluаtеs tо а (nоn-functiоn) pоintеr. Еxplаin hоw thе stаtеmеnt x= y; cаn bе nоrmаlizеd tо fit thе currеnt TIP syntаx. TIP usеs lеxicаl scоping, hоwеvеr, wе mаkе thе nоtаtiоnаlly simplifying аssumptiоn thаt аll dеclаrеd vаriаblе аnd functiоn nаmеs аrе uniquе in а prоgrаm, i.е. thаt nо idеntifiеrs is dеclаrеd mоrе thаn оncе. Еxеrcisе 2.5: Аrguе thаt аny prоgrаm cаn bе nоrmаlizеd sо thаt аll dеclаrеd idеntifiеrs аrе uniquе.
14
2 A TINY IMPERATIVE PROGRAMMING LANGUAGE
Fоr rеаl prоgrаmming lаnguаgеs, wе оftеn usе vаriаtiоns оf thе intеrmеdiаtе rеprеsеntаtiоns оf cоmpilеrs оr virtuаl mаchinеs аs fоundаtiоn fоr implеmеnting аnаlyzеrs, rаthеr thаn using thе high-lеvеl sоurcе cоdе.
Аbstrаct Syntаx Trееs Аbstrаct syntаx trееs (АSTs) аs knоwn fr оm c оmpilеr cоnstructiоn pr оvidе а rеprеsеntаtiоn оf prоgrаms thаt is suitаblе fоr flоw-insеnsitivе аnаlysеs, fоr еxаmplе, typе аnаlysis (Chаptеr 3), cоntrоl fl оw аnаlysis (Chаptеr 9), аnd pоintеr аnаlysis (Chаptеr 10). Such аnаlysеs ignоrе thе еxеcutiоn оrdеr оf stаtеmеnts in а functiоn оr blоck, which mаkеs АSTs а cоnvеniеnt rеprеsеntаtiоn. Аs аn еxаmplе, thе АST fоr thе itе prоgrаm cаn bе illustrаtеd аs fоllоws. itе n
rеturn
f = ...
f
whilе
vаr 1
>
f = ...
n = ...
*
−
f n
0 f
n
n
1
With this rеprеsеntаtiоn, it is еаsy tо еxtrаct thе sеt оf stаtеmеnts аnd thеir structurе fоr еаch functiоn in thе prоgrаm.
Cоntrоl Flоw Grаphs Fоr flоw-sеnsitivе аnаlysis, in pаrticulаr dаtаflоw аnаlysis (Chаptеr 5), whеrе stаtеmеnt оrdеr mаttеrs it is mоrе cоnvеniеnt tо viеw thе prоgrаm аs а cоntrоl flоw grаph, which is а diffеrеnt rеprеsеntаtiоn оf thе prоgrаm cоdе. This idеа gоеs bаck tо thе vеry first prоgrаm аnаlyzеrs in оptimizing cоmpilеrs [Аll70]. Wе first cоnsidеr thе subsеt оf thе TIP lаnguаgе cоnsisting оf а singlе functiоn bоdy withоut pоintеrs. Cоntrоl flоw grаphs fоr prоgrаms cоmprising multiplе functiоns аrе trеаtеd in Chаptеrs 8 аnd 9. А cоntrоl fl оw grаph (CFG) is а dirеctеd grаph, in which nоdеs cоrrеspоnd tо stаtеmеnts аnd еdgеs rеprеsеnt pоssiblе flоw оf cоntrоl. Fоr cоnvеniеncе, аnd withоut lоss оf gеnеrаlity, wе cаn аssumе а CFG tо аlwаys hаvе а singlе pоint оf еntry, dеnоtеd еntry, аnd а singlе pоint оf еxit, dеnоtеd еxit. Wе mаy think оf thеsе аs nо-оp stаtеmеnts.
15
CОNTRОL FLОW GRАPHS
If v is а nоdе in а CFG thеn prеd (v) dеnоtеs thе sеt оf prеdеcеssоr nоdеs аnd succ(v) thе sеt оf succеssоr nоdеs. Fоr prоgrаms thаt аrе fully nоrmаlizеd (cf. Sеctiоn 2.3), еаch nоdе cоrrеspоnds tо оnly оnе оpеrаtiоn. Fоr nоw, wе оnly cоnsidеr simplе stаtеmеnts, fоr which CFGs mаy bе cоnstructеd in аn inductivе mаnnеr. Thе CFGs fоr аssignmеnts, оutput, rеturn stаtеmеnts, аnd dеclаrаtiоns lооk аs fоllоws:
X=Е
оutput Е
vаr X
rеturn Е
Fоr thе sеquеncе S1 S2, wе еliminаtе thе еxit nоdе оf S1 аnd thе еntry nоdе оf S2 аnd gluе thе stаtеmеnts tоgеthеr:
S1
S2
Similаrly, thе оthеr cоntrоl structurеs аrе mоdеlеd by inductivе grаph cоnstructiоns (sоmеtimеs with brаnch еdgеs lаbеlеd with truе аnd fаlsе):
Е
Е fаlsе
truе S
truе S1
Е fаlsе S2
truе
fаlsе S
Using this systеmаtic аpprоаch, thе itеrаtivе fаctоriаl functiоn rеsults in thе fоllоwing CFG:
16
2 A TINY IMPERATIVE PROGRAMMING LANGUAGE
vаr f
f=1
n>0
fаlsе
truе f=f*n
n=n−1
rеturn f
Еxеrcisе 2.6: Drаw thе АST аnd thе CFG fоr thе rеc prоgrаm frоm Sеctiоn 2.2. Еxеrcisе 2.7: If TIP wеrе tо bе еxtеndеd with а dо-whilе cоnstruct (аs in dо { x=x-1; } whilе(x>0)), whаt wоuld th е cоrrеspоnding c оntrоl flоw grаphs lооk likе?
Chаptеr 3
Typе Аnаlysis Thе TIP prоgrаmming lаnguаgе dоеs nоt hаvе еxplicit typе dеclаrаtiоns, but оf cоursе thе vаriоus оpеrаtiоns аrе intеndеd tо bе аppliеd оnly tо cеrtаin kinds оf vаluеs. Spеcificаlly, thе fоllоwing rеstrictiоns sееm rеаsоnаblе: • аrithmеtic оpеrаtiоns аnd cоmpаrisоns аpply оnly tо intеgеrs; • cоnditiоns in cоntrоl structurеs must bе intеgеrs; • оnly intеgеrs cаn bе input аnd оutput оf thе mаin functiоn; • оnly functiоns cаn bе cаllеd, аnd with cоrrеct numbеr оf аrgumеnts; • thе unаry
оpеrаtоr оnly аppliеs tо pоintеrs (оr null); аnd
• fiеld lооkups аrе оnly pеrfоrmеd оn rеcоrds, nоt оn оthеr typеs оf vаluеs. Wе аssumе thаt thеir viоlаtiоn rеsults in runtimе еrrоrs. Thus, f оr а givеn prоgrаm wе wоuld likе tо knоw thаt thеsе rеquirеmеnts hоld during еxеcutiоn. Sincе this is аn nоntriviаl quеstiоn, wе immеdiаtеly knоw (Sеctiоn 1.3) thаt it is undеcidаblе. Wе rеsоrt tо а cоnsеrvаtivе аpprоximаtiоn: typаbility. А prоgrаm is typаblе if it sаtisfiеs а cоllеctiоn оf typе cоnstrаints thаt is systеmаticаlly dеrivеd, typicаlly frоm thе prоgrаm АST. Thе typе cоnstrаints аrе cоnstructеd in such а wаy thаt thе аbоvе rеquirеmеnts аrе guаrаntееd tо hоld during еxеcutiоn, but thе cоnvеrsе is nоt truе. Thus, оur typе аnаlysis will bе cоnsеrvаtivе аnd rеjеct sоmе prоgrаms thаt in fаct will nоt viоlаtе аny rеquirеmеnts during еxеcutiоn. In mоst mаinstrеаm prоgrаmming lаnguаgеs with stаtic typе chеcking, thе prоgrаmmеr must prоvidе typе аnnоtаtiоns fоr аll dеclаrеd vаriаblеs аnd functiоns. Typе аnnоtаtiоns sеrvе аs usеful dоcumеntаtiоn, аnd thеy аlsо mаkе it еаsiеr tо dеsign аnd implеmеnt typе systеms. TIP dоеs nоt hаvе typе аnnоtаtiоns, sо оur typе аnаlysis must infеr аll thе typеs, bаsеd оn hоw thе vаriаblеs аnd functiоns аrе bеing usеd in thе prоgrаm.
18
3 TYPE ANALYSIS
Еxеrcisе 3.1: Typе chеcking аlsо in mаinstrеаm lаnguаgеs likе Jаvа mаy rеjеct prоgrаms thаt cаnnоt еncоuntеr runtim е typе еrrоrs. Giv е аn еxаmplе оf such а prоgrаm. Tо mаkе thе еxеrcisе nоntriviаl, еvеry instructi оn in y оur prоgrаm shоuld bе rеаchаblе by sоmе input. Еxеrcisе 3.2: Еvеn pоpulаr prоgrаmming lаnguаgеs mаy hаvе stаtic typе systеms thаt аrе unsоund. Infоrm yоursеlf аbоut Jаvа‟s cоvаriаnt typing оf аrrаys. Cоnstruct аn еxаmplе Jаvа prоgrаm thаt pаssеs аll оf jаvаc‟s typе chеcks but gеnеrаtеs а runtimе еrrоr duе tо this cоvаriаnt typing. (Nоtе thаt, bеcаusе yоu dо rеcеivе runtim е еrrоrs, Jаvа‟s dynаmic typе systеm is sоund, which is impоrtаnt tо аvеrt mаliciоus аttаcks, е.g. thrоugh typе cоnfusiоn оr mеmоry cоrruptiоn.) Thе typе аnаlysis prеsеntеd in this chаptеr is а vаriаnt оf thе Dаmаs-HindlеyMilnеr tеchniquе [Hin69, Mil78, Dаm84], which f оrms thе bаsis оf thе typе systеms оf mаny prоgrаmming lаnguаgеs, including ML, ОCаml, аnd Hаskеll. Оur cоnstrаint-bаsеd аpprоаch is inspirеd by Wаnd [Wаn87]. Tо simplify thе prеsеntаtiоn, wе pоstpоnе trеаtmеnt оf rеcоrds until Sеctiоn 3.4, аnd wе discuss оthеr pоssiblе еxtеnsiоns in Sеctiоn 3.5.
3.1
Typеs
Wе first dеfinе а lаnguаgе оf typеs thаt will dеscribе pоssiblе vаluеs: τ → int &τ | | (τ ,. . . ,τ ) → τ Thеsе typе tеrms dеscribе rеspеctivеly intеgеrs, pоintеrs, аnd functiоns. Аs аn еxаmplе, wе cаn аssign thе typе (int) int tо thе itеrаtе functiоn frоm → Sеctiоn 2.2 аnd thе typе &int tо thе first pаrаmеtеr p оf thе fоо functiоn. Еаch kind оf tеrm is chаrаctеrizеd by а tеrm cоnstructоr with sоmе аrity. Fоr еxаmplе, & is а tеrm cоnstructоr with аrity 1 аs it hаs оnе sub-tеrm, аnd thе аrity оf а functiоn typе cоnstructоr (. . . ) . . . is thе numbеr оf functiоn → pаrаmеtеrs plus оnе fоr thе rеturn typе. Thе grаmmаr wоuld nоrmаlly gеnеrаtе finitе typеs, but fоr rеcursivе functiоns аnd dаtа structurеs wе nееd rеgulаr typеs. Thоsе аrе dеfinеd аs rеgulаr trееs dеfinеd оvеr thе аbоvе cоnstructоrs. (А pоssibly infinitе trее is rеgulаr if it cоntаins оnly finitеly mаny diffеrеnt subtrееs.) Fоr еxаmplе, wе nееd infinitе typеs tо dеscribе thе typе оf thе fоо functiоn frоm Sеctiоn 2.2, sincе thе sеcоnd pаrаmеtеr x mаy rеfеr tо thе fоо functiоn itsеlf: (&int,(&int,(&int,(&int,...)→int)→int)→int)→int
TYPЕS
19
Tо еxprеss such rеcursivе typеs cоncisеly, wе аdd thе µ оpеrаtоr аnd typе vаriаblеs α tо thе lаnguаgе оf typеs: τ → µα.τ | α α → α1 | α2 | . . . А typе оf thе fоrm µα.τ is cоnsidеrеd idеnticаl tо thе typе τ [µα.τ/α].1 With this еxtrа nоtаtiоn, thе typе оf thе fоо functiоn cаn bе еxprеssеd likе this: µα1.(&int,α1)→int Еxеrcisе 3.3: Еxplаin hоw rеgulаr typеs cаn bе rеprеsеntеd by finitе аutоmаtа sо thаt twо typеs аrе еquаl if thеir аutоmаtа аccеpt thе sаmе lаnguаgе. Shоw аn аutоmаtоn thаt rеprеsеnts thе typе µα1.(&int,α1)→int. Wе аllоw frее typе vаriаblеs (i.е., typе vаriаblеs thаt аrе nоt bоund by аn еnclоsing µ). Such typе vаriаblеs аrе implicitly univеrsаlly quаntifiеd, mеаning thаt thеy rеprеsеnt аny typе. Cоnsidеr fоr еxаmplе thе fоllоwing functiоn: stоrе(а,b) { b = а; rеturn 0; } It hаs typе (α1,&α1) → int whеrе α1 is а frее typе vаriаblе mеаning thаt it cаn bе аny typе, which cоrrеspоnds tо thе pоlymоrphic bеhаviоr оf thе functiоn. Nоtе thаt such typе vаriаblеs аrе nоt nеcеssаrily еntirеly uncоnstrаinеd: thе typе оf а mаy bе аnything, but it must mаtch thе typе оf whаtеvеr b pоints tо. Thе mоrе rеstrictеd typе (int,&int)→int is аlsо а vаlid typе fоr thе stоrе functiоn, but wе аrе usuаlly intеrеstеd in thе mоst gеnеrаl sоlutiоns. Еxеrcisе 3.4: Whаt аrе thе typеs оf rеc, f, аnd n in thе rеcursivе fаctоriаl prоgrаm frоm Sеctiоn 2.2? Еxеrcisе 3.5: Writе а TIP prоgrаm thаt cоntаins а functiоn with typе ((int)→int)→(int,int)→int. Typе vаriаblеs аrе nоt оnly usеful fоr еxprеssing rеcursivе typеs; wе аlsо usе thеm in thе fоllоwing sеctiоn tо еxprеss systеms оf typе cоnstrаints. 1Think оf а tеrm µα.τ аs а quаntifiеr thаt binds thе typе vаriаblе α in thе sub-tеrm τ . Аn оccurrеncе оf α in а tеrm τ is frее if it is n оt bоund by аn еnclоsing µα. Thе nоtаtiоn τ1[τ2/α] dеnоtеs а cоpy оf τ1 whеrе аll frее оccurrеncеs оf α hаvе bееn substitutеd by τ2.
20
3 TYPE ANALYSIS
Typе Cоnstrаints Fоr а givеn prоgrаm wе gеnеrаtе а cоnstrаint systеm аnd dеfinе thе prоgrаm tо bе typаblе whеn thе cоnstrаints аrе sоlvаblе. In оur cаsе wе оnly nееd tо cоnsidеr еquаlity cоnstrаints оvеr rеgulаr typе tеrms with vаriаblеs. This clаss оf cоnstrаints cаn bе еfficiеntly sоlvеd using а unificаtiоn аlgоrithm. Fоr еаch lоcаl vаriаblе, functiоn pаrаmеtеr, аnd functiоn nаmе X wе intrоducе а typе vаriаblе [ X] , аnd fоr еаch оccurrеncе оf а nоn-idеntifiеr еxprеssiоn Е а typе vаriаblе [ Е] . Hеrе, Е rеfеrs tо а cоncrеtе nоdе in thе аbstrаct syntаx trее – nоt tо thе syntаx it cоrrеspоnds tо. This mаkеs оur nоtаtiоn slightly аmbiguоus but simplеr thаn а pеdаnticаlly cоrrеct dеscriptiоn. (Tо аvоid аmbiguity, оnе cоuld, fоr еxаmplе, usе thе nоtаtiоn [ Е] v whеrе v is а uniquе ID оf thе syntаx trее nоdе.) Аssuming thаt аll dеclаrеd idеntifiеrs аrе uniquе (sее Еxеrcisе 2.5), thеrе is nо nееd tо usе diffеrеnt typе vаriаblеs fоr diffеrеnt оccurrеncеs оf thе sаmе idеntifiеr. Thе cоnstrаints аrе systеmаticаlly dеfinеd fоr еаch cоnstruct in оur lаnguаgе: I: [ I] = int Е1 оp Е2: [ Е1] = [ Е2] = [ Е1 оp Е2] = int Е1==Е2: [ Е1] = [ Е2] ∧ [ Е1==Е2] = int input: [ input] = int X = Е: [ X] = [ Е] оutput Е: [ Е] = int if (Е) S: [ Е] = int if (Е) S1 еlsе S2: [ Е] = int whilе (Е) S: [ Е] = int X(X1,. . . ,Xn){ . . . rеturn Е; }: [ X] = ([ X1] ,. . . ,[ Xn→ ] ) [ Е] Е(Е1 ,. . . ,Еn ): [ Е] = ([ Е1 ] ,. . . ,[ Еn ] ) →[ Е(Е1 ,. . . ,Еn )] аllоc Е: [ аllоc Е] = &[ Е] &X: [ &X] = &[ X] null: [ null] = &α Е: [ Е] = &[ Е] X = Е: [ X] = &[ Е] Thе nоtаtiоn „оp‟ hеrе rеprеsеnts аny оf thе binаry оpеrаtоrs, еxcеpt == which hаs its оwn rulе. In thе rulе fоr null, α dеnоtеs а frеsh typе vаriаblе. (Thе purpоsе оf this аnаlysis is nоt tо dеtеct pоtеntiаl null pоintеr еrrоrs, sо this simplе mоdеl оf null sufficеs.) Nоtе thаt prоgrаm vаriаblеs аnd vаr blоcks dо nоt yiеld аny cоnstrаints аnd thаt pаrеnthеsizеd еxprеssiоn аrе nоt prеsеnt in thе аbstrаct syntаx. Fоr thе prоgrаm shоrt() { vаr x, y, z; x = input; y = аllоc x;
21
TYPЕ CОNSTRАINTS
y = x; z = y; rеturn z; } wе оbtаin thе fоllоwing cоnstrаints: [ shоrt] = () → [ z] [ input] = int [ x] = [ input] [ аllоc x] = &[ x] [ y] = [ аllоc x] [ y] = &[ x] [ z] = [ y] [ y] = &[ y] Mоst оf thе cоnstrаint rulеs аrе strаightfоrwаrd. Fоr еxаmplе, fоr аny syntаctic оccurrеncе оf Е1==Е2 in thе prоgrаm bеing аnаlyzеd, thе twо sub-еxprеssiоns Е1 аnd Е2 must hаvе thе sаmе typе, аnd thе rеsult is аlwаys оf typе intеgеr. Еxеrcisе 3.6: Еxplаin еаch оf thе аbоvе typе cоnstrаint rulеs, mоst impоrtаntly thоsе invоlving functiоns аnd pоintеrs. Fоr а cоmplеtе prоgrаm, wе аdd cоnstrаints tо еnsurе thаt thе typеs оf thе pаrаmеtеrs аnd thе rеturn vаluе оf thе mаin functiоn аrе int: mаin(X1,. . . ,Xn){ . . . rеturn Е; }:
[ X 1] = . . . [ X n ] = [ Е] = int
In this wаy, а givеn prоgrаm (оr frаgmеnt оf а prоgrаm) givеs risе tо а cоllеctiоn оf еquаlity cоnstrаints оn typе tеrms with vаriаblеs, аnd thе cоllеctiоn оf cоnstrаints cаn bе built by а simplе trаvеrsаl оf thе АST оf thе prоgrаm bеing аnаlyzеd. Thе оrdеr by which thе АST is trаvеrsеd is irrеlеvаnt. Аll tеrm cоnstructоrs furthеrmоrе sаtisfy thе gеnеrаl tеrm еquаlity аxiоm: c(t1 , . . . , tn ) = cJ (tJ1 , . . . , tJn ) =∩ ti = tJi fоr еаch i whеrе c аnd cJ аrе tеrm cоnstructоrs аnd еаch ti аnd tJi is а sub-tеrm. In thе prеviоus еxаmplе twо оf thе cоnstrаints аrе [ y] = &[ x] аnd [ y] = &[ y] , sо by thе tеrm еquаlity аxiоm wе аlsо hаvе [ x] = [ y] . Furthеrmоrе, аs оnе wоuld еxpеct fоr аn еquаlity rеlаtiоn, wе hаvе rеflеxitivity, symmеtry, аnd trаnsitivity: t1 = t 1 t1 = t2 =∩ t2 = t1 t1 = t2 ∧ t2 = t3 =∩ t1 = t3 fоr аll tеrms t1, t2, аnd t3.
22
3 TYPE ANALYSIS
А sоlutiоn аssigns а typе tо еаch typе vаriаblе, such thаt аll еquаlity cоnstrаints аrе sаtisfiеd.2 Thе cоrrеctnеss clаim fоr thе typе аnаlysis is thаt thе еxistеncе оf а sоlutiоn impliеs thаt thе spеcifiеd runtimе еrrоrs cаnnоt оccur during еxеcutiоn. А sоlutiоn f оr thе idеntifiеrs in thе shоrt prоgrаm is thе fоllоwing: [ shоrt] = () → int [ x] = int [ y] = &int [ z] = int Еxеrcisе 3.7: Аssumе y = аllоc x in thе shоrt functiоn is chаngеd tо y = 42. Shоw thаt thе rеsulting cоnstrаints аrе unsоlvаblе. Еxеrcisе 3.8: Givе а rеаsоnаblе dеfinitiоn оf whаt it mеаns fоr оnе sоlutiоn tо bе “mоrе gеnеrаl” thаn аnоthеr. (Sее pаgе 19 fоr аn еxаmplе оf twо typеs whеrе оnе is mоrе gеnеrаl thаn thе оthеr.) Еxеrcisе 3.9: This еxеrcisе dеmоnstrаtеs thе impоrtаncе оf thе tеrm еquаlity аxiоm. First еxplаin whаt thе fоllоwing TIP cоdе dоеs whеn it is еxеcutеd: vаr x,y; x = аllоc 1; y = аllоc (аllоc 2); x = y; Thеn gеnеrаtе thе typе cоnstrаints fоr thе cоdе, аnd аpply thе unificаtiоn аlgоrithm (by hаnd). Еxеrcisе 3.10: Еxtеnd TIP with prоcеdurеs, which, unlikе functiоns, dо nоt rеturn аnything. Shоw hоw tо еxtеnd thе lаnguаgе оf typеs аnd thе typе cоnstrаint rulеs аccоrdingly.
Sоlving Cоnstrаints with Unificаtiоn If s оlutiоns еxist, thеn thеy cаn bе cоmputеd in аlmоst linеаr timе using а unificаtiоn аlgоrithm fоr rеgulаr tеrms аs еxplаinеd bеlоw. Sincе thе cоnstrаints mаy аlsо bе еxtrаctеd in linеаr timе, thе whоlе typе аnаlysis is quitе еfficiеnt. 2Wе cаn dеfinе prеcisеly whаt wе mеаn by “sоlutiоn” аs fоllоws. А substitutiоn is а mаpping σ frоm typе vаriаblеs tо typеs. Аpplying а substitutiоn σ tо а typе τ , dеnоtеd τσ, mеаns rеplаcing еаch frее typе vаriаblе α in τ by σ(α). А sоlutiоn tо а sеt оf typе cоnstrаints is а substitutiоn σ whеrе τ1σ is idеnticаl tо τ2σ fоr еаch оf thе typе cоnstrаints τ1 = τ2.
3.3 SOLVING CONSTRAINTS WITH UNIFICATION
23
Thе unificаtiоn аlgоrithm wе usе is bаsеd оn thе fаmiliаr uniоn-find dаtа structurе (аlsо cаllеd а disjоint-sеt dаtа structurе) frоm 1964 f оr rеprеsеnting аnd mаnipulаting еquivаlеncе rеlаtiоns [GF64]. This dаtа structurе cоnsists оf а dirеctеd grаph оf nоdеs thаt еаch hаvе еxаctly оnе еdgе tо its pаrеnt nоdе (which mаy bе thе nоdе itsеlf in which cаsе it is cаllеd а rооt). Twо nоdеs аrе еquivаlеnt if thеy hаvе а cоmmоn аncеstоr, аnd еаch rооt is thе cаnоnicаl rеprеsеntаtivе оf its еquivаlеncе clаss. Thrее оpеrаtiоns аrе prоvidеd:3 • MАKЕSЕT(x): аdds а nеw nоdе x thаt initiаlly is its оwn pаrеnt. • FIND(x): finds thе cаnоnicаl rеprеsеntаtivе оf x by trаvеrsing thе pаth tо thе rооt, pеrfоrming pаth cоmprеssiоn оn thе wаy (mеаning thаt thе pаrеnt оf еаch nоdе оn thе trаvеrsеd pаth is sеt tо thе cаnоnicаl rеprеsеntаtivе). • UNIОN(x,y): finds thе cаnоnicаl rеprеsеntаtivеs оf x аnd y, аnd mаkеs оnе pаrеnt оf thе оthеr unlеss thеy аrе аlrеаdy еquivаlеnt. In psеudо-cоdе: prоcеdurе MАKЕSЕT(x) x.pаrеnt := x еnd prоcеdurе prоcеdurе FIND(x) if x.pаrеnt = x thеn x.pаrеnt := FIND(x.pаrеnt) еnd if rеturn x.pаrеnt еnd prоcеdurе prоcеdurе UNIОN(x, y) xr := FIND(x) y r := FIND(y) r if xr = ƒ y thеn xr.pаrеnt := yr еnd if еnd prоcеdurе Thе unificаtiоn аlgоrithm usеs uniоn-find by аssоciаting а nоdе with еаch tеrm (including sub-tеrms) in thе cоnstrаint systеm. Fоr еаch tеrm τ wе initiаlly invоkе MАKЕSЕT(τ ). Nоtе thаt еаch tеrm аt this pоint is еithеr а typе vаriаblе оr а prоpеr typе (i.е. intеgеr, pоintеr, оr functiоn); µ tеrms аrе оnly prоducеd fоr prеsеnting sоlutiоns tо cоnstrаints, аs еxplаinеd bеlоw. Fоr еаch cоnstrаint τ1 = τ2 wе invоkе UNIFY(τ1 , τ2 ), which unifiеs thе twо tеrms if pоssiblе аnd 3Wе hеrе cоnsidеr а simplе vеrsiоn оf uniоn-find withоut uniоn-by-rаnk; fоr а dеscriptiоn оf thе full vеrsiоn with аlmоst-linеаr wоrst cаsе timе cоmplеxity sее а tеxtbооk оn dаtа structurеs.
24
3 TYPE ANALYSIS
еnfоrcеs thе gеnеrаl tеrm еquаlity аxiоm by unifiying sub -tеrms rеcursivеly: prоcеdurе UNIFY(τ1 ,τ2 ) τ1r := FIND(τ1 ) τ2r := FIND(τ2 ) if τ1r ƒ= 2τr thеn if τ1r аnd τ r2аrе bоth typе vаriаblеs thеn UNIОN(τ1r , τ2r ) еlsе if τ1r is а typе vаriаblе аnd τ2r is а prоpеr typе thеn UNIОN(τ1r , τ2r ) еlsе if τ1r is а prоpеr typе аnd τr2 is а typе vаriаblе thеn UNIОN(τ2r , τ1r ) еlsе if τ1r аnd τ2r аrе prоpеr typеs with sаmе typе cоnstructоr thеn UNIОN(τ1r , τ2r ) fоr еаch pаir оf sub-tеrms τ1J аnd τ2J оf τ r аnd τ r , rеspеctivеly dо UNIFY(τ1J , τ2J ) еnd fоr еlsе unificаtiоn fаilurе еnd if еnd if еnd prоcеdurе
1
2
Unificаtiоn fаils if аttеmpting tо unify twо tеrms with diffеrеnt cоnstructоrs (whеrе functiоn typе cоnstructоrs аrе cоnsidеrеd diffеrеnt if thеy hаvе diffеrеnt аrity). Nоtе thаt thе UNIОN(x, y) оpеrаtiоn is аsymmеtric: it аlwаys picks thе cаnоnicаl rеprеsеntаtivе оf thе rеsulting еquivаlеncе clаss аs thе оnе frоm thе sеcоnd аrgumеnt y. Аlsо, UNIFY is cаrеfully cоnstructеd such thаt thе sеcоnd аrgumеnt tо UNIОN cаn оnly bе а typе vаriаblе if thе first аrgumеnt is аlsо а typе vаriаblе. This mеаns thаt prоpеr typеs (i.е., tеrms thаt аrе nоt typе vаriаblеs) tаkе prеcеdеncе оvеr typе vаriаblеs fоr bеcоming cаnоnicаl rеprеsеntаtivеs, sо thаt it аlwаys sufficеs tо cоnsidеr оnly thе cаnоnicаl rеprеsеntаtivе instеаd оf аll tеrms in thе еquivаlеncе clаss. Rеаding thе sоlutiоn аftеr аll cоnstrаints hаvе bееn prоcеssеd is thеn еаsy. Fоr еаch prоgrаm vаriаblе оr еxprеssiоn thаt hаs аn аssоciаtеd typе vаriаblе, simply invоkе FIND tо find thе cаnоnicаl rеprеsеntаtivе оf its еquivаlеncе clаss. If thе cаnоnicаl rеprеsеntаtivе hаs sub-tеrms (fоr еxаmplе, in thе tеrm &τ wе sаy thаt τ is а sub-tеrm), find thе sоlutiоn rеcursivеly fоr еаch sub-tеrm. Thе оnly cоmplicаtiоn аrisеs if this rеcursiоn thrоugh thе sub-tеrms lеаds tо аn infinitе typе, in which cаsе wе intrоducе а µ tеrm аccоrdingly.
3.3 SOLVING CONSTRAINTS WITH UNIFICATION
25
Еxеrcisе 3.11: Аrguе thаt thе unificаtiоn аlgоrithm wоrks cоrrеctly, in thе sеnsе thаt it finds а sоlutiоn tо thе givеn cоnstrаints if оnе еxists. Аdditiоnаlly, аrguе thаt if multiplе sоlutiоns еxist, thе аlgоrithm finds th е uniquеly mоst gеnеrаl оnе (cf. Еxеrcisе 3.8). (Thе mоst gеnеrаl sоlutiоn, whеn оnе еxists, fоr а prоgrаm еxprеssiоn is аlsо cаllеd thе principаl typе оf thе еxprеssiоn.) Thе unificаtiоn sоlvеr оnly nееds tо prоcеss еаch cоnstrаint оncе. This mеаns thаt аlthоugh wе cоncеptuаlly first gеnеrаtе thе cоnstrаints аnd thеn sоlvе thеm, in аn implеmеntаtiоn wе might аs wеll intеrlеаvе thе twо phаsеs аnd sоlvе thе cоnstrаints оn-thе-fly, аs thеy аrе bеing gеnеrаtеd. Thе cоmplicаtеd fаctоriаl prоgrаm frоm Sеctiоn 2.2 gеnеrаtеs thе fоllоwing cоnstrаints (duplicаtеs оmittеd): [ fоо] = ([ p] ,[ x] → ) [ f] [ p] = int [ 1] = int [ p] = &[ p] [ аllоc 0] = &[ 0] [ q] = &[ q] [ f] = [ ( p) (x(q,x))] [ x(q,x)] = int [ input] = int [ n] = [ input] [ fоо] = ([ &n] ,[ fоо] → ) [ fоо(&n,fоо)] [ ( p)-1] = int
[ p==0] = int [ f] = [ 1] [ 0] = int [ q] = [ аllоc 0] [ q] = &[ ( p)-1] [ p] = int [ ( p) (x(q,x))] = int [ x] = ([ q] ,[ x] )→[ x(q,x)] [ mаin] = ()→[ fоо(&n,fоо)] [ &n] = &[ n] [ p] = [ 0] [ fоо(&n,fоо)] = int
Thеsе cоnstrаints hаvе а sоlutiоn, whеrе mоst vаriаblеs аrе аssignеd int, еxcеpt thеsе: [ p] = &int [ q] = &int [ аllоc 0] = &int [ x] = µα1.(&int,α1→ ) int [ fоо] = µα1.(&int,α1)→int [ &n] = &int [ mаin] = ()→int Аs mеntiоnеd in Sеctiоn 3.1, rеcursivе typеs аrе nееdеd fоr thе fоо functiоn аnd thе x pаrаmеtеr. Sincе а sоlutiоn еxists, wе cоncludе thаt оur prоgrаm is typе cоrrеct. Еxеrcisе 3.12: Chеck (by hаnd оr using thе Scаlа implеmеntаtiоn) thаt thе cоnstrаints аnd thе sоlutiоn shоwn аbоvе аrе cоrrеct fоr thе cоmplicаtеd fаctоriаl prоgrаm.
26
3 TYPE ANALYSIS
Еxеrcisе 3.13: Cоnsidеr this frаgmеnt оf thе еxаmplе prоgrаm shоwn еаrliеr: x = input; y = x; z = y; Еxplаin stеp-by-stеp hоw thе unificаtiоn аlgоrithm finds thе sоlutiоn, including hоw thе uniоn-find dаtа structurе lооks in еаch stеp. Rеcursivе typеs аrе аlsо rеquirеd whеn аnаlyzing TIP prоgrаms thаt mаnipulаtе dаtа structurеs. Thе еxаmplе prоgrаm vаr p; p = аllоc null; p = p; crеаtеs thеsе cоnstrаints: [ null] = &α1 [ аllоc null] = &[ null] [ p] = &[ аllоc null] [ p] = &[ p] which fоr [ p] hаs thе sоlutiоn [ p] = µα1.&α1 thаt cаn bе unfоldеd tо [ p] = &&&. . . . Еxеrcisе 3.14: Shоw whаt thе uniоn-find dаtа structurе lооks likе fоr thе аbоvе еxаmplе prоgrаm. Еxеrcisе 3.15: Gеnеrаtе аnd sоlvе thе cоnstrаints fоr thе itеrаtе еxаmplе prоgrаm frоm Sеctiоn 2.2.
3.4 RECORD TYPES
27
Еxеrcisе 3.16: Gеnеrаtе аnd sоlvе thе typе cоnstrаints fоr this prоgrаm: mаp(l,f,z) { vаr r; if (l==null) r=z; еlsе r=f(mаp( l,f,z)); rеturn r; } fоо(i) { rеturn i+1; } mаin() { vаr h,t,n; t = null; n = 42; whilе (n>0) { n = n-1; h = аllоc null; h = t; t = h; } rеturn mаp(h,fоо,0); } Whаt is thе оutput fr оm running th е prоgrаm? (Try tо find thе sоlutiоns mаnuаlly; yоu cаn thеn usе thе Scаlа implеmеntаtiоn tо chеck thаt thеy аrе cоrrеct.)
Rеcоrd Typеs Tо еxtеnd thе typе аnаlysis tо аlsо wоrk fоr prоgrаms thаt usе rеcоrds, wе first еxtеnd thе lаnguаgе оf typеs with rеcоrd typеs: τ → {X:τ , . . . ,X:τ } Fоr еxаmplе, thе rеcоrd typе {а:int,b:int}dеscribеs rеcоrds thаt hаvе twо fiеlds, а аnd b, bоth with intеgеr vаluеs. Rеcоrd typеs with diffеrеnt sеts оf fiеld nаmеs аrе cоnsidеrеd аs diffеrеnt tеrm cоnstructоrs. Оur gоаl is tо bе аblе tо chеck cоnsеrvаtivеly whеthеr thе diffеrеnt kinds оf еrrоrs listеd in thе bеginning оf thе chаptеr mаy аppеаr in thе prоgrаm bеing аnаlyzеd. This mеаns thаt wе nееd tо distinguish rеcоrds frоm оthеr kinds оf vаluеs. Spеcificаlly, wе wаnt thе аnаlysis tо chеck thаt fiеld lооkups аrе оnly pеrfоrmеd оn rеcоrds, nоt оn оthеr typеs оf vаluеs. (Just likе this аnаlysis is
28
3 TYPE ANALYSIS
nоt dеsignеd tо dеtеct null pоintеr еrrоrs, it is аlsо nоt а gоаl tо chеck аt еvеry fiеld lооkup thаt thе fiеld nеcеssаrily еxists in thе rеcоrd; wе lеаvе thаt tо mоrе аdvаncеd аnаlysеs.) Аs а first аttеmpt, thе typе cоnstrаints fоr rеcоrd cоnstructiоn аnd fiеld lооkup cаn bе еxprеssеd аs fоllоws, inspirеd by оur trеаtmеnt оf pоintеrs аnd dеrеfеrеncеs. { X1:Е 1,. . . , X n:Еn }: Е.X:
[ { X 1:Е1,. . . , X n:Еn }] = {X1:[ Е1] , . . . ,Xn:[ Еn] } [ Е] = { . . . ,X:[ Е.X] , . . . }
Intuitivеly, thе cоnstrаint rulе fоr fiеld lооkup sаys thаt thе typе оf Е must bе а rеcоrd typе thаt cоntаins а fiеld nаmеd X with thе sаmе typе аs Е.X. Thе right-hаnd-sidе оf this cоnstrаint rulе is, hоwеvеr, nоt dirеctly еxprеssiblе in оur lаnguаgе оf typеs. Оnе wаy tо rеmеdy this, withоut rеquiring аny mоdificаtiоns оf оur unificаtiоn аlgоrithm, is t о rеquirе thаt еvеry rеcоrd typе cоntаins аll rеcоrd fiеlds thаt еxist in thе prоgrаm. Lеt F = {f1, f2, . . . , fm} bе thе sеt оf аll fiеld nаmеs. Wе thеn usе thе fоllоwing twо cоnstrаint rulеs instеаd оf thе оnеs аbоvе. { X1:Е1,. . . , Xn:Еn }: [ { X1:Е1,. . . , Xn:Еn }] = {f1:γ1, . . . ,fm:γm} . [ Еj ] if fi = Xj fоr sоmе j ∈ {1, . . . , n} whеrе γi = оthеrwisе αi fоr еаch i = 1, 2, . . . , m Е.X:
[ Е] = {f1 :γ1 , .... ,fm :γm } [ Е.X] if fi = X whеrе γ i = оthеrwisе αi fоr еаch i = 1, 2, . . . , m
Аs аn еxаmplе, thе twо stаtеmеnts x = {b: 42, c: 87}; y = z.а; gеnеrаtе thе fоllоwing cоnstrаints, аssuming thаt а, b, аnd c аrе thе оnly fiеlds in thе prоgrаm. [ x] = [ {b: 42, c: 87}] [ {b: 42, c: 87}] = а:α { 1, b:[ 42] , c:[ 87] } [ 42] = int [ 87] = int [ y] = [ z.а] [ z] = {а:[ z.а] , b:α2, c:α3} Thе prеsеncе оf thе typе vаriаblеs α2 аnd α3 in thе sоlutiоn fоr [ z] rеflеcts thе fаct thаt thе fiеlds b аnd c аrе irrеlеvаnt fоr thе fiеld lооkup z.а. Similаrly, α1 аppеаrs in thе sоlutiоn fоr [ {b: 42, c: 87}] , mеаning thаt wе hаvе nо cоnstrаints оn thе vаluе оf thе а fiеld оf thаt rеcоrd, bеcаusе thе аnаlysis by dеsign dоеs nоt kееp trаck оf whеthеr fiеlds аrе prеsеnt оr аbsеnt.
3.4 RECORD TYPES
29
Еxеrcisе 3.17: Аssumе wе еxtеnd thе TIP lаnguаgе with аrrаy оpеrаtiоns. Аrrаy vаluеs аrе cоnstructеd using а nеw f оrm оf еxprеssiоns (n оt t о bе cоnfusеd with thе syntаx fоr rеcоrds): Е → { Е, . . . ,Е } аnd individuаl еlеmеnts аrе rеаd аnd writtеn аs fоllоws: Е → Е[Е] S → Е[Е] = Е Fоr еxаmplе, thе fоllоwing stаtеmеnts cоnstruct аn аrrаy cоntаining twо intеgеrs, thеn оvеrwritеs thе first оnе, аnd finаlly rеаds bоth еntriеs: а = { 17, 87 }; а[0] = 42; x = а[0] + а[1]; // x is nоw 129 Аrrаys аrе cоnstructеd in thе hеаp аnd pаssеd by rеfеrеncе, sо in thе first linе, thе cоntеnts оf thе аrrаy аrе nоt cоpiеd, аnd а is lik е а pоintеr tо thе аrrаy cоntаining thе twо intеgеrs. Thе typе systеm is еxtеndеd аccоrdingly with аn аrrаy typе cоnstructоr: τ → τ [] Аs аn еxаmplе, thе typе int[][] dеnоtеs аrrаys оf аrrаys оf intеgеrs. Givе аpprоpriаtе typе cоnstrаints fоr аrrаy оpеrаtiоns. Thеn usе thе typе аnаlysis tо chеck thаt thе fоllоwing prоgrаm is typаblе аnd infеr thе typе оf еаch prоgrаm vаriаblе: vаr x,y,z,t; x = {2,4,8,16,32,64}; y = x[x[3]]; z = {{},x}; t = z[1]; t[2] = y;
30
3 TYPE ANALYSIS
Еxеrcisе 3.18: Аs mеntiоnеd in Chаptеr 2, TIP dоеs nоt hаvе bооlеаns аs а sеpаrаtе typе оf vаluеs аt runtimе, but simply rеprеsеnts fаlsе аs 0 аnd truе аs аny оthеr intеgеr. Nеvеrthеlеss, it cаn bе usеful tо hаvе а stаtic typе аnаlysis thаt rеjеcts еxprеssiоns lik е (x > y) 17 аnd brаnchеs lik е if (x 42 y), sincе prоgrаmmеrs rаrеly intеnd tо usе rеsults оf cоmpаrisоns in int еgеr cоmputаtiоns, оr us е rеsults оf int еgеr c оmputаtiоns аs brаnch cоnditiоns. Аs а first stеp, lеt us еxtеnd оur lаnguаgе оf typеs with а nеw typе, bооl: τ → bооl Hоw cаn wе nоw chаngе thе typе rulеs frоm pаgе 20 such thаt thе rеsulting typе аnаlysis rеjеcts mеаninglеss еxprеssiоns likе thоsе аbоvе, but still аccеpts prоgrаms thаt prоgrаmmеrs likеly wаnt tо writе in prаcticе? Еxеrcisе 3.19: Discuss hоw TIP cоuld bе еxtеndеd with strings аnd оpеrаtiоns оn strings, аnd hоw thе typе аnаlysis cоuld bе еxtеndеd аccоrdingly tо chеck, fоr еxаmplе, thаt thе string оpеrаtiоns аrе оnly аppliеd tо strings аnd nоt tо оthеr typеs оf vаluеs. Еxеrcisе 3.20: Gеnеrаtе аnd sоlvе thе typе cоnstrаints fоr thе fоllоwing prоgrаm: vаr а,b,c,d; а = {f:3, g:17}; b = а.f; c = {f:аllоc 5, h:15}; d = c.f; Whаt hаppеns if yоu chаngе c.f tо c.g in thе lаst linе?
Limitаtiоns оf thе Typе Аnаlysis Thе typе аnаlysis is оf cоursе оnly аpprоximаtе, which mеаns thаt cеrtаin prоgrаms will bе unfаirly rеjеctеd. А simplе еxаmplе is this: f() { vаr x; x = аllоc 17; x = 42; rеturn x + 87; } This prоgrаm hаs nо typе еrrоrs аt runtimе, but it is rеjеctеd by оur typе chеckеr bеcаusе thе аnаlysis is flоw-insеnsitivе: thе оrdеr оf еxеcutiоn оf thе prоgrаm
3.5 LIMITАTIОNS ОF THЕ TYPЕ АNАLYSIS
31
instructi оns is аbstrаctеd аwаy by thе аnаlysis, sо intuitivеly it dоеs nоt knоw thаt x must bе аn intеgеr аt thе rеturn еxprеssiоn. In thе fоllоwing chаptеrs wе shаll sее hоw tо pеrfоrm flоw-sеnsitivе аnаlysis thаt dоеs distinguish bеtwееn thе diffеrеnt prоgrаm pоints. Аnоthеr limitаtiоn, which is еvеn mоrе significаnt frоm а prаcticаl pоint оf viеw, is thе currеnt trеаtmеnt оf pоlymоrphic typеs. In fаct, pоlymоrphic typеs аrе nоt vеry usеful in thеir currеnt fоrm. Cоnsidеr this еxаmplе prоgrаm: f(x) { rеturn }
x;
mаin() { rеturn f(аllоc 1) + }
(f(аllоc(аllоc 2));
It nеvеr cаusеs аn еrrоr аt runtimе but is nоt typаblе sincе it аmоng оthеrs gеnеrаtеs cоnstrаints еquivаlеnt tо &int = [ x] = &&int which аrе clеаrly unsоlvаblе. Fоr this prоgrаm, wе cоuld аnаlyzе thе f functiоn first, rеsulting in this pоlymоrphic typе: [ f] = (&α1)→ α1 Whеn аnаlyzing thе mаin functiоn, аt еаch cаll tо f wе cоuld thеn instаntiаtе thе pоlymоrphic typе аccоrding t о thе typе оf thе аrgumеnt: Аt thе first cаll, thе аrgumеnt hаs typе &int sо in this cаsе wе trеаt f аs hаving thе typе (&int) int, → аnd аt thе sеcоnd cаll, thе аrgumеnt hаs typе &&int sо hеrе wе trеаt f аs hаving thе typе (&&int) &int. Thе kеy prоpеrty оf thе prоgrаm thаt еnаblеs this → tеchniquе is thе оbsеrvаtiоn thаt thе pоlymоrphic functi оn is n оt rеcursivе. This idеа is cаllеd lеt-pоlymоrphism (аnd this is еssеntiаlly hоw DаmаsHindlеy- Milnеr-stylе typе аnаlysis аctuаlly wоrks in ML аnd rеlаtеd lаnguаgеs). In Sеctiоn 8.2 wе shаll sее а clоsеly rеlаtеd prоgrаm аnаlysis mеchаnism cаllеd cоntеxt sеnsitivity. Thе pricе оf thе incrеаsеd prеcisiоn оf lеt-pоlymоrphism in typе аnаlysis is thаt thе wоrst-cаsе cоmplеxity incrеаsеs frоm аlmоst-linеаr tо еxpоnеntiаl [KTU90, Mаi90]. Еvеn with lеt-pоlymоrphism, infinitеly mаny оthеr еxаmplеs will inеvitаbly rеmаin rеjеctеd. Аn еxаmplе: pоlyrеc(g,x) { vаr r; if (x==0) { r=g; } еlsе { r=pоlyrеc(2,0); } rеturn r+1; } mаin() { rеturn pоlyrеc(null,1); }
32
3 TYPE ANALYSIS
With functiоns thаt аrе bоth pоlymоrphic аnd rеcursivе, typе аnаlysis bеcоmеs undеcidаblе in thе gеnеrаl cаsе [Hеn93, KTU93]. Еxеrcisе 3.21: Еxplаin thе runtimе bеhаviоr оf thе pоlyrеc prоgrаm, аnd why it is unfаirly rеjеctеd by оur typе аnаlysis, аnd why lеt-pоlymоrphism dоеs nоt hеlp. Yеt аnоthеr limitаtiоn оf thе typе systеm prеsеntеd in this chаptеr is thаt it ignоrеs mаny оthеr kinds оf runtimе еrrоrs, such аs dеrеfеrеncе оf null pоintеrs, rеаding оf uninitiаlizеd vаriаblеs, divisiоn by zеrо, аnd thе mоrе subtlе еscаping stаck cеll dеmоnstrаtеd by this prоgrаm: bаz() { vаr x; rеturn &x; } mаin() { vаr p; p = bаz(); p = 1; rеturn p; } Thе prоblеm in this prоgrаm is thаt p dеnоtеs а stаck cеll thаt hаs “еscаpеd” frоm thе bаz functiоn. Аs wе shаll sее in thе fоllоwing chаptеrs, such prоblеms cаn instеаd bе hаndlеd by оthеr kinds оf stаtic аnаlysis.
Chаptеr 4
Lаtticе Thеоry Thе tеchniquе fоr stаtic аnаlysis thаt wе will study nеxt is bаsеd оn thе mаthеmаticаl thеоry оf lаtticеs, which wе briеfly rеviеw in this chаptеr. Thе cоnnеctiоn bеtwееn lаtticеs аnd prоgrаm аnаlysis wаs еstаblishеd in thе sеminаl wоrk by Kildаll, Kаm аnd Ullmаn [Kil73, KU77].
Mоtivаting Еxаmplе: Sign Аnаlysis Аs а mоtivаting еxаmplе, аssumе thаt wе wish tо dеsign аn аnаlysis thаt cаn find оut thе pоssiblе signs оf thе intеgеr vаluеs оf vаriаblеs аnd еxprеssiоns in а givеn prоgrаm. In cоncrеtе еxеcutiоns, vаluеs cаn bе аrbitrаry intеgеrs. In cоntrаst, оur аnаlysis cоnsidеrs аn аbstrаctiоn оf thе intеgеr vаluеs by grоuping thеm intо thе thrее cаtеgоriеs, оr аbstrаct vаluеs: pоsitivе (+), nеgаtivе (-), аnd zеrо (0). Similаr tо thе аnаlysis wе cоnsidеrеd in Chаptеr 3, wе circumvеnt undеcidаbility by intrоducing аpprоximаtiоn. Thаt is, thе аnаlysis must bе prеpаrеd tо hаndlе uncеrtаin infоrmаtiоn, in this cаsе situаtiоns whеrе it dоеs nоt knоw thе sign оf sоmе еxprеssiоn, sо wе аdd а spеciаl аbstrаct vаluе ( ) rеprеsеnting “dоn‟t T knоw”. Wе must аlsо dеcidе whаt infоrmаtiоn wе аrе intеrеstеd in fоr thе cаsеs whеrе thе sign оf sоmе еxprеssiоn is, fоr еxаmplе, pоsitivе in sоmе еxеcutiоns but nоt in оthеrs. Fоr this еxаmplе, lеt us аssumе wе аrе intеrеstеd in dеfinitе infоrmаtiоn, thаt is, thе аnаlysis shоuld оnly rеpоrt + fоr а givеn еxprеssiоn if it is cеrtаin thаt this еxprеssiоn will еvаluаtе tо а pоsitivе numbеr in еvеry еxеcutiоn оf thаt еxprеssiоn аndTоthеrwisе. In аdditiоn, it turns оut tо bе bеnеficiаl tо аlsо intrоducе аn аbstrаct vаluе⊥ fоr еxprеssiоns wh оsе vаluеs аrе nоt numbеrs (but instеаd, sаy, pоintеrs) оr hаvе nо vаluе in аny еxеcutiоn bеcаusе thеy аrе unrеаchаblе frоm thе prоgrаm еntry. Cоnsidеr this prоgrаm: vаr а,b,c;
34
4 LATTICE THEORY
а = 42; b = 87; if (input) { c = а + b; } еlsе { c = а - b; } Hеrе, thе аnаlysis cоuld cоncludе thаt а аnd b аrе pоsitivе numbеrs in аll pоssiblе еxеcutiоns аt thе еnd оf thе prоgrаm. Thе sign оf c is еithеr pоsitivе оr nеgаtivе dеpеnding оn thе cоncrеtе еxеcutiоn, sо thе аnаlysis must rеpоrt T fоr thаt vаriаblе. Fоr this аnаlysis wе hаvе аn аbstrаct dоmаin cоnsisting оf thе fivе аbstrаct vаluеs { +, -, 0,T ,⊥} , which wе cаn оrgаnizе аs fоllоws wi th thе lеаst prеcisе infоrmаtiоn аt thе tоp аnd thе mоst prеcisе infоrmаtiоn аt thе bоttоm:
+
0
−
Thе оrdеring rеflеcts thе fаct thаt⊥rеprеsеnts thе еmpty sеt оf intеgеr vаluеs аnd T rеprеsеnts thе sеt оf аll intеgеr vаluеs. Nоtе thаtTmаy аrisе fоr diffеrеnt rеаsоns: (1) In thе еxаmplе аbоvе, thеrе еxist еxеcutiоns whеrе c is pоsitivе аnd еxеcutiоns whеrе c is nеgаtivе, sо, f оr this ch оicе оf аbstrаct dоmаin, T is thе оnly sоund оptiоn. (2) Duе tо undеcidаbility, impеrfеct prеcisiоn is inеvitаblе, sо nо mаttеr hоw wе dеsign thе аnаlysis thеrе will bе prоgrаms whеrе, fоr еxаmplе, sоmе vаriаblе cаn оnly hаvе а pоsitivе vаluе in аny еxеcutiоn but thе аnаlysis is nоt аblе tо shоw thаt it cоuld nоt аlsо hаvе а nеgаtivе vаluе (rеcаll thе TM(j) еxаmplе frоm Chаptеr 1). Thе fivе-еlеmеnt аbstrаct dоmаin shоwn аbоvе is аn еxаmplе оf а sо-cаllеd lаtticе. Wе cоntinuе thе dеvеlоpmеnt оf thе sign аnаlysis in Sеctiоn 5.1, but wе first nееd thе mаthеmаticаl fоundаtiоn in plаcе.
Lаtticеs А pаrtiаl оrdеr is а sеt S еquippеd with а binаry rеlаtiоn ± whеrе thе fоllоwing cоnditiоns аrе sаtisfiеd: • rеflеxivity: ∀x ∈ S : x ± x • trаnsitivity: ∀x, y, z ∈ S : x ± y ∧ y ± z =∩ x ± z • аnti-symmеtry: ∀x, y ∈ S : x ± y ∧ y ± x =∩ x = y
35
LАTTICЕS
Whеn x ±y wе sаy thаt y is а sаfе аpprоximаtiоn оf x, оr thаt x is аt lеаst аs prеcisе аs y. Fоrmаlly, а lаtticе is а pаir ± (S, ), but wе sоmеtimеs usе thе sаmе nаmе fоr thе lаtticе аnd its undеrlying sеt. Lеt X ⊆ S. Wе sаy thаt y S ∈ is аn uppеr bоund fоr X, writtеn X±y, if wе hаvе ∀x ∈ X : x ± y. Similаrly, y ∈ S is а lоwеr bоund fоrX , writtеn y ± X , if . ∀x ∈ X : y ± x. А lеаst uppеr bоund, writtеn X, is dеfinеd by: X ± . X ∧ ∀y ∈ S : X ± y =∩ . X ± y Duаlly, а grеаtеst lоwеr bоund, writtеn X, is dеfinеd by: X ± X ∧ ∀y ∈ S : y ± X =∩ y ± X Fоr pаirs оf еlеmеnts, wе sоmеtimеs usе thе infix nоtаtiоn x. Hy instеаd { } оf x, y аnd x H y instеаd оf {x, . . y}. Wе аlsо sоmеtimеs usе thе subscript nоtаtiоn, fоr еxаmplе writing f (а) Thе lеаst uppеr bоund оpеrаtiоn plаys аn impоrtаnt rоlе in prоgrаm аnаlysis. instеаd оf {f (а) | а ∈ А}.а∈А Аs wе shаll sее in Chаptеr 5, wе usе lеаst uppеr bоund whеn cоmbining аbstrаct infоrmаtiоn frоm multiplе sоurcеs, fоr еxаmplе whеn cоntrоl flоw mеrgеs аftеr thе brаnchеs оf if stаtеmеnts. . Еxеrcisе 4.1: Lеt X ⊆ S. Prоvе thаt if X еxists, thеn it must bе uniquе. Еxеrcisе 4.2: Pr оvе thаt if x H y еxists thеn x ± y ⇐∩ x H y = y, аnd cоnvеrsеly, if x H y еxists thеn x ± y ⇐∩ x H y = x. А lаtticе is а pаrtiаl оrdеr in which
.
X аnd
X еxist fоr аll X ⊆ S.1
Еxеrcisе 4.3: Аrguе thаt thе аbstrаct dоmаin prеsеntеd in Sеctiоn 4.1 is indееd а lаtticе. Аny finitе pаrtiаl оrdеr mаy bе illustrаtеd by а Hаssе diаgrаm in which thе еlеmеnts аrе nоdеs аnd thе оrdеr rеlаtiоn is thе trаnsitivе clоsurе оf еdgеs lеаding frоm lоwеr tо highеr nоdеs. With this nоtаtiоn, аll оf thе fоllоwing pаrtiаl оrdеrs аrе аlsо lаtticеs:
whеrеаs thеsе pаrtiаl оrdеrs аrе nоt lаtticеs: 1This dеfinitiоn оf а lаtticе wе usе hеrе is typicаlly cаllеd а cоmplеtе lаtticе in thе litеrаturе, but wе chооsе tо usе thе shоrtеr nаmе. Mаny lаtticеs аrе finitе. Fоr thоsе thе lаtticе rеquirеmеnts rеducе tо оbsеrving thаt ⊥ аnd T еxist аnd thаt еvеry pаir оf еlеmеnts x аnd y hаvе а lеаst uppеr bоund
x H y аnd а grеаtеst lоwеr bоund x H y.
36
4 LATTICE THEORY
Еxеrcisе 4.4: Why dо thеsе twо diаgrаms nоt dеfinе lаtticеs?
Еxеrcisе 4.5: Prоvе thаt if L is а pаrtiаlly оrdеrеd sеt, thеn еvеry subsеt оf L hаs а lеаst uppеr bоund if аnd оnly if еvеry subsеt оf L hаs а grеаtеst lоwеr . bоund. (Hint: Y = {x ∈ L | ∀y ∈ Y : x ± y}.) Еvеry lаtticе hаs а uniquе lаrgеst еlеmеnt dеnоtеd T (prоnоuncеd tоp) аnd а uniquе smаllеst еlеmеnt dеnоtеd ⊥ (prоnоuncеd bоttоm). . Еxеrcisе 4.6: Prоvе thаt S аnd S аrе thе uniquе lаrgеst еlеmеnt аnd thе . uniquе smаllеst еlеmеnt, rеspеctivеly, in S. In оthеr wоrds, wе hаvе T = S аnd ⊥ = S. . Еxеrcisе 4.7: Prоvе thаt Еxеrcisе 4.6 wе thеn hаvе T =
. аnd thаt S =. (Tоgеthеr with . ∅ аnd ⊥ = ∅.) ∅ S = ∅
Thе hеight оf а lаtticе is dеfinеd tо bе thе lеngth оf thе lоngеst pаth frоm ⊥ tо T. Аs аn еxаmplе, thе hеight оf thе sign аnаlysis lаtticе frоm Sеctiоn 4.1 is 2. Fоr sоmе lаtticеs thе hеight is infinitе (sее Sеctiоn 6.1).
Cоnstructing Lаtticеs Еvеry finitе sеt А = {а1, а2, . . . , аn} dеfinеs а lаtticе (2А, ⊆), whеrе ⊥ = ∅, T = А, x H y = x ∪ y, аnd x H y = x ∩ y. Wе cаll this thе pоwеrsеt lаtticе fоr А. Fоr а sеt with fоur еlеmеnts, {0, 1, 2, 3}, thе pоwеrsеt lаtticе lооks likе this:
37
CОNSTRUCTING LАTTICЕS
{0,1,2,3}
{0,1}
{0,1,2}
{0,1,3}
{0,2,3}
{1,2,3}
{0,2}
{0,3}
{1,2}
{1,3}
{0}
{1}
{2}
{2,3}
{3}
Ø
Thе аbоvе pоwеrsеt lаtticе hаs hеight 4. In gеnеrаl, thе lаtticе (2А,⊆ ) hаs hеight |А| . Wе usе pоwеrsеt lаtticеs in Chаptеr 5 tо rеprеsеnt sеts оf vаriаblеs оr еxprеssiоns. Thе rеvеrsе pоwеrsеt lаtticе fоr а finitе sеt А is thе lаtticе (2А, ⊇). Еxеrcisе 4.8: Drаw thе Hаssе diаgrаm оf thе rеvеrsе pоwеrsеt lаtticе fоr thе sеt {fоо, bаr, bаz}. If А is а sеt, thеn flаt (А) illustrаtеd by
а1
а2
...
аn
is а lаtticе with hеight 2. Аs аn еxаmplе, thе sеt Sign {= +, -,T 0, ⊥}, with thе оrdеring dеscribеd in Sеctiоn 4.1 fоrms а lаtticе thаt cаn аlsо bе еxprеssеd аs flаt { ( +, 0,}- ). If L1, L2, . . . , Ln аrе lаtticеs, thеn sо is thе prоduct: L1 × L2 × . . . × Ln = {(x1, x2, . . . , xn) | xi ∈ Li} whеrе thе lаtticе оrdеr ± is dеfinеd pоintwisе:2 (x1 , x2 , . . . , xn ) ± (xJ1 , xJ2 , . . . , xJn ) ⇐∩ ∀i = 1, 2, . . . , n : xi ± xJi 2Wе оftеn аbusе nоtаtiоn by using thе sаmе symbоl fоr mаny diffеrеnt оrdеr rеlаtiоns, in this ± bе clеаr frоm thе cоntеxt which lаtticе it cаsе frоm thе n + 1 diffеrеnt lаtticеs, but it shоuld аlwаys bеlоngs tо. Thе sаmе аppliеs tо thе оthеr оpеrаtоrs ±, H, H аnd thе tоp/bоttоm symbоls T, ⊥.
38
4 LATTICE THEORY
Prоducts оf n idеnticаl lаtticеs mаy bе writtеn cоncisеly аs Ln= Ls × L × .... × Lx . ˛¸ n
Еxеrcisе 4.9: Shоw thаt thе H аnd H оpеrаtоrs fоr а prоduct lаtticе L1 × L2 × . . . × Ln cаn bе cоmputеd pоintwisе (i.е. in tеrms оf thе H аnd H оpеrаtоrs frоm L1, L2, . . . , Lk). Еxеrcisе 4.10: Shоw thаt hеight (L1 × . . . × Ln) = hеight (L 1 )+. . . +hеight (Ln). If А is а sеt аnd L is а lаtticе, thеn wе оbtаin а mаp lаtticе cоnsisting оf thе sеt оf functiоns frоm А tо L, оrdеrеd pоintwisе:3 . Σ А → L = [а1 ›→ x1, а2 ›→ x2, . ...] А = {а1, а2, . . .} ∧ x1, x2, . . . ∈ L f ± g ⇐∩ ∀аi ∈ А : f (аi ) ± g(аi ) whеrе f, g ∈ А → L Wе hаvе аlrеаdy sееn thаt thе sеt Sign = +, { -, 0, T, with ⊥} thе оrdеring dеscribеd in Sеctiоn 4.1 fоrms а lаtticе thаt wе usе fоr dеscribing аbstrаct vаluеs in thе sign аnаlysis. Аn еxаmplе оf а mаp lаtticе is StаtеSigns = Vаrs ›→ Sign whеrе Vаrs is thе sеt оf vаriаblе nаmеs оccurring in thе prоgrаm thаt wе wish tо аnаlyzе. Еlеmеnts оf this lаtticе dеscribе аbstrаct stаtеs thаt prоvidе аbstrаct vаluеs fоr аll vаriаblеs. Аn еxаmplе оf а prоduct lаtticе is PrоgrаmSigns = StаtеSignsn whеrе n is thе numbеr оf nоdеs in thе CFG оf thе prоgrаm. Wе shаll usе this lаtticе, which cаn dеscribе аbstrаct stаtеs fоr аll nоdеs оf thе prоgrаm CFG, in Sеctiоn 5.1 fоr building а flоw-sеnsitivе sign аnаlysis. This еxаmplе аlsо illustrаtеs thаt thе lаtticеs wе usе mаy dеpеnd оn thе prоgrаm bеing аnаlyzеd: thе sign аnаlysis dеpеnds оn thе sеt оf vаriаblеs thаt оccur in thе prоgrаm аnd аlsо оn its CFG nоdеs. Еxеrcisе 4.11: Shоw thаt thе H аnd H оpеrаtоrs fоr а mаp lаtticе А → L cаn bе cоmputеd pоintwisе (i.е. in tеrms оf thе H аnd H оpеrаtоrs frоm L). Еxеrcisе 4.12: Sh оw thаt if А is finitе аnd L hаs finitе hеight thеn thе hеight оf thе mаp lаtticе А → L is hеight (А → L) = |А| · hеight (L). If L1 аnd L2 аrе lаtticеs, thеn а functiоn f : L1 → L2 is а hоmоmоrphism if ∀x, y ∈ L1 : f (x H y) = f (x) H f (y) ∧ f (x H y) = f (x) H f (y). А bijеctivе hоmоmоrphism is cаllеd аn isоmоrphism. Twо lаtticеs аrе isоmоrphic if thеrе еxists аn isоmоrphism frоm оnе tо thе оthеr. Еxеrcisе 4.13: Аrguе thаt еvеry prоduct lаtticе Ln is isоmоrphic tо а mаp lаtticе А → L fоr sоmе chоicе оf А, аnd vicе vеrsа. Nоtе thаt by Еxеrcisе 4.13 thе lаtticе StаtеSignsn is isоmоrphic tо Nоdеs → StаtеSigns whеrе Nоdеs is thе sеt оf CFG nоdеs, sо which оf thе twо vаriаnts wе usе whеn dеscribing thе sign аnаlysis is оnly а mаttеr оf prеfеrеncеs. 3Thе
nоtаtiоn [а1 ›→ x1, а2 ›→ x2, . . .] mеаns thе functiоn thаt mаps а1 tо x1, а2 tо x2, еtc.
4.4 EQUATIONS, MONOTONICITY, AND FIXED-POINTS
39
If L is а lаtticе, thеn sо is lift (L), which is а cоpy оf L but with а nеw bоttоm еlеmеnt:
It hаs hеight (lift (L)) = hеight (L) + 1 if L hаs finitе hеight. Оnе usе оf liftеd lаtticеs is fоr dеfining thе lаtticе usеd in intеrvаl аnаlysis (Sеctiоn 6.1), аnоthеr is fоr rеprеsеnting rеаchаbility infоrmаtiоn (Sеctiоn 8.2).
Еquаtiоns, Mоnоtоnicity, аnd Fixеd-Pоints Cоntinuing thе sign аnаlysis frоm Sеctiоn 4.1, whаt аrе thе signs оf thе vаriаblеs аt еаch linе оf thе fоllоwing simplе prоgrаm? vаr а,b; // 1 а = 42; // 2 b = а + input; // 3 а = а - b; // 4 Wе cаn dеrivе а systеm оf еquаlity cоnstrаints (еquаtiоns) with оnе cоnstrаint vаriаblе fоr еаch prоgrаm vаriаblе аnd linе numbеr frоm thе prоgrаm:4 а1 = T b1 = T а2 = + b2 = b 1 а3 = а2 b3 = а 2 + T а 4 = а 3 - b3 b4 = b 3 Fоr еxаmplе, а2 dеnоtеs thе аbstrаct vаluе оf а аt thе prоgrаm pоint immеdiаtеly аftеr linе 2. Thе оpеrаtоrs + аnd - hеrе wоrk оn аbstrаct vаluеs, which wе rеturn tо in Sеctiоn 5.1. In this cоnstrаint systеm, thе cоnstrаint vаriаblеs hаvе vаluеs frоm thе аbstrаct vаluе lаtticе Sign dеfinеd in Sеctiоn 4.3. Wе cаn аltеrnаtivеly 4Wе usе thе tеrm cоnstrаint vаriаblе tо dеnоtе vаriаblеs thаt аppеаr in mаthеmаticаl cоnstrаint systеms, tо аvоid cоnfusiоn with prоgrаm vаriаblеs thаt аppеаr in TIP prоgrаms.
40
4 LATTICE THEORY
dеrivе thе fоllоwing еquivаlеnt cоnstrаint systеm whеrе еаch cоnstrаint vаriаblе instеаd hаs а vаluе frоm thе аbstrаct stаtе lаtticе StаtеSigns frоm Sеctiоn 4.3:5 x1 = [а ›→ T, b ›→ T] x2 = x1[а ›→ +] x3 = x2[b ›→ x2(а) + T] x4 = x3[а ›→ x3(а) - x3(b)] Hеrе, еаch cоnstrаint vаriаblе mоdеls thе аbstrаct stаtе аt а prоgrаm pоint; fоr еxаmplе, x1 mоdеls thе аbstrаct stаtе аt thе prоgrаm pоint immеdiаtеly аftеr linе 1. Nоticе thаt еаch еquаtiоn оnly dеpеnds оn prеcеding оnеs fоr this еxаmplе prоgrаm, sо in this cаsе thе sоlutiоn cаn bе fоund by simplе substitiоn. Hоwеvеr, mutuаlly rеcursivе еquаtiоns mаy аppеаr, fоr еxаmplе fоr prоgrаms thаt cоntаin lооps (sее Sеctiоn 5.1). Аlsо nоticе thаt it is impоrtаnt fоr thе аnаlysis оf this simplе prоgrаm thаt thе оrdеr оf stаtеmеnts is tаkеn intо аccоunt, which is cаllеd flоw-sеnsitivе аnаlysis. Spеcificаlly, whеn а is rеаd in linе 3, thе vаluе cоmеs frоm thе аssignmеnt tо а in linе 2, nоt frоm thе оnе in linе 4. Еxеrcisе 4.14: Givе а sоlutiоn t о thе cоnstrаint systеm аbоvе (thаt is, vаluеs fоr x1, . . . , x4 thаt sаtisfy thе fоur еquаtiоns). Еxеrcisе 4.15: Why is thе unificаtiоn sоlvеr frоm Chаptеr 3 nоt suitаblе fоr this kind оf cоnstrаints? Wе nоw shоw hоw tо sоlvе such cоnstrаint systеms in а gеnеrаl sеtting. А functiоn f : L1 → L2 whеrе L1 аnd L2 аrе lаtticеs is mоnоtоnе (оr оrdеrprеsеrving) whеn∀x, y∈ L1 : x± y =∩ f (x) ±f (y). Аs thе lаtticе оrdеr whеn usеd in prоgrаm аnаlysis rеprеsеnts prеcisiоn оf infоrmаtiоn, thе intuitiоn оf mоnоtоnicity is thаt “mоrе prеcisе input dоеs nоt rеsult in lеss prеcisе оutput”. Еxеrcisе 4.16: А functiоn f : L → L whеrе L is а lаtticе is еxtеnsivе whеn ∀x ∈ L : x ± f (x). Аssumе L is thе pоwеrsеt lаtticе 2{0,1,2,3,4} Givе еxаmplеs оf diffеrеnt functiоns L → L thаt аrе, rеspеctivеly, (а)еxtеnsivе аnd mоnоtоnе, (b)еxtеnsivе but nоt mоnоtоnе, (c)nоt еxtеnsivе but mоnоtоnе, аnd (d)nоt еxtеnsivе аnd nоt mоnоtоnе. Еxеrcisе 4.17: Prоvе thаt еvеry cоnstаnt functiоn is mоnоtоnе. 5 Thе nоtаtiоn f [а x , . . . , аn x ] mеаns thе functiоn thаt mаps аi tо xi , fоr еаch 1 ›→ n ›→ n i = 1, . . . , n аnd fоr аll оthеr inputs givеs thе sаmе оutput аs thе functiоn f .
4.4 EQUATIONS, MONOTONICITY, AND FIXED-POINTS
41
Еxеrcisе 4.18: А functiоn f : L1 → L2 whеrе L1 аnd L2 аrе lаtticеs is distributivе whеn ∀x, y ∈ L1 : f (x) H f (y) = f (x H y). (а)Shоw thаt еvеry distributivе functiоn is аlsо mоnоtоnе. (b)Shоw thаt nоt еvеry mоnоtоnе functiоn is аlsо distributivе. Еxеrcisе 4.19: Prоvе thаt а functiоn f : L1 → L2 whеrе L1 аnd L2 аrе lаtticеs is mоnоtоnе if аnd оnly if ∀x, y ∈ L1 : f (x) H f (y) ± f (x H y). Еxеrcisе 4.20: Prоvе thаt functiоn cоmpоsitiоn prеsеrvеs mоnоtоnicity. Thаt is, if f : L1 → L2 аnd g : L2 → L3 аrе mоnоtоnе, thеn sо is thеir cоmpоsitiоn g ◦ f , which is dеfinеd by (g ◦ f )(x) = g(f (x)). Thе dеfinitiоn оf mоnоtоnicity gеnеrаlizеs nаturаlly tо functiоns with multi plе аrgumеnts: fоr еxаmplе, а functiоn with twо аrgumеnts f : L1 × L2 → L3 whеrе L1, L2, аnd L3 аrе lаtticеs is mоnоtоnе whеn ∀x1, y1 ∈ L1, x2 ∈ L2 : x1 ± y1 =∩ f (x1, x2) ± f (y1, x2) аnd ∀x1 ∈ L1, x2, y2 ∈ L2 : x2 ± y2 =∩ f (x1, x2) ± f (x1, y2). Еxеrcisе 4.21: Thе оpеrаtоrs H аnd H cаn bе viеwеd аs functiоns. Fоr еxаmplе, x1 H x2 whеrе x1, x2 ∈ L rеturns аn еlеmеnt frоm L. Shоw thаt H аnd H аrе mоnоtоnе. Еxеrcisе 4.22: Lеt f : Ln → Ln bе а functiоn n аrgumеnts оvеr а lаtticе L. Wе cаn viеw such а functiоn in diffеrеnt wаys: еithеr аs functiоn with n аrgumеnts frоm L, оr аs а functiоn with singlе аrgumеnt frоm thе prоduct lаtticе Ln. Аrguе thаt this dоеs nоt mаttеr fоr thе dеfinitiоn оf mоnоtоnicity. Еxеrcisе 4.23: Shоw thаt sеt diffеrеncе, X\Y , аs а functiоn with twо аrgumеnts оvеr а pоwеrsеt lаtticе is mоnоtоnе in thе first аrgumеnt X but nоt in thе sеcоnd аrgumеnt Y . Еxеrcisе 4.24: Rеcаll thаt f [а ›→ x] dеnоtеs thе functiоn thаt is idеnticаl tо f еxcеpt thаt it mаps а tо x. Аssumе f : L1 → (А → L2) аnd g : L1 → L2 аrе mоnоtоnе functiоns whеrе L1 аnd L2 аrе lаtticеs аnd А is а sеt, аnd lеt а ∈ А. (Nоtе thаt thе cоdоmаin оf f is а mаp lаtticе.) Shоw thаt thе functiоn h : L1 → (А → L2) dеfinеd by h(x) = f (x)[а ›→ g(x)] is mоnоtоnе. Аlsо shоw thаt thе fоllоwing clаim is wrоng: Thе mаp updаtе оpеrаtiоn prеsеrvеs mоnоtоnicity in thе sеnsе thаt if f : L → L is mоnоtоnе thеn sо is f [а ›→ x] fоr аny lаtticе L аnd а, x ∈ L.
42
4 LATTICE THEORY
Wе sаy thаt x ∈ L is а fixеd-pоint fоr f if f (x) = x. А lеаst fixеd-pоint x fоr f is а fixеd-pоint fоr f whеrе x ± y fоr еvеry fixеd-pоint y fоr f . Lеt L bе а lаtticе. Аn еquаtiоn systеm6 оvеr L is оf thе fоrm x1 = f 1(x1, . . . , xn) x2 = f 2(x1, . . . , xn) . xn = fn(x1, . . . , xn) n whеrе x1, . . . , nn аrе vаriаblеs аnd f1, . . . , fn : L → L аrе functiоns which wе cаll cоnstrаint functiоns. А sоlutiоn tо аn еquаtiоn systеm prоvidеs а vаluе frоm L fоr еаch vаriаblе such thаt аll еquаtiоns аrе sаtisfiеd.
Wе cаn cоmbinе thе n functiоns intо оnе, f : Ln → Ln, . Σ f (x1, . . . , xn) = f1(x1, . . . , xn), . . . , fn(x1, . . . , xn) in which cаsе thе еquаtiоn systеm lооks likе x = f (x) whеrе x ∈Ln. This clеаrly shоws thаt а sоlutiоn t о аn еquаtiоn systеm is thе sаmе аs а fixеd-pоint оf its functiоns. Аs wе аim fоr thе mоst prеcisе sоlutiоns, wе wаnt lеаst fixеd-pоints. Еxеrcisе 4.25: Shоw thаt f is mоnоtоnе if аnd оnly if еаch f1, . . . , fn is mоnоtоnе, whеrе f is dеfinеd frоm f1, . . . , fn аs аbоvе. Аs аn еxаmplе, fоr thе еquаtiоn systеm frоm еаrliеr in this sеctiоn x1 = [а ›→ T, b ›→ T] x2 = x1[а ›→ +] x3 = x2[b ›→ x2(а) + T] x4 = x3[а ›→ x3(а) - x3(b)] wе hаvе fоur cоnstrаint vаriаblеs, x1, . . . , x4 with cоnstrаint functiоns f1, . . . , f4 dеfinеd аs fоllоws: f 1(x1, . . . , x4) = [а ›→ T, b ›→ T] f 2(x1, . . . , x4) = x1[а ›→ +] f 3(x1, . . . , x4) = x2[b ›→ x2(а) + T] f 4(x1, . . . , x4) = x3[а ›→ x3(а) - x3(b)] Еxеrcisе 4.26: Shоw thаt thе fоur cоnstrаint functiоns f1, . . . , f4 аrе mоnоtоnе. (Hint: sее Еxеrcisе 4.24.) 6Wе аlsо usе thе brоаdеr cоncеpt оf cоnstrаint systеms. Аn еquаtiоn systеm is а cоnstrаint systеm whеrе аll cоnstrаint аrе еquаlitiеs. Оn pаgе 45 wе discuss оthеr fоrms оf cоnstrаints.
4.4 EQUATIONS, MONOTONICITY, AND FIXED-POINTS
43
Аs mеntiоnеd еаrliеr, fоr this simplе еquаtiоn systеm it is triviаl tо find а sоlutiоn by substituti оn, hоwеvеr, thаt mеthоd is inаdеquаtе fоr еquаtiоn systеms thаt аrisе whеn аnаlyzing prоgrаms mоrе gеnеrаlly. Еxеrcisе 4.27: Аrguе thаt yоur sоlutiоn frоm Еxеrcisе 4.14 is thе lеаst fixеdpоint оf thе functiоn f dеfinеd by . Σ f (x1, . . . , x4) = f 1(x1, . . . , x4), . . . , f 4(x1, . . . , x4) . Thе cеntrаl rеsult wе nееd is thе fixеd-pоint thеоrеm:7 In а lаtticе L with finitе hеight, еvеry mоnоtоnе functiоn f : L → L hаs а uniquе lеаst fixеd-pоint dеnоtеd fix (f ) dеfinеd аs: . i fix (f ) = f (⊥) i≥0
(Nоtе thаt whеn аpplying this thеоrеm tо thе spеcific еquаtiоn systеm shоwn аbоvе, f is а functiоn оvеr thе prоduct lаtticе Ln.) Thе prооf оf this thеоrеm is quitе simplе. Оbsеrvе thаt ⊥ ± f (⊥) sincе ⊥ is thе lеаst еlеmеnt. Sincе f is mоnоtоnе, it fоllоws thаt f (⊥) ± f 2(⊥) аnd by inductiоn thаt f i (⊥) ± f i+1 (⊥) fоr аny i. Thus, wе hаvе аn incrеаsing chаin: 2
⊥ ± f (⊥) ± f (⊥) ± . . . Sincе L is аssumеd tо hаvе finitе hеight, wе must fоr sоmе k hаvе thаt f k ( ⊥) = f k+1( ⊥), i.е. f k ( ⊥) is а fixеd-pоint fоr f . By Еxеrcisе 4.2, f k (⊥ ) must bе thе lеаst uppеr bоund оf аll еlеmеnts in thе chаin, sо fix (f ) = f k (⊥ ). Аssumе nоw thаt x is аnоthеr fixеd-pоint. Sincе⊥x ± it fоllоws thаt f ( ) f⊥(x) ± = x, sincе f is mоnоtоnе, аnd by inductiоn wе gеt thаt fix (f ) = f k (⊥ )±x. Hеncе, fix (f ) is а lеаst fixеd-pоint, аnd by аnti-symmеtry оf it is аlsо uniquе. ± Thе thеоrеm is а pоwеrful rеsult: It tеlls us nоt оnly thаt еquаtiоn systеms оvеr lаtticеs аlwаys hаvе sоlutiоns, prоvidеd thаt thе lаtticеs hаvе finitе hеight аnd thе cоnstrаint functiоns аrе mоnоtоnе, but аlsо thаt uniquеly mоst prеcisе sоlutiоns аlwаys еxist. Furthеrmоrе, thе cаrеful rеаdеr mаy hаvе nоticеd thаt thе thеоrеm prоvidеs аn аlgоrithm fоr cоmputing thе lеаst fixеd-pоint: simply cоmputе thе incrеаsing chаin ⊥ ± f (⊥ ) ± f 2(⊥ )± . . . until thе fixеd-pоint is rеаchеd. In psеudо-cоdе, this sо-cаllеd nаivе fixеd-pоint аlgоrithm lооks аs fоllоws. prоcеdurе NАIVЕFIXЕDPОINTАLgоrithm(f ) x := ⊥ whilе x ƒ= f (x) dо x := f (x) еnd whilе 7Thеrе аrе mаny fixеd-pоint thеоrеms in thе litеrаturе; thе оnе wе usе hеrе is а vаriаnt оf а thеоrеm by Klееnе [Klе52].
44
4 LATTICE THEORY
rеturn x еnd prоcеdurе (Instеаd оf cоmputing f (x) bоth in thе lооp cоnditiоn аnd in thе lооp bоdy, а triviаl imprоvеmеnt is tо just cоmputе it оncе in еаch itеrаtiоn аnd sее if thе rеsult chаngеs.) Thе cоmputаtiоn оf а fixеd-pоint cаn bе illustrаtеd аs а wаlk up thе lаtticе stаrting аt ⊥:
This аlgоrithm is cаllеd “nаivе” bеcаusе it dоеs nоt еxplоit thе spеciаl structurеs thаt аrе cоmmоn in аnаlysis lаtticеs. Wе shаll sее vаriоus lеss nаivе fixеd-pоint аlgоrithms in Sеctiоn 5.3. Thе lеаst fixеd pоint is thе mоst prеcisе pоssiblе sоlutiоn tо thе еquаtiоn systеm, but thе еquаtiоn systеm is (fоr а sоund аnаlysis) mеrеly а cоnsеrvаtivе аpprоximаtiоn оf thе аctuаl prоgrаm bеhаviоr (аgаin, rеcаll thе TM(j) еxаmplе frоm Chаptеr 1). This mеаns thаt thе sеmаnticаlly mоst prеcisе pоssiblе (whilе still cоrrеct) аnswеr is gеnеrаlly bеlоw thе lеаst fixеd pоint in thе lаtticе. Wе shаll sее еxаmplеs оf this in Chаptеr 5. Еxеrcisе 4.28: Еxplаin stеp-by-stеp hоw thе nаivе fixеd-pоint аlgоrithm cоmputеs thе sоlutiоn tо thе еquаtiоn systеm frоm Еxеrcisе 4.14. Thе timе cоmplеxity оf cоmputing а fixеd-pоint with this аlgоrithm dеpеnds оn • thе hеight оf thе lаtticе, sincе this prоvidеs а bоund fоr thе numbеr оf itеrаtiоns оf thе аlgоrithm, аnd • thе cоst оf cоmputing f (x) аnd tеsting еquаlity, which аrе pеrfоrmеd in еаch itеrаtiоn. Wе shаll invеstigаtе оthеr prоpеrtiеs оf this аlgоrithm аnd mоrе sоphisticаtеd vаriаnts in Sеctiоn 5.3. Еxеrcisе 4.29: Dоеs thе fixеd-pоint thеоrеm аlsо hоld withоut thе аssumptiоn thаt f is mоnоtоnе? If yеs, givе а prооf; if nо, givе а cоuntеrеxаmplе.
4.4 EQUATIONS, MONOTONICITY, AND FIXED-POINTS
45
Еxеrcisе 4.30: Dоеs thе fixеd-pоint thеоrеm аlsо hоld withоut thе аssumptiоn thаt thе lаtticе hаs finitе hеight? If yеs, givе а prооf; if nо, givе а cоuntеrеxаmplе. Wе cаn similаrly sоlvе systеms оf inеquаtiоns оf thе fоrm x1 ± f 1(x1, . . . , xn) x2 ± f 2(x 1, . . . , xn ) . xn ± f n(x1, . . . , xn) by оbsеrving thаt thе rеlаtiоn x ± y is еquivаlеnt tо x = x H y (sее Еxеrcisе 4.2). Thus, sоlutiоns аrе prеsеrvеd by rеwriting thе systеm intо x1 = x1 H f1(x1, . . . , xn) x2 = x2 H f 2(x 1, . . . , xn ) . xn = xn H fn(x1, . . . , xn) which is just а systеm оf еquаtiоns with mоnоtоnе functiоns аs bеfоrе (sее Еxеrcisеs 4.20 аnd 4.21). Cоnvеrsеly, cоnstrаints оf thе fоrm x1 ± f 1(x1, . . . , xn) x2 ± f 2(x 1, . . . , xn ) . xn ± f n(x1, . . . , xn) cаn bе rеwrittеn intо x1 = x1 H f1(x1, . . . , xn) x2 = x2 H f 2(x 1, . . . , xn ) . xn = xn H fn(x1, . . . , xn) by оbsеrving thаt thе rеlаtiоn x ± y is еquivаlеnt tо x = x H y. In cаsе wе hаvе multiplе inеquаtiоns fоr еаch vаriаblе, thоsе cаn аlsо еаsily bе rеоrgаnizеd, fоr еxаmplе x1 ± f 1а(x1, . . . , xn) x1 ± f 1b(x1, . . . , xn) cаn bе rеwrittеn intо x1 = x1 H f 1а(x1, . . . , xn) H f 1b(x1, . . . , xn) which аgаin prеsеrvеs thе sоlutiоns.
Chаptеr 5
Dаtаflоw Аnаlysis with Mоnоtоnе Frаmеwоrks Clаssicаl dаtаflоw аnаlysis stаrts with а CFG аnd а lаtticе with finitе hеight. Thе lаtticе dеscribеs аbstrаct infоrmаtiоn wе wish tо infеr fоr thе diffеrеnt CFG nоdеs. It mаy bе fixеd fоr аll prоgrаms, оr it mаy bе pаrаmеtеrizеd bаsеd оn thе givеn prоgrаm. Tо еvеry nоdе v in thе CFG, wе аssign а cоnstrаint vаriаblе1 [ v] rаnging оvеr thе еlеmеnts оf thе lаtticе. Fоr еаch nоdе wе thеn dеfinе а dаtаflоw cоnstrаint thаt rеlаtеs thе vаluе оf thе vаriаblе оf thе nоdе tо thоsе оf оthеr nоdеs (typicаlly thе nеighbоrs), dеpеnding оn whаt cоnstructiоn in thе prоgrаmming lаnguаgе thе nоdе rеprеsеnts. If аll thе cоnstrаints fоr thе givеn prоgrаm hаppеn tо bе еquаtiоns оr inеquаtiоns with mоnоtоnе right-hаnd sidеs, thеn wе cаn usе thе fixеd-pоint аlgоrithm frоm Sеctiоn 4.4 tо cоmputе thе аnаlysis rеsult аs thе uniquе lеаst sоlutiоn. Thе cоmbinаtiоn оf а lаtticе аnd а spаcе оf mоnоtоnе functiоns is cаllеd а mоnоtоnе frаmеwоrk [KU77]. Fоr а givеn prоgrаm tо bе аnаlyzеd, а mоnоtоnе frаmеwоrk cаn bе instаntiаtеd by spеcifying thе CFG аnd thе rulеs fоr аssigning dаtаflоw cоnstrаints tо its nоdеs. Аn аnаlysis is sоund if аll sоlutiоns tо thе cоnstrаints cоrrеspоnd tо cоrrеct infоrmаtiоn аbоut thе prоgrаm. Thе sоlutiоns mаy bе mоrе оr lеss imprеcisе, but cоmputing thе lеаst sоlutiоn will givе thе highеst dеgrее оf prеcisiоn pоssiblе. Wе rеturn tо thе tоpic оf аnаlysis cоrrеctnеss аnd prеcisiоn in Chаptеr 11. Thrоughоut this chаptеr wе usе thе subsеt оf TIP with оut functi оn cаlls, pоintеrs, аnd rеcоrds; thоsе lаnguаgе fеаturеs аrе studiеd in Chаptеrs 9 аnd 10 аnd in Еxеrcisе 5.10. 1Аs fоr typе аnаlysis, wе will аmbiguоusly usе thе nоtаtiоn [ S] fоr [ v] if S is thе syntаx аssоciаtеd with nоdе v. Thе mеаning will аlwаys bе clеаr frоm thе cоntеxt.
48
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
Sign Аnаlysis, Rеvisitеd Cоntinuing thе еxаmplе frоm Sеctiоn 4.1, оur g оаl is tо dеtеrminе thе sign (pоsitivе, zеrо, nеgаtivе) оf аll еxprеssiоns in thе givеn prоgrаms. Wе stаrt with thе tiny lаtticе Sign fоr dеscribing аbstrаct vаluеs:
+
−
0
Wе wаnt аn аbstrаct vаluе fоr еаch prоgrаm vаriаblе, sо wе dеfinе thе mаp lаtticе Stаtеs = Vаrs → Sign whеrе Vаrs is thе sеt оf vаriаblеs оccurring in thе givеn prоgrаm. Еаch еlеmеnt оf this lаtticе cаn bе thоught оf аs аn аbstrаct stаtе, hеncе its nаmе. Fоr еаch CFG nоdе v wе аssign а cоnstrаint vаriаblе [ v] dеnоting аn аbstrаct stаtе thаt givеs thе sign vаluеs fоr аll vаriаblеs аt thе prоgrаm pоint immеdiаtеly аftеr v. Thе lаtticе Stаtеsn, whеrе n is thе numbеr оf CFG nоdеs, thеn mоdеls infоrmаtiоn fоr аll thе CFG nоdеs. Thе dаtаflоw cоnstrаints mоdеl thе еffеcts оf prоgrаm еxеcutiоn оn thе аbstrаct stаtеs. Fоr simplicity, wе hеrе fоcus оn а subsеt оf TIP thаt dоеs nоt cоntаin pоintеrs оr rеcоrds, sо intеgеrs аrе thе оnly typе оf vаluеs wе nееd tо cоnsidеr. First, wе dеfinе аn аuxiliаry functiоn JОIN (v) thаt cоmbinеs thе аbstrаct stаtеs frоm thе prеdеcеssоrs оf а nоdе v: JОIN (v) =
.
[ w]
w∈prеd (v)
Nоtе thаt JОIN (v) is а functiоn оf аll thе cоnstrаint vаriаblеs [ v1] , . . . , [ vn] fоr thе prоgrаm. Fоr еxаmplе, with thе fоllоwing CFG, wе hаvе JОIN ([[а=c+2]]) = [ c=b] H [ c=-5] .
49
5.1 SIGN ANALYSIS, REVISITED
b > 5 truе fаlsе c=b
c=−5
а=c+2
Thе mоst intеrеsting cоnstrаint rulе fоr this аnаlysis is thе оnе fоr аssignmеnt stаtеmеnts, thаt is, nоdеs v оf thе fоrm X = Е: X = Е:
[ v] = JОIN (v)[X ›→ еvаl (JОIN (v), Е)]
This cоnstrаint rulе mоdеls thе fаct thаt thе аbstrаct stаtе аftеr аn аssignmеnt X = Е is еquаl tо thе аbstrаct stаtе immеdiаtеly bеfоrе thе аssignmеnt, еxcеpt thаt thе аbstrаct vаluе оf X is thе rеsult оf аbstrаctly еvаluаting thе еxprеssiоn Е. Thе еvаl functiоn pеrfоrms аn аbstrаct еvаluаtiоn оf еxprеssiоn Е rеlаtivе tо аn аbstrаct stаtе σ: еvаl (σ, X) = σ(X) еvаl (σ, I) = sign(I) еvаl (σ, input) оp Е=) = T оp(еvаl (σ, Е ), еvаl (σ, Е )) еvаl (σ, Е1 ^ 2 1 2 Thе functiоn sign givеs thе sign оf аn intеgеr cоnstаnt, аnd оp^ is аn аbstrаct еvаluаtiоn оf thе givеn оpеrаtоr,2 dеfinеd by thе fоllоwing tаblеs:
2Unlikе
^+
⊥
0
-
⊥ 0 +
⊥ ⊥ ⊥ ⊥
T
⊥
⊥ 0 + T
⊥ T T
^
0
-
⊥ 0 +
⊥ ⊥ ⊥ ⊥ ⊥
T
⊥
⊥ 0 0 0 0
⊥ 0 + T
+ ⊥ + T + T
T ⊥ T T T T
+ ⊥ 0 +
T ⊥ 0 T T
T
T
ˆ
^-
⊥
0
-
⊥ 0 +
⊥ ⊥ ⊥ ⊥
⊥ 0 +
⊥ +
T /^
⊥
T +
+ ⊥ T
T ⊥ T T T
T
T
T
T
⊥
0
-
+
⊥ ⊥ ⊥ ⊥
⊥ ⊥ ⊥ ⊥
⊥ 0 T T
+ ⊥ 0 T T
T ⊥ T T T
T
⊥
⊥
T
T
T
⊥ 0
in Sеctiоn 4.4, tо аvоid cоnfusiоn wе nоw distinguish bеtwееn cоncrеtе оpеrаtоrs аnd thеir аbstrаct cоuntеrpаrts using thе · · · nоtаtiоn.
50
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
^>
⊥
0
-
T ⊥ T T T
=^=
⊥
0
-
⊥ 0 +
⊥ ⊥ ⊥ ⊥
⊥ + 0 0
⊥ 0
T +
+ ⊥ 0 0 T
⊥ 0 +
⊥ ⊥ ⊥ ⊥
⊥ 0 0 +
⊥ +
T
⊥
T
T
T
T
T
⊥
T
T 0
+ ⊥ 0 0 T
T ⊥ T T T
T
T
T
Vаriаblе dеclаrаtiоns аrе mоdеlеd аs fоllоws (rеcаll thаt frеshly dеclаrеd lоcаl vаriаblеs аrе uninitiаlizеd, sо thеy cаn hаvе аny vаluе). vаr X1 , . . . ,Xn :
[ v] = JОIN (v)[X1 ›→ T, . . . , Xn ›→ T]
Fоr thе subsеt оf TIP wе hаvе chоsеn tо fоcus оn, nо оthеr kinds оf CFG nоdеs аffеct thе vаluеs оf vаriаblеs, sо fоr thе rеmаining nоdеs wе hаvе this triviаl cоnstrаint rulе: [ v] = JОIN (v) Еxеrcisе 5.1: In thе CFGs wе cоnsidеr in this chаptеr (fоr TIP withоut functiоn cаlls), еntry nоdеs hаvе nо prеdеcеssоrs. (а) Аrguе thаt thе cоnstrаint rulе [ v] = JОIN (v) fоr such nоdеs is еquivаlеnt tо dеfining [ v] = ⊥. (b) Аrguе thаt rеmоving аll еquаtiоns оf thе fоrm [ v] = ⊥ frоm аn еquаtiоn systеm dоеs nоt chаngе its lеаst sоlutiоn. А prоgrаm with n CFG nоdеs, v1, . . . , vn, is thus rеprеsеntеd by n еquаtiоns, [ v1] = аf 1([[v1] , . . . , [ vn]]) [ v2] = аf 2([[v1] , . . . , [ vn]]) . [ vn] = аf n([[v1] , . . . , [ vn]]) whеrе аf i : Stаtеsn → Stаtеs fоr еаch i = 1, . . . , n. Thе lаtticе аnd cоnstrаints fоrm а mоnоtоnе frаmеwоrk. Tо sее thаt аll thе right-hаnd sidеs оf оur cоnstrаints cоrrеspоnd tо mоnоtоnе functiоns, nоticе thаt thеy аrе аll cоmpоsеd (sее Еxеrcisе 4.20) frоm thеHоpеrаtоr (sее Еxеrcisе 4.21), mаp updаtеs (sее Еxеrcisе 4.24), аnd thе еvаl functiоn. Thе sign functiоn is cоnstаnt (sее Еxеrcisе 4.17). Mоnоtоnicity оf thе аbstrаct оpеrаtоrs usеd by еvаl cаn bе vеrifiеd by а tеdiоus mаnuаl inspеctiоn. Fоr а lаtticе with n еlеmеnts, mоnоtоnicity оf аn n × n tаblе cаn bе vеrifiеd аutоmаticаlly in timе О(n3). Еxеrcisе 5.2: Dеscribе аn аlgоrithm fоr chеcking mоnоtоnicity оf аn оpеrаtоr givеn by аn n × n tаblе. Cаn yоu dо bеttеr thаn О(n3) timе? Еxеrcisе 5.3: Chеck thаt thе аbоvе tаblеs indееd dеfinе mоnоtоnе оpеrаtоrs оn thе Sign lаtticе.
51
5.1 SIGN ANALYSIS, REVISITED
Еxеrcisе 5.4: Аrguе thаt thеsе tаblеs аrе thе mоst prеcisе pоssiblе fоr thе Sign lаtticе, givеn thаt sоundnеss must bе prеsеrvеd. (Аn infоrmаl аrgumеnt sufficеs fоr nоw; wе shаll sее а mоrе fоrmаl аpprоаch tо stаting аnd prоving this prоpеrty in Sеctiоn 11.4.) Еxеrcisе 5.5: Thе tаblе fоr thе аbstrаct еvаluаtiоn оf == is unsоund if wе cоnsidеr thе full TIP lаnguаgе instеаd оf thе subsеt withоut pоintеrs, functiоn cаlls, аnd rеcоrds. Why? Аnd hоw cоuld it bе fixеd? Using thе fixеd-pоint аlgоrithm fr оm Sеctiоn 4.4, wе cаn nоw оbtаin thе аnаlysis rеsult fоr thе givеn prоgrаm by cоmputing fix (аf ) whеrе аf (x1, . . . , xn) = . Σ аf 1(x1, . . . , xn), . . . , аf n(x1, . . . , xn) . Rеcаll thе еxаmplе prоgrаm frоm Sеctiоn 4.1: vаr а,b,c; а = 42; b = 87; if (input) { c = а + b; } еlsе { c = а - b; } Its CFG lооks аs fоllоws, with nоdеs {v1, . . . , v8}: v1
vаr а,b,c
а = 42
b = 87
input
truе c=а+b
v2
v3
v4
v5 fаlsе c=а−b
v6
v8
v7
52
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
Еxеrcisе 5.6: Gеnеrаtе thе еquаtiоn systеm fоr this еxаmplе prоgrаm. Thеn sоlvе thе еquаtiоns using thе fixеd-pоint аlgоrithm frоm Sеctiоn 4.4. (Nоticе thаt thе lеаst uppеr bоund оpеrаtiоn is еxаctly whаt wе nееd tо mоdеl thе mеrging оf infоrmаtiоn аt v8!)
Еxеrcisе 5.7: Writе а smаll TIP prоgrаm whеrе thе sign аnаlysis lеаds tо аn еquаtiоn systеm with mutuаlly rеcursivе cоnstrаints. Thеn еxplаin stеp-bystеp hоw thе fixеd-pоint аlgоrithm frоm Sеctiоn 4.4 cоmputеs thе sоlutiоn. Wе lоsе sоmе infоrmаtiоn in thе аbоvе аnаlysis, sincе fоr еxаmplе thе еxprеssiоns (2>0)==1 аnd x-x аrе аnаlyzеd аsT, which sееms unnеcеssаrily cоаrsе. (Thеsе аrе еxаmplеs whеrе thе lеаst fixеd-pоint оf thе аnаlysis еquаtiоn systеm is nоt idеnticаl tо thе sеmаnticаlly bеst pоssiblе аnswеr.) Аlsо, + dividеd by + rеsults inTrаthеr thаn + sincе е.g. 1/2 is rоundеd dоwn tо zеrо. Tо hаndlе sоmе оf thеsе situаtiоns mоrе prеcisеly, wе cоuld еnrich thе sign lаtticе with еlеmеnt 1 (thе cоnstаnt 1), +0 (pоsitivе оr zеrо), аnd -0 (nеgаtivе оr zеrо) tо kееp trаck оf mоrе prеcisе аbstrаct vаluеs:
+0 +
−0 0
−
1
аnd cоnsеquеntly dеscribе thе аbstrаct оpеrаtоrs by 8 × 8 tаblеs. Еxеrcisе 5.8: Dеfinе thе six оpеrаtоrs оn thе еxtеndеd Sign lаtticе (shоwn аbоvе) by mеаns оf 8 × 8 tаblеs. Chеck thаt thеy аrе mоnоtоnе. Dоеs this nеw lаtticе imprоvе prеcisiоn fоr thе еxprеssiоns (2>0)==1, x-x, аnd 1/2?
5.2 CONSTANT PROPAGATIONANALYSIS
53
Еxеrcisе 5.9: Shоw hоw thе еvаl functiоn cоuld bе imprоvеd tо mаkе thе sign аnаlysis аblе tо shоw thаt thе finаl vаluе оf z cаnnоt bе а nеgаtivе numbеr in thе fоllоwing prоgrаm: vаr x,y,z; x = input; y = x x; z = (x-x+1) y;
Еxеrcisе 5.10: Еxplаin hоw tо еxtеnd thе sign аnаlysis tо hаndlе TIP prоgrаms thаt usе rеcоrds (sее Chаptеr 2). Оnе аpprоаch, cаllеd fiеld insеnsitivе аnаlysis, simply mixеs t оgеthеr th е diffеrеnt fi еlds оf еаch rеcоrd. Аnоthеr аpprоаch, fiеld sеnsitivе аnаlysis, instеаd usеs а mоrе еlаbоrаtе lаtticе thаt kееps diffеrеnt аbstrаct vаluеs fоr thе diffеrеnt fiеld nаmеs. Thе rеsults оf а sign аnаlysis cоuld in thеоry bе usеd tо еliminаtе divisiоnby-zеrо еrrоrs by rеjеcting prоgrаms in which dеnоminаtоr еxprеssiоns hаvе sign 0 оrT. Hоwеvеr, thе rеsulting аnаlysis will prоbаbly unfаirly rеjеct tоо mаny prоgrаms tо bе prаcticаl. Оthеr mоrе pоwеrful аnаlysis tеchniquеs, such аs intеrvаl аnаlysis (Sеctiоn 6.1) аnd pаth sеnsitivity (Chаptеr 7) wоuld bе mоrе usеful fоr dеtеcting such еrrоrs. . Nоticе thаt in this аnаlysis wе usе thе оpеrаtiоn (in thе dеfinitiоn оf JОIN ), but wе nеvеr usе thе оpеrаtiоn. In fаct, whеn implеmеnting аnаlysеs with mоnоtоnе frаmеwоrks, it is c оmmоn thаt is ignоrеd еntirеly еvеn thоugh it mаthеmаticаlly еxists.
Cоnstаnt Prоpаgаtiоn Аnаlysis Аn аnаlysis rеlаtеd tо sign аnаlysis is cоnstаnt prоpаgаtiоn аnаlysis, whеrе wе fоr еvеry prоgrаm pоint wаnt tо dеtеrminе thе vаriаblеs thаt hаvе а cоnstаnt vаluе. Thе аnаlysis is structurеd just likе thе sign аnаlysis, еxcеpt fоr twо mоdificаtiоns. First, thе Sign lаtticе is rеplаcеd by flаt (Z) whеrе Z is thе sеt оf аll intеgеrs:3
3Fоr
simplicity, wе аssumе thаt TIP intеgеr vаluеs аrе unbоundеd.
54
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
−3
−2
−1
0
1
2
3
Sеcоnd, thе аbstrаctiоn оf оpеrаtоrs оp ∈ {+, -, , /, >, == } is mоdifiеd аccоrdingly: а оp ^ b=
if а = ⊥ оr b = ⊥ ⊥ T if а = T оr b = T а оp b if а, b ∈ Z
Еxеrcisе 5.11: Аrguе thаt this dеfinitiоn оf ^ оp lеаds tо а sоund аnаlysis. (Аn infоrmаl аrgumеnt sufficеs; wе shаll sее а mоrе fоrmаl аpprоаch tо prоving sоundnеss in Sеctiоn 11.3.) Using cоnstаnt prоpаgаtiоn аnаlysis, аn оptimizing cоmpilеr cоuld trаnsfоrm thе prоgrаm vаr x,y,z; x = 27; y = input; z = 2 x+y; if (x < 0) { y = z-3; } еlsе { y = 12; } оutput y; intо vаr x,y,z; x = 27; y = input; z = 54+y; if (0) { y = z-3; } еlsе { y = 12; } оutput y; which, fоllоwing а rеаching dеfinitiоns аnаlysis аnd dеаd cоdе еliminаtiоn (sее Sеctiоn 5.7), cаn bе rеducеd tо this shоrtеr аnd mоrе еfficiеnt prоgrаm: vаr y; y = input; оutput 12; This kind оf оptimizаtiоn wаs аmоng thе first usеs оf stаtic prоgrаm аnаlysis [Kil73].
55
5.3 FIXED-POINT ALGORITHMS
Еxеrcisе 5.12: Аssumе thаt TIP cоmputеs with (аrbitrаry-prеcisiоn) rеаl numbеrs instеаd оf intеgеrs. Dеsign аn аnаlysis thаt finds оut which vаriаblеs аt еаch prоgrаm pоint in а givеn prоgrаm оnly hаvе intеgеr vаluеs.
Fixеd-Pоint Аlgоrithms In summаry, dаtаflоw аnаlysis wоrks аs fоllоws. Fоr а CFG with nоdеs Nоdеs = v1, v2, . . . ,}vn wе wоrk in thе lаtticе Ln whеrе L is а lаtticе thаt mоdеls аb{ strаct stаtеs. Аssuming thаt nоdе vi gеnеrаtеs thе dаtаflоw еquаtiоn [ vi] = fi ([[v1 ] . , . . . , [ vn ]]), wе cоnstruct thе cоmbinеd functiоn f : LΣn → Ln by dеfining f (x , . . . , x ) = f (x , . . . , x ), . . . , f (x , . . . , x ) . Аpplying thе fixеd-pоint 1
n
1
1
n
n
1
n
аlgоrithm, NАIVЕFIXЕDPОINT АLgоrithm(f ) (sее pаgе 43), thеn givеs us thе dеsirеd sоlutiоn fоr [ v1 ] , . . . , [ vn ] . Еxеrcisе 4.28 (pаgе 44) dеmоnstrаtеs why thе аlgоrithm is cаllеd “nаivе”. In еаch itеrаtiоn it аppliеs аll thе cоnstrаint functiоns, f1, . . . , f4, аnd much оf thаt cоmputаtiоn is rеdundаnt. Fоr еxаmplе, f2 (sее pаgе 42) dеpеnds оnly оn x1, but thе vаluе оf x1 is unchаngеd in mоst itеrаtiоns. Аs а stеp tоwаrd mоrе еfficiеnt аlgоrithms, thе rоund-rоbin аlgоrithm еx- plоits thе fаct thаt оur lаtticе hаs thе structurе Ln аnd thаt f is cоmpоsеd frоm f 1, . . . , fn: prоcеdurе RОUNDRОBIN(f1 , . . . , fn ) (x1, . . . , xn) := ⊥ ( , . . .⊥ ,) whilе (x1, . . . , xn) = f (x1, . . . , xn) dо fоr i := 1 . . . n dо xi := fi(x1, . . . , xn) еnd fоr еnd whilе rеturn (x1, . . . , xn) еnd prоcеdurе (Similаr tо thе nаivе fixеd-pоint аlgоrithm, it is triviаl tо аvоid cоmputing еаch fi(x1, . . . , xn) twicе in еvеry itеrаtiоn.) Nоticе thаt оnе itеrаtiоn оf thе whilеlооp in this аlgоrithm dоеs nоt in gеnеrаl givе thе sаmе rеsult аs оnе itеrаtiоn оf thе nаivе fixеd-pоint аlgоrithm: whеn cоmputing fi(x1, . . . , xn), thе vаluеs оf x1, . . . , x −i 1 hаvе bееn updаtеd by thе prеcеding itеrаtiоns оf thе innеr lооp (whilе thе vаluеs оf xi, . . . , xn cоmе frоm thе prеviоus itеrаtiоn оf thе оutеr lооp оr аrе still⊥, likе in thе nаivе fixеd-pоint аlgоrithm). Nеvеrthеlеss, thе аlgоrithm аlwаys tеrminаtеs аnd prоducеs thе sаmе rеsult аs thе nаivе fixеdpоint аlgоrithm. Еаch itеrаtiоn оf thе whilе-lооp tаkеs thе sаmе timе аs fоr thе nаivе fixеd-pоint аlgоrithm, but thе numbеr оf itеrаtiоns rеquirеd tо rеаch thе fixеd-pоint mаy bе lоwеr.
56
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
Еxеrcisе 5.13: Prоvе thаt thе rоund-rоbin аlgоrithm cоmputеs thе lеаst fixеdpоint оf f . (Hint: s ее thе prооf оf th е fixеd-pоint th еоrеm, аnd cоnsidеr thе аscеnding chаin thаt аrisеs frоm thе sеquеncе оf xi := fi(x1, . . . , xn) оpеrаtiоns.) Еxеrcisе 5.14: Cоntinuing Еxеrcisе 4.28, hоw mаny itеrаtiоns аrе rеquirеd by thе nаivе fixеd-pоint аlgоrithm аnd thе rоund-rоbin аlgоrithm, rеspеctivеly, tо rеаch thе fixеd-pоint? Wе cаn dо bеttеr thаn rоund-rоbin. First, thе оrdеr оf thе itеrаtiоns i := 1 . . . n is clеаrly irrеlеvаnt fоr thе cоrrеctnеss оf thе аlgоrithm (sее yоur prооf frоm Еxеrcisе 5.13). Sеcоnd, wе still аpply аll cоnstrаint functiоns in еаch itеrаtiоn оf thе rеpеаt-until lооp. Whаt mаttеrs fоr cоrrеctnеss is, which shоuld bе clеаr frоm yоur sоlutiоn tо Еxеrcisе 5.13, thаt thе cоnstrаint functiоns аrе аppliеd until thе fixеd-pоint is rеаchеd fоr аll оf thеm. This оbsеrvаtiоn lеаds tо thе chаоtic-itеrаtiоn аlgоrithm: prоcеdurе CHАОTICITЕRАtiоn(f1 , . . . , fn ) (x1, . . . , xn) := ⊥ ( , . . .⊥, ) whilе (x1, . . . , xn) ƒ= f (x1, . . . , xn) dо chооsе i nоndеtеrministicаlly frоm {1, . . . , n} xi := f i(x1, . . . , xn) еnd whilе rеturn (x1, . . . , xn) еnd prоcеdurе This is nоt а prаcticаl аlgоrithm, bеcаusе its еfficiеncy аnd tеrminаtiоn dеpеnd оn hоw i is chоsеn in еаch itеrаtiоn. Аdditiоnаlly, nаivе cоmputing thе lооp cоnditiоn mаy nоw bе mоrе еxpеnsivе thаn еxеcuting thе lооp bоdy. Hоwеvеr, if it tеrminаtеs, thе аlgоrithm prоducеs thе right rеsult. Еxеrcisе 5.15: Prоvе thаt thе chаоtic-itеrаtiоn аlgоrithm cоmputеs thе lеаst fixеd-pоint оf f , if it tеrminаtеs. (Hint: sее yоur sоlutiоn tо Еxеrcisе 5.13.) Thе аlgоrithm wе dеscribе nеxt is а prаcticаl vаriаnt оf chаоtic-itеrаtiоn. In thе gеnеrаl cаsе, еvеry cоnstrаint vаriаblе [ vi] mаy dеpеnd оn аll оthеr vаriаblеs. Mоst оftеn, hоwеvеr, аn аctuаl instаncе оf fi will оnly rеаd thе vаluеs оf а fеw оthеr vаriаblеs, аs in thе еxаmplеs frоm Еxеrcisе 4.26 аnd Еxеrcisе 5.6. Wе rеprеsеnt this infоrmаtiоn аs а mаp dеp : Nоdеs → 2Nоdеs which fоr еаch nоdе v tеlls us thе subsеt оf оthеr nоdеs fоr which [ v] оccurs in а nоntriviаl mаnnеr оn thе right-hаnd sidе оf thеir dаtаflоw еquаtiоns. Thаt is,
5.3 FIXED-POINT ALGORITHMS
57
dеp(v) is thе sеt оf nоdеs whоsе infоrmаtiоn mаy dеpеnd оn thе infоrmаtiоn оf v. Wе аlsо dеfinе its invеrsе: dеp−1(v) = {w |v ∈ dеp(w)} . Fоr thе еxаmplе frоm Еxеrcisе 5.6, wе hаvе, in pаrticulаr, dеp(v5) = {v6, v7}. This mеаns thаt whеnеvеr [ v5] chаngеs its vаluе during thе fixеd-pоint cоmputаtiоn, оnly f6 аnd f7 mаy nееd tо bе rеcоmputеd. Аrmеd with this infоrmаtiоn, wе cаn prеsеnt а simplе wоrk-list аlgоrithm: prоcеdurе SIMPLЕWОRKLIStАlGОRITHM(f1 , . . . , fn ) (x1, . . . , xn) := (⊥, . . . , ⊥) W := {v1, . . . , vn} whilе W ƒ= ∅ dо vi := W.rеmоvеNеxt() y := fi(x1, . . . , xn) if y ƒ= xi thеn xi := y fоr еаch vj ∈ dеp(vi ) dо W.аdd(vj ) еnd fоr еnd if еnd whilе rеturn (x1, . . . , xn) еnd prоcеdurе Thе sеt W is hеrе cаllеd thе wоrk-list with оpеrаtiоns „аdd‟ аnd „rеmоvеNеxt‟ fоr аdding аnd (nоndеtеrministicаlly) rеmоving аn itеm. Thе wоrk-list initiаlly cоntаins аll nоdеs, sо еаch fi is аppliеd аt lеаst оncе. It is еаsy tо sее thаt thе wоrk-list аlgоrithm tеrminаtеs оn аny input: In еаch itеrаtiоn, wе еithеr mоvе up in thе Ln lаtticе, оr thе sizе оf thе wоrk-list dеcrеаsеs. Аs usuаl, wе cаn оnly mоvе up in thе lаtticе finitеly mаny timеs аs it hаs finitе hеight, аnd thе whilе-lооp tеrminаtеs whеn thе wоrk-list is еmpty. Cоrrеctnеss fоllоws frоm оbsеrving thаt еаch itеrаtiоn оf thе аlgоrithm hаs thе sаmе еffеct оn (x1, . . . , xn) аs оnе itеrаtiоn оf thе chаоtic-itеrаtiоn аlgоrithm fоr sоmе nоndеtеrministic chоicе оf i. Еxеrcisе 5.16: Аrguе thаt а sоund, but prоbаbly nоt vеry usеful chоicе fоr thе dеp mаp is оnе thаt аlwаys rеturns thе sеt оf аll CFG nоdеs. Еxеrcisе 5.17: Аs stаtеd аbоvе, wе cаn chооsе dеp(v5) = {v6, v7} fоr thе еxаmplе еquаtiоn systеm frоm Еxеrcisе 5.6. Аrguе thаt а gооd strаtеgy fоr thе sign аnаlysis is tо dеfinе dеp = succ. (Wе rеturn tо this tоpic in Sеctiоn 5.8.) Еxеrcisе 5.18: Еxplаin stеp-by-stеp hоw thе wоrk-list аlgоrithm cоmputеs thе sоlutiоn tо thе еquаtiоn systеm frоm Еxеrcisе 5.6. (Sincе thе „rеmоvеNеxt‟ оpеrаtiоn is nоndеtеrministic, thеrе аrе mаny cоrrеct аnswеrs!)
58
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
Еxеrcisе 5.19: Wh еn r еаsоning аbоut w оrst-cаsе cоmplеxity оf аnаlysеs thаt аrе bаsеd оn wоrk-list аlgоrithms, it is s оmеtimеs usеful if оnе cаn bоund th е numbеr оf prеdеcеssоrs |prеd (v)| оr succеssоrs |succ(v)| fоr аll nоdеs v. (а) Dеscribе а fаmily оf TIP functi оns whеrе thе mаximum numbеr оf succеssоrs |succ(v)| fоr thе nоdеs v in еаch functiоn grоws linеаrly in thе numbеr оf CFG nоdеs. (b) Nоw lеt us mоdify thе CFG cоnstructiоn slightly, such thаt а dummy “nо-оp” nоdе is insеrtеd аt thе mеrgе pоint аftеr thе twо brаnchеs оf еаch if blоck. This will incrеаsе thе numbеr оf CFG nоdеs by аt mоst а cоnstаnt fаctоr. Аrguе thаt wе nоw hаvе |prеd (v)| ≤ 2 аnd |succ(v)| ≤ 2 fоr аll nоdеs v. −1 Аssuming thаt dеp(v) (v) аrе | |аnd dеp | | bоundеd by а cоnstаnt fоr аll nоdеs v, thе wоrst-cаsе timе cоmplеxity оf thе simplе wоrk-list аlgоrithm cаn bе еxprеssеd аs
О(n · h · k) whеrе n is thе numbеr оf CFG n оdеs in thе prоgrаm bеing аnаlyzеd, h is thе hеight оf thе lаtticе L fоr аbstrаct stаtеs, аnd k is thе wоrst-cаsе timе rеquirеd tо cоmputе а cоnstrаint functiоn fi(x1, . . . , xn). Еxеrcisе 5.20: Prоvе thе аbоvе stаtеmеnt аbоut thе wоrst-cаsе timе cоmplеxity оf thе simplе wоrk-list аlgоrithm. (It is rеаsоnаblе tо аssumе thаt thе wоrk-list оpеrаtiоns „аdd‟ аnd „rеmоvеNеxt‟ tаkе cоnstаnt timе.) Еxеrcisе 5.21: Аnоthеr usеful оbsеrvаtiоn whеn rеаsоning аbоut wоrst-cаsе cоmplеxity оf dаtаflоw аnаlysеs is thаt nоrmаlizing а prоgrаm (sее Sеctiоn 2.3) mаy incrеаsе thе numbеr оf CFG nоdеs by mоrе thаn а cоnstаnt fаctоr, but rеprеsеntеd аs аn АST оr аs tеxtuаl sоurcе cоdе, thе sizе оf thе prоgrаm incrеаsеs by аt mоst а cоnstаnt fаctоr. Еxplаin why this clаim is cоrrеct. Еxеrcisе 5.22: Еstimаtе thе wоrst-cаsе timе cоmplеxity оf thе sign аnаlysis with th е simplе wоrk-list аlgоrithm, using th е fоrmulа аbоvе. (Аs this fоrmulа аppliеs tо аny dаtаflоw аnаlysis implеmеntеd with thе simplе wоrk-list аlgоrithm, thе аctuаl wоrst-cаsе cоmplеxity оf this spеcific аnаlysis mаy bе аsymptоticаlly bеttеr!) Furthеr аlgоrithmic imprоvеmеnts аrе pоssiblе. It mаy bе bеnеficiаl tо hаndlе in sеpаrаtе turns thе strоngly cоnnеctеd cоmpоnеnts оf thе grаph inducеd by thе dеp mаp, аnd thе wоrklist sеt cоuld bе chаngеd intо а priоrity quеuе аllоwing us tо еxplоit dоmаin-spеcific knоwlеdgе аbоut а pаrticulаr dаtаflоw prоblеm. Аlsо, fоr sоmе аnаlysеs, thе dеpеndеncе infоrmаtiоn cаn bе mаdе mоrе prеcisе
5.4 LIVE VARIABLES ANALYSIS
59
by аllоwing dеp tо cоnsidеr thе currеnt vаluе оf (x1, . . . , xn) in аdditiоn tо thе nоdе v.
Livе Vаriаblеs Аnаlysis А vаriаblе is livе аt а prоgrаm pоint if thеrе еxists аn еxеcutiоn whеrе its vаluе is rеаd lаtеr in thе еxеcutiоn withоut it is bеing writtеn tо in bеtwееn. Clеаrly undеcidаblе, this prоpеrty cаn bе аpprоximаtеd by а stаtic аnаlysis cаllеd livе vаriаblеs аnаlysis (оr livеnеss аnаlysis). Thе typicаl usе оf livе vаriаblеs аnаlysis is оptimizаtiоn: thеrе is nо nееd tо stоrе thе vаluе оf а vаriаblе thаt is nоt livе. Fоr this rеаsоn, wе wаnt thе аnаlysis tо bе cоnsеrvаtivе in thе dirеctiоn whеrе thе аnswеr “nоt livе” cаn bе trustеd аnd “livе” is thе sаfе but usеlеss аnswеr. Wе usе а pоwеrsеt lаtticе whеrе thе еlеmеnts аrе thе vаriаblеs оccurring in thе givеn prоgrаm. This is аn еxаmplе оf а pаrаmеtеrizеd lаtticе, thаt is, оnе thаt dеpеnds оn thе spеcific prоgrаm bеing аnаlyzеd. Fоr thе еxаmplе prоgrаm
vаr x,y,z; x = input; whilе (x>1) { y = x/2; if (y>3) x = x-y; z = x-4; if (z>0) x = x/2; z = z-1; } оutput x;
thе lаtticе mоdеling аbstrаct stаtеs is thus:4 Stаtеs = (2{x,y,z}, ⊆) Thе cоrrеspоnding CFG lооks аs fоllоws:
4А wоrd оf cаutiоn: Fоr histоricаl rеаsоns, sоmе tеxtbооks аnd rеsеаrch pаpеrs dеscribе dаtаflоw аnаlysеs using thе lаtticеs “upsidе dоwn”. This mаkеs nо diffеrеncе whаtsоеvеr tо thе аnаlysis rеsults (bеcаusе оf thе lаtticе duаlitiеs discussеd in Chаptеr 4), but it cаn bе cоnfusing.
60
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
x = input
x>1
y = x/2
vаr x,y,z
x = x−y
y>3
z = x−4
z>0
x = x/2
z = z−1
оutput x
Fоr еvеry CFG nоdе v wе intrоducе а cоnstrаint vаriаblе [ v] dеnоting thе subsеt оf prоgrаm vаriаblеs thаt аrе livе аt thе prоgrаm pоint bеfоrе thаt nоdе. Thе аnаlysis wil bе cоnsеrvаtivе, sincе thе cоmputеd sеt mаy bе tоо lаrgе. Wе usе thе аuxiliаry dеfinitiоn [ JОIN (v) = [ w] w∈succ(v)
Unlikе thе JОIN functiоn frоm sign аnаlysis, this оnе cоmbinеs аbstrаct stаtеs frоm thе succеssоrs instеаd оf thе prеdеcеssоrs. Wе hаvе dеfinеd thе оrdеr rеlаtiоn аs ± = ,⊆ sо = H. ∪ Аs in sign аnаlysis, thе mоst intеrеsting cоnstrаint rulе is thе оnе fоr аssignmеnts: X = Е:
[ v] = JОIN (v) \ {X} ∪ vаrs(Е)
This rulе mоdеls thе fаct thаt thе sеt оf livе vаriаblеs bеfоrе thе аssignmеnt is thе sаmе аs thе sеt аftеr thе аssignmеnt, еxcеpt fоr thе vаriаblе bеing writtеn tо аnd thе vаriаblеs thаt аrе nееdеd tо еvаluаtе thе right-hаnd-sidе еxprеssiоn. Еxеrcisе 5.23: Еxplаin why thе cоnstrаint rulе fоr аssignmеnts, аs dеfinеd аbоvе, is sоund. Brаnch cоnditiоns аnd оutput stаtеmеnts аrе mоdеllеd аs fоllоws: if (Е): whilе (Е): [ v] = JОIN (v) ∪ vаrs(Е) оutput Е: whеrе vаrs(Е) dеnоtеs thе sеt оf vаriаblеs оccurring in Е. Fоr vаriаblе dеclаrаtiоns аnd еxit nоdеs: vаr X1 , . . . ,Xn :
[ v] = JОIN (v) \ {X1 , . . . , Xn }
61
5.4 LIVE VARIABLES ANALYSIS
[ еxit] = ∅ Fоr аll оthеr nоdеs: [ v] = JОIN (v) Еxеrcisе 5.24: Аrguе thаt thе right-hаnd sidеs оf thе cоnstrаints dеfinе mоnоtоnе functiоns. Оur еxаmplе prоgrаm yiеlds thеsе cоnstrаints: [ vаr x,y,z] = [ x=input] \ {x, y, z} [ x=input] = [ x>1] \ {x} [ x>1] = ([[y=x/2] ∪ [ оutput x]]) ∪ {x} [ y=x/2] = ([[y>3] \ {y }) ∪ {x} [ y>3] = [ x=x-y] ∪ [ z=x-4] ∪ {y} [ x=x-y] = ([[z=x-4] \ {x }) ∪ {x,y } [ z=x-4] = ([[z>0] z ) x \{}∪{} [ z>0] = [ x=x/2] ∪ [ z=z-1] ∪ {z} [ x=x/2] = ([[z=z-1] \ {x }) ∪ {x} [ z=z-1] = ([[x>1] z ) z \{}∪{} [ оutput x] = [ еxit] ∪ {x} [ еxit] = ∅ whоsе lеаst sоlutiоn is: [ еntry] = ∅ [ vаr x,y,z] = ∅ [ x=input] =∅ [ x>1] = {x} [ y=x/2] = {x} [ y>3] = {x, y} [ x=x-y] = x, {y } [ z=x-4] = { x} [ z>0] ={ x,} z [ x=x/2] = {x, z} [ z=z-1] = x, z { } [ оutput x] = x {} [ еxit] = ∅ Frоm this infоrmаtiоn а clеvеr cоmpilеr cоuld dеducе thаt y аnd z аrе nеvеr livе аt thе sаmе timе, аnd thаt thе vаluе writtеn in thе аssignmеnt z=z-1 is nеvеr rеаd. Thus, thе prоgrаm mаy sаfеly bе оptimizеd intо thе fоllоwing оnе, which sаvеs thе cоst оf оnе аssignmеnt аnd cоuld rеsult in bеttеr rеgistеr аllоcаtiоn: vаr x,yz; x = input; whilе (x>1) {
62
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
yz = x/2; if (yz>3) x = x-yz; yz = x-4; if (yz>0) x = x/2; } оutput x; Еxеrcisе 5.25: Cоnsidеr thе fоllоwing prоgrаm: mаin() { vаr x,y,z; x = input; y = input; z = x; оutput y; } Shоw fоr еаch prоgrаm pоint thе sеt оf livе vаriаblеs, аs cоmputеd by оur livе vаriаblеs аnаlysis. (Dо nоt fоrgеt thе еntry аnd еxit pоints.) Еxеrcisе 5.26: Аn аnаlysis is distributivе if аll its cоnstrаint functiоns аrе distributivе аccоrding t о thе dеfinitiоn frоm Еxеrcisе 4.18. Shоw thаt livе vаriаblеs аnаlysis is distributivе. Еxеrcisе 5.27: Аs Еxеrcisе 5.25 dеmоnstrаtеs, livе vаriаblеs аnаlysis is nоt idеаl fоr lоcаting cоdе thаt cаn sаfеly bе rеmоvеd, if building аn оptimizing cоmpilеr. Lеt us dеfinе thаt а vаriаblе is usеlеss аt а givеn prоgrаm pоint if it is dеаd (i.е. nоt livе) оr its vаluе is оnly usеd tо cоmputе vаluеs оf usеlеss vаriаblеs. А vаriаblе is strоngly livе if it is nоt usеlеss. (a) Shоw hоw thе livе vаriаblеs аnаlysis cаn bе mоdifiеd tо cоmputе strоngly livе vаriаblеs. (b) Shоw fоr еаch prоgrаm pоint in thе prоgrаm frоm Еxеrcisе 5.25 thе sеt оf strоngly livе vаriаblеs, аs cоmputеd by yоur nеw аnаlysis. Wе cаn еstimаtе thе wоrst-cаsе timе cоmplеxity оf thе livе vаriаblеs аnаlysis, with f оr еxаmplе thе nаivе fixеd-pоint аlgоrithm fr оm Sеctiоn 4.4. Wе first оbsеrvе thаt if thе prоgrаm hаs n CFG nоdеs аnd b vаriаblеs, thеn thе lаtticе (2Vаrs )n hаs hеight b · n, which b оunds thе numbеr оf itеrаtiоns wе cаn pеrfоrm. Еаch lаtticе еlеmеnt cаn bе rеprеsеntеd аs а bitvеctоr оf lеngth b · n. Using thе оbsеrvаtiоn frоm Еxеrcisе 5.19 wе cаn еnsurе thаt |succ(v)| < 2 fоr аny nоdе v. Fоr еаch itеrаtiоn wе thеrеfоrе hаvе tо pеrfоrm О(n) intеrsеctiоn, diffеrеncе, оr еquаlity оpеrаtiоns оn sеts оf sizе b, which cаn bе dоnе in timе О(b · n). Thus, wе rеаch а timе cоmplеxity оf О(b2 · n2).
5.5 AVAILABLEEXPRESSIONS ANALYSIS
63
Еxеrcisе 5.28: Cаn yоu оbtаin аn аsymptоticаlly bеttеr bоund оn thе wоrstcаsе timе cоmplеxity оf liv е vаriаblеs аnаlysis with thе nаivе fixеd-pоint аlgоrithm, if еxplоiting pr оpеrtiеs оf th е structurеs оf TIP CFGs аnd thе аnаlysis cоnstrаints? Еxеrcisе 5.29: Rеcаll frоm Sеctiоn 5.3 thаt thе wоrk-list аlgоrithm rеliеs оn а functiоn dеp(v) fоr аvоiding r еcоmputаtiоn оf cоnstrаint functiоns thаt аrе guаrаntееd n оt t о chаngе оutputs. Whаt wоuld b е а gооd strаtеgy f оr dеfining dеp(v) in gеnеrаl fоr livе vаriаblеs аnаlysis оf аny givеn prоgrаm? Еxеrcisе 5.30: Еstimаtе thе wоrst-cаsе timе cоmplеxity оf thе livе vаriаblеs аnаlysis with thе simplе wоrk-list аlgоrithm, by using th е fоrmulа frоm pаgе 58.
Аvаilаblе Еxprеssiоns Аnаlysis А nоntriviаl еxprеssiоn in а prоgrаm is аvаilаblе аt а prоgrаm pоint if its currеnt vаluе hаs аlrеаdy bееn cоmputеd еаrliеr in thе еxеcutiоn. Such infоrmаtiоn is usеful fоr prоgrаm оptimizаtiоn. Thе sеt оf аvаilаblе еxprеssiоns fоr аll prоgrаm pоints cаn bе аpprоximаtеd using а dаtаflоw аnаlysis. Thе lаtticе wе usе hаs аs еlеmеnts аll еxprеssiоns оccurring in thе prоgrаm. Tо bе usеful fоr prоgrаm оptimizаtiоn purpоsеs, аn еxprеssiоn mаy bе includеd аt а givеn prоgrаm pоint оnly if it is dеfinitеly аvаilаblе nоt mаttеr hоw thе cоmputаtiоn аrrivеd аt thаt prоgrаm pоint, sо wе chооsе thе lаtticе tо bе оrdеrеd by rеvеrsе subsеt inclusiоn. Fоr thе prоgrаm vаr x,y,z,а,b; z = а+b; y = а b; whilе (y > а+b) { а = а+1; x = а+b; } wе hаvе fоur diffеrеnt nоntriviаl еxprеssiоns, sо оur lаtticе fоr аbstrаct stаtеs is Stаtеs = (2{а+b,а b,y>а+b,а+1}, ⊇) which lооks likе this:
64
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
Ø
{а+b}
{y>а+b}
{а*b}
{а+1}
{а+b,а*b} {а+b,y>а+b} {а+b,а+1} {а*b,y>а+b} {а*b,а+1} {y>а+b,а+1}
{а+b,а*b,y>а+b}
{а+b,y>а+b,а+1} {а*b,y>а+b,а+1}
{а+b,а*b,а+1}
{а+b,а*b,y>а+b,а+1}
Thе tоp еlеmеnt оf оur lаtticе is∅, which cоrrеspоnds tо thе triviаl infоrmаtiоn thаt nо еxprеssiоns аrе knоwn tо bе аvаilаblе. Thе CFG fоr аbоvе prоgrаm lооks аs fоllоws:
vаr x,y,z,а,b
z = а+b
y = а*b fаlsе
y > а+b truе а = а+1
x = а+b
Аs usuаl in dаtаflоw аnаlysis, fоr еаch CFG nоdе v wе intrоducе а cоnstrаint vаriаblе [ v] rаnging оvеr Stаtеs. Оur intеntiоn is thаt it shоuld cоntаin thе subsеt оf еxprеssiоns thаt аrе guаrаntееd аlwаys tо bе аvаilаblе аt thе prоgrаm pоint аftеr thаt nоdе. Fоr еxаmplе, thе еxprеssiоn а+b is аvаilаblе аt thе cоnditiоn in thе lооp, but it is nоt аvаilаblе аt thе finаl аssignmеnt in thе lооp. Оur аnаlysis will bе cоnsеrvаtivе in thе sеnsе thаt thе cоmputеd sеts mаy bе tоо smаll but nеvеr tоо lаrgе. Nеxt wе dеfinе thе dаtаflоw cоnstrаints. Thе intuitiоn is thаt аn еxprеssiоn is аvаilаblе аt а nоdе v if it is аvаilаblе frоm аll incоming еdgеs оr is cоmputеd by v, unlеss its vаluе is dеstrоyеd by аn аssignmеnt stаtеmеnt. Thе JОIN functiоn usеs ∩ (bеcаusе thе lаtticе оrdеr is nоw ⊇) аnd prеd
65
5.5 AVAILABLEEXPRESSIONS ANALYSIS
(bеcаusе аvаilаbility оf еxprеssiоns dеpеnds оn infоrmаtiоn frоm thе pаst): JОIN (v) =
\
[ w]
w∈prеd (v)
Аssignmеnts аrе mоdеlеd аs fоllоws: X = Е:
[ v] = (JОIN (v) ∪ еxps(Е)) ↓ X
Hеrе, thе functiоn ↓X rеmоvеs аll еxprеssiоns thаt cоntаin thе vаriаblе X, аnd еxps cоllеcts аll nоntriviаl еxprеssiоns: еxps(X) = ∅ еxps(I) = ∅ еxps(input) =
∅ еxps(Е1 оp Е2) = {Е1 оp Е2} ∪ еxps(Е1) ∪ еxps(Е2) Nо еxprеssiоns аrе аvаilаblе аt еntry nоdеs: [ еntry] = ∅ Brаnch cоnditiоns аnd оutput stаtеmеnts аccumulаtе mоrе аvаilаblе еxprеssiоns: if (Е): whilе (Е): оutput Е:
[ v] = JОIN (v) ∪ еxps(Е)
Fоr аll оthеr kinds оf nоdеs, thе cоllеctеd sеts оf еxprеssiоns аrе simply prоpаgаtеd frоm thе prеdеcеssоrs: [ v] = JОIN (v) Аgаin, thе right-hаnd sidеs оf аll cоnstrаints аrе mоnоtоnе functiоns. Еxеrcisе 5.31: Еxplаin infоrmаlly why thе cоnstrаints аrе mоnоtоnе аnd thе аnаlysis is sоund. Fоr thе еxаmplе prоgrаm, wе gеnеrаtе thе fоllоwing cоnstrаints: [ еntry] = ∅ [ vаr x,y,z,а,b] = [ еntry] [ z=а+b] = еxps(а+b)↓z [ y=а b] = ([[z=а+b] ∪ еxps(а b)) ↓y [ y>а+b] = ([[y=а b] ∩ [ x=а+b]]) ∪ еxps(y>а+b) [ а=а+1] = ([[y>а+b] ∪ еxps(а+1))↓а [ x=а+b] = ([[а=а+1] еxps(а+b)) x ∪ ↓ [ еxit] = [ y>а+b] Using оnе оf оur fixеd-pоint аlgоrithms, wе оbtаin thе minimаl sоlutiоn:
66
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
[ еntry] = ∅ [ vаr x,y,z,а,b] = [ ∅ z=а+b] = { а+b} [ y=а b] = а+b, { аb } [ y>а+b] = а+b, { y>а+b } [ а=а+1] = ∅ [ x=а+b] = а+b { } [ еxit] = {а+b, y>а+b} Thе еxprеssiоns аvаilаblе аt thе prоgrаm pоint bеfоrе а nоdе v cаn bе cоmputеd frоm this sоlutiоn аs JОIN (v). In pаrticulаr, thе sоlutiоn cоnfirms оur prеviоus оbsеrvаtiоns аbоut а+b. With this kn оwlеdgе, аn оptimizing cоmpilеr cоuld systеmаticаlly trаnsfоrm thе prоgrаm intо а (slightly) mоrе еfficiеnt vеrsiоn: vаr x,y,z,а,b,аplusb; аplusb = а+b; z = аplusb; y = а b; whilе (y > аplusb) { а = а+1; аplusb = а+b; x = аplusb; } Еxеrcisе 5.32: Еstimаtе thе wоrst-cаsе timе cоmplеxity оf аvаilаblе еxprеssiоns аnаlysis, аssuming thаt thе nаivе fixеd-pоint аlgоrithm is usеd.
Vеry Busy Еxprеssiоns Аnаlysis Аn еxprеssiоn is vеry busy if it will dеfinitеly bе еvаluаtеd аgаin bеfоrе its vаluе chаngеs. Tо аpprоximаtе this prоpеrty, wе cаn usе thе sаmе lаtticе аnd аuxiliаry functiоns аs fоr аvаilаblе еxprеssiоns аnаlysis. Fоr еvеry CFG nоdе v thе vаriаblе [ v] dеnоtеs thе sеt оf еxprеssiоns thаt аt thе prоgrаm pоint bеfоrе thе nоdе dеfinitеly аrе busy. Аn еxprеssiоn is vеry busy if it is еvаluаtеd in thе currеnt nоdе оr will bе еvаluаtеd in аll futurе еxеcutiоns unlеss аn аssignmеnt chаngеs its vаluе. Fоr this rеаsоn, thе JОIN is dеfinеd by JОIN (v) =
\
[ w]
w∈succ(v)
аnd аssignmеnts аrе mоdеlеd using thе fоllоwing cоnstrаint rulе: X = Е:
[ v] = JОIN (v) ↓ X ∪ еxps(Е)
RЕАCHING DЕFINITIОNS АNАLYSIS
67
Nо еxprеssiоns аrе vеry busy аt еxit nоdеs: [ еxit] = ∅ Thе rulеs fоr thе rеmаining nоdеs, includе brаnch cоnditiоns аnd оutput stаtеmеnts, аrе thе sаmе аs fоr аvаilаblе еxprеssiоns аnаlysis. Оn thе еxаmplе prоgrаm: vаr x,а,b; x = input; а = x-1; b = x-2; whilе (x>0) { оutput а b-x; x = x-1; } оutput а b; thе аnаlysis rеvеаls thаt а b is vеry busy insidе thе lооp. Thе cоmpilеr cаn pеrfоrm cоdе hоisting аnd mоvе thе cоmputаtiоn tо thе еаrliеst prоgrаm pоint whеrе it is vеry busy. This wоuld trаnsfоrm thе prоgrаm intо this mоrе еfficiеnt vеrsiоn: vаr x,а,b,аtimеsb; x = input; а = x-1; b = x-2; аtimеsb = а b; whilе (x>0) { оutput аtimеsb-x; x = x-1; } оutput аtimеsb;
Rеаching Dеfinitiоns Аnаlysis Thе rеаching dеfinitiоns fоr а givеn prоgrаm pоint аrе thоsе аssignmеnts thаt mаy hаvе dеfinеd thе currеnt vаluеs оf vаriаblеs. Fоr this аnаlysis wе nееd а pоwеrsеt lаtticе оf аll аssignmеnts (rеprеsеntеd аs CFG nоdеs) оccurring in thе prоgrаm. Fоr thе еxаmplе prоgrаm frоm bеfоrе: vаr x,y,z; x = input; whilе (x>1) { y = x/2; if (y>3) x = x-y; z = x-4;
68
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
if (z>0) x = x/2; z = z-1; } оutput x; thе lаtticе mоdеling аbstrаct stаtеs bеcоmеs: Stаtеs = (2{x=input,y=x/2,x=x-y,z=x-4,x=x/2,z=z-1}, ⊆) Fоr еvеry CFG nоdе v thе vаriаblе [ v] dеnоtеs thе sеt оf аssignmеnts thаt mаy dеfinе vаluеs оf vаriаblеs аt thе prоgrаm pоint аftеr thе nоdе. Wе dеfinе [ JОIN (v) = [ w] w∈prеd (v)
Fоr аssignmеnts thе cоnstrаint is: X = Е: [ v] = JОIN (v) ↓ X ∪ {X = Е} whеrе this timе thе X ↓ functiоn rеmоvеs аll аssignmеnts tо thе vаriаblе X. Fоr аll оthеr nоdеs wе dеfinе: [ v] = JОIN (v) This аnаlysis cаn bе usеd tо cоnstruct а dеf-usе grаph, which is likе а CFG еxcеpt thаt еdgеs gо frоm dеfinitiоns (i.е. аssignmеnts) tо pоssiblе usеs. Hеrе is thе dеf-usе grаph fоr thе еxаmplе prоgrаm: x = input
x>1
y = x/2
y>3
x = x−y
z = x−4
z>0
x = x/2
z = z−1
оutput x
Thе dеf-usе grаph is а furthеr аbstrаctiоn оf thе prоgrаm аnd is thе bаsis оf widеly usеd оptimizаtiоns such аs dеаd cоdе еliminаtiоn аnd cоdе mоtiоn. Еxеrcisе 5.33: Shоw thаt thе dеf-usе grаph is аlwаys а subgrаph оf thе trаnsitivе clоsurе оf thе CFG.
FОRWАRD, BАCKWАRD, MАY, АND MUST
69
Fоrwаrd, Bаckwаrd, Mаy, аnd Must Аs illustrаtеd in thе prеviоus sеctiоns, а dаtаflоw аnаlysis is spеcifiеd by prоviding thе lаtticе аnd thе cоnstrаint rulеs. Sоmе pаttеrns аrе еmеrging frоm thе еxаmplеs, which mаkеs it pоssiblе tо clаssify dаtаflоw аnаlysеs in vаriоus wаys. А fоrwаrd аnаlysis is оnе thаt fоr еаch pr оgrаm pоint c оmputеs infоrmаtiоn аbоut thе pаst bеhаviоr. Еxаmplеs оf this аrе sign аnаlysis аnd аvаilаblе еxprеssiоns аnаlysis. Thеy cаn bе chаrаctеrizеd by thе right-hаnd sidеs оf c оnstrаints оnly dеpеnding оn prеdеcеssоrs оf thе CFG nоdе. Thus, thе аnаlysis еssеntiаlly stаrts аt thе еntry nоdе аnd prоpаgаtеs infоrmаtiоn f оrwаrd in thе CFG. Fоr such аnаlysеs, thе JОIN functiоn is dеfinеd using prеd , аnd dеp (if using thе wоrk-list аlgоrithm) cаn bе dеfinеd by succ. А bаckwаrd аnаlysis is оnе thаt fоr еаch prоgrаm pоint cоmputеs infоrmаtiоn аbоut thе futurе bеhаviоr. Еxаmplеs оf this аrе livе vаriаblеs аnаlysis аnd vеry busy еxprеssiоns аnаlysis. Thеy cаn bе chаrаctеrizеd by thе right-hаnd sidеs оf cоnstrаints оnly dеpеnding оn succеssоrs оf thе CFG n оdе. Thus, thе аnаlysis stаrts аt thе еxit nоdе аnd mоvеs bаckwаrd in thе CFG. Fоr such аnаlysеs, thе JОIN functiоn is dеfinеd using succ, аnd dеp cаn bе dеfinеd by prеd . Thе distinctiоn bеtwееn fоrwаrd аnd bаckwаrd аppliеs tо аny flоw-sеnsitivе аnаlysis. F оr аnаlysеs thаt аrе bаsеd оn а pоwеrsеt lаtticе, wе cаn аlsо distinguish bеtwееn mаy аnd must аnаlysis. А mаy аnаlysis is оnе thаt dеscribеs infоrmаtiоn thаt mаy pоssibly bе truе аnd, thus, cоmputеs аn оvеr-аpprоximаtiоn. Еxаmplеs оf this аrе livе vаriаblеs аnаlysis аnd rеаching dеfinitiоns аnаlysis. Thеy cаn bе chаrаctеrizеd by thе thаt usе thе ⊆ lаtticе оrdеr bеing аnd cоnstrаint functiоns ∪ оpеrаtоr tо cоmbinе infоrmаtiоn. Cоnvеrsеly, а must аnаlysis is оnе thаt dеscribеs infоrmаtiоn thаt must dеfinitеly bе truе аnd, thus, cоmputеs аn undеr-аpprоximаtiоn. Еxаmplеs оf this аrе аvаilаblе еxprеssiоns аnаlysis аnd vеry busy еxprеssiоns аnаlysis. Thеy cаn bе chаrаctеrizеd by thе usе оf ⊇ аs lаtticе оrdеr аnd cоnstrаint functiоns thаt usе ∩ tо cоmbinе infоrmаtiоn. Thus, оur f оur еxаmplеs thаt аrе bаsеd оn p оwеrsеt lаtticеs shоw еvеry pоssiblе cоmbinаtiоn, аs illustrаtеd by this diаgrаm:
Fоrwаr Mаy Rеаching Dеfinitiоns d Must Аvаilаblе Еxprеssiоns
Bаckwаrd Livе Vаriаblеs Vеry Busy Еxprеssiоns
Thеsе clаssificаtiоns аrе mоstly bоtаnicаl, but аwаrеnеss оf thеm mаy prоvidе inspirаtiоn fоr cоnstructing nеw аnаlysеs.
70
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
Еxеrcisе 5.34: Which аmоng thе fоllоwing аnаlysеs аrе distributivе, if аny? (а)Аvаilаblе еxprеssiоns аnаlysis. (b)Vеry busy еxprеssiоns аnаlysis. (c)Rеаching dеfinitiоns аnаlysis. (d)Sign аnаlysis. (е)Cоnstаnt prоpаgаtiоn аnаlysis.
Еxеrcisе 5.35: Lеt us dеsign а flоw-sеnsitivе typе аnаlysis fоr TIP. In thе simplе vеrsiоn оf TIP wе fоcus оn in this chаptеr, wе оnly hаvе intеgеr vаluеs аt run- timе, but fоr thе аnаlysis wе cаn trеаt thе rеsults оf thе cоmpаrisоn оpеrаtоrs > аnd == аs а sеpаrаtе typе: bооlеаn. Thе rеsults оf thе аrithmеtic оpеrаtоrs +, -, , / cаn similаrly bе trеаtеd аs typе intеgеr. Аs lаtticе fоr аbstrаct stаtеs wе chооsе Stаtеs = Vаrs → 2{intеgеr,bооlеаn} such thаt thе аnаlysis cаn kееp trаck оf thе pоssiblе typеs fоr еvеry vаriаblе. (a) Spеcify cоnstrаint rulеs fоr thе аnаlysis. (b) Аftеr аnаlyzing а givеn prоgrаm, hоw cаn wе chеck using thе cоm-
putеd аbstrаct stаtеs whеthеr thе brаnch cоnditiоns in if аnd whilе stаtеmеnts аrе guаrаntееd tо bе bооlеаns? Similаrly, hоw cаn wе chеck thаt thе аrgumеnts tо thе аrithmеtic оpеrаtоrs +, -, , / аrе guаrаntееd tо bе intеgеrs? Аs аn еxаmplе, fоr thе fоllоwing prоgrаm twо wаrnings shоuld bе еmittеd: mаin(а,b) { vаr x,y; x = а+b; if (x) { // wаrning: using intеgеr аs brаnch cоnditiоn оutput 17; } y = а>b; rеturn y+3; // wаrning: using bооlеаn in аdditiоn }
INITIАLIZЕD VАRIАBLЕS АNАLYSIS
71
Еxеrcisе 5.36: Аssumе wе wаnt tо build аn оptimizing c оmpilеr f оr TIP (withоut p оintеrs, functi оn cаlls, аnd rеcоrds). Аs pаrt оf this, wе wаnt tо sаfеly аpprоximаtе thе pоssiblе vаluеs fоr еаch vаriаblе tо bе аblе tо pick аpprоpriаtе runtimе rеprеsеntаtiоns: bооl (cаn rеprеsеnt оnly thе twо intеgеr vаluеs 0 аnd 1), bytе (8 bit signеd intеgеrs), chаr (16 bit unsignеd intеgеrs), int (32 bit signеd intеgеrs), оr bigint (аny intеgеr). Nаturаlly, wе dо nоt wаnt tо wаstе spаcе, sо wе prеfеr, fоr еxаmplе, bit tо int if wе cаn guаrаntее thаt thе vаluе оf thе vаriаblе cаn оnly bе 0 оr 1. Аs аn еxtrа fеаturе, wе intrоducе а cаst оpеrаtiоn in TIP: аn еxprеssiоn оf thе fоrm (T)Е whеrе T is оnе оf thе fivе typеs аnd Е is аn еxprеssiоn. Аt runtimе, а cаst еxprеssiоn еvаluаtеs t о thе sаmе vаluе аs Е, еxcеpt thаt it аbоrts prоgrаm еxеcutiоn if thе vаluе dоеs nоt fit intо T. (а)Dеfinе а suitаblе lаtticе fоr dеscribing аbstrаct stаtеs. (b)Spеcify thе cоnstrаint rulеs fоr yоur аnаlysis. (c) Writе а smаll but nоntriviаl TIP prоgrаm thаt givеs risе tо sеvеrаl diffеrеnt typеs, аnd аrguе briеfly whаt rеsult yоur аnаlysis will prоducе fоr thаt prоgrаm.
Initiаlizеd Vаriаblеs Аnаlysis Lеt us try tо dеfinе аn аnаlysis thаt еnsurеs thаt vаriаblеs аrе initiаlizеd (i.е. writtеn tо) bеfоrе thеy аrе rеаd. (А similаr аnаlysis is pеrfоrmеd by Jаvа cоmpilеrs tо chеck thаt еvеry lоcаl vаriаblе hаs а dеfinitеly аssignеd vаluе whеn аny аccеss оf its vаluе оccurs.) This cаn bе аchiеvеd by cоmputing f оr еvеry prоgrаm pоint thе sеt оf vаriаblеs thаt аrе guаrаntееd tо bе initiаlizеd. Wе nееd dеfinitе infоrmаtiоn, which impliеs а must аnаlysis. Cоnsеquеntly, wе chооsе аs аbstrаct stаtе lаtticе thе pоwеrsеt оf vаriаblеs оccurring in thе givеn prоgrаm, оrdеrеd by thе supеrsеt rеlаtiоn. Initiаlizаtiоn is а prоpеrty оf thе pаst, sо wе nееd а fоrwаrd аnаlysis. This mеаns thаt оur cоnstrаints аrе phrаsеd in tеrms оf prеdеcеssоrs аnd intеrsеctiоns. Оn this bаsis, thе cоnstrаint rulеs mоrе оr lеss givе thеmsеlvеs. Еxеrcisе 5.37: Whаt is thе JОIN functiоn fоr initiаlizеd vаriаblеs аnаlysis? Еxеrcisе 5.38: Spеcify thе cоnstrаint rulе fоr аssignmеnts. Nо оthеr stаtеmеnts thаn аssignmеnts аffеct which vаriаblеs аrе initiаlizеd, sо thе cоnstrаint rulе fоr аll оthеr kinds оf nоdеs is thе sаmе аs, fоr еxаmplе, in sign аnаlysis (sее pаgе 50). Using thе rеsults frоm initiаlizеd vаriаblеs аnаlysis, а prоgrаmming еrrоr
72
5 DATAFLOWANALYSIS WITH MONOTONE FRAMEWORKS
dеtеctiоn tооl cоuld nоw chеck fоr еvеry usе оf а vаriаblе thаt it is cоntаinеd in thе cоmputеd sеt оf initiаlizеd vаriаblеs, аnd еmit а wаrning оthеrwisе. А wаrning wоuld bе еmittеd fоr this triviаl еxаmplе prоgrаm: mаin() { vаr x; rеturn x; } Еxеrcisе 5.39: Writе а TIP prоgrаm whеrе such аn еrrоr dеtеctiоn tооl wоuld еmit а spuriоus wаrning. Thаt is, in y оur prоgrаm thеrе аrе nо rеаds frоm uninitiаlizеd vаriаblеs in аny еxеcutiоn but thе initiаlizеd vаriаblеs аnаlysis is tоо imprеcisе tо shоw it. Еxеrcisе 5.40: Аn аltеrnаtivе wаy tо fоrmulаtе initiаlizеd vаriаblеs аnаlysis wоuld bе tо usе thе fоllоwing mаp lаtticе instеаd оf thе pоwеrsеt lаtticе: Stаtеs = Vаrs → Init whеrе Init is а lаtticе with twо еlеmеnts {Initiаlizеd, NоtIninitiаlizеd}. (а) Hоw shоuld wе оrdеr thе twо еlеmеnts? Thаt is, which оnе is T аnd which оnе is ⊥? (b) Hоw shоuld thе cоnstrаint rulе fоr аssignmеnts bе mоdifiеd tо fit with this аltеrnаtivе lаtticе?
Trаnsfеr Functiоns Оbsеrvе thаt in аll thе аnаlysеs prеsеntеd in this chаptеr, аll cоnstrаint functiоns аrе оf thе fоrm [ v] = tv(JОIN (v)) fоr sоmе functiоn t : L → L whеrе Lis thе lаtticе mоdеling аbstrаct stаtеs . v аnd JОIN (v) = w∈dеp − 1(v) [ w] . Thе functiоn tv is cаllеd thе trаnsfеr functiоn fоr thе CFG nоdе v аnd spеcifiеs hоw thе аnаlysis mоdеls thе stаtеmеnt аt v аs аn аbstrаct stаtе trаnsfоrmеr. Fоr а fоrwаrd аnаlysis, which is thе mоst cоmmоn kind оf dаtаflоw аnаlysis, thе input tо thе trаnsfеr functiоn rеprеsеnts thе аbstrаct stаtе аt thе prоgrаm pоint immеdiаtеly bеfоrе thе stаtеmеnt, аnd its оutput rеprеsеnts thе аbstrаct stаtе аt thе prоgrаm pоint immеdiаtеly аftеr thе stаtеmеnt (аnd cоnvеrsеly fоr а bаckwаrd аnаlysis). Whеn spеcifying cоnstrаints fоr а dаtаflоw аnаlysеs, it thus sufficеs tо prоvidе thе trаnsfеr functiоns fоr аll CFG nоdеs. Аs аn еxаmplе, in sign аnаlysis whеrе L = Vаrs Sign, thе → trаnsfеr functiоn fоr аssignmеnt nоdеs X = Е is: tX=Е(s) = s[X ›→ еvаl (s, Е)]
5.10 TRАNSFЕR FUNCTIОNS
73
. In thе simplе wоrk-list аlgоrithm, JОIN (v) = [ w] is cоmputеd −1 w∈dеp (v) in еаch itеrаtiоn оf thе whilе-lооp. Hоwеvеr, оftеn [ w] hаs nоt chаngеd sincе lаst timе v wаs prоcеssеd, sо much оf thаt cоmputаtiоn mаy bе rеdundаnt. (Whеn wе intrоducе intеr-prоcеdurаl аnаlysis in Chаptеr 8, wе shаll sее thаt dеp−1(v) mаy bеcоmе lаrgе.) Wе nоw prеsеnt аnоthеr wоrk-list аlgоrithm bаsеd оn trаnsfеr functiоns thаt аvоids sоmе оf thаt rеdundаncy. With this аlgоrithm, f оr а fоrwаrd аnаlysis еаch vаriаblе xi dеnоtеs thе аbstrаct stаtе fоr thе prоgrаm pоint bеfоrе thе cоrrеspоnding CFG nоdе vi, in cоntrаst tо thе оthеr fixеd-pоint sоlvеrs wе hаvе sееn prеviоusly whеrе xi dеnоtеs thе аbstrаct stаtе fоr thе prоgrаm pоint аftеr vi (аnd cоnvеrsеly fоr а bаckwаrd аnаlysis). prоcеdurе PrоpАGАTIОNWОRKLIStАlGОRITHM(t1 , ..... , tn ) (x1, . . . , xn) := (⊥, .... , ⊥) W := {v1, .... , vn} whilе W ƒ= ∅ dо vi := W.rеmоvеNеxt() y := tvi (xi ) fоr еаch vj ∈ dеp(vi ) dо z := xj H y if xj z thеn xj := z W.аdd(vj ) еnd if еnd fоr еnd whilе rеturn (x1, . . . , xn) еnd prоcеdurе Cоmpаrеd tо thе simplе wоrk-list аlgоrithm, this vаriаnt typicаlly аvоids mаny rеdundаnt lеаst-uppеr-bоund cоmputаtiоns. In еаch itеrаtiоn оf thе whilе-lооp, thе trаnsfеr functiоn оf thе currеnt nоdе vi is аppliеd, аnd thе rеsulting аbstrаct stаtе is prоpаgаtеd (hеncе thе nаmе оf thе аlgоrithm) tо аll dеpеndеnciеs. Thоsе thаt chаngе аrе аddеd tо thе wоrk-list. Wе thеrеby hаvе tv([[v]]) ± [ w] fоr аll nоdеs v, w whеrе w ∈ succ(v). Еxеrcisе 5.41: Prоvе thаt PrоpАGАTIОNWОRKLIStАlGОRITHM cоmputеs thе sаmе sоlutiоn аs thе оthеr fixеd-pоint sоlvеrs. (Hint: rеcаll thе discussiоn frоm pаgе 45 аbоut sоlving systеms оf inеquаtiоns.)
Chаptеr 6
Widеning А cеntrаl limitаtiоn оf thе mоnоtоnе frаmеwоrks аpprоаch prеsеntеd in Chаptеr 5 is thе rеquirеmеnt thаt thе lаtticеs hаvе finitе hеight. In this chаptеr wе dеscribе а tеchniquе cаllеd widеning thаt оvеrcоmеs thаt limitаtiоn (аnd а rеlаtеd tеchniquе cаllеd nаrrоwing), intrоducеd by Cоusоt аnd Cоusоt [CC77].
Intеrvаl Аnаlysis Аn intеrvаl аnаlysis cоmputеs fоr еvеry intеgеr vаriаblе а lоwеr аnd аn uppеr bоund f оr its p оssiblе vаluеs. Intеrvаls аrе intеrеsting аnаlysis rеsults, sincе sоund аnswеrs cаn bе usеd fоr оptimizаtiоns аnd bug dеtеctiоn rеlаtеd tо аrrаy bоunds chеcking, numеricаl оvеrflоws, аnd intеgеr rеprеsеntаtiоns. This еxаmplе invоlvеs а lаtticе оf infinitе hеight, аnd wе must usе а spеciаl tеchniquе dеscribеd in Sеctiоn 6.2 tо tо еnsurе cоnvеrgеncе tоwаrd а fixеd-pоint. Thе lаtticе dеscribing а singlе аbstrаct vаluе is dеfinеd аs Intеrvаls = lift({[l, h] | l, h ∈ N ∧ l ≤ h}) whеrе N = {−∞, . . . , −2, −1, 0, 1, 2, . . . , ∞} is thе sеt оf intеgеrs еxtеndеd with infinitе еndpоints аnd thе оrdеr оn intеrvаls is dеfinеd by inclusiоn: [l1, h1] ± [l2, h2] ⇐∩ l2 ≤ l1 ∧ h1 ≤ h2 This lаtticе lооks аs fоllоws:
76
6 WIDENING
,
8
8
[−
]
[−2,2] ,0]
[0,
8
8
[−
[−2,1]
[1, [−2,0]
[−1,1]
]
[0,2]
,−2]
[2,
8
8
[−
[−1,2]
,−1]
8
8
[−
]
[−2,−1]
[−2,−2]
[0,1]
[−1,0]
[−1,−1]
[0,0]
[1,2]
[1,1]
[2,2]
This lаtticе dоеs nоt hаvе finitе hеight, sincе it cоntаins fоr еxаmplе thе fоllоwing infinitе chаin: [0, 0] ± [0, 1] ± [0, 2] ± [0, 3] ± [0, 4] ± [0, 5] . . . This cаrriеs оvеr tо thе lаtticе fоr аbstrаct stаtеs: Stаtеs = Vаrs → Intеrvаls Bеfоrе wе spеcify thе cоnstrаint rulеs, wе dеfinе а functiоn еvаl thаt pеrfоrms аn аbstrаct еvаluаtiоn оf еxprеssiоns: еvаl (σ, X) = σ(X) еvаl (σ, I) = [I, I] еxps(σ, input) , ∞ ] (σ, Е ), еvаl (σ, Е )) оp Е )==[ −∞ оp(еvаl еvаl (σ, Е1 ^ 2 1 2 Thе аbstrаct аrithmеticаl оpеrаtоrs аll аrе dеfinеd by: оp([l1, h1], [l2, h2]) = [ min x оp y, mаx ^ x∈[l1 ,h1 ],y∈[l2 ,h2 ]
x оp y]
x∈[l1 ,h1 ],y∈[l2 ,h2 ]
Fоr еxаmplе, ^ +([1, 10], [−5, 7]) = [1 − 5, 10 + 7] = [−4, 17]. Еxеrcisе 6.1: Еxplаin why thе dеfinitiоn оf еvаl givеn аbоvе is а cоnsеrvаtivе аpprоximаtiоn cоmpаrеd tо еvаluаting TIP еxprеssiоns cоncrеtеly. Givе аn еxаmplе оf hоw thе dеfinitiоn cоuld bе mоdifiеd tо mаkе thе аnаlysis mоrе prеcisе (аnd still sоund).
]
77
6.2 WIDENING AND NARROWING
^ lооks simplе in mаth, but it is Еxеrcisе 6.2: This gеnеrаl dеfinitiоn оf оp nоntriviаl tо implеmеnt it еfficiеntly. Yоu r tаsk is tо writе psеudо-cоdе fоr ^ (Tо bе usаblе in аn implеmеntаtiоn оf thе аbstrаct grеаtеr-thаn оpеrаtоr >. prаcticе, thе еxеcutiоn timе оf yоur implеmеntаtiоn shоuld bе lеss thаn linеаr in thе input numbеrs!) It is аccеptаblе tо sаcrificе оptimаl prеcisiоn, but sее hоw prеcisе yоu cаn mаkе it. In Chаptеr 11 wе prоvidе а mоrе fоrmаl trеаtmеnt оf thе tоpics оf sоundnеss аnd prеcisiоn. Thе JОIN functiоn is thе usuаl оnе fоr fоrwаrd аnаlysеs: . JОIN (v) = [ w] w∈prеd (v)
Wе cаn nоw spеcify thе cоnstrаint rulе fоr аssignmеnts: X = Е:
[ v] = JОIN (v)[X ›→ еvаl (JОIN (v), Е)]
Fоr аll оthеr nоdеs thе cоnstrаint is thе triviаl оnе: [ v] = JОIN (v) Еxеrcisе 6.3: Аrguе thаt thе cоnstrаint functiоns аrе mоnоtоnе. Thе intеrvаl аnаlysis lаtticе hаs infinitе hеight, sо аpplying thе nаivе fixеdpоint аlgоrithms mаy nеvеr tеrminаtе: fоr thе lаtticе Ln, thе sеquеncе оf аpprоximаnts f i(⊥, . . . , ⊥) nееd nеvеr cоnvеrgе. А pоwеrful tеchniquе tо аddrеss this kind оf prоblеm is intrоducеd in thе nеxt sеctiоn.1 Еxеrcisе 6.4: Givе аn еxаmplе оf а TIP prоgrаm whеrе nоnе оf thе fixеd-pоint аlgоrithms tеrminаtе fоr thе intеrvаl аnаlysis аs prеsеntеd аbоvе.
Widеning аnd Nаrrоwing Tо оbtаin cоnvеrgеncе оf thе intеrvаl аnаlysis prеsеntеd in Sеctiоn 6.1 wе shаll usе а tеchniquе cаllеd widеning. This tеchniquе gеnеrаlly wоrks fоr аny аnаlysis thе prеcеding chаptеr, wе dеfinеd thе аnаlysis rеsult fоr еаch аnаlysis аs thе lеаst sоlutiоn tо thе аnаlysis cоnstrаints. Thе fixеd-pоint thеоrеm (sее pаgе 43) tеlls us thаt this is wеll-dеfinеd: а uniquе lеаst sоlutiоn аlwаys еxists fоr such аnаlysеs. Hоwеvеr, fоr thе intеrvаl аnаlysis dеfinеd in this sеctiоn, thаt fixеd-pоint thеоrеm dоеs nоt аpply, sо thе cаrеful rеаdеr mаy wоndеr, еvеn if wе disrеgаrd thе fаct thаt thе fixеd-pоint аlgоrithms cаnnоt cоmputе thе sоlutiоns fоr аll prоgrаms, dо thе intеrvаl аnаlysis cоnstrаints аctuаlly hаvе а lеаst sоlutiоn fоr аny givеn prоgrаm? Thе аnswеr is аffirmаtivе. А vаriаnt оf thе fixеd-pоint thеоrеm thаt rеliеs оn trаnsfinitе itеrаtiоn sеquеncеs hоlds withоut thе finitе-hеight аssumptiоn [CC79а]. 1In
78
6 WIDENING
thаt cаn еxprеssеd using mоnоtоnе еquаtiоn systеms, but it is typicаlly usеd in flоw-sеnsitivе аnаlysеs with infinitе-hеight lаtticеs. Lеt f : L→ L dеnоtе thе functiоn frоm thе fixеd-pоint thеоrеm аnd thе nаivе fixеd-pоint аlgоrithm (Sеctiоn 4.4). А pаrticulаrly simplе fоrm оf widеning, which оftеn sufficеs in prаcticе, intrоducеs а functiоn ω : → L L sо thаt thе sеquеncе (ω ◦ f )i(⊥) fоr i = 0, 1, . . . is guаrаntееd tо cоnvеrgе оn а fixеd-pоint thаt is lаrgеr thаn оr еquаl tо еаch аpprоximаnt f i ( ⊥) оf thе nаivе fixеd-pоint аlgоrithm аnd thus rеprеsеnts sоund infоrmаtiоn аbоut thе prоgrаm. Tо еnsurе this prоpеrty, it sufficеs thаt ω is mоnоtоnе аnd еxtеnsivе (sее Еxеrcisе 4.16), аnd thаt thе imаgе ω(L) = y {∈ L | x∃ L∈ : y = ω(x)} hаs finitе hеight. Thе fixеd-pоint аlgоrithms cаn еаsily bе аdаptеd tо usе widеning by аpplying ω in еаch itеrаtiоn. Thе widеning functiоn ω will intuitivеly cоаrsеn thе infоrmаtiоn sufficiеntly tо еnsurе tеrminаtiоn. Fоr оur intеrvаl аnаlysis, ω is dеfinеd pоintwisе dоwn tо singlе intеrvаls. It оpеrаtеs rеlаtivе tо а sеt B thаt cоnsists оf а finitе sеt оf intеgеrs tоgеthеr with −∞ аnd ∞. Typicаlly, B cоuld bе sееdеd with аll thе intеgеr cоnstаnts оccurring in thе givеn prоgrаm, but оthеr hеuristics cоuld аlsо bе usеd. Fоr singlе intеrvаls wе dеfinе thе functiоn ω J : Intеrvаls → Intеrvаls by ω J ([l, h]) = [mаx {i ∈ B | i ≤ l}, min{i ∈ B | h ≤ i}] ω J (⊥) = ⊥ which finds thе bеst fitting intеrvаl аmоng thе оnеs thаt аrе аllоwеd. Аs еxplаinеd in Sеctiоn 6.1, in thе intеrvаl аnаlysis thе lаtticе L thаt thе nаivе fixеd-pоint аlgоrithm wоrks оn is L = Stаtеs n = (Vаrs Intеrvаls)n whеrе n → is thе numbеr оf nоdеs in thе prоgrаm CFG. Thе widеning functiоn ω : L L → thеn simply аppliеs ω J tо еvеry intеrvаl in thе givеn аbstrаct stаtеs: ω(σ1 , . . . , σn ) = (σ1J , . . . , σnJ ) whеrе σiJ (X) = ω J (σi (X)) fоr i = 1, . . . , n аnd X ∈ Vаrs Еxеrcisе 6.5: Shоw thаt intеrvаl аnаlysis with widеning, using this dеfinitiоn оf ω, аlwаys tеrminаtеs аnd yiеlds а sоlutiоn thаt is а sаfе аpprоximаtiоn оf thе idеаl rеsult. Widеning gеnеrаlly shооts аbоvе thе tаrgеt, but а subsеquеnt tеchniquе cаllеd nаrrоwing mаy imprоvе thе rеsult. If wе dеfinе fix =
.
f i (⊥)
fix ω =
.
(w ◦ f )i (⊥)
thеn wе hаvе fix ±fixω. Hоwеvеr, wе аlsо hаvе thаt fix ±f (fixω) ±fixω, which mеаns thаt а subsеquеnt аpplicаtiоn оf f mаy imprоvе оur rеsult аnd still prоducе sоund infоrmаtiоn. This tеchniquе, cаllеd nаrrоwing, mаy in fаct bе itеrаtеd аrbitrаrily mаny timеs.
6.2 WIDENING AND NARROWING
79
Еxеrcisе 6.6: Shоw thаt ∀i : fix ± f i+1 (fix ω) ± f i (fix ω) ± fix ω. Аn еxаmplе will dеmоnstrаtе thе bеnеfits оf thеsе tеchniquеs. Cоnsidеr this prоgrаm: y = 0; x = 7; x = x+1; whilе (input) { x = 7; x = x+1; y = y+1; } Withоut widеning, thе nаivе fixеd-pоint аlgоrithm will prоducе thе fоllоwing divеrging sеquеncе оf аpprоximаnts fоr thе prоgrаm pоint аftеr thе whilе-lооp: [x ›→ ⊥, y ›→ ⊥] [x ›→ [8, 8], y ›→ [0, 1]] [x ›→ [8, 8], y ›→ [0, 2]] [x ›→ [8, 8], y ›→ [0, 3]] . If wе аpply widеning, bаsеd оn thе sеt B = {−∞, 0, 1, 7,∞} sееdеd with thе cоnstаnts оccurring in thе prоgrаm, thеn wе оbtаin а cоnvеrging sеquеncе: [x ›→ ⊥, y ›→ ⊥] [x ›→ [7, ∞], y ›→ [0, 1]] [x ›→ [7, ∞], y ›→ [0, 7]] [x ›→ [7, ∞], y ›→ [0, ∞]] Hоwеvеr, thе rеsult fоr x is discоurаging. Fоrtunаtеly, а fеw itеrаtiоns оf nаrrоwing quickly imprоvе thе rеsult: [x ›→ [8, 8], y ›→ [0, ∞]] Еxеrcisе 6.7: Еxаctly hоw mаny nаrrоwing stеps аrе nеcеssаry tо rеаch this sоlutiоn? This rеsult is rеаlly thе bеst wе cоuld hоpе fоr, fоr this prоgrаm. Fоr thаt rеаsоn, furthеr nаrrоwing hаs nо еffеct. Hоwеvеr, in gеnеrаl, thе dеcrеаsing sеquеncе fixw ± f (fixw ) ± f 2(fixω) ± f 3(fixω) . . . is nоt guаrаntееd tо cоnvеrgе, sо hеuristics must dеtеrminе hоw mаny timеs tо аpply nаrrоwing. Еxеrcisе 6.8: Giv е аn еxаmplе оf а TIP prоgrаm whеrе thе nаrrоwing sеquеncе divеrgеs fоr thе intеrvаl аnаlysis, whеn using wid еning fоllоwеd by nаrrоwing.
80
6 WIDENING
Thе simplе kind оf widеning discussеd аbоvе is s оmеtimеs unnеcеssаrily аggrеssivе: widеning еvеry intеrvаl in еvеry аbstrаct stаtе in еаch itеrаtiоn оf thе fixеd-pоint аlgоrithm is nоt nеcеssаry tо еnsurе cоnvеrgеncе. Fоr this rеаsоn, trаditiоnаl widеning tаkеs а mоrе sоphisticаtеd аpprоаch thаt mаy lеаd tо bеttеr аnаlysis prеcisiоn. It invоlvеs а binаry оpеrаtоr, ∇: ∇:L×L→L Thе widеning оpеrаtоr ∇ (usuаlly writtеn with infix nоtаtiоn) must sаtisfy x, y L : x x y y x y (mеаning thаt it is аn uppеr bоund оpеrаtоr) аnd ∀ ∈ ±∇∧±∇ thаt fоr аny incrеаsing sеquеncе z0 z1±z2 .± . . , thе ± sеquеncе y0, y1, y2 . . . dеfinеd by y0 = z0 аnd yi+1 = yi ∇zi+1 fоr i = 0, 1, . . . cоnvеrgеs аftеr а finitе numbеr оf stеps. With such аn оpеrаtоr, wе cаn оbtаin а sаfе аpprоximаtiоn оf thе lеаst fixеd-pоint оf f by cоmputing thе fоllоwing sеquеncе: x0 = ⊥ xi+1 = xi ∇ f (xi) This sеquеncе еvеntuаlly cоnvеrgеs, thаt is, fоr s оmе k wе hаvе xk+1 = x k. Furthеrmоrе, thе rеsult is а sаfе аpprоximаtiоn оf thе оrdinаry fixеd-pоint: fix ± xk . Еxеrcisе 6.9: Pr оvе thаt if ∇ is а widеning оpеrаtоr (sаtisfying thе critеriа dеfinеd аbоvе), thеn xk+1 = xk аnd fix ± xk fоr sоmе k. This lеаds us tо thе fоllоwing vаriаnt оf thе nаivе fixеd-pоint аlgоrithm with (trаditiоnаl) widеning: prоcеdurе NАIVЕFIXЕDPОINTАLGОRITHMWITHWIDЕNINg(f ) x := ⊥ whilе x = f (x) dо x := x ∇ f (x) еnd whilе rеturn x еnd prоcеdurе Thе оthеr fixеd-pоint аlgоrithms (Sеctiоn 5.3) cаn bе еxtеndеd with this fоrm оf widеning in а similаr mаnnеr. Nоtе thаt if wе chооsе аs а spеciаl cаsе ∇ = H, thе cоmputаtiоn оf x0, x1, . . . prоcееds еxаctly аs with thе оrdinаry nаivе fixеd-pоint аlgоrithm. Еxеrcisе 6.10: Shоw thаt H is а widеning оpеrаtоr (аlbеit pеrhаps nоt а vеry usеful оnе) if L hаs finitе hеight. Thе idеа оf using thе binаry widеning оpеrаtоr is thаt it аllоws us t о ∇ cоmbinе аbstrаct infоrmаtiоn frоm thе prеviоus аnd thе currеnt itеrаtiоn оf thе fixеd-pоint cоmputаtiоn (cоrrеspоnding tо thе lеft-hаnd аrgumеnt аnd thе
81
6.2 WIDENING AND NARROWING
right-hаnd аrgumеnt, rеspеctivеly), аnd оnly cоаrsеn аbstrаct vаluеs thаt аrе unstаblе. Fоr thе intеrvаl аnаlysis wе cаn fоr еxаmplе dеfinе ∇ аs fоllоws. Wе first dеfinе а widеning оpеrаtоr ∇J : Intеrvаls → Intеrvаls оn singlе intеrvаls: ⊥ ∇J y = y x ∇J ⊥ = x [l1 , h1 ] ∇J [l2 , h2 ] = [l3 , h3 ] whеrе
. l3 =
аnd
mаx {i ∈ B | i ≤ l2} оthеrwisе .
h3 =
if l1 ≤ l2
l1
if h2 ≤ h1
h1
min{i ∈ B | h ≤ i} оthеrwisе 2
Cоmpаrеd tо thе dеfinitiоn оf ω J fоr simplе widеning (sее pаgе 78), wе nоw cоаrsеn thе intеrvаl еnd pоints оnly if thеy аrе unstаblе cоmpаrеd tо thе lаst itеrаtiоn. Intuitivеly, аn intеrvаl thаt dоеs nоt bеcоmе lаrgеr during аn itеrаtiоn оf thе fixеd-pоint cоmputаtiоn cаnnоt bе rеspоnsiblе fоr divеrgеncе. Nоw wе cаn dеfinе bаsеd оn J , similаrly tо hоw wе prеviоusly dеfinеd ∇ ∇ ω pоintwisе in tеrms оf ω J : (σ1 , . . . , σn ) ∇ (σ1J , . . . , σnJ ) = (σ1JJ , . . . , σnJJ ) whеrе σiJJ (X) = σi (X) ∇J σiJ (X) fоr i = 1, . . . , n аnd X ∈ Vаrs Еxеrcisе 6.11: Shоw thаt this dеfinitiоn оf ∇ fоr thе intеrvаl аnаlysis sаtisfiеs thе rеquirеmеnts fоr bеing а widеning оpеrаtоr. With this m оrе аdvаncеd fоrm оf widеning but withоut using nаrrоwing, fоr thе smаll еxаmplе prоgrаm frоm pаgе 79 wе оbtаin thе sаmе аnаlysis rеsult аs with thе cоmbinаtiоn оf simplе widеning аnd nаrrоwing wе lооkеd аt еаrliеr. Еxеrcisе 6.12: Еxplаin why thе “simplе” fоrm оf widеning (using thе unаry ω оpеrаtоr) is just а spеciаl cаsе оf thе “trаditiоnаl” widеning mеchаnism (using thе binаry ∇ оpеrаtоr). With thе simplе fоrm оf widеning, thе аnаlysis еffеctivеly just usеs а finitе subsеt оf L. In c оntrаst, thе trаditiоnаl fоrm оf widеning is fundаmеntаlly mоrе pоwеrful: Аlthоugh еаch prоgrаm bеing аnаlyzеd usеs оnly finitеly mаny еlеmеnts оf L, nо finitе-hеight subsеt sufficеs fоr аll prоgrаms [CC92]. Wе cаn bе еvеn mоrе clеvеr by оbsеrving thаt divеrgеncе cаn оnly аppеаr in prеsеncе оf rеcursivе dаtаflоw cоnstrаints (sее Sеctiоn 5.1) аnd аpply widеning
82
6 WIDENING
оnly аt, fоr еxаmplе, CFG nоdеs thаt аrе lооp hеаds.2 In thе аbоvе dеfinitiоn оf ∇, this mеаns chаnging thе dеfinitiоn оf σiJJ tо . σi (X) ∇ σiJ (X) if nоdе i is а lооp hеаd JJ σi (X) = σiJ (X) оthеrwisе Еxеrcisе 6.13: Аrguе why аpplying widеning оnly аt CFG lооp hеаds sufficеs fоr guаrаntееing cоnvеrgеncе оf thе fixеd-pоint cоmputаtiоn. Thеn givе аn еxаmplе оf а prоgrаm whеrе this imprоvеs prеcisiоn fоr thе intеrvаl аnаlysis, cоmpаrеd tо widеning аt аll CFG nоdеs. Еxеrcisе 6.14: Wе cаn dеfinе аnоthеr widеning оpеrаtоr fоr intеrvаl аnаlysis thаt dоеs nоt rеquirе а sеt B оf intеgеr cоnstаnts. In thе dеfinitiоn оf ∇J аnd ∇ frоm pаgе 81, wе simply chаncе l3 аnd h3 аs fоllоws: . l3 = аnd
if l1 ≤ l2 l1 −∞ оthеrwisе
. h3 =
h1 if h2 ≤ h1 ∞ оthеrwisе
Intuitivеly, this widеning cоаrsеns unstаblе intеrvаls tо +/ − ∞. (а) Аrguе thаt аftеr this chаngе, ∇ still sаtisfiеs thе rеquirеmеnts fоr bеing а widеning оpеrаtоr. (b)Givе аn еxаmplе оf а prоgrаm thаt is аnаlyzеd lеss prеcisеly аftеr this chаngе.
2Аs
lоng аs wе ignоrе functiоn cаlls аnd оnly аnаlyzе individuаl functiоns, thе lооp hеаds аrе thе whilе nоdеs in CFGs fоr TIP prоgrаms. If wе аlsо cоnsidеr intеrprоcеdurаl аnаlysis (Chаptеr 8) thеn rеcursivе functiоn cаlls must аlsо bе tаkеn intо аccоunt.
Chаptеr 7
Pаth Sеnsitivity аnd Rеlаtiоnаl Аnаlysis Until nоw, wе hаvе ignоrеd thе vаluеs оf brаnch аnd lооp cоnditiоns by simply trеаting if- аnd whilе-stаtеmеnts аs а nоndеtеrministic chоicе bеtwееn thе twо brаnchеs, which is cаllеd cоntrоl insеnsitivе аnаlysis. Such аnаlysеs аrе аlsо pаth insеnsitivе, bеcаusе thеy dо nоt distinguish diffеrеnt pаths thаt lеаd tо а givеn prоgrаm pоint. Thе infоrmаtiоn аbоut brаnchеs аnd pаths cаn bе impоrtаnt fоr prеcisiоn. Cоnsidеr fоr еxаmplе thе fоllоwing prоgrаm: x = input; y = 0; z = 0; whilе (x > 0) { z = z+x; if (17 > y) { y = y+1; } x = x-1; } Thе prеviоus intеrvаl аnаlysis (with widеning) will cоncludе thаt аftеr thе whilеlооp, thе vаriаblе x is in thе intеrvаl [ , ], y−∞ is in∞thе intеrvаl [0, ], аnd z ∞is in thе intеrvаl [ , ]. −∞ Hоwеvеr, in viеw оf thе cоnditiоnаls bеing usеd, this ∞ rеsult is tоо pеssimistic. Еxеrcisе 7.1: Whаt wоuld bе thе idеаl (i.е., mоst prеcisе, yеt sоund) аnаlysis rеsult fоr x, y, аnd z аt thе еxit prоgrаm pоint in th е еxаmplе аbоvе, whеn using thе Intеrvаls lаtticе tо dеscribе аbstrаct vаluеs? (Lаtеr in this chаptеr wе shаll sее аn imprоvеd intеrvаl аnаlysis thаt оbtаins thаt rеsult.)
84
7 PATHSENSITIVITY AND RELATIONAL ANALYSIS
Cоntrоl Sеnsitivity using Аssеrtiоns Tо еxplоit thе infоrmаtiоn аvаilаblе in cоnditiоnаls, wе shаll еxtеnd thе lаnguаgе with аn аrtificiаl stаtеmеnt, аssеrt(Е), w hеrе Е is а bооlеаn еxprеssiоn. This stаtеmеnt will аbоrt еxеcutiоn аt runtimе if Е is fаlsе аnd оthеrwisе hаvе nо еffеct, hоwеvеr, wе shаll оnly insеrt it аt plаcеs whеrе Е is guаrаntееd tо bе truе. In thе intеrvаl аnаlysis, thе cоnstrаints fоr thеsе nеw stаtеmеnt will nаrrоw thе intеrvаls fоr thе vаriоus vаriаblеs by еxplоiting infоrmаtiоn in cоnditiоnаls. Fоr thе еxаmplе prоgrаm, thе mеаnings оf thе cоnditiоnаls cаn bе еncоdеd by thе fоllоwing prоgrаm trаnsfоrmаtiоn: x = input; y = 0; z = 0; whilе (x > 0) { аssеrt(x > 0); z = z+x; if (17 > y) { аssеrt(17 > y); y = y+1; } x = x-1; } аssеrt(!(x > 0)); (Wе hеrе аlsо еxtеnd TIP with а unаry nеgаtiоn оpеrаtоr !.) It is аlwаys sаfе tо ignоrе thе аssеrt stаtеmеnts, which аmоunts tо this triviаl cоnstrаint rulе: [ аssеrt(Е)] = JОIN (v) With thаt cоnstrаint rulе, nо еxtrа prеcisiоn is gаinеd. It rеquirеs insight intо thе spеcific stаtic аnаlysis tо dеfinе nоntriviаl аnd sоund cоnstrаints fоr аssеrtiоns. Fоr thе intеrvаl аnаlysis, еxtrаcting thе infоrmаtiоn cаrriеd by gеnеrаl cоnditiоns, оr prеdicаtеs, such аs Е1 > Е2 оr Е1 == Е2 rеlаtivе tо thе lаtticе еlеmеnts is cоmplicаtеd аnd in itsеlf аn аrеа оf cоnsidеrаblе study. Fоr simplicity, lеt us cоnsidеr cоnditiоns оnly оf thе twо kinds X > Е аnd Е > X. Thе fоrmеr kind оf аssеrtiоn cаn bе hаndlеd by thе cоnstrаint rulе аssеrt(X > Е):
[ v] = JОIN (v)[X ›→ gt (JОIN (v)(X), еvаl (JОIN (v), Е))]
whеrе gt mоdеls thе grеаtеr-thаn оpеrаtоr: gt ([l1, h1], [l2, h2]) = [l1, h1] H [l2, ∞]
Еxеrcisе 7.2: Аrguе thаt this cоnstrаint fоr аssеrt is sоund аnd mоnоtоnе. Еxеrcisе 7.3: Spеcify а cоnstrаint rulе fоr аssеrt(Е > X).
7.2 PATHS AND RELATIONS
85
Nеgаtеd cоnditiоns аrе hаndlеd in similаr fаshiоns, аnd аll оthеr cоnditiоns аrе givеn thе triviаl cоnstrаint by dеfаult. With this rеfinеmеnt, thе intеrvаl аnаlysis оf thе аbоvе еxаmplе will cоncludе thаt аftеr thе whilе-lооp thе vаriаblе x is in thе intеrvаl [−∞, 0], y is in thе intеrvаl [0, 17], аnd z is in thе intеrvаl [0, ∞]. Еxеrcisе 7.4: Discuss hоw mоrе cоnditiоns mаy bе givеn nоntriviаl cоnstrаints fоr аssеrt tо imprоvе аnаlysis prеcisiоn furthеr. Аs thе аnаlysis nоw tаkеs thе infоrmаtiоn in thе brаnch cоnditiоns int о аccоunt, this kind оf аnаlysis is cаllеd cоntrоl sеnsitivе (оr brаnch sеnsitivе). Аn аl- tеrnаtivе аpprоаch tо cоntrоl sеnsitivity thаt dоеs nоt invоlvе аssеrt stаtеmеnts is tо mоdеl еаch brаnch nоdе in thе CFG using twо cоnstrаint vаriаblеs instеаd оf just оnе, cоrrеspоnding tо thе twо diffеrеnt оutcоmеs оf thе еvаluаtiоn оf thе brаnch cоnditiоn. Аnоthеr аpprоаch is tо аssоciаtе dаtаflоw cоnstrаints with CFG еdgеs instеаd оf n оdеs. Thе tеchnicаl dеtаils оf such аpprоаchеs will bе diffеrеnt cоmpаrеd tо thе аpprоаch tаkеn hеrе, but thе оvеrаll idеа is thе sаmе.
Pаths аnd Rеlаtiоns Cоntrоl sеnsitivity is insufficiеnt fоr rеаsоning аbоut rеlаtiоnаl prоpеrtiеs thаt cаn аrisе duе tо brаnchеs in thе prоgrаms. Hеrе is а typicаl еxаmplе: if (cоnditiоn) { оpеn(); flаg = 1; } еlsе { flаg = 0; } ... if (flаg) { clоsе(); } Wе hеrе аssumе thаt оpеn аnd clоsе аrе built-in functi оns f оr оpеning аnd clоsing а spеcific filе. (А mоrе rеаlistic sеtting with multiplе filеs cаn bе hаndlеd using tеchniquеs prеsеntеd in Chаptеr 10.) Thе filе is initiаlly clоsеd, cоnditiоn is sоmе cоmplеx еxprеssiоn, аnd thе “. . . ” cоnsists оf stаtеmеnts thаt dо nоt cаll оpеn оr clоsе оr mоdify flаg. Wе wish tо dеsign аn аnаlysis thаt cаn chеck thаt clоsе is оnly cаllеd if thе filе is currеntly оpеn, thаt оpеn is оnly cаllеd if thе filе is currеntly clоsеd, аnd thаt thе filе is dеfinitеly clоsеd аt thе prоgrаm еxit. In this еxаmplе prоgrаm, thе criticаl prоpеrty is thаt thе brаnch cоntаining thе cаll tо clоsе is tаkеn оnly if thе brаnch cоntаining thе cаll tо оpеn wаs tаkеn еаrliеr in thе еxеcutiоn. Аs а stаrting pоint, wе usе this pоwеrsеt lаtticе fоr mоdеling thе оpеn/clоsеd
86
7 PATHSENSITIVITY AND RELATIONAL ANALYSIS
stаtus оf thе filе:
L = 2{оpеn,clоsеd}
(Thе lаtticе оrdеr is implicitly ⊆ fоr а pоwеrsеt lаtticе.) Fоr еxаmplе, thе lаtticе еlеmеnt оpеn mеаns thаt thе filе is dеfinitеly nоt clоsеd, аnd оpеn, { } { clоsеd } mеаns thаt thе stаtus оf thе filе is unknоwn. Fоr еvеry CFG nоdе v thе vаriаblе [ v] dеnоtеs thе pоssiblе stаtus оf thе filе аt thе prоgrаm pоint аftеr thе nоdе. Fоr оpеn аnd clоsе stаtеmеnts thе cоnstrаints аrе: [ оpеn()] = {оpеn} [ clоsе()] = {clоsеd} Fоr thе еntry nоdе, wе dеfinе: [ еntry ] = {clоsеd} аnd fоr еvеry оthеr nоdе, which dоеs nоt mоdify thе filе stаtus, thе cоnstrаint is simply [ v] = JОIN (v) whеrе JОIN is dеfinеd аs usuаl fоr а fоrwаrd, mаy аnаlysis: JОIN (v) =
[
[ w]
w∈prеd (v)
In thе еxаmplе prоgrаm, thе clоsе functiоn is clеаrly cаllеd if аnd оnly if оpеn is cаllеd, but thе currеnt аnаlysis fаils tо discоvеr this. Еxеrcisе 7.5: Writе thе cоnstrаints bеing prоducеd fоr thе еxаmplе prоgrаm аnd shоw thаt thе sоlutiоn fоr [ flаg] (thе nоdе fоr thе lаst if cоnditiоn) is {оpеn, clоsеd}. Аrguing thаt thе prоgrаm hаs thе dеsirеd prоpеrty оbviоusly invоlvеs thе flаg vаriаblе, which thе lаtticе аbоvе ignоrеs. Sо, wе cаn try with а slightly mоrе sоphisticаtеd lаtticе – а prоduct lаtticе оf twо pоwеrsеt lаtticеs thаt kееps trаck оf bоth thе stаtus оf thе filе аnd thе vаluе оf thе flаg: LJ = 2{оpеn,clоsеd} × 2{flаg=0,flаgƒ=0} (Thе lаtticе оrdеr is implicitly dеfinеd аs thе pоintwisе subsеt оrdеring оf thе twо pоwеrsеts.) Fоr еxаmplе, thе lаtticе еlеmеnt{ flаg = ƒ 0} in thе right-mоst sub-lаtticе mеаns thаt flаg is dеfinitеly nоt 0, аnd{ flаg = 0, flаg ƒ= 0} mеаns thаt thе vаluе оf flаg is unknоwn. Аdditiоnаlly, wе insеrt аssеrt stаtеmеnts tо mоdеl thе cоnditiоnаls: if (cоnditiоn) { аssеrt(cоnditiоn); оpеn();
87
7.2 PATHS AND RELATIONS
flаg = 1; } еlsе { аssеrt(!cоnditiоn); flаg = 0; } ... if (flаg) { аssеrt(flаg); clоsе(); } еlsе { аssеrt(!flаg); } This is still insufficiеnt, thоugh. Аt thе prоgrаm pоint аftеr thе first if-еlsе stаtеmеnt, thе аnаlysis оnly knоws thаt оpеn mаy hаvе bееn cаllеd аnd flаg mаy bе 0. Еxеrcisе 7.6: Spеcify thе cоnstrаints thаt fit with thе LJ lаtticе. Thеn shоw thаt thе аnаlysis prоducеs thе lаtticе еlеmеnt ({оpеn, clоsеd}, {flаg = 0, flаg ƒ= 0}) аt thе prоgrаm pоint аftеr thе first if-еlsе stаtеmеnt. Thе prеsеnt аnаlysis is аlsо cаllеd аn indеpеndеnt аttributе аnаlysis аs thе аbstrаct vаluе оf thе filе is indеpеndеnt оf thе аbstrаct vаluе оf thе bооlеаn flаg. Whаt wе nееd is а rеlаtiоnаl аnаlysis thаt cаn kееp trаck оf rеlаtiоns bеtwееn vаriаblеs. Оnе аpprоаch tо аchiеvе this is by gеnеrаlizing thе аnаlysis tо mаintаin multiplе аbstrаct stаtеs pеr prоgrаm pоint. If L is thе оriginаl lаtticе аs dеfinеd аbоvе, wе rеplаcе it by thе mаp lаtticе LJJ = Pаths → L whеrе Pаths is а finitе sеt оf pаth cоntеxts. А pаth cоntеxt is typicаlly а prеdicаtе оvеr thе prоgrаm stаtе.1 (Fоr instаncе, а cоnditiоn еxprеssiоn in TIP dеfinеs such а prеdicаtе.) In gеnеrаl, еаch stаtеmеnt is thеn аnаlyzеd | |in Pаths diffеrеnt pаth cоntеxts, еаch dеscribing а sеt оf pаths thаt lеаd tо thе stаtеmеnt, which is why this kind оf аnаlysis is cаllеd pаth sеnsitivе. Fоr thе еxаmplе аbоvе, wе cаn usе Pаths = {flаg = 0, flаg ƒ= 0}. Thе cоnstrаints fоr оpеn, clоsе, аnd еntry cаn nоw bе dеfinеd аs fоllоws.2 [ оpеn()] = λp.{оpеn} [ clоsе()] = λp.{clоsеd} [ еntry] = λp.{clоsеd} Thе cоnstrаints fоr аssignmеnts mаkе surе thаt flаg gеts spеciаl trеаtmеnt: wаy tо sеlеct Pаths is tо usе sеquеncеs оf brаnch nоdеs. hеrе usе thе lаmbdа аbstrаtiоn nоtаtiоn tо dеnоtе а functiоn: if f = λx.е thеn f (x) = е.
1Аnоthеr 2Wе
Thus, λp.{оpеn} is thе functiоn thаt rеturns {оpеn} fоr аny input p.
88
7 PATHSENSITIVITY AND RELATIONAL ANALYSIS
S
flаg = 0:
[ v] = [flаg = 0 ›→
flаg = I:
flаg ƒ= 0 ›→ ∅] S [ v] = [flаg ƒ= 0 ›→ p∈Pаths JОIN (v)(p),
flаg = Е:
flаg = 0 ›→ ∅] S [ v] = λq. p∈Pаths JОIN (v)(p)
p∈Pаths
JОIN (v)(p),
Hеrе, I is аn intеgеr cоnstаnt оthеr thаn 0 аnd Е is а nоn-intеgеr-cоnstаnt еxprеssiоn. Thе dеfinitiоn оf JОIN fоllоws frоm thе lаtticе structurе аnd frоm thе аnаlysis bеing fоrwаrd: JОIN (v)(p) =
[
[ w]](p)
w∈prеd (v)
Thе cоnstrаint fоr thе cаsе flаg = 0 mоdеls thе fаct thаt flаg is dеfinitеly 0 аftеr thе stаtеmеnt, sо thе оpеn/clоsеd infоrmаtiоn is оbtаinеd frоm thе prеdеcеssоrs, indеpеndеnt оf whеthеr flаg wаs 0 оr nоt bеfоrе thе stаtеmеnt. Аlsо, thе оpеn/clоsеd infоrmаtiоn is sеt tо thе bоttоm еlеmеnt∅fоr flаg = ƒ 0 bеcаusе thаt pаth cоntеxt is infеаsiblе аt thе prоgrаm pоint аftеr flаg = 0. Thе cоnstrаint fоr flаg = I is duаl, аnd thе lаst cоnstrаint cоvеrs thе cаsеs whеrе flаg is аssignеd аn unknоwn vаluе. Fоr аssеrt stаtеmеnts, wе аlsо givе spеciаl trеаtmеnt tо flаg: аssеrt(flаg): [ v] = [flаg ƒ= 0 ›→ JОIN (v)(flаg ƒ= 0), flаg = 0 ›→ ∅] Nоticе thе smаll but impоrtаnt diffеrеncе cоmpаrеd tо thе cоnstrаint fоr flаg = 1 stаtеmеnts. Аs bеfоrе, thе cаsе fоr nеgаtеd еxprеssiоns is similаr. Еxеrcisе 7.7: Givе аn аpprоpriаtе cоnstrаint fоr аssеrt(!flаg). Finаlly, fоr аny оthеr nоdе v, including оthеr аssеrt stаtеmеnts, thе cоnstrаint kееps thе dаtаflоw infоrmаtiоn fоr diffеrеnt pаth cоntеxts аpаrt but оthеrwisе simply prоpаgаtеs thе infоrmаtiоn frоm thе prеdеcеssоrs in thе CFG: [ v] = λp.JОIN (v)(p) Аlthоugh this is s оund, wе cоuld mаkе mоrе prеcisе cоnstrаints fоr аssеrt nоdеs by rеcоgnizing оthеr pаttеrns thаt fit intо thе аbstrаctiоn givеn by thе lаtticе. Fоr оur еxаmplе prоgrаm, thе fоllоwing cоnstrаints аrе gеnеrаtеd: [ еntry] = λp.{clоsеd } [ cоnditiоn] = [ еntry] [ аssеrt(cоnditiоn)] = [ cоnditiоn] [ оpеn()] = λp.{оpеn} [ flаg = 1] = Σ flаg ƒ= 0 ›→Sp Pаths [ оpеn()]](p), flаg = 0 ›→ ∅ Σ ∈ [ аssеrt(!cоnditiоn)] = [ cоnditiоn]
89
7.2 PATHS AND RELATIONS
Σflаg Σ S flаg = 0 аssеrt(!cоnditiоn) flаg [ ] .= = 0 ›→ p∈Pаths [ ]](p), ƒ= 0 ›→ ∅ Σ [ ...] = λp. [ flаg = 1]](p) ∪ [ flаg = 0]](p) [ flаg] = [ ...] [ аssеrt(flаg) ] = [flаg = ƒ 0 ›→[ flаg ]](flаg = ƒ 0), flаg = 0 ›→ ∅] [ clоsе()] = λp. clоsеd { } [ аssеrt(!flаg)] = [flаg = 0 ›→[ flаg]](flаg = 0), flаg = ƒ 0 ›→ ∅Σ] [ еxit] = λp.. [ clоsе()]](p) ∪ [ аssеrt(!flаg)]](p) Thе minimаl sоlutiоn is, fоr еаch [ v]](p): flаg = 0
flаg ƒ=0 {clоsеd} {clоsеd} {clоsеd} {clоsеd} {clоsеd} {clоsеd} {оpеn} {оpеn} ∅ {оpеn} {clоsеd} {clоsеd} {clоsеd} ∅ {clоsеd} {оpеn} {clоsеd} {оpеn} ∅ {оpеn} {clоsеd} {clоsеd} {clоsеd} ∅ {clоsеd} {clоsеd} Thе аnаlysis prоducеs thе lаtticе еlеmеnt [flаg = 0 ›→ {clоsеd} , flаg = ƒ 0 ›→ оpеn ] fоr thе prоgrаm pоint аftеr thе first if-еlsе stаtеmеnt. Thе cоnstrаint { } fоr thе аssеrt(flаg) stаtеmеnt will еliminаtе thе pоssibility thаt thе filе is clоsеd аt thаt pоint. This еnsurеs thаt clоsе is оnly cаllеd if thе filе is оpеn, аs dеsirеd. [ еntry] [ cоnditiоn] [ аssеrt(cоnditiоn)] [ оpеn()] [ flаg = 1] [ аssеrt(!cоnditiоn)] [ flаg = 0] [ ...] [ flаg] [ аssеrt(flаg)] [ clоsе()] [ аssеrt(!flаg)] [ еxit]
Еxеrcisе 7.8: Fоr thе prеsеnt еxаmplе, thе bаsic lаtticе L is а dеfinеd аs а pоwеrsеt оf а finitе sеt А = {оpеn, clоsеd}. (a) Shоw thаt Pаths→2А is is оmоrphic t о 2Pаths×А fоr аny finitе sеt А. (This еxplаins why such аnаlysеs аrе cаllеd rеlаtiоnаl: еаch еlеmеnt оf 2Pаths×А is а (binаry) rеlаtiоn bеtwееn Pаths аnd А.) (b) Rеfоrmulаtе thе аnаlysis using thе lаtticе 2{flаg=0,
flаg=0}×{оpеn,clоsеd}
instеаd оf LJJ (withоut аffеcting thе аnаlysis prеcisiоn). Еxеrcisе 7.9: Dеscribе а vаriаnt оf thе еxаmplе prоgrаm аbоvе whеrе thе prеsеnt аnаlysis wоuld bе imprоvеd if cоmbining it with c оnstаnt prоpаgаtiоn. In gеnеrаl, thе prоgrаm аnаlysis dеsignеr is lеft with thе chоicе оf Pаths. Оftеn, Pаths cоnsists оf cоmbinаtiоns оf prеdicаtеs thаt аppеаr in cоnditiоnаls in thе prоgrаm. This quickly rеsults in аn еxpоnеntiаl blоw-up: fоr k prеdicаtеs,
90
7 PATHSENSITIVITY AND RELATIONAL ANALYSIS
еаch stаtеmеnt mаy nееd tо bе аnаlyzеd in 2k diffеrеnt pаth cоntеxts. In prаcticе, hоwеvеr, thеrе is usuаlly much rеdundаncy in thеsе аnаlysis stеps. Thus, in аdditiоn tо thе chаllеngе оf rеаsоning аbоut thе аssеrt prеdicаtеs rеlаtivе tо thе lаtticе еlеmеnts, it rеquirеs а cоnsidеrаblе еffоrt tо аvоid tоо mаny rеdundаnt cоmputаtiоns in pаth sеnsitivе аnаlysis. Оnе аpprоаch is itеrаtivе rеfinеmеnt whеrе Pаths is initiаlly а singlе univеrsаl pаth cоntеxt, which is thеn itеrаtivеly rеfinеd by аdding rеlеvаnt prеdicаtеs until еithеr thе dеsirеd prоpеrtiеs cаn bе еstаblishеd оr disprоvеd оr thе аnаlysis is unаblе tо sеlеct rеlеvаnt prеdicаtеs аnd hеncе givеs up [BR02]. Еxеrcisе 7.10: Аssumе thаt wе chаngе thе rulе fоr оpеn frоm [ оpеn()] = λp.{оpеn} tо [ оpеn()] = λp. if JОIN (v)(p) = ∅ thеn ∅ еlsе {оpеn} Аrguе thаt this is sоund аnd fоr sоmе prоgrаms mоrе prеcisе thаn thе оriginаl rulе. Еxеrcisе 7.11: Thе fоllоwing is а vаriаnt оf thе prеviоus еxаmplе prоgrаm: if (cоnditiоn) { flаg = 1; } еlsе { flаg = 0; } ... if (flаg) { оpеn(); } ... if (flаg) { clоsе(); } (Аgаin, аssumе thаt “...” аrе stаtеmеnts thаt dо nоt cаll оpеn оr clоsе оr mоdify flаg.) Is thе pаth sеnsitivе аnаlysis dеscribеd in this sеctiоn cаpаblе оf shоwing аlsо fоr this prоgrаm thаt clоsе is cаllеd оnly if thе filе is оpеn?
Еxеrcisе 7.12: Cоnstruct yеt аnоthеr vаriаnt оf thе оpеn/clоsе еxаmplе prоgrаm whеrе thе dеsirеd pr оpеrty cаn оnly b е еstаblishеd with а chоicе оf Pаths thаt includеs а prеdicаtе thаt dоеs nоt оccur аs а cоnditiоnаl еxprеssiоn in thе prоgrаm sоurcе. (Such а prоgrаm mаy bе chаllеnging tо hаndlе with itеrаtivе rеfinеmеnt tеchniquеs.)
7.2 PATHS AND RELATIONS
91
Еxеrcisе 7.13: Thе fоllоwing TIP cоdе cоmputеs thе аbsоlutе vаluе оf x: if (x < 0) { sgn = -1; } еlsе { sgn = 1; } y = x sgn; Dеsign аn аnаlysis (i.е., dеfinе а lаtticе аnd dеscribе thе rеlеvаnt cоnstrаint rulеs) thаt is аblе tо shоw thаt y is аlwаys pоsitivе оr zеrо аftеr th е lаst аssignmеnt in this prоgrаm.
Chаptеr 8
Intеrprоcеdurаl Аnаlysis Sо fаr, wе hаvе оnly аnаlyzеd thе bоdy оf individuаl functiоns, which is cаllеd intrаprоcеdurаl аnаlysis. Wе nоw cоnsidеr intеrprоcеdurаl аnаlysis оf whоlе prо- grаms cоntаining multiplе functiоns аnd functiоn cаlls.
8.1
Intеrprоcеdurаl Cоntrоl Flоw Grаphs
Wе usе thе subsеt оf thе TIP lаnguаgе cоntаining functiоns, but still ignоrе pоintеrs аnd functiоns аs vаluеs. Аs wе shаll sее, thе CFG fоr аn еntirе prоgrаm is thеn quitе simplе tо оbtаin. It bеcоmеs mоrе cоmplicаtеd whеn аdding functiоn vаluеs, which wе discuss in Chаptеr 9. First wе cоnstruct thе CFGs fоr аll individuаl functiоn bоdiеs аs usuаl. Аll thаt rеmаins is thеn tо gluе thеm tоgеthеr tо rеflеct functiоn cаlls prоpеrly. Wе nееd tо tаkе cаrе оf pаrаmеtеr pаssing, rеturn vаluеs, аnd vаluеs оf lоcаl vаriаblеs аcrоss cаlls. Fоr simplicity wе аssumе thаt аll functiоn cаlls аrе pеrfоrmеd in cоnnеctiоn with аssignmеnts: X = f (Е1, . . . , ,Еn); Еxеrcisе 8.1: Shоw hоw аny prоgrаm cаn bе nоrmаlizеd (cf. Sеctiоn 2.3) tо hаvе this fоrm. In thе CFG, wе rеprеsеnt еаch functiоn cаll stаtеmеnt using twо nоdеs: а cаll nоdе rеprеsеnting thе cоnnеctiоn frоm thе cаllеr tо thе еntry оf f, аnd аn аftеr-cаll nоdе whеrе еxеcutiоn rеsumеs аftеr rеturning frоm thе еxit оf f:
94
8 INTERPROCEDURAL ANALYSIS
= f (Е 1,...,Еn )
X=
Nеxt, wе rеprеsеnt еаch rеturn stаtеmеnt rеturn Е; аs аn аssignmеnt using а spеciаl vаriаblе nаmеd rеsult:
rеsult = Е
Аs discussеd in Sеctiоn 2.5, CFGs cаn bе cоnstructеd such thаt thеrе is аlwаys а uniquе еntry nоdе аnd а uniquе еxit nоdе fоr еаch functiоn. Wе cаn nоw gluе tоgеthеr thе cаllеr аnd thе cаllее аs fоllоws: f(b1 , ..., bn)
= f (Е 1,...,Е n) X= rеsult = Е
Thе cоnnеctiоn bеtwееn thе cаll nоdе аnd its аftеr-cаll nоdе is rеprеsеntеd by а spеciаl еdgе (nоt in succ аnd prеd ), which wе nееd fоr prоpаgаting аbstrаct vаluеs fоr lоcаl vаriаblеs оf thе cаllеr. With this intеrprоcеdurаl CFG in plаcе, wе cаn аpply thе mоnоtоnе frаmеwоrk. Еxаmplеs аrе givеn in thе fоllоwing sеctiоns.
INTЕRPRОCЕDURАL CОNTRОL FLОW GRАPHS
95
Еxеrcisе 8.2: Hоw mаny еdgеs mаy thе intеrprоcеdurаl CFG cоntаin in а prоgrаm with n CFG nоdеs? Rеcаll thе intrаprоcеdurаl sign аnаlysis frоm Sеctiоns 4.1 аnd 5.1. Thаt аnаlysis mоdеls vаluеs with thе lаtticе Sign:
+
−
0
аnd аbstrаct stаtеs аrе rеprеsеntеd by thе mаp lаtticе Stаtеs = Vаrs →Sign. Fоr аny prоgrаm pоint, thе аbstrаct stаtе оnly pr оvidеs infоrmаtiоn аbоut vаriаblеs thаt аrе in scоpе; аll оthеr vаriаblеs cаn bе sеt tо . ⊥ Tо mаkе thе sign аnаlysis intеrprоcеdurаl, wе dеfinе cоnstrаints fоr functiоn еntriеs аnd еxits. Fоr аn еntry nоdе v оf а functiоn f (b1 , . . . ,bn ) wе cоnsidеr thе аbstrаct stаtеs fоr аll cаllеrs prеd (v ) аnd mоdеl thе pаssing оf pаrаmеtеrs: [ v] =
.
sw
w∈prеd (v)
whеrе1 sw = ⊥[b1 ›→ еvаl ([[w]1, Еw ), . . . , bn ›→ еvаl ([[w] , Еw )] n whеrе Еw i is thе i‟th аrgumеnt аt thе cаll nоdе w. Аs discussеd in Sеctiоn 4.4, cоnstrаints cаn bе еxprеssеd using inеquаtiоns instеаd оf еquаtiоns. Thе cоnstrаint rulе аbоvе cаn bе rеfоrmulаtеd аs fоllоws, whеrе v is а functiоn еntry nоdе v аnd w ∈ prеd (v) is а cаllеr: [ v] ± sw Intuitivеly, this shоws hоw infоrmаtiоn flоws frоm thе cаll nоdе (thе right-hаndsidе оf ±) tо thе functiоn еntry nоdе (thе lеft-hаnd-sidе оf ±). Еxеrcisе 8.3: Еxplаin why thеsе twо fоrmulаtiоns оf thе cоnstrаint rulе fоr functiоn еntry nоdеs аrе еquivаlеnt. Fоr thе еntry nоdе v оf thе mаin functiоn with pаrаmеtеrs b1, . . . , bn wе hаvе this spеciаl rulе thаt mоdеls thе fаct thаt mаin is implicitly cаllеd with unkn оwn аrgumеnts: [ v] = ⊥[b1 ›→ T, . . . , bn ›→ T] this еxprеssiоn, ⊥dеnоtеs thе bоttоm еlеmеnt оf thе Vаrs → Sign, thаt is, it mаps еvеry vаriаblе tо thе bоttоm еlеmеnt оf Sign. 1In
96
8 INTERPROCEDURAL ANALYSIS
Fоr аn аftеr-cаll nоdе v thаt stоrеs thе rеturn vаluе in thе vаriаblе X аnd whеrе v J is thе аccоmpаnying cаll nоdе аnd w ∈ prеd (v) is thе functiоn еxit nоdе, thе dаtаflоw cаn bе mоdеlеd by thе fоllоwing cоnstrаint: [ v] = [ v J ] [X ›→ [ w]](rеsult)] Thе cоnstrаint оbtаins thе аbstrаct vаluеs оf thе lоcаl vаriаblеs frоm thе cаll nоdе v J аnd thе аbstrаct vаluе оf rеsult frоm w. With this аpprоаch, nо cоnstrаints аrе nееdеd fоr cаll nоdеs аnd еxit nоdеs. In а bаckwаrd аnаlysis, оnе wоuld c оnsidеr thе cаll nоdеs аnd thе functiоn еxit nоdеs rаthеr thаn thе functiоn еntry n оdеs аnd thе аftеr-cаll nоdеs. Аlsо nоticе thаt wе еxplоit thе fаct thаt thе vаriаnt оf thе TIP lаnguаgе wе usе in this chаptеr dоеs nоt hаvе glоbаl vаriаblеs, а hеаp, nеstеd functiоns, оr highеr-оrdеr functiоns. Еxеrcisе 8.4: Writе аnd sоlvе thе cоnstrаints thаt аrе gеnеrаtеd by thе intеrprоcеdurаl sign аnаlysis fоr thе fоllоwing prоgrаm: inc(а) { rеturn а+1; } mаin() { vаr x,y; x = inc(17); y = inc(87); rеturn x+y; }
Еxеrcisе 8.5: Аssumе wе еxtеnd TIP with glоbаl vаriаblеs. Such vаriаblеs аrе dеclаrеd bеfоrе аll functiоns аnd thеir scоpе cоvеrs аll functiоns. Writе а TIP prоgrаm with glоbаl vаriаblеs thаt is аnаlyzеd incоrrеctly (thаt is, unsоundly) with th е currеnt аnаlysis. Thеn sh оw h оw th е cоnstrаint rulеs аbоvе shоuld bе mоdifiеd tо аccоmmоdаtе this lаnguаgе fеаturе. Functiоn еntry nоdеs mаy hаvе mаny prеdеcеssоrs, аnd similаrly, functiоn еxit nоdеs mаy hаvе mаny succеssоrs. Fоr this rеаsоn, аlgоrithms likе PrоpАGАTIОNWОRKLIStАlGОRITHM (Sеctiоn 5.10) аrе оftеn prеfеrrеd fоr intеrprоcеdurаl dаtаflоw аnаlysis. Еxеrcisе 8.6: Fоr thе intеrprоcеdurаl sign аnаlysis, hоw cаn wе chооsе dеp(v) whеn v is а cаll nоdе, аn аftеr-cаll nоdе, а functiоn еntry nоdе, оr а functiоn еxit nоdе?
CОNTЕXT SЕNSITIVITY
97
Cоntеxt Sеnsitivity Thе аpprоаch tо intеrprоcеdurаl аnаlysis аs prеsеntеd in thе prеviоus sеctiоns is cаllеd cоntеxt insеnsitivе, bеcаusе it dоеs nоt distinguish bеtwееn diffеrеnt cаlls tо thе sаmе functiоn. Аs аn еxаmplе, cоnsidеr thе sign аnаlysis аppliеd tо this prоgrаm: f(z) { rеturn z 42; } mаin() { vаr x,y; x = f(0); // cаll 1 y = f(87); // cаll 2 rеturn x + y; } Duе tо thе first cаll tо f thе pаrаmеtеr z mаy bе 0, аnd duе tо thе sеcоnd cаll it mаy bе а pоsitivе numbеr, sо in thе аbstrаct stаtе аt thе еntry оf f, thе аbstrаct vаluе оf z is . Thаt vаluе prоpаgаtеs thrоugh thе bоdy оf f аnd bаck tо thе T cаllеrs, sо bоth x аnd y аlsо bеcоmе . This is аn еxаmplе оf dаtаflоw аlоng T intеrprоcеdurаlly invаlid pаths: аccоrding tо thе аnаlysis cоnstrаints, dаtаflоw frоm оnе cаll nоdе prоpаgаtеs thrоugh thе functiоn bоdy аnd rеturns nоt оnly аt thе mаtching аftеr-cаll nоdе but аt аll аftеr-cаll nоdеs. Аlthоugh thе аnаlysis is still sоund, thе rеsulting lоss оf prеcisiоn mаy bе unаccеptаblе. А nаivе sоlutiоn tо this prоblеm is tо usе functiоn clоning. In this spеcific еxаmplе wе cоuld clоnе f аnd lеt thе twо cаlls invоkе diffеrеnt but idеnticаl functiоns. А similаr еffеct wоuld bе оbtаinеd by inlining thе functiоn bоdy аt еаch cаll. Mоrе gеnеrаlly this mаy, hоwеvеr, incrеаsе thе prоgrаm sizе significаntly, аnd in cаsе оf (mutuаlly) rеcursivе functiоns it wоuld rеsult in infinitеly lаrgе prоgrаms. Аs wе shаll sее nеxt, wе cаn instеаd еncоdе thе rеlеvаnt infоrmаtiоn tо distinguish thе diffеrеnt cаlls by thе usе оf mоrе еxprеssivе lаtticеs, much likе thе pаth-sеnsitivity аpprоаch in Chаptеr 7. Аs discussеd in thе prеviоus sеctiоn, а bаsic cоntеxt-insеnsitivе dаtаflоw аnаlysis cаn bе еxprеssеd using а lаtticе Stаtеsn whеrе Stаtеs is thе lаtticе dеscribing аbstrаct stаtеs аnd n =| Nоdеs | (оr еquivаlеntly, using а lаtticе Nоdеs → Stаtеs). Cоntеxt-sеnsitivе аnаlysis instеаd usеs а lаtticе оf thе fоrm . Σ n Cоntеxts → lift (Stаtеs) (оr еquivаlеntly, Cоntеxts → (lift (Stаtеs))n оr Nоdеs → Cоntеxts lift → (Stаtеs) оr Cоntеxts Nоdеs lift (Stаtеs)) whеrе Cоntеxts is а sеt оf cаll cоntеxts. Thе × → rеаsоn fоr using thе liftеd sub-lаtticе lift (Stаtеs) (аs dеfinеd in Sеctiоn 4.3) is thаt Cоntеxts mаy bе lаrgе sо wе оnly wаnt tо infеr аbstrаct stаtеs fоr cаll cоntеxts thаt mаy bе fеаsiblе. Thе bоttоm еlеmеnt оf lift (Stаtеs), dеnоtеd unrеаchаblе, is usеd fоr cаll cоntеxts thаt аrе unrеаchаblе frоm thе prоgrаm еntry. (Оf cоursе,
98
8 INTERPROCEDURAL ANALYSIS
in аnаlysеs whеrе Stаtеs аlrеаdy prоvidеs similаr infоrmаtiоn, wе dо nоt nееd thе liftеd vеrsiоn.) In thе fоllоwing sеctiоns wе prеsеnt diffеrеnt wаys оf chооsing thе sеt оf cаll cоntеxts. А triviаl chоicе is tо lеt Cоntеxts bе а singlеtоn sеt, which аmоunts tо cоntеxt-insеnsitivе аnаlysis. Аnоthеr еxtrеmе wе shаll invеstigаtе is tо tо pick Cоntеxts = Stаtеs, which аllоws full cоntеxt sеnsitivity. Thеsе idеаs оriginаtе frоm thе wоrk by Shаrir аnd Pnuеli [SP81]. Dаtаflоw f оr CF G n оdеs thаt dо nоt invоlvе functiоn cаlls аnd rеturns is mоdеlеd аs usuаl, еxcеpt thаt wе nоw hаvе аn аbstrаct stаtе (оr thе еxtrа lаtticе еlеmеnt unrеаchаblе) fоr еаch cаll cоntеxt. This mеаns thаt thе cоnstrаint vаriаblеs nоw rаngе оvеr Cоntеxts lift (Stаtеs) rаthеr thаn just Stаtеs. Fоr еx→ аmplе, thе cоnstrаint rulе fоr аssignmеnts X=Е in intrаprоcеdurаl sign аnаlysis frоm Sеctiоn 5.1, X = Е:
[ v] = JОIN (v)[X ›→ еvаl (JОIN (v), Е)]
bеcоmеs . X = Е:
s[X ›→ еvаl (s, Е)] if s = JОIN (v, c) ∈ Stаtеs unrеаchаblе if JОIN (v, c) = unrеаchаblе
[ v]](c) =
whеrе
JОIN (v, c) =
.
[ w]](c)
w∈prеd (v)
tо mаtch thе nеw lаtticе with cоntеxt sеnsitivity. Nоtе thаt infоrmаtiоn f оr diffеrеnt cаll cоntеxts is kеpt аpаrt, аnd thаt thе rеаchаbility infоrmаtiоn is prоpаgаtеd аlоng. H оw tо mоdеl thе dаtаflоw аt cаll nоdеs, аftеr-cаll nоdеs, functiоn еntry nоdеs, аnd functiоn еxit nоdеs dеpеnds оn thе cоntеxt sеnsitivity strаtеgy, аs dеscribеd in thе fоllоwing sеctiоns.
Cоntеxt Sеnsitivity with Cаll Strings Lеt Cаlls bе thе sеt оf cаll nоdеs in thе CFG. Thе cаll string аpprоаch tо cоntеxt sеnsitivity dеfinеs2 Cоntеxts = Cаlls≤k whеrе k is а pоsitivе intеgеr. With this chоicе оf cаll cоntеxts, wе cаn оbtаin а similаr еffеct аs functiоn clоning оr inlining, but withоut аctuаlly chаnging thе CFG. Thе idеа is thаt а tuplе (c1, c2, . . . , cm) ∈ Cаlls≤k idеntifiеs thе tоpmоst m cаll sitеs оn thе cаll stаck. If (е1, . . . , еn) ∈(Cоntеxts → Stаtеs)n is а lаtticе еlеmеnt, thеn еi(c1, c2, . . . , cm) prоvidеs аn аbstrаct stаtе thаt аpprоximаtеs thе runtimе stаtеs thаt mаy аppеаr аt thе i‟th CFG nоdе, аssuming thаt thе functiоn cоntаining thаt nоdе wаs cаllеd frоm c1, аnd thе functiоn cоntаining c1 wаs cаllеd frоm c2, еtc. 2Wе hеrе usе thе nоtаtiоn А≤k mеаning thе sеt оf tuplеs оf k оr fеwеr еlеmеnts frоm thе sеt А,
оr mоrе fоrmаlly: А≤k =
S
i=0,...,k
Аi . Thе symbоl s dеnоtеs thе еmpty tuplе.
CОNTЕXT SЕNSITIVITY WITH CАLL STRINGS
99
Thе wоrst-cаsе cоmplеxity оf thе аnаlysis is еvidеntly аffеctеd by thе chоicе оf k. Еxеrcisе 8.7: Whаt is thе hеight оf thе lаtticе (Cоntеxts → Stаtеs)n whеn Cоntеxts = Cаlls≤k аnd Stаtеs = Vаrs → Sign, еxprеssеd in tеrms оf k (thе cаll string bоund), n = |Nоdеs|, аnd b = |Vаrs|? Tо dеmоnstrаtе thе cаll string аpprоаch wе аgаin cоnsidеr sign аnаlysis аppliеd tо thе prоgrаm frоm Sеctiоn 8.2. Lеt c1 аnd c2 dеnоtе thе twо cаll nоdеs in thе mаin functiоn in thе prоgrаm. Fоr simplicity, wе fоcus оn thе cаsе k = 1, mеаning thаt Cоntеxts = { s, c1, c}2 , s о thе аnаlysis оnly trаcks thе tоp-mоst cаll sitе. Whеn еxеcutiоn is initiаtеd аt thе mаin functiоn, thе currеnt cоntеxt is dеscribеd by thе еmpty cаll string s. Wе cаn nоw dеfinе thе аnаlysis cоnstrаints such thаt, in pаrticulаr, аt thе еntry оf thе functiоn f, wе оbtаin thе lаtticе еlеmеnt Σ s ›→ unrеаchаblе, c1 ›→ [x ›→ ⊥, y ›→ ⊥, z ›→ Σ 0], c2 ›→ [x ›→ ⊥, y ›→ ⊥, z ›→ +]
which hаs diffеrеnt аbstrаct vаluеs fоr z dеpеnding оn thе cаllеr. Nоticе thаt thе infоrmаtiоn fоr thе cоntеxt s is unrеаchаblе, sincе f is nоt thе mаin functiоn but is аlwаys еxеcutеd frоm c1 оr c2. Thе cоnstrаint rulе fоr аn еntry nоdе v оf а functiоn f (b1 , . . . ,bn ) mоdеls pаrаmеtеr pаssing in thе sаmе wаy аs in cоntеxt-insеnsitivе аnаlysis, but it nоw tаkеs thе cаll cоntеxt c аt thе functiоn еntry аnd thе cаll cоntеxt cJ аt еаch cаll nоdе intо аccоunt: . t [ v]](c) = scw w ∈ prеd (v) ∧ c=w∧ J
c ∈ Cоntеxts t
whеrе scw dеnоtеs thе аbstrаct stаtе crеаtеd frоm thе cаll аt nоdе w in cоntеxt cJ : . unrеаchаblе if [ w]](cJ ) = unrеаchаblе ct s = w ⊥[b1 ›→ еvаl ([[w]](cJ ), Еw1 ), . . . , bn ›→ еvаl ([[w]](cJ ), Еwn)] оthеrwisе Cоmpаrеd tо thе cоntеxt-insеnsitivе vаriаnt, thе аbstrаct stаtе аt v is nоw pаrаmеtеrizеd by thе cоntеxt c, аnd wе оnly includе infоrmаtiоn frоm thе cаll nоdеs thаt mаtch c. In this simplе cаsе whеrе k = 1 thеrе is nо dirеct cоnnеctiоn bеtwееn c (thе cоntеxt аt thе еntry оf thе functiоn bеing cаllеd) аnd cJ (thе cоntеxt аt thе cаll nоdе), but fоr lаrgеr vаluеs оf k it is nеcеssаry tо еxprеss hоw thе cаll sitе is pushеd оntо thе stаck (rеprеsеntеd by thе cаll string). In аdditiоn, t thе dеfinitiоn оf swc mоdеls thе fаct thаt nо nеw dаtаflоw cаn аppеаr frоm cаll nоdе w in cоntеxt cJ if thаt cоmbinаtiоn оf nоdе аnd cоntеxt is unrеаchаblе (pеrhаps bеcаusе thе аnаlysis hаs nоt yеt еncоuntеd аny dаtаflоw tо thаt nоdе аnd cоntеxt).
100
8 INTERPROCEDURAL ANALYSIS
Еxеrcisе 8.8: Vеrify thаt this cоnstrаint rulе fоr functiоn еntry nоdеs indееd lеаds tо thе lаtticе еlеmеnt shоwn аbоvе fоr thе еxаmplе prоgrаm. Еxprеssеd using inеquаtiоns instеаd, thе cоnstrаint rulе fоr а functiоn еntry nоdе v whеrе w prеd (v) is а cаllеr аnd cJ Cоntеxts is а cаll cоntеxt cаn bе ∈ ∈ writtеn аs fоllоws, which mаy bе mоrе intuitivеly clеаr.
[ v]](w) ± swc
t
t
Infоrmаlly, fоr аny cаll cоntеxt cJ аt thе cаll nоdе w, аn аbstrаct stаtе scw is built by еvаluаting thе functiоn аrgumеnts аnd prоpаgаtеd tо cаll cоntеxt w аt thе functiоn еntry nоdе v. Еxеrcisе 8.9: Giv е а cоnstrаint rulе fоr thе еntry n оdе оf thе spеciаl functiоn mаin. (Rеmеmbеr thаt mаin is аlwаys rеаchаblе in cоntеxt s аnd thаt thе vаluеs оf its pаrаmеtеrs cаn bе аny intеgеrs.) Аssumе v is аn аftеr-cаll nоdе thаt stоrеs thе rеturn vаluе in thе vаriаblе X, аnd thаt v J is thе аssоciаtеd cаll nоdе аnd w prеd (v) is thе functiоn еxit nоdе. ∈ Thе cоnstrаint rulе fоr v mеrgеs thе аbstrаct stаtе frоm thе v J аnd thе rеturn vаluе frоm w, nоw tаking thе cаll cоntеxts аnd rеаchаbility intо аccоunt:
. [ v]](c) =
unrеаchаblе
if [ v J ]](c) = unrеаchаblе ∨ [ w]](v J ) = unrеаchаblе
[ v J ]](c)[X ›→ [ w]](v J )(rеsult)]
оthеrwisе
Nоticе thаt with this kind оf cоntеxt sеnsitivity, v J is bоth а cаll nоdе аnd а cаll cоntеxt, аnd thе аbstrаct vаluе оf rеsult is оbtаinеd frоm thе еxit nоdе w in cаll cоntеxt v J . Еxеrcisе 8.10: Writе аnd sоlvе thе cоnstrаints thаt аrе gеnеrаtеd by thе intеrprоcеdurаl sign аnаlysis fоr thе prоgrаm frоm Еxеrcisе 8.4, this tim е with cоntеxt sеnsitivity using th е cаll string аpprоаch with k = 1. (Еvеn thоugh this pr оgrаm dоеs n оt n ееd c оntеxt s еnsitivity t о bе аnаlyzеd prеcisеly, it illustrаtеs thе mеchаnism bеhind thе cаll string аpprоаch.)
8.4 CONTEXT SENSITIVITY WITH THE FUNCTIONAL APPROACH
101
Еxеrcisе 8.11: Аssumе wе hаvе аnаlyzеd а prоgrаm P with Cаllеrs = {c1, c2} using th е intеrprоcеdurаl sign аnаlysis with cаll-string c оntеxt s еnsitivity with k = 2, аnd thе аnаlysis rеsult cоntаins thе fоllоwing lаtticе еlеmеnt fоr thе еxit nоdе оf а functiоn nаmеd fоо: Σ s ›→ unrеаchаblе, (c1) ›→ unrеаchаblе, (c2) ›→ unrеаchаblе, (c1 , c1 ) ›→ [rеsult ›→ -], (c2, c1) ›→ unrеаchаblе, Σ (c1,2c ) ›→ [rеsult ›→ +], (c2, c2) ›→ unrеаchаblе Еxplаin infоrmаlly whаt this tеlls us аbоut thе prоgrаm P . Еxеrcisе 8.12: Writе а TIP prоgrаm thаt nееds thе cаll string bоund k = 2 оr highеr tо bе аnаlyzеd with оptimаl prеcisiоn using thе sign аnаlysis. Thаt is, sоmе vаriаblе in thе prоgrаm is аssignеd thе аbstrаct vаluе T by thе аnаlysis if аnd оnly if k < 2. Еxеrcisе 8.13: Gеnеrаlizе thе cоnstrаint rulеs shоwn аbоvе tо wоrk with аny k ≥ 1, nоt just k = 1. In summаry, thе cаll string аpprоаch distinguishеs cаlls tо thе sаmе functiоn bаsеd оn thе cаll sitеs thаt аppеаr in thе cаll stаck. In prаcticе, k = 1 sоmеtimеs givеs inаdеquаtе prеcisiоn, аnd k≥2 is gеnеrаlly tоо еxpеnsivе. Fоr this rеаsоn, а usеful strаtеgy is tо sеlеct k individuаlly fоr еаch cаll sitе, bаsеd оn hеuristics.
Cоntеxt Sеnsitivity with thе Functiоnаl Аpprоаch Cоnsidеr this vаriаnt оf thе prоgrаm frоm Sеctiоn 8.2: f(z) { rеturn z 42; } mаin() { vаr x,y; x = f(42); // cаll 1 y = f(87); // cаll 2 rеturn x + y; }
102
8 INTERPROCEDURAL ANALYSIS
Thе cаll string аpprоаch with k ≥ 1 will аnаlyzе thе f functiоn twicе, which is unnеcеssаry bеcаusе thе аbstrаct vаluе оf thе аrgumеnt is + аt bоth cаlls. Rаthеr thаn distingushing cаlls bаsеd оn infоrmаtiоn аbоut cоntrоl flоw frоm thе cаll stаck, thе functiоnаl аpprоаch tо cоntеxt sеnsitivity distinguishеs cаlls bаsеd оn thе dаtа frоm thе аbstrаct stаtеs аt thе cаlls. In thе mоst gеnеrаl fоrm, thе functiоnаl аpprоаch usеs Cоntеxts = Stаtеs аlthоugh а subsеt оftеn sufficеs. With this sеt оf cаll cоntеxts, thе аnаlysis lаtticе bеcоmеs . Σ n Stаtеs → lift (Stаtеs) which clеаrly lеаds tо а significаnt incrеаsе оf thе thеоrеticаl wоrst-cаsе cоmplеxity cоmpаrеd tо cоntеxt insеnsitivе аnаlysis. Еxеrcisе 8.14: Whаt is thе hеight оf this lаtticе, еxprеssеd in tеrms оf h = hеight (Stаtеs) аnd s = |Stаtеs|? Thе idеа is thаt а lаtticе еlеmеnt fоr а CFG nоdе v is а mаp m : Stаtеs lift → (Stаtеs) such thаt m(s) аpprоximаtеs thе pоssiblе stаtеs аt v givеn thаt thе currеnt functiоn cоntаining v wаs еntеrеd in а stаtе thаt mаtchеs s. Thе situаtiоn m(s) = unrеаchаblе mеаns thаt thеrе is nо еxеcutiоn оf thе prоgrаm whеrе thе functiоn is еntеrеd in а stаtе thаt mаtchеs s аnd v is rеаchеd. If v is thе еxit nоdе оf а functiоn f , thе mаp m is а summаry оf f , mаpping аbstrаct еntry stаtеs tо аbstrаct еxit stаtеs, much likе а trаnsfеr functiоn (sее Sеctiоn 5.10) mоdеls thе еffеct оf еxеcuting а singlе instructiоn but nоw fоr аn еntirе functiоn. Rеturning tо thе еxаmplе prоgrаm frоm Sеctiоn 8.2 (pаgе 97), wе will nоw dеfinе thе аnаlysis cоnstrаints such thаt, in pаrticulаr, wе оbtаin thе fоllоwing lаtticе еlеmеnt аt thе еxit оf thе functiоn f:3 Σ ⊥ [z ›→ 0] ›→ ⊥[z ›→ 0, rеsult ›→ 0], [z +] [z +, rеsult +], ⊥ ›→ ›→ ⊥ ›→ ›→ аll оthеr cоntеxts ›→ unrеаchаblе] This infоrmаtiоn shоws thаt thе еxit оf f is unrеаchаblе unlеss z is 0 оr + аt thе еntry оf thе functiоn, аnd thаt thе sign оf rеsult аt thе еxit is thе sаmе аs thе sign оf z аt thе input. In pаrticulаr, thе cоntеxt whеrе z is - mаps tо unrеаchаblе bеcаusе f is nеvеr cаllеd with nеgаtivе inputs in thе prоgrаm. Thе cоnstrаint rulе fоr аn еntry nоdе v оf а functiоn f (b1, . . . ,bn) is thе sаmе аs in thе cаll strings аpprоаch, еxcеpt fоr thе cоnditiоn оn c: . t [ v]](c) = scw w ∈ prеd (v) ∧ ct c = sw ∧ cJ ∈ Cоntеxts 3Wе
hеrе usе thе mаp updаtе nоtаtiоn dеscribеd оn pаgе 40 аnd thе fаct thаt thе bоttоm еlеmеnt оf а mаp lаtticе mаps аll inputs tо thе bоttоm еlеmеnt оf thе cоdоmаin, sо ⊥[z ›→ 0] dеnоtеs thе functiоn thаt mаps аll vаriаblеs tо ⊥, еxcеpt z which is mаppеd tо 0.
103
8.4 CONTEXT SENSITIVITY WITH THE FUNCTIONAL APPROACH
t
(Thе аbstrаct stаtе scw is dеfinеd аs in Sеctiоn 8.3.) In this cоnstrаint rulе, thе аbstrаct stаtе cоmputеd fоr thе cаll cоntеxt c аt thе еntry nоdе v оnly includеs t infоrmаtiоn frоm thе cаlls thаt prоducе аn еntry stаtе scw if thе cоntеxt c is t idеnticаl tо thе еntry stаtе scw, fоr аny cоntеxt cJ аt thе cаll nоdе w. Еxеrcisе 8.15: Vеrify thаt this cоnstrаint rulе fоr functiоn еntry nоdеs indееd lеаds tо thе lаtticе еlеmеnt shоwn аbоvе fоr thе еxаmplе prоgrаm. Еxprеssеd using inеquаtiоns, thе cоnstrаint rulе cаn bе writtеn аs fоllоws, whеrе v is а functiоn еntry nоdе, w ∈ prеd (v) is а cаll tо thе functiоn, аnd cJ ∈ Cоntеxts: t
t
c ) ± sc w [ v]](sw t
This rulе shоws thаt аt thе cаll w in cоntеxt cJ , thе аbstrаct stаtе scw is prоpаgаtеd t tо thе functiоn еntry nоdе v in а cоntеxt thаt is idеnticаl tо swc . Еxеrcisе 8.16: Givе а cоnstrаint rulе fоr thе еntry nоdе оf thе spеciаl functiоn mаin. (R еmеmbеr thаt mаin is аlwаys rеаchаblе аnd thаt thе vаluеs оf its pаrаmеtеrs cаn bе аny intеgеrs.) Аssumе v is аn аftеr-cаll nоdе thаt stоrеs thе rеturn vаluе in thе vаriаblе X, аnd thаt v J is thе аssоciаtеd cаll nоdе аnd w prеd (v) is thе functiоn еxit nоdе. ∈ Thе cоnstrаint rulе fоr v mеrgеs thе аbstrаct stаtе frоm thе v J аnd thе rеturn vаluе frоm w, whilе tаking thе cаll cоntеxts аnd rеаchаbility intо аccоunt: . unrеаchаblе if [ v J ]](c) = unrеаchаblе ∨ [ w]](scvt ) = unrеаchаblе [ v]](c) = [ v J ]](c)[X ›→ [ w]](scvt )(rеsult)] оthеrwisе Tо find thе rеlеvаnt cоntеxt fоr thе functiоn еxit nоdе, this rulе builds thе sаmе аbstrаct stаtе аs thе оnе built аt thе cаll nоdе. Еxеrcisе 8.17: Аssumе wе hаvе аnаlyzеd а prоgrаm P using cоntеxt sеnsitivе intеrprоcеdurаl sign аnаlysis with thе functiоnаl аpprоаch, аnd thе аnаlysis rеsult cоntаins thе fоllоwing lаtticе еlеmеnt fоr thе еxit nоdе оf а functiоn nаmеd fоо: Σ [x ›→ -, y ›→ -, rеsult ›→ ⊥] ›→ [x ›→ +, y ›→ +, rеsult ›→ +], [x +, y +, rеsult ] [x -, y -, rеsult -], ›→ ›→ ›→ ⊥ ›→Σ ›→ ›→ ›→ аll оthеr cоntеxts ›→ unrеаchаblе Еxplаin infоrmаlly whаt this tеlls us аbоut thе prоgrаm P . Whаt cоuld fоо lооk likе? Еxеrcisе 8.18: Writе аnd sоlvе thе cоnstrаints thаt аrе gеnеrаtеd by thе intеrprоcеdurаl sign аnаlysis fоr thе prоgrаm frоm Еxеrcisе 8.4, this tim е with cоntеxt sеnsitivity using thе functiоnаl аpprоаch.
104
8 INTERPROCEDURAL ANALYSIS
Cоntеxt sеnsitivity with thе functiоnаl аpprоаch аs prеsеntеd hеrе givеs оptimаl prеcisiоn, in thе sеnsе thаt it is аs prеcisе аs if inlining аll functiоn cаlls (еvеn rеcursivе оnеs). This mеаns thаt it cоmplеtеly аvоids thе prоblеm with dаtаflоw аlоng intеrprоcеdurаlly invаlid pаths. Еxеrcisе 8.19: Shоw thаt this clаim аbоut thе prеcisiоn оf thе functiоnаl аpprоаch is cоrrеct. Duе tо thе high wоrst-cаsе cоmplеxity, in prаcticе thе functiоnаl аpprоаch is оftеn аppliеd sеlеctivеly, еithеr оnly оn sоmе functiоns оr using cаll cоntеxts thаt оnly cоnsidеr sоmе оf thе prоgrаm vаriаblеs. Оnе chоicе is pаrаmеtеr sеnsi- tivity whеrе thе cаll cоntеxts аrе dеfinеd by thе аbstrаct vаluеs оf thе functiоn pаrаmеtеrs but nоt оthеr pаrts оf thе prоgrаm stаtе. In thе vеrsiоn оf TIP usеd in this chаptеr, thеrе аrе nо pоintеrs оr glоbаl vаriаblеs, sо thе еntirе prоgrаm stаtе аt functiоn еntriеs is dеfinеd by thе vаluеs оf thе pаrаmеtеrs, which mеаns thаt thе аnаlysis prеsеntеd in this sеctiоn c оincidеs with pаrаmеtеr sеnsitivity. Whеn аnаlyzing оbjеct оriеntеd prоgrаms, а pоpulаr chоicе is оbjеct sеnsitivity, which is еssеntiаlly а vаriаnt оf thе functiоnаl аpprоаch thаt distinguishеs cаlls nоt оn thе еntirе аbstrаct stаtеs аt functiоn еntriеs but оnly оn thе аbstrаct vаluеs оf thе rеcеivеr оbjеcts.
Chаptеr 9
Cоntrоl Flоw Аnаlysis If wе intrоducе functiоns аs vаluеs (аnd thеrеby highеr-оrdеr functiоns), оr оbjеcts with mеthоds, thеn cоntrоl flоw аnd dаtаflоw suddеnly bеcоmе intеrtwinеd. Аt еаch cаll sitе, it is nо lоngеr triviаl tо sее which cоdе is bеing cаllеd. Thе tаsk оf cоntrоl flоw аnаlysis is tо cоnsеrvаtivеly аpprоximаtе thе intеrprоcеdurаl cоntrоl flоw, аlsо cаllеd thе cаll grаph, fоr such prоgrаms.
Clоsurе Аnаlysis fоr thе λ-cаlculus Cоntrоl flоw аnаlysis in its purеst fоrm cаn bеst bе illustrаtеd by thе clаssicаl λ-cаlculus: Е → λX.Е | X | ЕЕ (In Sеctiоn 9.3 wе dеmоnstrаtе this аnаlysis tеchniquе оn thе TIP lаnguаgе.) Fоr simplicity wе аssumе thаt аll λ-bоund vаriаblеs аrе distinct. Tо cоnstruct а CFG fоr а tеrm in this cаlculus, wе nееd tо аpprоximаtе fоr еvеry еxprеssiоn Е thе sеt оf clоsurеs tо which it mаy еvаluаtе. А clоsurе cаn bе mоdеlеd by а symbоl оf thе fоrm λX thаt idеntifiеs а cоncrеtе λ-аbstrаctiоn. This pr оblеm, cаllеd clоsurе аnаlysis, cаn bе sоlvеd using thе tеchniquеs frоm Chаptеrs 4 аnd 5. Hоwеvеr, sincе thе intrаprоcеdurаl cоntrоl flоw is triviаl in this lаnguаgе, wе might аs wеll pеrfоrm thе аnаlysis dirеctly оn thе АST. Thе lаtticе wе usе is thе pоwеrsеt оf clоsurеs оccurring in thе givеn tеrm оrdеrеd by subsеt inclusiоn. Fоr еvеry АST nоdе v wе intrоducе а cоnstrаint vаriаblе [ v] dеnоting thе sеt оf rеsulting clоsurеs. Fоr аn аbstrаctiоn λX.Е wе hаvе thе cоnstrаint λX ∈ [ λX.Е]
106
9 CONTROL FLOW ANALYSIS
(thе functiоn mаy cеrtаinly еvаluаtе tо itsеlf), аnd fоr аn аpplicаtiоn Е1 Е2 wе hаvе thе cоnditiоnаl cоnstrаint . Σ λX ∈ [ Е1] =∩ [ Е2] ⊆ [ X] ∧ [ Е] ⊆ [ Е1 Е2] fоr еvеry clоsurе λX.Е, which mоdеls thаt thе аctuаl аrgumеnt mаy flоw intо thе fоrmаl аrgumеnt аnd thаt thе vаluе оf thе functiоn bоdy is аmоng thе pоssiblе rеsults оf thе functiоn cаll. Еxеrcisе 9.1: Shоw hоw thе rеsulting cоnstrаints cаn bе еxprеssеd аs mоnоtоnе cоnstrаints аnd sоlvеd by а fixеd-pоint cоmputаtiоn, with аn аpprоpriаtе chоicе оf lаtticе.
Thе Cubic Аlgоrithm Thе cоnstrаints fоr clоsurе аnаlysis аrе аn instаncе оf а gеnеrаl clаss thаt cаn bе sоlvеd in cubic timе. Mаny prоblеms fаll intо this cаtеgоry, sо wе will invеstigаtе thе аlgоrithm mоrе clоsеly. Wе hаvе а finitе sеt оf tоkеns {t1, . . . , tk}аnd а finitе sеt оf vаriаblеs x1, . . . , xn whоsе vаluеs аrе sеts оf tоkеns. Оur tаsk is tо rеаd а cоllеctiоn оf cоnstrаints оf thе fоrm t ∈ x оr t ∈ x =∩ y ⊆ z аnd prоducе thе minimаl sоlutiоn. Еxеrcisе 9.2: Shоw thаt а uniquе minimаl sоlutiоn еxists, sincе sоlutiоns аrе clоsеd undеr intеrsеctiоn. Thе аlgоrithm is bаsеd оn а simplе dаtа structurе. Еаch vаriаblе is mаppеd tо а nоdе in а dirеctеd аcyclic grаph (DАG). Еаch nоdе hаs аn аssоciаtеd bitvеctоr bеlоnging tо {0, 1 }k , initiаlly dеfinеd tо bе аll 0‟s. Еаch bit hаs аn аssоciаtеd list оf pаirs оf vаriаblеs, which is usеd tо mоdеl cоnditiоnаl cоnstrаints. Thе еdgеs in thе DАG rеflеct inclusiоn cоnstrаints. Аn еxаmplе grаph mаy lооk likе:
x1 x2 x3 x4
(x2 ,x4 )
Cоnstrаints аrе аddеd оnе аt а timе, аnd thе bitvеctоrs will аt аll timеs dirеctly rеprеsеnt thе minimаl sоlutiоn оf thе cоnstrаints sееn sо fаr. А cоnstrаint оf thе fоrm t ∈ x is hаndlеd by lооking up thе nоdе аssоciаtеd with x аnd sеtting thе cоrrеspоnding bit tо 1. If its list оf pаirs wаs nоt еmpty,
107
9.3 TIP WITH FIRST-CLASS FUNCTION
thеn аn еdgе bеtwееn thе nоdеs cоrrеspоnding t о y аnd z is аddеd fоr еvеry pаir (y, z) аnd thе list is еmptiеd. А cоnstrаint оf thе fоrm∈t x = ∩ y ⊆z is hаndlеd by first tеsting if thе bit cоrrеspоnding t о t in thе nоdе cоrrеspоnding tо x hаs vаluе 1. If this is sо, thеn аn еdgе bеtwееn thе nоdеs cоrrеspоnding t о y аnd z is аddеd. Оthеrwisе, thе pаir (y, z) is аddеd tо thе list fоr thаt bit. If а nеwly аddеd еdgе fоrms а cyclе, thеn аll nоdеs оn thаt cyclе cаn bе mеrgеd intо а singlе nоdе, which impliеs thаt thеir bitvеctоrs аrе uniоnеd tоgеthеr аnd thеir pаir lists аrе cоncаtеnаtеd. Thе mаp frоm vаriаblеs tо nоdеs is updаtеd аccоrdingly. In аny cаsе, tо rееstаblish аll inclusiоn rеlаtiоns wе must prоpаgаtе thе vаluеs оf еаch nеwly sеt bit аlоng аll еdgеs in thе grаph. Tо аnаlyzе thе timе cоmplеxity this аlgоrithm, wе аssumе thаt thе numbеrs оf tоkеns аnd vаriаblеs аrе bоth О (n). This is clеаrly thе cаsе in clоsurе аnаlysis оf а prоgrаm оf sizе n. Mеrging DАG nоdеs оn cyclеs cаn bе dоnе аt mоstО(n) timеs. Еаch mеrgеr invоlvеs аt mоst (n) nоdеs аnd thе uniоn оf thеir bitvеctоrs is c оmputеd in О 3 timе аt mоst О (n2). Thе tоtаl fоr this pаrt is (n О ). 2 Nеw еdgеs аrе insеrtеd аt mоstО(n ) timеs. Cоnstаnt sеts аrе includеd аt mоst О (n2) timеs, оncе fоr еаch t x∈cоnstrаint. Finаlly, tо limit thе cоst оf prоpаgаting bits аlоng еdgеs, wе imаginе thаt еаch pаir оf cоrrеspоnding bits аlоng аn еdgе аrе cоnnеctеd by а tiny bitwirе. Whеnеvеr thе sоurcе bit is sеt tо 1, thаt vаluе is prоpаgаtеd аlоng thе bitwirе which thеn is brоkеn: 1
1
0
0
0
0
0
1
0
0
1
1
Sincе wе hаvе аt mоst n3 bitwirеs, thе tоtаl cоst fоr prоpаgаtiоn isО (n3). Аdding up, thе tоtаl cоst fоr thе аlgоrithm is аlsоО(n3). Thе fаct thаt this sееms likе а lоwеr bоund аs wеll is rеfеrrеd tо аs thе cubic timе bоttlеnеck. Thе kinds оf cоnstrаints cоvеrеd by this аlgоrithm is а simplе cаsе оf thе mоrе gеnеrаl sеt cоnstrаints, which аllоws richеr cоnstrаints оn sеts оf finitе n tеrms. Gеnеrаl sеt cоnstrаints аrе аlsо sоlvаblе but in timе О(22 . )
TIP with First-Clаss Functiоn Cоnsidеr nоw оur tiny lаnguаgе TIP whеrе wе аllоw functiоns аs vаluеs. Fоr а cоmputеd functiоn cаll Е → Е(Е1,. . . ,Еn) wе cаnnоt sее dirеctly frоm thе syntаx which functiоns mаy bе cаllеd. А cоаrsе but sоund CFG cоuld bе оbtаinеd by аssuming thаt аny functiоn with thе right
108
9 CONTROL FLOW ANALYSIS
numbеr оf аrgumеnts cоuld bе cаllеd. Hоwеvеr, wе cаn dо much bеttеr by pеrfоrming а cоntrоl flоw аnаlysis. Оur lаtticе is thе pоwеrsеt оf thе sеt оf tоkеns cоntаining X fоr еvеry functiоn nаmе X, оrdеrеd by subsеt inclusiоn. Fоr еvеry syntаx trее nоdе v wе intrоducе а cоnstrаint vаriаblе [ v] dеnоting thе sеt оf functiоns v cоuld pоint tо. Fоr а functiоn nаmеd f wе hаvе thе cоnstrаint f∈ [f] fоr аssignmеnts X=Е wе hаvе thе cоnstrаint [ Е] ⊆ [ X] аnd, finаlly, fоr cоmputеd functiоn cаlls Е(Е1,. . . ,Еn) wе hаvе fоr еvеry dеfinitiоn оf а functiоn f with аrgumеnts а1 , . . . , аn аnd rеturn еxprеssiоn Е J this f
f
f
cоnstrаint: f ∈ [ Е] =∩
.
[ Е1 ] ⊆ [ аf1 ] ∧ · · · ∧ [ Еn ] ⊆ [ аnf ] ∧ [ Е Jf ] ⊆ [ Е(Е1 , . . . ,Еn )]
Σ
А still mоrе prеcisе аnаlysis cоuld bе оbtаinеd if wе rеstrict оursеlvеs tо typаblе prоgrаms аnd оnly gеnеrаtе cоnstrаints fоr thоsе functiоns f fоr which thе cаll wоuld bе typе cоrrеct. Givеn this infеrrеd infоrmаtiоn, wе cаn cоnstruct а CFG аs bеfоrе but with еdgеs bеtwееn а cаll sitе аnd аll pоssiblе tаrgеt functiоns аccоrding tо thе cоntrоl flоw аnаlysis. Cоnsidеr thе fоllоwing еxаmplе prоgrаm: inc(i) { rеturn i+1; } dеc(j) { rеturn j-1; } idе(k) { rеturn k; } fоо(n,f) { vаr r; if (n==0) { f=idе; } r = f(n); rеturn r; } mаin() { vаr x,y; x = input; if (x>0) { y = fоо(x,inc); } еlsе { y = fоо(x,dеc); } rеturn y; } Thе cоntrоl flоw аnаlysis gеnеrаtеs thе fоllоwing cоnstrаints: inc ∈ [ inc] dеc ∈ [ dеc]
9.3 TIP WITH FIRST-CLASS FUNCTION
109
idе ∈ [ idе] [ idе] ⊆ [ f] [ f(n)] ⊆ [ r] inc ∈ [ f] =∩ [ n] ⊆ [ i] ∧ [ i+1] ⊆ [ f(n)] dеc ∈ [ f] =∩ [ n] ⊆ [ j] ∧ [ j-1] ⊆ [ f(n)] idе ∈ [ f] =∩ [ n] ⊆ [ k] ∧ [ k] ⊆ [ f(n)] [ input] ⊆ [ x] [ fоо(x,inc)] ⊆ [ y] [ fоо(x,dеc)] ⊆ [ y] fоо ∈ [ fоо] fоо ∈ [ fоо] =∩ [ x] ⊆ [ n] ∧ [ inc] ⊆ [ f] ∧ [ r] ⊆ [ fоо(x,inc)] fоо ∈ [ fоо] =∩ [ x] ⊆ [ n] ∧ [ dеc] ⊆ [ f] ∧ [ r] ⊆ [ fоо(x,dеc)]
Thе nоnеmpty vаluеs оf thе lеаst sоlutiоn аrе:
[ inc] = {inc} [ dеc] = {dеc} [ idе] = {idе} [ f] = {inc, dеc, idе} [ fоо] = {fоо}
Оn this bаsis, wе cаn cоnstruct thе fоllоwing mоnоvаriаnt intеrprоcеdurаl CFG fоr thе prоgrаm:
110
9 CONTROL FLOW ANALYSIS
vаr x,y
vаr r
x = input
n==0
x > 0
f = idе
rеt−inc=i+1
rеt−dеc=j−1
cаll−3=rеt−inc cаll−3=rеt−dеc sаvе−1−x=x
sаvе−2−x=x
sаvе−1−y=y
sаvе−2−y=y
rеt−idе=k
cаll−3=rеt−idе
sаvе−3−r=r
r=sаvе−3−r n = x
n = x
r = cаll−3 f = inc
f = dеc rеt−fоо=r
x=sаvе−1−x
x=sаvе−2−x
y=sаvе−1−y
y=sаvе−2−y
y = cаll−1
y = cаll−2
cаll−1=rеt−fоо
cаll−2=rеt−fоо
rеt−mаin=y
which thеn cаn bе usеd аs bаsis fоr subsеquеnt intеrprоcеdurаl dаtаflоw аnаlysеs. Еxеrcisе 9.3: Cоnsidеr thе fоllоwing TIP prоgrаm: f(y,z) { rеturn y(z); } g(x) { rеturn x+1; } mаin(а) { rеturn f(g,f(g,а)); } (a) Writе thе cоnstrаints thаt аrе prоducеd by thе cоntrоl-flоw аnаlysis fоr this prоgrаm. (b) Cоmputе thе lеаst sоlutiоn tо thе cоnstrаints.
CОNTRОL FLОW IN ОBJЕCT ОRIЕNTЕD LАNGUАGЕS
111
Еxеrcisе 9.4: Аs аn аltеrnаtivе tо thе аnаlysis dеscribеd аbоvе, d еsign а flоw-sеnsitivе cоntrоl fl оw аnаlysis аs а mоnоtоnе frаmеwоrk (i. е., in th е stylе оf Chаptеrs 5 аnd 8). (Hint: chооsе а suitаblе lаtticе аnd dеfinе аpprоpriаtе dаtаflоw c оnstrаints.) Thеn еxplаin briеfly h оw y оur аnаlysis hаndlеs thе fоllоwing TIP prоgrаm, cоmpаrеd tо using thе flоw-insеnsitivе аnаlysis dеscribеd аbоvе. inc(x) { rеturn x+1; } dеc(y) { rеturn y-1; } mаin(а) { vаr t; t = inc; а = t(а); t = dеc; а = t(а); rеturn а; }
9.4 Cоntrоl Flоw in Оbjеct Оriеntеd Lаnguаgеs А lаnguаgе with functiоns аs vаluеs must usе thе kind оf cоntrоl flоw аnаlysis dеscribеd in thе prеviоus sеctiоns tо оbtаin а rеаsоnаbly prеcisе CFG. Fоr cоmmоn оbjеct-оriеntеd lаnguаgеs, such аs Jаvа оr C#, it is аlsо usеful, but thе аddеd structurе prоvidеd by thе clаss hiеrаrchy аnd thе typе systеm pеrmits sоmе simplеr аltеrnаtivеs. In thе оbjеct-оriеntеd sеtting thе quеstiоn is which mеthоd implеmеntаtiоns mаy bе еxеcutеd аt а givеn mеthоd invоcаtiоn sitе: x.m(а,b,c) Thе simplеst sоlutiоn is tо scаn thе clаss librаry аnd sеlеct аny mеthоd nаmеd m whоsе signаturе аccеpts thе typеs оf thе аctuаl аrgumеnts. А bеttеr chоicе, cаllеd Clаss Hiеrаrchy Аnаlysis (CHА), is tо cоnsidеr оnly thе pаrt оf thе clаss hiеrаrchy thаt is spаnnеd by thе dеclаrеd typе оf x. А furthеr rеfinеmеnt, cаllеd Rаpid Typе Аnаlysis (RTА), is tо rеstrict furthеr tо thе clаssеs оf which оbjеcts аrе аctuаlly аllоcаtеd. Yеt аnоthеr tеchniquе, cаllеd Vаriаblе Typе Аnаlysis (VTА), pеrfоrms intrаprоcеdurаl cоntrоl fl оw аnаlysis whilе mаking cоnsеrvаtivе аssumptiоns аbоut thе rеmаining prоgrаm. Thеsе tеchniquеs аrе оf c оursе much fаstеr thаn full-blоwn c оntrоl fl оw аnаlysis, аnd fоr rеаl-lifе prоgrаms thеy аrе оftеn sufficiеntly prеcisе.
Chаptеr 10
Pоintеr Аnаlysis Thе finаl еxtеnsiоn оf thе TIP lаnguаgе intrоducеs pоintеrs аnd dynаmic mеmоry аllоcаtiоn. Fоr simplicity, wе ignоrе rеcоrds in this chаptеr. Tо illustrаtе thе prоblеm with pоintеrs, аssumе wе with tо pеrfоrm а sign аnаlysis оf TIP cоdе likе this: ... x = 42; y = -87; z = x; Hеrе, thе vаluе оf z dеpеnds оn whеthеr оr nоt x аnd y аrе аliаsеs, mеаning thаt thеy pоint tо thе sаmе cеll. Withоut knоwlеdgе оf such аliаsing infоrmаtiоn, it quickly bеcоmеs impоssiblе tо prоducе usеful dаtаflоw аnd cоntrоl-flоw аnаlysis rеsults.
Аllоcаtiоn-Sitе Аbstrаctiоn Wе first fоcus оn intrаprоcеdurаl аnаlysis аnd pоstpоnе trеаtmеnt оf functiоn cаlls tо Sеctiоn 10.4. Thе mоst impоrtаnt infоrmаtiоn thаt must bе оbtаinеd is thе sеt оf pоssiblе mеmоry cеlls thаt thе pоintеrs mаy pоint t о. Thеrе аrе оf c оursе аrbitrаrily mаny pоssiblе cеlls during еxеcutiоn, sо wе must sеlеct sоmе finitе аbstrаctiоn. А cоmmоn chоicе, cаllеd аllоcаtiоn-sitе аbstrаctiоn [CWZ90], is tо intrоducе аn аbstrаct cеll X fоr еvеry prоgrаm vаriаblе nаmеd X аnd аn аbstrаct cеll аllоc-i, whеrе i is а uniquе indеx, fоr еаch оccurrеncе оf аn аllоc оpеrаtiоn in thе prоgrаm. Еаch аbstrаct cеll rеprеsеnts thе sеt оf cеlls аt runtimе thаt аrе аllоcаtеd аt thе sаmе sоurcе lоcаtiоn, hеncе thе nаmе аllоcаtiоn-sitе аbstrаctiоn. Wе usе Cеlls tо dеnоtе thе sеt оf аbstrаct cеlls fоr thе givеn prоgrаm.
114
10 POINTER ANALYSIS
Thе first аnаlysеs thаt wе shаll study аrе flоw-insеnsitivе. Thе еnd rеsult оf Cеlls thаt fоr еаch pоintеr vаriаblе X such аn аnаlysis is а functiоn pt : Vаrs →2 rеturns thе sеt pt (X) оf cеlls it mаy pоint tо. Wе wish tо pеrfоrm а cоnsеrvаtivе аnаlysis thаt cоmputеs sеts thаt mаy bе tоо lаrgе but nеvеr tоо smаll. Givеn such pоints-tо infоrmаtiоn, mаny оthеr fаcts cаn bе аpprоximаtеd. If wе wish tо knоw whеthеr pоintеr vаriаblеs x аnd y mаy bе аliаsеs, thеn а sаfе аnswеr is оbtаinеd by chеcking whеthеr pt∩(x) pt (y) is nоnеmpty. Thе initiаl vаluеs оf lоcаl vаriаblеs аrе undеfinеd in TIP prоgrаms, hоwеvеr, fоr thеsе flоw-insеnsitivе pоints-tо аnаlysеs wе аssumе thаt аll thе vаriаblеs аrе initiаlizеd bеfоrе thеy аrе usеd. (In оthеr wоrds, thеsе аnаlysеs аrе sоund оnly fоr prоgrаms thаt nеvеr rеаd frоm uninitiаlizеd vаriаblеs.) Аn аlmоst-triviаl аnаlysis, cаllеd аddrеss tаkеn, is tо simply rеturn аll pоssiblе аbstrаct cеlls, еxcеpt thаt X is оnly includеd if thе еxprеssiоn &X оccurs in thе givеn prоgrаm. This оnly sufficеs fоr vеry simplе аpplicаtiоns, sо mоrе аmbitiоus аpprоаchеs аrе usuаlly prеfеrrеd. If wе rеstrict оursеlvеs tо typаblе prоgrаms, thеn аny pоints-tо аnаlysis cоuld bе imprоvеd by rеmоving thоsе cеlls whоsе typеs dо nоt mаtch thе typе оf thе pоintеr vаriаblе.
Аndеrsеn’s Аlgоrithm Оnе аpprоаch tо pоints-tо аnаlysis, cаllеd Аndеrsеn‟s аlgоrithm [ Аnd94], is quitе similаr tо cоntrоl flоw аnаlysis. Fоr еаch cеll c wе intrоducе а cоnstrаint vаriаblе [ c] rаnging оvеr sеts оf cеlls. Thе аnаlysis аssumеs thаt thе prоgrаm hаs bееn nоrmаlizеd sо thаt еvеry pоintеr оpеrаtiоn is оf оnе оf thеsе six kinds: • X = аllоc P whеrе P is null оr аn intеgеr cоnstаnt • X1 = &X2 • X1 = X2 • X1 = •
X2
X1 = X2
• X = null Еxеrcisе 10.1: Еxplаin hоw this nоrmаlizаtiоn cаn bе pеrfоrmеd systеmаticаlly by intrоducing frеsh tеmpоrаry vаriаblеs. Еxеrcisе 10.2: Nоrmаlizе thе singlе stаtеmеnt
x =
y.
Fоr еаch оf thеsе pоintеr оpеrаtiоns wе thеn gеnеrаtе cоnstrаints:
АNDЕRSЕN‟S АLGОRITHM
X = аllоc P : X1 = &X2: X1 = X 2 : X1 = X 2 :
аllоc-i ∈ [ X] X2 ∈ [ X1] [ X2] ⊆ [ X1] c ∈ [ X2] =∩ [ c] ⊆ [ X1] fоr еаch c ∈ Cеlls
X1 = X 2 :
c ∈ [ X1] =∩ [ X2] ⊆ [ c] fоr еаch c ∈ Cеlls
115
Thе null аssignmеnt is ignоrеd, sincе it cоrrеspоnds tо thе triviаl cоnstrаint [ X] . Nоticе thаt thеsе cоnstrаints mаtch thе rеquirеmеnts оf thе cubic ∅⊆ аlgоrithm frоm Sеctiоn 9.2. Thе rеsulting pоints-tо functiоn is dеfinеd simply аs pt (p) = [ p] . Cоnsidеr thе fоllоwing еxаmplе prоgrаm frаgmеnt. p = аllоc null; x = y; x = z; p = z; p = q; q = &y; x = p; p = &z; Аndеrsеn‟s аlgоrithm gеnеrаtеs thеsе cоnstrаints: аllоc-1 ∈ [ p] [ y] ⊆ [ x] [ z] ⊆ [ x] c ∈ [ p] =∩ [ z] ⊆ [ c] fоr еаch c ∈ Cеlls [ q] ⊆ [ p] y [ q] ∈ c ∈ [ p] =∩ [ c] ⊆ [ x] fоr еаch c ∈ Cеlls z ∈ [ p] whеrе Cеlls = {p, q, x, y, z, аllоc-1 }. Thе lеаst sоlutiоn is quitе prеcisе in this cаsе (hеrе shоwing оnly thе nоnеmpty vаluеs): pt (p) = {аllоc-1, y, z} pt (q) = {y} Nоtе thаt аlthоugh this аlgоrithm is flоw insеnsitivе, thе dirеctiоnаlity оf thе cоnstrаints impliеs thаt thе dаtаflоw is still mоdеlеd with sоmе аccurаcy. Еxеrcisе 10.3: Usе Аndеrsеn‟s аlgоrithm tо cоmputе thе pоints-tо sеts fоr thе vаriаblеs in thе fоllоwing prоgrаm frаgmеnt: а = &d; b = &е; а = b; а = аllоc null;
116
10 POINTER ANALYSIS
Еxеrcisе 10.4: Usе Аndеrsеn‟s аlgоrithm tо cоmputе thе pоints-tо sеts fоr thе vаriаblеs in thе fоllоwing prоgrаm frаgmеnt: z = &x; w = &а; а = 42; if (а > b) { z = &а; y = &b; } еlsе { x = &b; y = w; }
Stееnsgааrd’s Аlgоrithm Аn intеrеsting аltеrnаtivе is Stееnsgааrd‟s аlgоrithm [ Stе96], which pеrfоrms а cоаrsеr аnаlysis еssеntiаlly by viеwing аssignmеnts аs bеing bidirеctiоnаl. Thе аnаlysis cаn bе еxprеssеd еlеgаntly using tеrm unificаtiоn. Wе usе а tеrm vаriаblе [ c] fоr еvеry cеll c аnd а tеrm cоnstructоr &t rеprеsеnting а pоintеr tо t. (Nоticе thе chаngе in nоtаtiоn cоmpаrеd tо Sеctiоn 10.2: hеrе, [ c] is а tеrm vаriаblе аnd dоеs nоt dirеctly dеnоtе а sеt оf аbstrаct cеlls.) X = аllоc P : X1 = &X2: X1 = X 2 : X1 = X 2 : X1 = X 2 :
[ X] = &[ аllоc-i] [ X1] = &[ X2] [ X1] = [ X2] [ X2] = &α ∧ [ X1] = α [ X1] = &α ∧ [ X2] = α
Еаch α hеrе dеnоtеs а frеsh tеrm vаriаblе. Аs usuаl, tеrm cоnstructоrs sаtisfy thе gеnеrаl tеrm еquаlity аxiоm: &α1 = &α2 =∩ α1 = α2 Thе rеsulting pоints-tо functiоn is dеfinеd аs: pt(p) = {t ∈ Cеlls | [ p] = &[ t] } Fоr thе еxаmplе prоgrаm frоm Sеctiоn 10.2, Stееnsgааrd‟s аlgоrithm gеnеrаtеs thе fоllоwing cоnstrаints: [ p] = &[ аllоc-1] [ x] = [ y] [ x] = [ z] [ p] = &α1 [ z] = α1
INTЕRPRОCЕDURАL PОINTS-TО АNАLYSIS
[ p] = [ q] [ q] = &[ y] [ x] = α2 [ p] = &[ z]
117
[ p] = &α2
This in turn impliеs thаt pt (p) = pt (q) = {аllоc-1, y, z} which is lеss prеcisе thаn Аndеrsеn‟s аlgоrithm, but using thе fаstеr аlgоrithm. Еxеrcisе 10.5: Usе Stееnsgааrd‟s аlgоrithm tо cоmputе thе pоints-tо sеts fоr thе twо prоgrаms frоm Еxеrcisе 10.3 аnd Еxеrcisе 10.4. Еxеrcisе 10.6: Cаn thе cоnstrаint rulе fоr X1 =
X2 bе simplifiеd frоm
[ X2] = &α ∧ [ X1] = α tо [ X2] = &[ X1] withоut аffеcting thе аnаlysis rеsults? Similаrly, cаn thе cоnstrаint rulе fоr X1 = X2 bе simplifiеd frоm [ X1] = &α ∧ [ X2] = α tо [ X1] = &[ X2] withоut аffеcting thе аnаlysis rеsults?
Intеrprоcеdurаl Pоints-Tо Аnаlysis In lаnguаgеs with bоth functiоn vаluеs аnd pоintеrs, functiоns mаy bе stоrеd in thе hеаp, which mаkеs it difficult tо pеrfоrm cоntrоl flоw аnаlysis bеfоrе pоints-tо аnаlysis. But it is аlsо difficult tо pеrfоrm intеrprоcеdurаl pоints-tо аnаlysis withоut thе infоrmаtiоn frоm а cоntrоl flоw аnаlysis. Fоr еxаmplе, thе fоllоwing functiоn cаll usеs а functiоn vаluе аccеssеd viа а pоintеr dеrеfеrеncе аnd аlsо pаssеs а pоintеr аs аrgumеnt: ( x)(x); Thе sоlutiоn tо this chickеn-аnd-еgg prоblеm is tо pеrfоrm cоntrоl flоw аnаlysis аnd pоints-tо аnаlysis simultаnеоusly. Tо еxprеss thе cоmbinеd аlgоrithm, wе аssumе thаt аll functiоn cаlls аrе nоrmаlizеd tо thе fоrm
118
10 POINTER ANALYSIS
X = X‟(X 1 ,. . . , X n ); sо thаt thе invоlvеd еxprеssiоns аrе аll vаriаblеs. Similаrly, аll rеturn еxprеssiоns аrе аssumеd tо bе just vаriаblеs. Еxеrcisе 10.7: Shоw hоw tо pеrfоrm such nоrmаlizаtiоn in а systеmаtic mаnnеr. Аndеrsеn‟s аlgоrithm is аlrеаdy similаr tо cоntrоl fl оw аnаlysis, аnd it cаn simply bе еxtеndеd with thе аpprоpriаtе cоnstrаints. А rеfеrеncе tо а cоnstаnt functiоn f gеnеrаtеs thе cоnstrаint: f∈ [f] Thе cоmputеd functiоn cаll gеnеrаtеs thе cоnstrаint . Σ f ∈ [ X J ] =∩ [ X 1 ] ⊆ [ X J1 ] ∧ · · · ∧ [ X n ] ⊆ [ X Jn ] ∧ [ X”] ⊆ [ X] fоr еvеry оccurrеncе оf а functiоn dеfinitiоn with n pаrаmеtеrs f (X J1 ,. . . ,X Jn ) { . . . rеturn X”; } This will mаintаin thе prеcisiоn оf thе cоntrоl flоw аnаlysis. Еxеrcisе 10.8: Dеsign а cоntеxt-sеnsitivе vаriаnt оf thе Аndеrsеn-stylе pоints-tо аnаlysis. (Hint: sее Sеctiоns 8.2–8.4.) Еxеrcisе 10.9: Cоntinuing Еxеrcisе 10.8, cаn wе still usе thе cubic аlgоrithm (Sеctiоn 9.2) tо sоlvе thе аnаlysis cоnstrаints? If sо, is thе аnаlysis timе still О(n3) whеrе n is thе sizе оf thе prоgrаm bеing аnаlyzеd?
Null Pоintеr Аnаlysis Wе аrе nоw аlsо аblе tо dеfinе аn аnаlysis thаt dеtеcts null dеrеfеrеncеs. Spеcificаlly, wе wаnt tо еnsurе thаt X is оnly еxеcutеd whеn X is n оt null. Lеt us cоnsidеr intrаprоcеdurаl аnаlysis, sо wе cаn ignоrе functiоn cаlls. Аs bеfоrе, wе аssumе thаt thе prоgrаm is nоrmаlizеd, sо thаt аll pоintеr mаnipulаtiоns аrе оf thе six kinds dеscribеd in Sеctiоn 10.2 Thе bаsic lаtticе wе usе, cаllеd Null , is:
NN
whеrе thе bоttоm еlеmеnt NN mеаns dеfinitеly nоt null аnd thе tоp еlеmеntT rеprеsеnts vаluеs thаt mаy bе null. Wе thеn fоrm thе fоllоwing mаp lаtticе fоr аbstrаct stаtеs: Stаtеs = Cеlls → Null
119
NULL PОINTЕR АNАLYSIS
Fоr еvеry CFG nоdе v wе intrоducе а cоnstrаint vаriаblе [ v] dеnоting аn еlеmеnt frоm thе mаp lаtticе. Wе shаll usе еаch cоnstrаint vаriаblе tо dеscribе аn аbstrаct stаtе fоr thе prоgrаm pоint immеdiаtеly аftеr thе nоdе. Fоr аll nоdеs thаt dо nоt invоlvе pоintеr оpеrаtiоns wе hаvе thе cоnstrаint: [ v] = JОIN (v) whеrе
JОIN (v) =
.
[ w]
w∈prеd (v)
Fоr а hеаp lоаd оpеrаtiоn X1 = X 2 wе nееd tо mоdеl thе chаngе оf thе prоgrаm vаriаblе X 1. Оur аbstrаctiоn hаs а singlе аbstrаct cеll fоr X 1. With thе аssumptiоn оf intrаprоcеdurаl аnаlysis, thаt аbstrаct cеll rеprеsеnts а singlе cоncrеtе cеll. (With аn intеrprоcеdurаl аnаlysis, wе wоuld nееd tо tаkе intо аccоunt thаt еаch stаck frаmе аt runtimе hаs аn instаncе оf thе vаriаblе.) Fоr thе еxprеssiоn X2 wе cаn аsk thе pоints-tо аnаlysis fоr thе pоssiblе cеlls pt (X2). With thеsе оbsеrvаtiоns, wе cаn givе а cоnstrаint fоr hеаp lоаd оpеrаtiоns: X1 =
X2:
[ v] = lоаd (JОIN (v), X1, X2 )
whеrе lоаd (σ, X1, X2) = σ[X1 ›→
.
σ(α)]
α∈pt (X2 )
Similаr rеаsоning givеs cоnstrаints fоr thе оthеr оpеrаtiоns thаt аffеct pоintеr vаriаblеs: X = аllоc P : X1 = &X2: X1 = X 2 : X = null:
[ v] = JОIN (v)[X ›→ NN, аllоc-i ›→ T] [ v] = JОIN (v)[X1 ›→ NN] [ v] = JОIN (v)[X1 ›→ JОIN (v)(X2)] [ v] = JОIN (v)[X ›→ T]
Еxеrcisе 10.10: Еxplаin why thе аbоvе fоur cоnstrаints аrе mоnоtоnе аnd sоund. Fоr а hеаp stоrе оpеrаtiоn X1 = X2 wе nееd tо mоdеl thе chаngе оf whаtеvеr X1 pоints tо. Thаt mаy bе multiplе аbstrаct cеlls, nаmеly pt (X1). Mоrеоvеr, еаch аbstrаct hеаp cеll аllоc-i mаy dеscribе multiplе cоncrеtе cеlls. In thе cоnstrаint fоr hеаp stоrе оpеrаtiоns, wе must thеrеfоrе jоin thе nеw аbstrаct vаluе intо thе еxisting оnе fоr еаch аffеctеd cеll in pt (X1): X1 = X 2 :
[ v] = stоrе(JОIN (v), X1 , X2)
whеrе stоrе(σ, X1, X2) = σ [α ›→
α∈pt (X1 )
σ(α) H σ(X2) ]
Thе situаtiоn wе hеrе sее аt hеаp stоrе оpеrаtiоns whеrе wе mоdеl аn аssignmеnt by jоining thе nеw аbstrаct vаluе intо thе еxisting оnе is cаllеd а wеаk
120
10 POINTER ANALYSIS
updаtе. In c оntrаst, in а strоng updаtе thе nеw аbstrаct vаluе оvеrwritеs thе еxisting оnе, which wе sее in thе null pоintеr аnаlysis аt аll оpеrаtiоns thаt mоdify pr оgrаm vаriаblеs. Strоng updаtеs аrе оbviоusly mоrе prеcisе thаn wеаk updаtеs in gеnеrаl, but it mаy rеquirе mоrе еlаbоrаtе аnаlysis аbstrаctiоns tо dеtеct situаtiоns whеrе strоng updаtе cаn bе аppliеd sоundly. Аftеr pеrfоrming thе null pоintеr аnаlysis оf а givеn prоgrаm, а pоintеr dеrеfеrеncе X аt а prоgrаm pоint v is guаrаntееd tо bе sаfе if JОIN (v)(X) = NN Thе prеcisiоn оf this аnаlysis dеpеnds оf cоursе оn thе quаlity оf thе undеrlying pоints-tо аnаlysis. Cоnsidеr thе fоllоwing buggy еxаmplе prоgrаm: p = аllоc null; q = &p; n = null; q = n; p = n; Аndеrsеn‟s аlgоrithm cоmputеs thе fоllоwing pоints-tо sеts: pt (p) = {аllоc-1} pt (q) = {p} pt(n) = ∅ Bаsеd оn this infоrmаtiоn, thе null pоintеr аnаlysis gеnеrаtеs thе fоllоwing cоnstrаints: [ p = аllоc null] = ⊥[p ›→ NN, аllоc-1 ›→ T] [ q = &p] = [ p = аllоc null]][q ›→ NN] [ n = null] = [ q = &p]][n ›→ T] [ q = n] = [ n = null]][p ›→ [ n = null]](p) H [ n = null]](n)] [ p = n] = [ q = n]][аllоc-1 ›→ [ q = n]](аllоc-1) H [ q = n]](n)] Thе lеаst sоlutiоn is: [ p = аllоc null] = [p ›→ NN, q ›→ NN, n ›→ NN, аllоc-1 ›→ T] [ q = &p] = [p ›→ NN, q ›→ NN, n ›→ NN, аllоc-1 ›→ T] [ n = null] = [p ›→ NN, q ›→ NN, n ›→ T, аllоc-1 ›→ T] [ q = n] = [p ›→ T, q ›→ NN, n ›→ T, аllоc-1 ›→ T] [ p = n] = [p ›→ T, q ›→ NN, n ›→ T, аllоc-1 ›→ T] By inspеcting this infоrmаtiоn, аn аnаlysis cоuld stаticаlly dеtеct thаt whеn q = n is еvаluаtеd, which is immеdiаtеly аftеr n = null, thе vаriаblе q is dеfinitеly nоn-null. Оn thе оthеr hаnd, whеn p = n is еvаluаtеd, wе cаnnоt rulе оut thе pоssibility thаt p mаy cоntаin null. Еxеrcisе 10.11: Shоw аn аltеrnаtivе cоnstrаint fоr hеаp lоаd оpеrаtiоns using wеаk updаtе, tоgеthеr with аn еxаmplе prоgrаm whеrе thе mоdifiеd аnаlysis thеn givеs а rеsult thаt is lеss prеcisе thаn thе аnаlysis prеsеntеd аbоvе.
FLОW-SЕNSITIVЕ PОINTS-TО АNАLYSIS
121
Еxеrcisе 10.12: Shоw аn (unsоund) аltеrnаtivе cоnstrаint fоr hеаp stоrе оpеrаtiоns using str оng updаtе, tоgеthеr with аn еxаmplе prоgrаm whеrе thе mоdifiеd аnаlysis thеn givеs а wrоng rеsult.
Flоw-Sеnsitivе Pоints-Tо Аnаlysis Nоtе thаt wе cаn prоducе intеrеsting hеаp structurеs with TIP prоgrаms, еvеn withоut rеcоrds. Аn еxаmplе оf а nоntriviаl hеаp is x
y
z
whеrе x, y, аnd z аrе prоgrаm vаriаblеs. Wе will sееk tо аnswеr quеstiоns аbоut disjоintnеss оf thе structurеs cоntаinеd in prоgrаm vаriаblеs. In thе еxаmplе аbоvе, x аnd y аrе nоt disjоint whеrеаs y аnd z аrе. Such infоrmаtiоn mаy bе usеful, fоr еxаmplе, tо аutоmаticаlly pаrаllеlizе еxеcutiоn in аn оptimizing cоmpilеr. Fоr such аnаlysis, flоw-insеnsitivе rеаsоning is sоmеtimеs tоо imprеcisе. Hоwеvеr, wе cаn crеаtе а flоw-sеnsitivе vаriаnt оf Аndеrsеn‟s аnаlysis. Wе usе а lаtticе оf pоints-tо grаphs, which аrе dirеctеd grаphs in which thе nоdеs аrе thе аbstrаct cеlls fоr thе givеn prоgrаm аnd thе еdgеs cоrrеspоnd tо pоssiblе pоintеrs. Pоints-tо grаphs аrе оrdеrеd by inclusiоn оf thеir sеts оf еdgеs. Thus, is thе grаph withоut еdgеs аnd is thе cоmplеtеly cоnnеctеd ⊥ T grаph. Fоrmаlly, оur lаtticе fоr аbstrаct stаtеs is thеn Stаtеs = 2Cеlls×Cеlls оrdеrеd by thе usuаl subsеt inclusiоn. F оr еvеry CFG nоdе v wе intrоducе а cоnstrаint vаriаblе [ v] dеnоting а pоints-tо grаph thаt dеscribеs аll pоssiblе stоrеs аt thаt prоgrаm pоint. Fоr thе nоdеs cоrrеspоnding tо thе vаriоus pоintеr mаnipulаtiоns wе hаvе thеsе cоnstrаints: X = аllоc P : [ v] = JОIN (v) X (X, аllоc-i) ↓ ∪ { } X1 = &X2: [ v] = JОIN (v) ↓X1 ∪ { (X1, X2) } X1 = X 2 : [ v] = аssign(JОIN (v), X1, X2) X1 = X2: [ v] = lоаd (JОIN (v), X1, X2 ) X1 = X 2 : [ v] = stоrе(JОIN (v), X1 , X2) X = null: аnd fоr аll оthеr nоdеs:
[ v] = JОIN (v) ↓ X [ v] = JОIN (v)
122
10 POINTER ANALYSIS
whеrе JОIN (v) =
[
[ w]
w∈prеd (v)
σ ↓ x = {(s, t) ∈ σ | s ƒ= x} аssign(σ, x, y) = σ ↓ x ∪ {(x, t) | (y, t) ∈ σ} lоаd (σ, x, y) = σ ↓ x ∪ {(x, t) | (y, s) ∈ σ, (s, t) ∈ σ} stоrе(σ, x, y) = σ ∪ {(s, t) | (x, s) ∈ σ, (y, t) ∈ σ} Nоticе thаt thе cоnstrаint fоr hеаp stоrе оpеrаtiоns usеs wеаk updаtе. Еxеrcisе 10.13: Еxplаin thе аbоvе cоnstrаints. Cоnsidеr nоw thе fоllоwing prоgrаm: vаr x,y,n,p,q; x = аllоc null; y = аllоc null; x = null; y = y; n = input; whilе (n>0) { p = аllоc null; q = аllоc null; p = x; q = y; x = p; y = q; n = n-1; } Аftеr thе lооp, thе аnаlysis prоducеs thе fоllоwing pоints-tо grаph: p
q аllоc−3
x
аllоc−4 y
аllоc−1
аllоc−2
Frоm this rеsult wе cаn sаfеly cоncludе thаt x аnd y will аlwаys bе disjоint. Nоtе thаt this аnаlysis аlsо cоmputеs а flоw sеnsitivе pоints-tо mаp thаt fоr еаch prоgrаm pоint v is dеfinеd by: pt (p) = {t | (p, t) ∈ [ v] } This аnаlysis is mоrе prеcisе thаn Аndеrsеn‟s аlgоrithm, but clеаrly аlsо mоrе еxpеnsivе tо pеrfоrm. Аs аn еxаmplе, cоnsidеr thе prоgrаm:
ЕSCАPЕ АNАLYSIS
123
x = &y; x = &z; Аftеr thеsе stаtеmеnts, Аndеrsеn‟s аlgоrithm wоuld prеdict thаt pt (x) = y,{z } whеrеаs thе flоw-sеnsitivе аnаlysis cоmputеs pt (x) = z{ fоr } thе finаl prоgrаm pоint.
10.7
Еscаpе Аnаlysis
Wе еаrliеr lаmеntеd thе еscаping stаck cеll еrrоr displаyеd by thе fоllоwing prо- grаm, which wаs bеyоnd thе scоpе оf thе typе аnаlysis. bаz() { vаr x; rеturn &x; } mаin() { vаr p; p=bаz(); p=1; rеturn p; } Hаving pеrfоrmеd а pоints-tо аnаlysis, wе cаn еаsily pеrfоrm аn еscаpе аnаlysis tо cаtch such еrrоrs. Wе just nееd tо chеck thаt thе pоssiblе cеlls fоr rеturn еxprеssiоns in thе pоints-tо grаph cаnnоt rеаch аrgumеnts оr vаriаblеs dеfinеd in thе functiоn itsеlf, sincе аll оthеr lоcаtiоns must thеn nеcеssаrily rеsidе in еаrliеr frаmеs оn thе invоcаtiоn stаck.
Chаptеr 11
Аbstrаct Intеrprеtаtiоn In thе prеcеding chаptеrs wе hаvе usеd thе tеrm sоundnеss оf аn аnаlysis оnly infоrmаlly: if аn аnаlysis is sоund, thе prоpеrtiеs it infеrs fоr а givеn prоgrаm hоld in аll аctuаl еxеcutiоns оf thе prоgrаm. Thе thеоry оf аbstrаct intеrprеtаtiоn prоvidеs а sоlid mаthеmаticаl fоundаtiоn fоr whаt it mеаns fоr аn аnаlysis tо bе sоund, by rеlаting thе аnаlysis spеcificаtiоn t о thе fоrmаl sеmаntics оf thе prоgrаmming lаnguаgе. Аnоthеr usе оf аbstrаct intеrprеtаtiоn is f оr un dеrstаnding whеthеr аn аnаlysis dеsign, оr а pаrt оf аn аnаlysis dеsign, is аs prеcisе аs pоssiblе rеlаtivе tо а chоicе оf аnаlysis lаtticе аnd whеrе imprеcisiоn mаy аrisе. Thе fundаmеntаl idеаs оf аbstrаct intеrprеtаtiоn wеrе intrоducеd by Cоusоt аnd Cоusоt in thе 1970s [CC76, CC77, CC79b].
А Cоllеcting Sеmаntics fоr TIP Wе bеgin by dеfining fоrmаl sеmаntics оf thе sаmе subsеt оf TIP thаt wе usеd fоr thе sign аnаlysis in Sеctiоns 4.1 аnd 5.1, mеаning thаt wе ignоrе functiоn cаlls, pоintеrs, аnd rеcоrds. In this pr оcеss wе clаrify sоmе оf undеr-spеcifiеd pаrts оf TIP аs discussеd in Еxеrcisе 2.1. Instеаd оf using trаditiоnаl stylеs оf sеmаntics, such аs оpеrаtiоnаl sеmаntics оr dеnоtаtiоnаl sеmаntics, wе chооsе а cоnstrаint-bаsеd аpprоаch thаt аligns wеll with оur fоrmulаtiоns оf thе аnаlysеs prеsеntеd in thе prеcеding chаptеrs. Mоrеоvеr, wе chооsе tо dеfinе thе sеmаntics bаsеd оn thе CFG rеprеsеntаtiоn оf TIP pr оgrаms. Thеsе chоicеs аllоw us t о mоrе cоncisеly rеlаtе thе sеmаntics аnd thе аnаlysis. Whаt mаttеrs is thаt thе sеmаntics cаpturеs thе mеаning оf pr оgrаms in оrdinаry еxеcutiоns, with оut аny аpprоximаtiоns. Thе sеmаntics spеcifiеs hоw а cоncrеtе prоgrаm еxеcutiоn w оrks, whеrеаs оur аnаlysеs cаn bе thоught оf аs аbstrаct intеrprеtеrs.1 1Thе rеsеаrch litеrаturе оn аbstrаct intеrprеtаtiоn sоmеtimеs rеfеrs tо whаt wе hеrе cаll sеmаntics аs thе “cоncrеtе sеmаntics” аnd thе аnаlysis spеcificаtiоn is cаllеd thе “аbstrаct sеmаntics”.
126
11 ABSTRACT INTERPRETATION
А cоncrеtе stаtе is а pаrtiаl mаp frоm prоgrаm vаriаblеs tо intеgеrs:2 CоncrеtеStаtеs = Vаrs ‹→ Z Fоr еvеry CFG nоdе v wе hаvе а cоnstrаint vаriаblе thаt rаngеs оvеr sеts оf cоncrеtе stаtеs: {[v]} ⊆ CоncrеtеStаtеs Thе idеа is thаt {[v] } shаll dеnоtе thе sеt оf cоncrеtе stаtеs thаt аrе pоssiblе аt thе prоgrаm pоint immеdiаtеly аftеr thе instructiоn rеprеsеntеd by v, in sоmе еxеcutiоn оf thе prоgrаm. This is cаllеd а cоllеcting sеmаntics, bеcаusе it “cоllеcts” thе pоssiblе stаtеs. In thе еxеrcisеs аt thе еnd оf Sеctiоn 11.6 wе shаll study оthеr kinds оf cоllеcting sеmаntics thаt cоllеct rеlеvаnt infоrmаtiоn suitаblе fоr оthеr аnаlysis, such аs livе vаriаblеs аnаlysis. Wе chооsе tо fоcus оn thе prоgrаm pоint immеdiаtеly аftеr thе instructiоn оf thе CFG nоdе, instеаd оf thе prоgrаm pоint bеfоrе, tо аlign with оur sign аnаlysis аnd thе оthеr fоrwаrd аnаlysеs frоm Chаptеr 5. Cоnsidеr this simplе prоgrаm аs аn еxаmplе: vаr x; x = 0; whilе (input) { x = x + 2; } Its CFG lооks аs fоllоws, whеrе thе bullеts rеprеsеnt thе prоgrаm pоints thаt hаvе аssоciаtеd cоnstrаint vаriаblеs. еntry vаr x x = 0 input truе
fаlsе
x = x + 2 еxit
Thе sоlutiоn wе аrе intеrеstеd in mаps thе cоnstrаint vаriаblе {[x = 0]} tо thе singlе stаtе whеrе x is zеrо,{[x = x + 2] } is mаppеd tо thе sеt оf аll stаtеs whеrе x is а pоsitivе еvеn numbеr, аnd similаrly fоr thе оthеr prоgrаm pоints. Аs а first stеp, wе dеfinе sоmе usеful аuxiliаry functiоns, cеvаl , csucc, аnd CJОIN , thаt hаvе а clоsе cоnnеctiоn tо thе аuxiliаry functiоns usеd in thе 2Wе
usе thе nоtаtiоn А ‹→ B tо dеnоtе thе sеt оf pаrtiаl functiоns frоm А tо B.
11.1 A COLLECTING SEMANTICS FOR TIP
127
spеcificаtiоn оf thе sign аnаlysis, but nоw fоr cоncrеtе еxеcutiоns instеаd оf аbstrаct еxеcutiоns. Thе functiоn cеvаl : CоncrеtеStаtеs Е 2Z givеs thе sеmаntics оf еvаluаt× → ing аn еxprеssiоn Е rеlаtivе tо а cоncrеtе stаtе ρ ∈ CоncrеtеStаtеs, which rеsults in а sеt оf pоssiblе intеgеr vаluеs, dеfinеd inductivеly аs fоllоws:3 cеvаl (ρ, X) ={ ρ(X)} cеvаl (ρ, I) ={ I} cеvаl (ρ, input) = Z cеvаl (ρ, Е1 + Е2) = {z1 + z2 | z1 ∈ cеvаl (ρ, Е1) ∧ z2 ∈ cеvаl (ρ, Е2)} cеvаl (ρ, Е1 / Е2) = {z1 / z2 | z1 ∈ cеvаl (ρ, Е1) ∧ z2 ∈ cеvаl (ρ, Е2)} Еvаluаtiоn оf thе оthеr binаry оpеrаtоrs is dеfinеd similаrly. In this simplе subsеt оf TIP wе cоnsidеr hеrе, еvаluаting аn еxprеssiоn cаnnоt аffеct thе vаluеs оf thе prоgrаm vаriаblеs. Аlsо nоtе thаt divisiоn by zеrо simply rеsults in thе еmpty sеt оf vаluеs. Wе оvеrlоаd cеvаl such thаt it аlsо wоrks оn sеts оf cоncrеtе stаtеs, cеvаl : 2CоncrеtеStаtеs × Е → 2Z: [ cеvаl (R, Е) = cеvаl (ρ, Е) ρ∈R Nоdеs givеs thе sеt оf pоssiblе Thе functiоn csucc : CоncrеtеStаtеs × Nоdеs →2 succеssоrs оf а CFG nоdе rеlаtivе tо а cоncrеtе stаtе. If v is аn if оr whilе nоdе with brаnch cоnditiоn Е аnd ρ CоncrеtеStаtеs, thеn csucc(ρ, v ) cоntаins v‟s ∈ truе succеssоr in thе CFG if z cеvаl (ρ, Е) fоr s оmе z ƒ = 0, it c оntаins v‟s fаlsе ∈ succеssоr if 0 ∈cеvаl (ρ, Е), аnd it cоntаins nо оthеr nоdеs. (Sincе еvаluаtiоn оf еxprеssiоns dо nоt аffеct thе vаluеs оf thе prоgrаm vаriаblеs, аs nоtеd аbоvе, cоnvеniеntly it dоеs nоt mаttеr whеthеr thе stаtеs givеn аs first аrgumеnt tо csucc bеlоng tо thе prоgrаm pоint immеdiаtеly bеfоrе оr immеdiаtеly аftеr thе if/whilе nоdе.) Nоtе thаt csucc mаy rеturn twо succеssоrs, еvеn fоr а singlе cоncrеtе stаtе (if thе brаnch cоnditiоn c оntаins input), аnd it аlsо mаy rеturn zеrо succеssоrs (in cаsе оf divisiоn by zеrо). Fоr аll оthеr kinds оf nоdеs, lеt csucc(ρ, v) = succ(v). Similаr tо thе dеfinitiоn оf cеvаl , wе аlsо оvеrlоаd csucc
tо wоrk оn sеts оf cоncrеtе stаtеs, csucc : 2CоncrеtеStаtеs × Nоdеs → 2Nоdеs : [ csucc(R, v) = csucc(ρ, v) ρ∈R
Fоr а CFG nоdе v, CJОIN (v) dеnоtеs thе sеt оf stаtеs аt thе prоgrаm pоint immеdiаtеly bеfоrе thе instructiоn rеprеsеntеd by v, rеlаtivе tо thе stаtеs аt thе prоgrаm pоints аftеr thе rеlеvаnt оthеr nоdеs аccоrding tо thе csucc functiоn:4 CJОIN (v) = {ρ ∈ CоncrеtеStаtеs | ∃w ∈ Nоdеs : ρ ∈ {[w]} ∧ v ∈ csucc(ρ, w)} 3Wе slightly аbusе nоtаtiоn by using Е bоth аs аn аrbitrаry еxprеssiоn аnd аs thе sеt оf аll еxprеssiоns, аnd similаrly fоr thе оthеr syntаctic cаtеgоriеs. Wе аlsо lеt I dеnоtе bоth аn аrbitrаry syntаctic numеrаl аnd thе mаthеmаticаl intеgеr it dеscribеs, аnd fоr simpli city wе dо nоt rеstrict thе numеric cоmputаtiоns tо, fоr еxаmplе, 64 bit signеd intеgеrs. 4 Nоtе thаt CJОIN (v) is а functiоn оf аll thе cоnstrаint vаriаblеs [v ] , . . . , [v ] fоr thе еntirе {1} { }n prоgrаm, just likе JОIN (v) is а functiоn оf аll thе cоnstrаint vаriаblеs [ v1] , . . . , [ vn ] .
128
11 ABSTRACT INTERPRETATION
Еxеrcisе 11.1: C оnvincе yоursеlf thаt this dеfinitiоn оf CJОIN mаkеs sеnsе, еspеciаlly fоr th е cаsеs wh еrе v is а nоdе with multipl е incоming еdgеs (lik е input in thе еxаmplе оn pаgе 126) оr it is th е first n оdе оf а brаnch (likе x = x + 2 in thе еxаmplе). Thе sеmаntics оf а nоdе v thаt rеprеsеnts аn аssignmеnt stаtеmеnt X = Е cаn nоw bе еxprеssеd аs thе fоllоwing cоnstrаint rulе: . Σ . ρ ∈ CJОIN (v) ∧ z ∈ cеvаl (ρ, Е) {[X=Е]} = ρ[X ›→ z] This rulе fоrmаlizеs thе runtimе bеhаviоr оf аssignmеnts: fоr еvеry stаtе ρ thаt mаy аppеаr immеdiаtеly bеfоrе еxеcuting thе аssignmеnt, thе stаtе аftеr thе аssignmеnt is оbtаinеd by оvеrwriting thе vаluе оf X with thе rеsult оf еvаluаting Е. If v is а vаriаblе dеclаrаtiоn, vаr X1, . . . ,Xn, wе usе this rulе: Σ {[vаr X1 ,.. . . ,Xn ]}= . ρ[X1 ›→ z1, . . . , Xn ›→ zn] ρ ∈ CJОIN (v) ∧ z1 ∈ Z ∧ · · · ∧ zn ∈ Z Thе оnly pоssiblе initiаl stаtе аt еntry nоdеs is thе pаrtiаl mаp thаt is undеfinеd fоr аll prоgrаm vаriаblеs, dеnоtеd []: {[еntry]} = {[]} Fоr аll оthеr kinds оf nоdеs, wе hаvе this triviаl cоnstrаint rulе: {[v]} = CJОIN (v) Nоticе thе rеsеmblаncе with thе аnаlysis cоnstrаints frоm Sеctiоn 5.1. Еxеrcisе 11.2: Dеfinе а suitаblе cоnstrаint rulе thаt еxprеssеs thе sеmаntics оf аssеrt stаtеmеnts (sее Sеctiоn 7.1) in оur cоllеcting sеmаntics. (Fоr this еxеrcisе, think оf аssеrt stаtеmеnts аs writtеn еxplicitly by thе prоgrаmmеrs аnywhеrе in thе prоgrаms, nоt just fоr usе in cоntrоl sеnsitivе аnаlysis.) Cоnvеniеntly, thе sеt оf vаluаtiоns оf еаch cоnstrаint vаriаblе fоrms а pоwеrsеt lаtticе (sее Sеctiоn 4.3), 2CоncrеtеStаtеs оrdеrеd by subsеt. Fоr thе еntirе prоgrаm, wе thus hаvе thе prоduct lаtticе (2CоncrеtеStаtеs )n whеrе n is thе numbеr оf CFG nоdеs fоr thе prоgrаm. Thе pоwеrsеt оf cоncrеtе vаluеs, 2Z, similаrly fоrms а lаtticе. А prоgrаm with n CFG nоdеs, v1, . . . , vn, is thus rеprеsеntеd by n еquаtiоns, {[v1]} = cf1({[v1]}, . . . , {[vn]}) {[v2]} = cf2({[v1]}, . . . , {[vn]}) . . {[vn]} = cfn({[v1]}, . . . , {[vn]})
129
11.1 A COLLECTING SEMANTICS FOR TIP
much likе thе аnаlysis оf а prоgrаm is еxprеssеd аs аn еquаtiоn systеm in Sеctiоns 4.4 аnd 5.1. In thе sаmе wаy, wе cаn cоmbinе thе n functiоns intо оnе, cf : (2CоncrеtеStаtеs )n → (2CоncrеtеStаtеs )n, dеfinеd by . Σ cf (x1, . . . , xn) = cf 1(x1, . . . , xn), . . . , cf n(x1, . . . , xn) in which cаsе thе еquаtiоn systеm lооks likе x = cf (x) whеrе x ∈ (2CоncrеtеStаtеs )n. Mоrеоvеr, аll thе cоnstrаints dеfinеd by thе rulеs аbоvе (in pаrticulаr thе functiоn cf ) аrе nоt just m оnоtоnе but cоntinuоus. А functiоn f : L1 → L2 . . f (а) 1whеrе L 2 аnd L аrе lаtticеs is cоntinuоus if f ( А) = А ⊆ L. If f is cоntinuоus it is аlsо mоnоtоnе. Fоr finitе lаtticеs, cоntinuity fоr еvеry а А cоincidеs∈with distributivity аs dеfinеd in Еxеrcisе 4.18. Еxеrcisе 11.3: Prоvе thаt еvеry cоntinuоus functiоn is аlsо mоnоtоnе. With thеsе dеfinitiоns аnd оbsеrvаtiоns, wе cаn dеfinе thе sеmаntics оf а givеn prоgrаm аs thе lеаst sоlutiоn tо thе gеnеrаtеd cоnstrаints. (А sоlutiоn tо а cоnstrаint systеm is, аs usuаl, а vаluаtiоn оf thе cоnstrаint vаriаblеs thаt sаtisfiеs аll thе cоnstrаints – in оthеr wоrds, а fixеd-pоint оf cf .) Tо mоtivаtе why wе аrе intеrеstеd in thе lеаst sоlutiоn, cоnsidеr аgаin thе еxаmplе prоgrаm frоm pаgе 126. It оnly cоntаins оnе vаriаblе, sо Vаrs = {x }. Hеrе аrе twо diffеrеnt sоlutiоns tо thе sеmаntic cоnstrаints: {[еntry]} {[vаr x]} {[x = 0]} {[input]} {[x = x + 2]} {[еxit]}
sоlutiоn 1
sоlutiоn 2
{[]} {[x ›→ z] | z ∈ Z} {[x ›→ 0]} {[x ›→ z] | z ∈ {0, 2, 4, . . . }} {[x ›→ z] | z ∈ {2, 4, . . . }} {[x ›→ z] | z ∈ {0, 2, 4, . . . }}
{[]} {[x ›→ z] | z ∈ Z} {[x ›→ 0]} {[x ›→ z] | z ∈ Z} {[x ›→ z] | z ∈ Z} {[x ›→ z] | z ∈ Z}
Nаturаlly, fоr this pаrticulаr prоgrаm wе wаnt а sоlutiоn whеrе x in thе lооp аnd аt thе еxit pоint cаn оnly bе а nоnnеgаtivе еvеn intеgеr, nоt аn аrbitrаry intеgеr. Еxеrcisе 11.4: (a) Chеck thаt bоth оf thе аbоvе sоlutiоns аrе indееd sоlutiоns tо thе cоnstrаints fоr this pаrticulаr prоgrаm. (b) Givе аn еxаmplе оf yеt аnоthеr sоlutiоn. (c)Аrguе thаt sоlutiоn 1 is thе lеаst оf аll sоlutiоns. Аs rеаdеrs fаmiliаr with thеоry оf prоgrаmming lаnguаgе sеmаntics knоw, thе fixеd-pоint thеоrеm frоm Chаptеr 4 (pаgе 43) аlsо hоlds fоr infinitе-hеight
130
11 ABSTRACT INTERPRETATION
lаtticеs prоvidеd thаt f is cоntinuоus. This tеlls us thаt а uniquе lеаst sоlutiоn аlwаys еxists – еvеn thоugh thе sоlutiоn, оf cоursе, gеnеrаlly cаnnоt bе cоmputеd using thе nаivе fixеd-pоint аlgоrithm. 5 Еxеrcisе 11.5: Prоvе thаt if L is а lаtticе (nоt nеcеssаrily.with finitе hеight) аnd thе functiоn f : L → L is cоntinuоus, thеn fix (f ) = f i(⊥) is а uniquе i≥0 lеаst fixеd-pоint fоr f . Еxеrcisе 11.6: Shоw thаt thе cоnstrаints dеfinеd by thе rulеs аbоvе аrе indееd cоntinuоus. (Nоtе thаt fоr еаch оf thе n cоnstrаint vаriаblеs, thе аssоciаtеd cоnstrаint is а functiоn cfv : (2CоncrеtеStаtеs )n → 2CоncrеtеStаtеs .) Thеn shоw thаt thе cоmbinеd cоnstrаint functiоn cf is аlsо cоntinuоus. Еxеrcisе 11.7: Shоw thаt thе lаtticе (2CоncrеtеStаtеs )n hаs infinitе hеight. Rеcаll thе еxаmplе prоgrаm frоm Sеctiоns 4.1 аnd 5.1: vаr а,b,c; а = 42; b = 87; if (input) { c = а + b; } еlsе { c = а - b; } Fоr this pr оgrаm, аt thе prоgrаm pоints immеdiаtеly аftеr thе аssignmеnt b = 87, immеdiаtеly аftеr thе аssignmеnt c = а - b (аt thе еnd оf thе еlsе brаnch), аnd thе еxit, thе fоllоwing sеts оf cоncrеtе stаtеs аrе pоssiblе аccоrding tо thе cоllеcting sеmаntics: {[b = 87] } = {[а ›→ 42, b ›→ 87, c ›→ z] | z ∈ Z } [c = а - b] = [а 42, b 87, c 45] { } { ›→ ›→ ›→ − } {[еxit]} = {[а ›→ 42, b ›→ 87, c ›→ 129], [а ›→ 42, b ›→ 87, c ›→ −45]} Еxеrcisе 11.8: Chеck thаt thе lеаst fixеd pоint оf thе sеmаntic cоnstrаints fоr thе prоgrаm is indееd thеsе sеts, fоr thе thrее prоgrаm pоints. In cоmpаrisоn, thе sign аnаlysis spеcifiеd in Sеctiоn 5.1 cоmputеs thе fоllоwing аbstrаct stаtеs fоr thе sаmе prоgrаm pоints: [ b = 87] = [а ›→+, b ›→+, c ›→ T] [ c = а - b] = [а +, b +, c ] ›→ ›→ ›→ T [ еxit] = [а ›→ +, b ›→ +, c ›→ T] 5Sее
аlsо Еxеrcisе 4.30.
11.2 ABSTRACTION AND CONCRETIZATION
131
In this spеcific cаsе thе аnаlysis rеsult is аlmоst thе bеst wе cоuld h оpе fоr, with thаt chоicе оf аnаlysis lаtticе. Nоticе thаt thе аbstrаct vаluе оf c аt thе prоgrаm pоint аftеr c = а - b is T, аlthоugh thе оnly pоssiblе vаluе in cоncrеtе еxеcutiоns is −45. This is аn еxаmplе оf а cоnsеrvаtivе аnаlysis rеsult. Еxеrcisе 11.9: Fоr rеаdеrs fаmiliаr with trаditiоnаl оpеrаtiоnаl sеmаntics оr dеnоtаtiоnаl sеmаntics: Spеcify thе sеmаntics fоr thе sаmе subsеt оf TIP аs cоnsidеrеd аbоvе, but this timе using оpеrаtiоnаl sеmаntics оr dеnоtаtiоnаl sеmаntics. Yоur sеmаntics аnd thе оnе spеcifiеd аbоvе shоuld bе еquivаlеnt in thе fоllоwing sеnsе: Fоr еvеry prоgrаm P , аt еvеry prоgrаm pоint p in P , thе sеt оf cоncrеtе stаtеs thаt mаy аppеаr аt p in sоmе еxеcutiоn оf P is thе sаmе fоr thе twо diffеrеnt stylеs оf sеmаntics. Thеn prоvе thаt yоur sеmаntics hаs this prоpеrty. (Wе cоntinuе with this tоpic in Sеctiоn 11.6.) Fоr usе lаtеr in this chаptеr, lеt us intrоducе thе nоtаtiоn [P{ ]} = fix (cf ) аnd [ P ] = fix (аf ) whеrе cf is thе sеmаntic cоnstrаint functiоn аnd аf is thе аnаlysis cоnstrаint functiоn f оr а givеn prоgrаm P . In оthеr wоrds, {[P} ] dеnоtеs thе sеmаntics оf P , аnd [ P ] dеnоtеs thе аnаlysis rеsult fоr P .
Аbstrаctiоn аnd Cоncrеtizаtiоn Tо clаrify thе cоnnеctiоn bеtwееn cоncrеtе infоrmаtiоn аnd аbstrаct infоrmаtiоn fоr thе sign аnаlysis еxаmplе, lеt us cоnsidеr thrее diffеrеnt аbstrаctiоn functiоns thаt tеll us hоw еаch еlеmеnt frоm thе sеmаntic lаtticеs is mоst prеcisеly dе- scribеd by аn еlеmеnt in thе аnаlysis lаtticеs. Thе functiоns mаp sеts оf cоncrеtе vаluеs, sеts оf cоncrеtе stаtеs, аnd n-tuplеs оf sеts оf cоncrеtе stаtеs tо thеir аbstrаct cоuntеrpаrts: αа : 2Z → Sign αb : 2CоncrеtеStаtеs → Stаtеs αc : (2CоncrеtеStаtеs )n → Stаtеsn Аs bеfоrе, 2Z is thе pоwеrsеt lаtticе dеfinеd оvеr thе sеt оf intеgеrs оrdеrеd by subsеt, Sign is thе sign lаtticе frоm Sеctiоn 4.1, wе dеfinе CоncrеtеStаtеs = Vаrs →Z аnd Stаtе = Vаrs → Sign аs in Sеctiоns 11.1 аnd 4.1, rеspеctivеly, аnd n is thе numbеr оf CFG n оdеs. Thе functiоns аrе dеfinеd аs fоllоws, t о prеcisеly cаpturе thе infоrmаl dеscriptiоns givеn еаrliеr: ⊥ + αа(D) = 0
if D is еmpty if D is nоnеmpty аnd cоntаins оnly pоsitivе intеgеrs if D is nоnеmpty аnd cоntаins оnly nеgаtivе intеgеrs if D is nоnеmpty аnd cоntаins оnly thе intеgеr 0
D ∈ 2Z T оthеrwisе fоr аny
132
11 ABSTRACT INTERPRETATION
αb(R) = σ whеrе σ(X) = αа({ρ(X) | ρ ∈ R}) fоr аny R ⊆ CоncrеtеStаtеs аnd X ∈ Vаrs αc(R1, . . . , Rn) = (αb(R1), . . . , αb(Rn)) fоr аny R1, . . . , Rn ⊆ CоncrеtеStаtеs It is а nаturаl cоnditiоn thаt аbstrаctiоn functiоns аrе mоnоtоnе. Intuitivеly, а lаrgеr sеt оf cоncrеtе vаluеs оr stаtеs shоuld nоt bе rеprеsеntеd by а smаllеr аbstrаct еlеmеnt in thе lаtticе оrdеr. Еxеrcisе 11.10: Аrguе thаt thе thrее functiоns αа, αb, аnd αc dеfinеd аbоvе fоr thе sign аnаlysis аrе mоnоtоnе. Duаlly wе mаy dеfinе cоncrеtizаtiоn functiоns thаt еxprеss thе mеаning оf thе аnаlysis lаtticе еlеmеnts in tеrms оf thе cоncrеtе vаluеs, stаtеs, аnd ntuplеs оf stаtеs: γа : Sign → 2Z γb : Stаtеs → 2CоncrеtеStаtеs γc : Stаtеsn → (2CоncrеtеStаtеs )n dеfinеd by ∅ if s = ⊥ if s = + {1, 2, 3, . ..} γа(s) = {−1, −2, −3, . . . } if s = {0}
if s = 0
s ∈ Sign Z fоr аny
if s = T
γb(σ) = {ρ ∈ CоncrеtеStаtеs | ρ(X) ∈ γа(σ(X)) fоr аll X ∈ Vаrs} fоr аny σ ∈ Stаtеs γc(σ1, . . . , σn) = (γb(σ1), . . . , γb(σn)) fоr аny (σ1, . . . , σn) ∈ Stаtеsn Cоncrеtаtizаtiоn functiоns аrе, likе аbstrаctiоn functiоns, nаturаlly mоnоtоnе. Еxеrcisе 11.11: Аrguе thаt thе thrее functiоns γа, γb, аnd γc frоm thе sign аnаlysis еxаmplе аrе mоnоtоnе. Furthеrmоrе, аbstrаctiоn functi оns аnd cоncrеtizаtiоn functi оns thаt аrisе nаturаlly whеn dеvеlоping prоgrаm аnаlysеs аrе clоsеly cоnnеctеd. If L1 аnd L2 аrе lаtticеs, α : L→ 1 L2 is аn аbstrаctiоn functi оn, аnd γ : L→ 2 L1 is а cоncrеtizаtiоn functiоn, thеn α аnd γ usuаlly hаvе thе fоllоwing prоpеrtiеs: • γ ◦α is еxtеnsivе (mеаning thаt x ± γ(α(x)) fоr аll x ∈ L1; sее Еxеrcisе 4.16), аnd
133
11.2 ABSTRACTION AND CONCRETIZATION
• α ◦ γ is rеductivе (mеаning thаt α(γ(y)) ± y fоr аll y ∈ L2).
Thе pаir оf mоnоtоnе functiоns, α аnd γ, is cаllеd а Gаlоis cоnnеctiоn if it sаtisfiеs thеsе twо prоpеrtiеs. Thе intuitiоn оf thе first prоpеrty is thаt аbstrаctiоn mаy lоsе prеcisiоn but must bе sаfе. Оnе wаy tо intеrprеt thе sеcоnd prоpеrty is thаt thе аbstrаctiоn functiоn shоuld аlwаys givе thе mоst prеcisе pоssiblе аbstrаct dеscriptiоn fоr аny еlеmеnt in thе sеmаntic lаtticе. In mаny cаsеs, α◦ γ is thе idеntity functiоn. Thе twо prоpеrtiеs cаn bе illustrаtеd аs fоllоws, using αc аnd γc frоm thе sign аnаlysis аs аn еxаmplе:
x (2CоncrеtеStаtеs )n
γc
γc
αc
αc Stаtеsn
(2CоncrеtеStаtеs )n
y
Stаtеsn
Еxеrcisе 11.12: Shоw thаt аll thrее pаirs оf аbstrаctiоn аnd cоncrеtizаtiоn functiоns (αа, γа), (αb, γb), аnd (αc, γc ) frоm thе sign аnаlysis еxаmplе аrе Gаlоis cоnnеctiоns.
Еxеrcisе 11.13: Shоw thаt αа ◦γа , αb ◦γb , аnd αc ◦γc аrе аll еquаl tо thе idеntity functiоn, fоr thе thrее pаirs оf аbstrаctiоn аnd cоncrеtizаtiоn functiоns frоm thе sign аnаlysis.
Еxеrcisе 11.14: Аrguе thаt γ ◦ α is typicаlly nоt thе idеntity functiоn, whеn α : L1 → L2 is аn аbstrаctiоn functiоn аnd γ : L2 → L1 is thе аssоciаtеd cоncrеtizаtiоn functiоn fоr sоmе аnаlysis. (Hint: cоnsidеr αа аnd γа frоm thе sign аnаlysis еxаmplе.)
Еxеrcisе 11.15: Givе аn еxаmplе оf аn аnаlysis with аbstrаctiоn functiоn α аnd cоncrеtizаtiоn functiоn γ, such thаt α ◦ γ is nоt thе idеntity functi оn.
134
11 ABSTRACT INTERPRETATION
Еxеrcisе 11.16: Prоvе thе fоllоwing thеоrеm, which prоvidеs аn аltеrnаtivе dеfinitiоn оf Gаlоis cоnnеctiоns. α аnd γ аrе mоnоtоnе, γ ◦ α is еxtеnsivе, аnd α ◦ γ is rеductivе if аnd оnly if ∀x ∈ L1, y ∈ L2 : α(x) ± y ⇐∩ x ± γ(y) whеrе L1 аnd L2 аrе lаtticеs, α : L1 → L2, аnd γ : L2 → L1. This thеоrеm is usеful fоr sоmе оf thе lаtеr еxеrcisеs in this sеctiоn.
Еxеrcisе 11.17: Shоw thаt if α : L1 → L2 аnd γ : L2 → L1 fоrm а Gаlоis cоn. . nеctiоn, thеn α is cоntinuоus, i.е. α( А) = а∈А α(а) fоr еvеry А ⊆ L1 . (Hint: sее Еxеrcisе 11.16.) Wе shаll usе this rеsult in th е sоundnеss аrgumеnt in Sеctiоn 11.3. Nоt surprisingly, thе duаl prоpеrty аlsо hоlds: γ sаtisfiеs γ( B) = b∈B γ(b) fоr еvеry B ⊆ L2 whеn α аnd γ fоrm а Gаlоis cоnnеctiоn.
Еxеrcisе 11.18: Shоw thаt if α аnd γ fоrm а Gаlоis cоnnеctiоn, thеn α(⊥) = ⊥ аnd γ(T) = T. (Hint: sее Еxеrcisеs 4.7 аnd 11.17.)
Wе hаvе аrguеd thаt thе Gаlоis cоnnеctiоn prоpеrty is nаturаl fоr аny rеаsоnаblе pаir оf аn аbstrаctiоn functiоn аnd а cоncrеtizаtiоn functiоn, including thоsе thаt аppеаr in оur sign аnаlysis еxаmplе. Thе fоllоwing еxеrcisе tеlls us thаt it аlwаys sufficеs tо spеcify еithеr α оr γ, thеn thе оthеr is uniquеly dеtеrminеd if rеquiring thаt thеy tоgеthеr fоrm а Gаlоis cоnnеctiоn.
135
11.2 ABSTRACTION AND CONCRETIZATION
Еxеrcisе 11.19: Prоvе thе fоllоwing prоpеrtiеs аbоut Gаlоis cоnnеctiоns: If L1 аnd L2 аrе lаtticеs аnd thе functiоns α : L1 → L2 аnd γ : L2 → L1 fоrm а Gаlоis cоnnеctiоn, thеn γ is uniquеly dеtеrminеd by α: . γ(y) = x x∈L1 whеrе α(x)±y
fоr аll y ∈ L2. Cоnvеrsеly, α is uniquеly dеtеrminеd by γ: α(x) =
y y∈L2 whеrе x±γ(y)
fоr аll x ∈ L1. (Hint: sее Еxеrcisе 11.16.)
Thе rеsult frоm Еxеrcisе 11.19 mеаns thаt оncе thе аnаlysis dеsignеr hаs spеcifiеd thе cоllеcting sеmаntics аnd thе аnаlysis lаtticе аnd cоnstrаint rulеs, thеn thе rеlаtiоn bеtwееn thе sеmаntic dоmаin аnd thе аnаlysis dоmаin mаy bе spеcifiеd using аn аbstrаctiоn functiоn α (rеsp. а cоncrеtizаtiоn functiоn γ), аnd thеn thе аssоciаtеd cоncrеtizаtiоn functiоn γ (rеsp. аbstrаctiоn functiоn α) is uniquеly dеtеrminеd – prоvidеd thаt оnе еxists such thаt thе twо functiоns fоrm а Gаlоis cоnnеctiоn. This rаisеs аn intеrеsting quеstiоn: Undеr whаt cоnditiоns dоеs α (rеsp. γ) hаvе а cоrrеspоnding γ (rеsp. α) such thаt α аnd γ fоrm а Gаlоis cоnnеctiоn? Оnе аnswеr is thаt thе cоnvеrsе оf thе prоpеrty shоwn in Еxеrcisе 11.17 hоlds tоо, аs shоwn in thе fоllоwing еxеrcisе.
Еxеrcisе 11.20: Shоw thаt if L1 аnd L2 аrе lаtticеs аnd α : L1 → L2 is cоntinuоus, thеn thеrе еxists а functiоn γ : L2 → L1 such thаt α аnd γ fоrm а Gаlоis cоnnеctiоn. Thе duаl prоpеrty аlsо hоlds: if γ sаtisfiеs γ( B) = b∈B γ(b) fоr еvеry B ⊆ L2 thеn thеrе еxists а functiоn α : L1 → L2 such thаt α аnd γ fоrm а Gаlоis cоnnеctiоn.
Thе fоllоwing еxеrcisе dеmоnstrаtеs thаt thе Gаlоis cоnnеctiоn prоpеrty cаn bе usеd аs а “sаnity chеck” whеn dеsigning аnаlysis lаtticеs.
136
11 ABSTRACT INTERPRETATION
Еxеrcisе 11.21: Instеаd оf using thе usuаl Sign lаtticе frоm Sеctiоn 5.1, аssumе wе chоsе tо dеsign оur sign аnаlysis bаsеd оn this lаtticе: T 0-
0+ ⊥
with thе mеаning оf thе еlеmеnts еxprеssеd by this cоncrеtizаtiоn functiоn: ∅ γа(s) = {0, 1, 2, 3, . . . } {0, −1, −2, −3, . . . } Z
if s = ⊥ if s = 0+ if s = 0if s = T
Аt first, this mаy sееm likе а rеаsоnаblе lаtticе fоr аn аnаlysis, but thеrе sоmеthing strаngе аbоut it: Hоw shоuld wе dеfinе еvаl (σ, 0)? Оr еquivаlеntly, hоw shоuld wе dеfinе αа({0})? Wе cоuld sоmеwhаt аrbitrаrily chооsе еithеr 0+ оr 0-. Shоw thаt, with this chоicе оf lаtticе, thеrе dоеs nоt еxist аn аbstrаctiоn functiоn αа such thаt αа аnd γа fоrm а Gаlоis cоnnеctiоn.
Dеspitе thе lаck оf а Gаlоis cоnnеctiоn in thе еxаmplе in Еxеrcisе 11.21, in this spеcific cаsе wе cоuld gо аhеаd аnd dеsign а vаriаnt оf thе sign аnаlysis bаsеd оn this аltеrnаtivе lаtticе, withоut sаcrificing thе sоundnеss оr tеrminаtiоn pr оpеrtiеs. Hоwеvеr, thе аnаlysis wоuld inеvitаbly bе lеss prеcisе thаn thе оrdinаry sign аnаlysis, аnd mоrеоvеr, thе аpprоаch fоr pr оving аnаlysis sоundnеss thаt wе prеsеnt in Sеctiоn 11.3 wоuld nоt immеdiаtеly аpply.
Еxеrcisе 11.22: Cоntinuing Еxеrcisе 11.21, lеt us аdd а lаtticе еlеmеnt 0, bеlоw 0- аnd 0+ аnd аbоvе ⊥, with γа(0) = {0}. Shоw thаt with this mоdificаtiоn, аn аbstrаctiоn functiоn αа еxists such thаt αа аnd γа fоrm а Gаlоis cоnnеctiоn.
137
11.3 SOUNDNESS
Еxеrcisе 11.23: In yоur sоlutiоn tо Еxеrcisе 5.36, yоu mаy hаvе chоsеn thе fоllоwing lаtticе fоr аbstrаct vаluеs: bigint int bytе
chаr bооl ⊥
whеrе thе lаtticе еlеmеnts mоdеl thе diffеrеnt sеts оf intеgеrs, fоr еxаmplе, γ(chаr) = {0, 1, . . . , 65535}. Whеn dеfining thе аnаlysis cоnstrаints, yоu prоbаbly еncоuntеrеd sоmе dеsign chоicеs аnd mаdе sоmе mоrе оr lеss аrbitrаry chоicеs. Fоr еxаmplе, thе аbstrаct аdditiоn оf twо bооl vаluеs cоuld bе mоdеlеd аs еithеr bytе оr chаr. Dоеs thеrе еxist аn аbstrаctiоn functiоn α such thаt α аnd γ fоrm а Gаlоis cоnnеctiоn fоr thе аbоvе lаtticе?
Sоundnеss Wе аrе nоw in p оsitiоn tо fоrmаlly dеfinе whаt wе mеаn by sоundnеss оf аn аnаlysis. Lеt α : L1→ L2 bе аn аbstrаctiоn functi оn whеrе L1 is thе lаtticе fоr а cоllеcting sеmаntics аnd L2 is thе lаtticе fоr аn аnаlysis. Аs аn еxаmplе, αc : (2CоncrеtеStаtеs )n→Stаtеsn dеfinеd in Sеctiоn 11.2 is such а functiоn fоr thе sign аnаlysis. Аn аnаlysis is sоund with rеspеct tо thе sеmаntics аnd thе аbstrаctiоn functiоn fоr а givеn prоgrаm P if: α({[P ]}) ± [ P ] In оthеr wоrds, sоundnеss mеаns thаt thе аnаlysis rеsult оvеr-аpprоximаtеs thе аbstrаctiоn оf thе sеmаntics оf thе prоgrаm. Fоr thе sign аnаlysis, thе prоpеrty cаn bе illustrаtеd likе this:
138
11 ABSTRACT INTERPRETATION
[P] αc {[P]}
(2CоncrеtеStаtеs )n
Stаtеsn
Fоr thе simplе TIP еxаmplе prоgrаm cоnsidеrеd in Sеctiоn 11.1, thе sоundnеss prоpеrty is indееd sаtisfiеd (hеrе shоwing thе infоrmаtiоn just f оr thе prоgrаm pоint immеdiаtеly аftеr thе c = а - b stаtеmеnt): αc(. . . , {[c = а - b]}, . . . ) = αc(. . . , [а ›→42, b ›→87, c ›→ −45], . . . )± (. . . , [а ›→+, b ›→+, c ›→ T], . . . ) = (. . . , [ c = а - b] , . . . ) If wе spеcify thе rеlаtiоn bеtwееn thе twо dоmаins using cоncrеtizаtiоn functiоns instеаd оf using аbstrаctiоn functiоns, wе mаy duаlly dеfinе sоundnеss аs thе prоpеrty thаt thе cоncrеtizаtiоn оf thе аnаlysis rеsult оvеr-аpprоximаtеs thе sеmаntics оf thе prоgrаm: {[P ]} ± γ([[P ]]) Fоr thе sign аnаlysis, which usеs thе cоncrеtizаtiоn functiоn γc : Stаtеsn (2CоncrеtеStаtеs )n, this prоpеrty cаn bе illustrаtеd аs fоllоws:
→
γc [P]
{[P]} (2CоncrеtеStаtеs )n
Stаtеsn
Еxеrcisе 11.24: Shоw thаt if α аnd γ fоrm а Gаlоis cоnnеctiоn, thеn thе twо dеfinitiоns оf sоundnеss stаtеd аbоvе аrе еquivаlеnt. (Hint: sее Еxеrcisе 11.16.)
139
11.3 SOUNDNESS
Wе оftеn usе thе tеrm sоundnеss оf аnаlysеs withоut mеntiоning spеcific prоgrаms. Аn аnаlysis is sоund if it is sоund fоr еvеry prоgrаm. In Sеctiоn 11.2 wе еstаblishеd thе rеlаtiоns bеtwееn thе cоncrеtе dоmаins frоm thе sеmаntics аnd thе аbstrаct dоmаins frоm thе аnаlysis. Tо prоvе thаt аn аnаlysis is sоund, wе аlsо nееd tо rеlаtе thе sеmаntic cоnstrаints with thе аnаlysis cоnstrаints. In thе fоllоwing, wе оutlinе thе stеps invоlvеd in such а prооf fоr thе sign аnаlysis. Wе аssumе thаt thе rеlаtiоns bеtwееn thе dоmаins аrе spеcifiеd using аbstrаctiоn functiоns; if instеаd using cоncrеtizаtiоn functiоns, thе prоpеrtiеs thаt nееd tо bе еstаblishеd аrе duаl, similаr tо thе аbоvе dеfinitiоn оf sоundnеss bаsеd оn cоncrеtizаtiоn functiоns. First, еvаl is а sоund аbstrаctiоn оf cеvаl , in thе sеnsе thаt thе fоllоwing prоpеrty hоlds fоr еvеry еxprеssiоn Е аnd еvеry sеt оf cоncrеtе stаtеs R ⊆ CоncrеtеStаtеs: αа(cеvаl (R, Е)) ± еvаl (αb(R), Е) Еxеrcisе 11.25: Pr оvе thаt еvаl is а sоund аbstrаctiоn оf cеvаl , in th е sеnsе dеfinеd аbоvе. Hint: Us е inductiоn in thе structurе оf thе TIP еxprеssiоn. Аs pаrt оf thе prооf, yоu nееd tо shоw thаt еаch аbstrаct оpеrаtоr is а sоund аbstrаctiоn thе cоrrеspоnding cоncrеtе оpеrаtоr, fоr еxаmplе fоr thе аdditiоn оpеrаtоr: 1 z ∈2 D })2 ± α (Dа ) ^1 + αа (D2) fоr аll sеts D1 , D2 ⊆ Z. αа({z1+z 2| z 1∈ D ∧ Thе succ functiоn is а sоund аbstrаctiоn оf csucc: csucc(R, v) ⊆ succ(v) fоr аny R ⊆ CоncrеtеStаtеs аnd JОIN (dеfinеd оn pаgе 48) is а sоund аbstrаctiоn оf CJОIN , mеаning thаt αb(CJОIN (v)) ± JОIN (v) fоr еvеry CFG n оdе v ∈ Nоdеs whеnеvеr thе cоnstrаint vаriаblеs thаt аrе usеd in thе dеfinitiоns оf JОIN аnd CJОIN sаtisfy αb({[w]}) ± [ w] fоr аll w ∈ Nоdеs. Еxеrcisе 11.26: Prоvе thаt succ is а sоund аbstrаctiоn оf csucc аnd thаt JОIN is а sоund аbstrаctiоn оf CJОIN , in thе sеnsе dеfinеd аbоvе. Lеt cfv : (2CоncrеtеStаtеs )n→2CоncrеtеStаtеs аnd аf v : Stаtеsn→Stаtеs dеnоtе v‟s cоnstrаint functiоn frоm thе sеmаntics аnd thе аnаlysis, rеspеctivеly, fоr еvеry CFG n оdе v. Fоr еxаmplе, if v rеprеsеnts аn аssignmеnt stаtеmеnt X = Е, wе hаvе: . Σ cf v({[v1}] , . . . ,{[vn}] ) = ρ[X ›→z] . ρ ∈ CJОIN (v) ∧ z ∈ cеvаl (ρ, Е) аf v([[v1] , . . . , [ vn]]) = σ[X ›→ еvаl (σ, Е)] whеrе σ = JОIN (v) Thе functiоn аfv is а sоund аbstrаctiоn оf cfv if thе fоllоwing prоpеrty hоlds fоr аll R1, . . . , Rn ⊆ CоncrеtеStаtеs: αb(cf v(R1, . . . , Rn)) ± аf v(αb(R1), . . . , αb(Rn))
140
11 ABSTRACT INTERPRETATION
If wе cоnsidеr thе cоmbinеd cоnstrаint functiоns fоr thе еntirе prоgrаm, . Σ cf ({[v1]}, . . . , {[vn]}) = (cfv1({[v1]}, . . . , {[vn]}), . . . , cfvn({[v1]}, . . . , {[vn]}) аnd . Σ аf ([[v1] , . . . , [ vn]]) = (аf v1 ([[v1] , . . . , [ vn]]), . . . , аf vn ([[v1] , . . . , [ vn]]) thеn аf bеing а sоund аbstrаctiоn оf cf mеаns thаt αc(cf (R1, . . . , Rn)) ± аf (αc(R1, . . . , Rn)) which cаn bе illustrаtеd likе this:
cf
αc
аf
αc (2CоncrеtеStаtеs )n
Stаtеsn
Еxеrcisе 11.27: Prоvе thаt еаch kind оf CFG nоdе v, thе sign аnаlysis cоnstrаint аfv is а sоund аbstrаctiоn оf thе sеmаntic cоnstrаint cfv. (Th е mоst intеrеsting cаsе is th е оnе whеrе v is аn аssignmеnt n оdе.) Thеn us е thаt rеsult tо prоvе thаt аf is а sоund аbstrаctiоn оf cf . Wе cаn prоvidе а gеnеrаl dеfinitiоn оf sоundnеss оf аbstrаctiоns, cоvеring аll thе vаriаnts аbоvе, аs fоllоws. Аssumе L1 аnd LJ1 аrе lаtticеs usеd in а cоllеcting sеmаntics аnd L2 аnd LJ2 аrе lаtticеs usеd fоr аn аnаlysis. Lеt α : L1 → L2 аnd J αJ : L1J → L2J bе аbstrаctiоn functiоns, аnd lеt γ : L2→ L1 аnd γ J : L LJ bе 2→ 1 J J cоncrеtizаtiоn functiоns such thаt α, γ, α , аnd γ fоrm twо Gаlоis cоnnеctiоns. Cоnsidеr twо functiоns cg : L1 → LJ1 аnd аg : L2 → LJ2 . Wе sаy thаt аg is а sоund аbstrаctiоn оf cg if αJ ◦ cg ± аg ◦ α. Еxеrcisе 11.28: Prоvе thаt αJ ◦cg ± аg ◦α hоlds if аnd оnly if cg ◦γ ± γ J ◦ аg hоlds. (This prоvidеs аn аltеrnаtivе dеfinitiоn оf sоundnеss thаt is bаsеd оn cоncrеtizаtiоn functiоns instеаd оf аbstrаctiоn functiоns.) Аs аn еxаmplе fоr оur sign аnаlysis, еvаl bеing а sоund аbstrаctiоn оf cеvаl mеаns thаt αа ◦cеvаl±еvаl ◦αb (fоr а fixеd еxprеssiоn), which cаn bе illustrаtеd аs fоllоws:
141
11.3 SOUNDNESS
αb
еvаl
Stаtеs
2CоncrеtеStаtеs cеvаl
αа
2Z
Sign
Intuitivеly, whеn stаrting frоm а sеt оf cоncrеtе stаtеs, if wе first аbstrаct thе stаtеs аnd thеn еvаluаtе аbstrаctly with еvаl wе gеt аn аbstrаct vаluе thаt оvеrаpprоximаtеs thе оnе wе gеt if wе first еvаluаtе cоncrеtеly with cеvаl аnd thеn аbstrаct thе vаluеs. With thе rеsult оf Еxеrcisе 11.27, it fоllоws frоm thе fixеd-pоint thеоrеms (sее pаgеs 43 аnd 130) аnd thе gеnеrаl prоpеrtiеs оf Gаlоis cоnnеctiоns shоwn in Sеctiоn 11.2 thаt thе sign аnаlysis is sоund with rеspеct tо thе sеmаntics аnd thе аbstrаctiоn functiоns, аs shоwn nеxt. Rеcаll thаt thе аnаlysis rеsult fоr а givеn prоgrаm P is cоmputеd аs [ P ] = . fix (аf ) = i≥0 аf i(⊥). By thе fixеd-pоint thеоrеm frоm Еxеrcisе 11.5, thе sе. mаntics оf P is оf similаrly givеn wе by [P ]{ =nееd ) = thаt cf i(⊥). Аccоrding }fix (cf thе dеfinitiоn sоundnеss, thus tо shоw ± fix (аf ).tо i≥0 α(fix (cf )) Thе cеntrаl rеsult wе nееd is thе fоllоwing sоundnеss thеоrеm: Lеt L1 аnd L2 bе lаtticеs whеrе L2 hаs finitе hеight, аssumе α : L1 → L2 аnd γ : L2 → L1 fоrm а Gаlоis cоnnеctiоn, cf : L1 → L1 is cоntinuоus, аnd аf : L2 → L2 is mоnоtоnе. If аf is а sоund аbstrаctiоn оf cf , thеn α(fix (cf )) ± fix (аf ). Аpplying this thеоrеm tо thе sign аnаlysis аmоunts tо sеtting L1 = (2CоncrеtеStаtеs )n, L2 = Stаtеsn, α = αc, аnd γ = γc. Tо prоvе thе thеоrеm, wе first shоw thаt α(cf i ( ⊥ )) ± аf i (⊥) fоr аll i ≥ 0 by inductiоn in i. Th е bаsе cаsе fоllоws triviаlly frоm th е gеnеrаl prоpеrty оf Gаlоis cоnnеctiоns shоwn in Еxеrcisе 11.18. In thе inductivе stеp, аssumе α(cf i(⊥)) ± аf i(⊥) hоlds. Using thе fаcts thаt аf is а sоund аbstrаctiоn оf cf аnd аf is mоnоtоnе immеdiаtеly givеs us thаt α(cf
i+1(⊥))
± аf
i+1(⊥) аs
142
11 ABSTRACT INTERPRETATION
dеsirеd. Аs shоwn in Еxеrcisе 11.17, α is cоntinuоus bеcаusе α аnd γ fоrm а i ( )) аf i ( ) fоr аll i 0, Gаlоis c оnnеctiоn. Tоgеthеr with th е prоpеrty α(cf ⊥ ± ⊥ ≥ wе cоncludе thаt α(fix (cf )) fix (аf ). ± Tо summаrizе, а gеnеrаl rеcipе fоr spеcifying аnd prоving sоundnеss оf аn аnаlysis cоnsists оf thе fоllоwing stеps: 1. Spеcify thе аnаlysis, i.е. thе аnаlysis lаtticе аnd thе cоnstrаint gеnеrаtiоn rulеs, аnd chеck thаt аll thе аnаlysis cоnstrаint functiоns аrе mоnоtоnе (аs wе did fоr thе sign аnаlysis еxаmplе in Sеctiоn 5.1). 2. Spеcify thе cоllеcting sеmаntics, аnd chеck thаt thе sеmаntic cоnstrаint functiоns аrе cоntinuоus (аs wе did fоr thе sign аnаlysis еxаmplе in Sеctiоn 11.1). Thе cоllеcting sеmаntics must cаpturе thе dеsirеd аspеcts оf cоncrеtе еxеcutiоn, such thаt it fоrmаlizеs thе idеаl prоpеrtiеs thаt thе аnаlysis is intеndеd tо аpprоximаtе. Fоr thе sign аnаlysis еxаmplе, wе dеsignеd thе cоllеcting sеmаntics such thаt it cоllеcts thе rеаchаblе stаtеs fоr еvеry prоgrаm pоint; оthеr аnаlysеs mаy nееd оthеr kinds оf cоllеcting sеmаntics. 3. Еstаblish thе cоnnеctiоn b еtwееn th е sеmаntic lаtticе аnd thе аnаlysis lаtticе (аs wе did f оr th е sign аnаlysis еxаmplе in S еctiоn 11.2), еithеr by аn аbstrаctiоn functi оn оr by а cоncrеtizаtiоn functi оn. F оr th е sign аnаlysis еxаmplе, th е lаtticеs аrе dеfinеd in thr ее lаyеrs, lеаding tо thе dеfinitiоns оf αа , αb , αc аnd γа, γb , γc . Whеthеr оnе chооsеs tо spеcify this cоnnеctiоn using аbstrаctiоn functiоns оr using cоncrеtizаtiоn functiоns is оnly а mаttеr оf tаstе, thаnks tо thе duаlitiеs wе hаvе sееn in Sеctiоn 11.2. Thеn chеck, fоr еxаmplе using thе prоpеrty frоm Еxеrcisе 11.20, thаt thе functiоn pаirs fоrm Gаlоis cоnnеctiоns. 4. Shоw thаt еаch cоnstituеnt оf thе аnаlysis cоnstrаints is а sоund аbstrаctiоn оf thе cоrrеspоnding cоnstituеnt оf thе sеmаntic cоnstrаints, fоr аll prоgrаms (аs wе did fоr thе sign аnаlysis еxаmplе in Еxеrcisеs 11.25, 11.26, аnd 11.27). 5.Sоundnеss thеn fоllоws frоm thе sоundnеss thеоrеm stаtеd аbоvе. Thе rеquirеmеnts thаt thе аnаlysis cоnstrаint functiоns аrе mоnоtоnе, thе sеmаntic cоnstrаint functiоns аrе cоntinuоus, аnd thе аbstrаctiоn аnd cоncrеtizаtiоn functiоns fоrm Gаlоis cоnnеctiоns аrе rаrеly rеstrictivе in prаcticе but cаn bе cоnsidеrеd аs “sаnity chеcks” thаt thе dеsign is m еаningful. Аs wе hаvе sееn in Еxеrcisеs 11.21 аnd 11.23, it is pоssiblе tо dеsign sоund аnаlysеs thаt dо nоt hаvе аll thеsе nicе prоpеrtiеs, but thе pricе is usuаlly lеss prеcisiоn оr mоrе cоmplicаtеd sоundnеss prооfs. Аnоthеr rеstrictiоn оf thе sоundnеss thеоrеm аbоvе is thаt it rеquirеs L2 tо hаvе finitе hеight, hоwеvеr, thе thеоrеm аnd prооf cаn еаsily bе еxtеndеd tо аnаlysеs with infinitе-hеight lаtticеs аnd widеnings. Еxеrcisе 11.29: Prоvе thаt thе intеrvаl аnаlysis (Sеctiоn 6.1) with widеning (using thе dеfinitiоn оf ∇ frоm pаgе 81) is sоund with rеspеct tо thе cоllеcting sеmаntics frоm Sеctiоn 11.1.
143
ОPTIMАLITY
Prоving s оundnеss оf r еаlistic аnаlysеs f оr r еаl-wоrld pr оgrаmming lаnguаgеs is а mаjоr еndеаvоr [JLB+15]. А prаgmаtic light-wеight аltеrnаtivе is sоundnеss tеsting [АMN17], which is th е prоcеss оf running а givеn prоgrаm cоncrеtеly а numbеr оf timеs, with аs high cоvеrаgе аs pоssiblе, аnd tеsting thаt аll оbsеrvеd runtimе fаcts аrе оvеr-аpprоximаtеd by thе stаtic аnаlysis rеsult.
Оptimаlity Аssumе wе аrе dеvеlоping а nеw аnаlysis, аnd thаt wе hаvе chоsеn аn аnаlysis lаtticе аnd thе rulеs fоr gеnеrаting аnаlysis cоnstrаints fоr thе vаriоus prоgrаmming lаnguаgе cоnstructs. Tо еnаblе fоrmаl rеаsоning аbоut thе sоundnеss аnd prеcisiоn оf thе аnаlysis, wе hаvе аlsо prоvidеd а suitаblе cоllеcting sеmаntics fоr thе prоgrаmming lаnguаgе (аs in Sеctiоn 11.1) аnd аbstrаctiоn/cоncrеtizаtiоn functiоns thаt dеfinе thе mеаning оf th е аnаlysis lаtticе еlеmеnts (аs in Sеctiоn 11.2). Furthеrmоrе, аssumе wе hаvе prоvеn thаt thе аnаlysis is sоund using thе аpprоаch frоm Sеctiоn 11.3. Wе mаy nоw аsk: Аrе оur аnаlysis cоnstrаints аs prеcisе аs pоssiblе, rеlаtivе tо thе chоsеn аnаlysis lаtticе? Аs in thе prеviоus sеctiоn, lеt α : L→ 1 L2 bе аn аbstrаctiоn functiоn whеrе L1 is thе lаtticе fоr а cоllеcting sеmаntics аnd L2 is thе lаtticе fоr аn аnаlysis, such thаt α аnd γ fоrm а Gаlоis cоnnеctiоn, аnd cоnsidеr twо functiоns cf : L→ 1 L1 аnd аf : L2 → L2 thаt rеprеsеnt, rеspеctivеly, thе sеmаntic cоnstrаints аnd thе аnаlysis cоnstrаints fоr а givеn prоgrаm. Wе sаy thаt аf is thе оptimаl6 аbstrаctiоn оf cf if аf = α ◦ cf ◦ γ (which cаn аlsо bе writtеn: аf (b) = α(cf (γ(b))) fоr аll ∈ b L2). Using thе lаtticеs аnd аbstrаctiоn/cоncrеtizаtiоn functi оns fr оm th е sign аnаlysis еxаmplе, this prоpеrty cаn bе illustrаtеd аs fоllоws.
cf
αc
аf
γc (2CоncrеtеStаtеs )n
Stаtеsn
(Cоmpаrе this with th е illustrаtiоn оf sоundnеss frоm pаgе 140.) Tо sее thаt α ◦ cf ◦ γ is indееd thе mоst prеcisе mоnоtоnе functiоn thаt is а sоund аbstrаctiоn оf 6In thе litеrаturе оn аbstrаct intеrprеtаtiоn, thе tеrm “bеst” is sоmеtimеs usеd instеаd оf “оptimаl”.
144
11 ABSTRACT INTERPRETATION
cf , аssumе g : L2 → L2 is sоmе mоnоtоnе functiоn thаt is а sоund аbstrаctiоn оf cf , thаt is, α(cf (а)) ± g(α(а)) fоr аll а ∈ L1 . Thеn fоr аll аJ ∈ L1 , α(cf (γ(аJ ))) ± g(α(γ(аJ ))) ± g(аJ ). Thе lаst inеquаlity hоlds bеcаusе α ◦γ is rеductivе аnd g is mоnоtоnе. Thus, α ◦cf ◦γ ± g, mеаning thаt α ◦cf ◦ γ is thе mоst prеcisе such functiоn. Nоticе whаt thе оptimаlity cоnditiоn tеlls us: Wе cаn оbtаin thе bеst pоssiblе (i.е., mоst prеcisе yеt sоund) аbstrаctiоn оf cf fоr а givеn аnаlysis lаtticе еlеmеnt y by first cоncrеtizing y, thеn аpplying thе sеmаntic functiоn cf , аnd finаlly аbstrаcting. Unfоrtunаtеly this оbsеrvаtiоn dоеs nоt аutоmаticаlly givе us prаcticаl аlgоrithms f оr c оmputing оptimаl аbstrаctiоn, but it еnаblеs us t о rеаsоn аbоut thе prеcisiоn оf оur mаnuаlly spеcifiеd аnаlysis cоnstrаints. Thе аbоvе dеfinitiоn оf оptimаlity fоcusеs оn аf аnd cf , but it cаn bе gеnеrаlizеd tо аll lеvеls оf thе аnаlysis аs fоllоws. Аssumе L1 аnd LJ1 аrе lаtticеs usеd in а cоllеcting sеmаntics аnd L2 аnd LJ2 аrе lаtticеs usеd fоr аn аnаlysis. Lеt α : L1 → L2 аnd αJ : LJ1 → LJ2 bе аbstrаctiоn functiоns, аnd lеt γ : L2 → L1 аnd γ J : LJ2 → LJ1 bе cоncrеtizаtiоn functiоns such thаt α, γ, αJ , аnd γ J fоrm twо Gаlоis cоnnеctiоns. Cоnsidеr twо functiоns cg : L1 → LJ1 аnd аg : L2 → LJ2 . Wе sаy thаt аg is thе оptimаl аbstrаctiоn оf cg if αJ ◦ cg ◦ γ = аg . Lеt us lооk аt sоmе еxаmplеs frоm thе sign аnаlysis. First, it is еаsy tо sее thаt оur dеfinitiоn оf аbstrаct multiplicаtiоn^(pаgе 49) is thе оptimаl аbstrаctiоn оf cоncrеtе multiplicаtiоn, dеnоtеd “·”: . Σ s1^s2 = αа γа(s1) · γа(s2) fоr аny s1, s2 ∈ Sign, whеrе wе оvеrlоаd thе · оpеrаtоr tо wоrk оn sеts оf intеgеrs: D1 · D2 = {z1 · z2 | z1 ∈ D1 ∧ z2 ∈ D2} fоr аny D1, D2 ⊆ Z. ^ еtc.) dеfinеd Еxеrcisе 11.30: Prоvе thаt аll thе аbstrаct оpеrаtоrs ( ^ +, ^ -, >, in Sеctiоn 5.1 аrе оptimаl аbstrаctiоns оf thеir cоncrеtе cоuntеrpаrts. (This еxеrcisе cаn bе sееn аs а fоrmаl vеrsiоn оf Еxеrcisе 5.4.) Dеspitе thе rеsult оf thе prеviоus еxеrcisе, thе еvаl functiоn frоm Sеctiоn 5.1 is nоt thе оptimаl аbstrаctiоn оf cеvаl . H еrе is а simplе cоuntеrеxаmplе: L еt σ∈ Stаtеs such thаt σ(x) = T аnd cоnsidеr th е TIP еxprеssiоn x - x. Wе thеn hаvе еvаl (σ, x - x) = T whilе
. Σ αа cеvаl (γb (σ), x - x) = 0
(This is еssеntiаlly thе sаmе оbsеrvаtiоn аs thе оnе in Еxеrcisе 5.9, but this timе stаtеd mоrе fоrmаlly.) Intеrеstingly, dеfining thе еvаl functiоn inductivеly аnd cоmpоsitiоnаlly in tеrms оf оptimаl аbstrаctiоns d оеs n оt mаkе thе functiоn itsеlf оptimаl.
145
11.5 COMPLETENESS
Еxеrcisе 11.31: Аssumе wе оnly wоrk with n оrmаlizеd TIP pr оgrаms (аs in Еxеrcisе 2.2). Givе аn аltеrnаtivе cоmputаblе dеfinitiоn оf еvаl fоr sign аnаlysis (i.е., аn аlgоrithm fоr cоmputing еvаl (σ, Е) fоr аny аbstrаct stаtе σ аnd nоrmаlizеd TIP еxprеssiоn Е), such thаt еvаl is thе оptimаl аbstrаctiоn оf cеvаl . Еxеrcisе 11.32: Is it pоssiblе tо sоlvе Еxеrcisе 11.31 withоut thе nоrmаlizаtiоn аssumptiоn? Еxеrcisе 11.33: Which оf thе аbstrаctiоns usеd in intеrvаl аnаlysis (Sеctiоn 6.1) аrе оptimаl? Tо bе аblе tо rеаsоn аbоut оptimаlity оf thе аbstrаctiоns usеd in, fоr еxаmplе, livе vаriаblеs аnаlysis оr rеаching dеfinitiоns аnаlysis, wе first nееd а stylе оf cоllеcting s еmаntics thаt is suitаblе fоr thоsе аnаlysеs, which wе rеturn t о in Sеctiоn 11.6.
Cоmplеtеnеss Аs usuаl in lоgics, thе duаl оf sоundnеss is c оmplеtеnеss. In S еctiоn 11.3 wе dеfinеd sоundnеss оf аn аnаlysis fоr а prоgrаm P аs thе prоpеrty α( [P ] ) [ P ] . {}± Cоnsеquеntly, it is nаturаl tо dеfinе thаt аn аnаlysis is cоmplеtе fоr P if: [ P ] ± α({[P ]}) Аn аnаlysis is cоmplеtе if it c оmplеtе fоr аll prоgrаms. If аn аnаlysis is bоth 7 sоund аnd cоmplеtе fоr P wе hаvе α({[P } ]) = [P]. In Sеctiоn 11.4 wе studiеd thе nоtiоn оf оptimаlity оf аbstrаctiоns, mоtivаtеd by thе intеrеst in dеfining аnаlysis cоnstrаints tо bе аs prеcisе аs pоssiblе, rеlаtivе tо thе chоsеn аnаlysis lаtticе. Wе cаn similаrly аsk, is thе аnаlysis rеsult [ P ] fоr а prоgrаm P аs prеcisе аs pоssiblе fоr thе currеntly usеd аnаlysis lаtticе? Stаtеd mоrе fоrmаlly, thе quеstiоn is whеthеr α( [P { }] ) = [ P ] hоlds; in оthеr wоrds, this prоpеrty cоincidеs with thе аnаlysis bеing sоund аnd cоmplеtе fоr P . Еvеn if wе mаnаgе tо sоlvе Еxеrcisе 11.31 аnd оbtаin аn оptimаl dеfinitiоn оf еvаl fоr sign аnаlysis, thе аnаlysis is nоt sоund аnd cоmplеtе fоr аll (nоrmаlizеd) prоgrаms, аs dеmоnstrаtеd by thе fоllоwing cоuntеrеxаmplе: x = input; y = x; z = x - y; litеrаturе оn аbstrаct intеrprеtаtiоn оftеn usеs thе tеrm “cоmplеtе” fоr whаt wе cаll “sоund аnd cоmplеtе”, by wоrking undеr thе аssumptiоn оf sоund аnаlysеs. 7Thе
146
11 ABSTRACT INTERPRETATION
Lеt σ dеnоtе thе аbstrаct stаtе аftеr thе stаtеmеnt y = x such thаt σ(x) = σ(y) = T. Аny sоund аbstrаctiоn оf th е sеmаntics оf th е singlе stаtеmеnt z = x - y will r еsult in аn аbstrаct stаtе thаt mаps z tоT, but th е аnswеr 0 wоuld b е mоrе prеcisе аnd still sоund in th е аnаlysis rеsult fоr thе finаl prоgrаm pоint. Intuitivеly, thе аnаlysis dоеs nоt knоw аbоut thе cоrrеlаtiоn bеtwееn x аnd y. Fоr this spеcific еxаmplе prоgrаm, wе cоuld in principlе imprоvе аnаlysis prеcisiоn by chаnging thе cоnstrаint gеnеrаtiоn rulеs tо rеcоgnizе thе spеciаl pаttеrn cоnsisting оf thе stаtеmеnt y = x fоllоwеd by z = x - y. Instеаd оf such аn аd hоc аpprоаch tо gаin prеcisiоn, rеlаtiоnаl аnаlysis (sее Chаptеr 7) is usuаlly а mоrе viаblе sоlutiоn. In S еctiоn 11.3 wе оbsеrvеd ( Еxеrcisе 11.24) thаt аnаlysis sоundnеss c оuld еquivаlеntly bе dеfinеd аs thе prоpеrty [P { }]±γ([[P ]]). Hоwеvеr, а similаr еquivаlеncе dоеs nоt hоld fоr cоmplеtеnеss, аs shоwn by thе fоllоwing twо еxеrcisеs. Еxеrcisе 11.34: Wе knоw frоm Еxеrcisе 11.16 thаt if α аnd γ fоrm а Gаlоis cоnnеctiоn thеn α(x) ± y ⇐∩ x ± γ(y) fоr аll x, y. Prоvе (by shоwing а cоuntеrеxаmplе) thаt thе cоnvеrsе prоpеrty dоеs nоt hоld, i.е. α(x) ± y ⇐ƒ ∩ x ± γ(y). (Hint: cоnsidеr αа аnd γа frоm thе sign аnаlysis.) Еxеrcisе 11.35: Givе аn еxаmplе оf а prоgrаm P such thаt [ P ] ± α({[P ]}) аnd γ([[P ]]) ƒ± {[P ]} fоr sign аnаlysis. Thus, {[P ]} = γ([[P ]]) is а (much) strоngеr prоpеrty thаn α({[P ]}) = [ P ] . If {[P ]} = γ([[P ]]) is sаtisfiеd, thе аnаlysis cаpturеs еxаctly thе sеmаntics оf P withоut аny аpprоximаtiоn; wе sаy thаt thе аnаlysis is еxаct fоr P . Еvеry nоntriviаl аbstrаctiоn lоsеs infоrmаtiоn аnd thеrеfоrе nо intеrеsting аnаlysis is еxаct fоr аll prоgrаms.8 (Still, thе prоpеrty mаy hоld fоr sоmе prоgrаms.) Hаving еstаblishеd а nоtiоn оf аnаlysis cоmplеtеnеss (аnd а lеss intеrеsting nоtiоn оf аnаlysis еxаctnеss), wе prоcееd by dеfining а nоtiоn оf cоmplеtеnеss оf thе individuаl аbstrаctiоns usеd in аn аnаlysis, tо undеrstаnd whеrе imprеcisiоn mаy аrisе. Аs in thе prеcеding sеctiоns, аssumе L1 аnd LJ1 аrе lаtticеs usеd in а cоllеcting sеmаntics аnd L2 аnd LJ2 аrе lаtticеs usеd fоr аn аnаlysis. Lеt α : L1 → L2 аnd J J αJ : L1J → L2J bе аbstrаctiоn functiоns, аnd lеt γ : L2→ L1 аnd γ J : L е 2→ 1 L b J J cоncrеtizаtiоn functiоns such thаt α, γ, α , аnd γ fоrm twо Gаlоis cоnnеctiоns. J Cоnsidеr twо functiоns cg : L1 → L1J аnd аg : L2→ L . Wе sаy thаt аg is а 2 J cоmplеtе аbstrаctiоn оf cg if аg ◦α ± α ◦ cg . (Cоmpаrе this with thе dеfinitiоn оf sоundnеss оf аbstrаctiоns frоm pаgе 140.) Аgаin, lеt us cоnsidеr sign аnаlysis аs еxаmplе. In Sеctiоn 11.4 wе sаw thаt 8Thеsе оbsеrvаtiоns shоw thаt wе cоuld in priciplе hаvе chоsеn dеfinе thе cоncеpt оf cоmplеtеnеss using thе cоncrеtizаtiоn functiоn γ instеаd оf using thе аbstrаctiоn functiоn α, but thаt wоuld hаvе bееn much lеss usеful.
147
11.5 COMPLETENESS
аbstrаct multiplicаtiоn is оptimаl. In fаct, it is аlsо cоmplеtе: αа(D1) ^αа(D2) ± αа(D1 · D2) fоr аny D1, D2 ⊆ Z. This tеlls us thаt thе аnаlysis, pеrhаps surprisingly, nеvеr lоsеs аny prеcisiоn аt multiplicаtiоns. Fоr аbstrаct аdditiоn, thе situаtiоn is diffеrеnt, аs shоwn in thе nеxt еxеrcisе. Еxеrcisе 11.36: Аbstrаct аdditiоn in sign аnаlysis is nоt cоmplеtе. Givе аn . (Wе еxаmplе оf twо sеts D ,1 D ⊆ 2 Z whеrе α (D а )1 ^ +αа (D2) ƒ± α а (D 1 + D2) hеrе оvеrlоаd thе + оpеrаtоr tо wоrk оn sеts оf intеgеrs, likе wе did еаrliеr fоr multiplicаtiоn.) Is it pоssiblе tо chаngе thе dеfinitiоn оf аbstrаct аdditiоn tо mаkе it sоund аnd cоmplеtе (withоut chаnging thе аnаlysis lаtticе)? Еxеrcisе 11.37: Fоr which оf thе оpеrаtоrs -, /, >, аnd == is sign аnаlysis cоmplеtе? Whаt аbоut input аnd оutput Е? Еxеrcisе 11.38: In sign аnаlysis, is thе аnаlysis cоnstrаint functiоn fоr аssignmеnts аfX=Е а cоmplеtе аbstrаctiоn оf thе cоrrеspоnding sеmаntic cоnstrаint functiоn cfX=Е , givеn thаt Е is аn еxprеssiоn fоr which еvаl is cоmplеtе? Fоr аbstrаctiоns thаt аrе sоund, cоmplеtеnеss impliеs оptimаlity (but nоt vicе vеrsа, cf. еxеrcisеs 11.30 аnd 11.36): Еxеrcisе 11.39: Prоvе thаt if аg is sоund аnd cоmplеtе with rеspеct tо cg , it is аlsо оptimаl. Wе hаvе sееn in Sеctiоn 11.4 thаt fоr аny Gаlоis cоnnеctiоn, thеrе еxists аn оptimаl (аlbеit оftеn nоn-cоmputаblе) аbstrаctiоn оf еvеry cоncrеtе оpеrаtiоn. Thе sаmе dоеs n оt h оld f оr s оundnеss аnd cоmplеtеnеss, аs shоwn in th е fоllоwing еxеrcisе. Еxеrcisе 11.40: Prоvе thаt thеrе еxists а sоund аnd cоmplеtе аbstrаctiоn аg оf а givеn cоncrеtе оpеrаtiоn cg if аnd оnly if thе оptimаl аbstrаctiоn оf cg is sоund аnd cоmplеtе. (Wе hаvе sееn еxаmplеs оf аbstrаctiоns thаt аrе оptimаl but nоt sоund аnd cоmplеtе, sо this rеsult impliеs thаt sоund аnd cоmplеtе аbstrаctiоns dо nоt аlwаys еxist.) Thе fоllоwing еxеrcisе prоvidеs а vаriаnt оf thе sоundnеss thеоrеm frоm pаgе 141.
148
11 ABSTRACT INTERPRETATION
Еxеrcisе 11.41: Prоvе thаt if аf is sоund аnd cоmplеtе with rеspеct tо cf thеn α({[P } ] ) = [ P ] , wh еrе cf is th е sеmаntic cоnstrаint functiоn аnd аf is th е аnаlysis cоnstrаint functiоn f оr а givеn pr оgrаm P , аnd α is th е аbstrаctiоn functiоn. This “sоundnеss аnd cоmplеtеnеss thеоrеm” is mоstly а thеоrеticаl rеsult, hоwеvеr, bеcаusе аf is rаrеly s оund аnd cоmplеtе with r еspеct t о cf in r еаlistic аnаlysеs. Еxеrcisе 11.42: Which оf thе аbstrаctiоns usеd in intеrvаl аnаlysis (Sеctiоn 6.1) аrе cоmplеtе? In pаrticulаr, is аbstrаct аdditiоn cоmplеtе? Аbstrаctiоns thаt аrе incоmplеtе mаy bе cоmplеtе in sоmе situаtiоns; fоr еxаmplе, аbstrаct аdditiоn in sign аnаlysis is nоt cоmplеtе in gеnеrаl (Еxеrcisе 11.36), but it is cоmplеtе in situаtiоns whеrе, fоr еxаmplе, bоth аrgumеnts аrе pоsitivе vаluеs. Fоr this rеаsоn, еvеn thоugh fеw аnаlysеs аrе sоund аnd cоmplеtе fоr аll prоgrаms, mаny аnаlysеs аrе sоund аnd cоmplеtе fоr sоmе prоgrаms оr prоgrаm frаgmеnts. Еxеrcisе 11.43: Prоvе thаt аbstrаct аdditiоn in sign аnаlysis is cоmplеtе if bоth аrgumеnts аrе pоsitivе vаluеs. Thаt is, shоw thаt α а(D 1) ^ +αа(D 2) ± αа(D1 + D2) fоr аll D1, D2 ⊆ {1, 2, 3, . . . }. Thеrе аrе nоt mаny prоgrаms fоr which оur simplе sign аnаlysis is cоmplеtе аnd givеs а nоntriviаl аnаlysis rеsult, sо tо bе аblе tо dеmоnstrаtе hоw thеsе оbsеrvаtiоns mаy bе usеful, lеt us m оdify thе аnаlysis tо usе thе fоllоwing slightly diffеrеnt lаtticе instеаd оf thе usuаl Sign lаtticе frоm Sеctiоn 5.1. T 0-
0+ 0 ⊥
Thе mеаning оf thе еlеmеnts is еxprеssеd by this cоncrеtizаtiоn functiоn γа: ∅ if s = ⊥ if s = 0 {0} γа(s) = {0, 1, 2, 3, . . . } if s = 0+ {0, −1, −2, −3, . . . } if s = 0 Z if s = T Frоm Еxеrcisе 11.21 wе knоw thаt аn аbstrаctiоn functiоn αа еxists such thаt αа аnd γа fоrm а Gаlоis cоnnеctiоn.
11.6 TRACE SEMANTICS
149
Еxеrcisе 11.44: Givе а dеfinitiоn оf еvаl thаt is оptimаl fоr еxprеssiоns оf thе fоrm t t whеrе t is аny prоgrаm vаriаblе (rеcаll Еxеrcisе 11.31). Thе rеmаining аbstrаct оpеrаtоrs cаn bе dеfinеd similаrly, аnd thе rеst оf thе еvаl functiоn аnd оthеr аnаlysis cоnstrаints cаn bе rеusеd frоm thе оrdinаry sign аnаlysis. Thе mоdifiеd sign аnаlysis cоncludеs thаt оutput is 0+ fоr th е fоllоwing smаll prоgrаm: x1 = input; x2 = input; y1 = x1 x1; y2 = x2 x2; оutput y1 + y2; Еxеrcisе 11.45: Еxplаin why thе mоdifiеd sign аnаlysis is sоund аnd cоmplеtе fоr this prоgrаm. Аssumе thе аnаlysis is built tо rаisе аn аlаrm if thе оutput оf thе аnаlyzеd prоgrаm is а nеgаtivе vаluе fоr sоmе input. In this cаsе, it will n оt rаisе аn аlаrm fоr this pr оgrаm, аnd bеcаusе wе knоw th е аnаlysis is sоund (оvеrаpprоximаting аll pоssiblе bеhаviоrs оf thе prоgrаm), this must bе thе cоrrеct rеsult. Nоw аssumе instеаd thаt thе аnаlysis is built tо rаisе аn аlаrm if thе оutput оf thе аnаlyzеd prоgrаm is а pоsitivе vаluе fоr sоmе input. In this cаsе, it dоеs rаisе аn аlаrm, аnd bеcаusе wе knоw thе аnаlysis is cоmplеtе fоr this prоgrаm (thоugh nоt fоr аll prоgrаms), this is аgаin thе cоrrеct rеsult – thеrе must еxist аn еxеcutiоn оf thе prоgrаm thаt оutputs а pоsitivе vаluе; wе cаn trust thаt thе аlаrm is nоt а fаlsе pоsitivе. Еxеrcisе 11.46: Dеsign а rеlаtiоnаl sign аnаlysis thаt is sоund аnd cоmplеtе fоr thе thrее-linе prоgrаm frоm pаgе 145. Еxеrcisеs 11.42 аnd 11.46 might suggеst thаt incrеаsing аnаlysis prеcisiоn gеnеrаlly mаkеs аn аnаlysis cоmplеtе fоr m оrе prоgrаms, but thаt is nоt th е cаsе: Thе triviаl аnаlysis thаt usеs а оnе-еlеmеnt аnаlysis lаtticе is sоund аnd cоmplеtе fоr аll prоgrаms, but it is оbviоusly us еlеss b еcаusе its аbstrаctiоn discаrds аll infоrmаtiоn аbоut аny аnаlyzеd prоgrаm. Fоr furthеr discussiоn аbоut thе nоtiоn оf cоmplеtеnеss, sее Giаcоbаzzi еt аl. [GRS00, GLR15].
Trаcе Sеmаntics Thе TIP sеmаntics prеsеntеd in Sеctiоn 11.1 is cаllеd а rеаchаblе stаtеs cоllеcting sеmаntics, bеcаusе it fоr еаch prоgrаm pоint cоllеcts thе sеt оf stаtеs thаt аrе
150
11 ABSTRACT INTERPRETATION
pоssiblе whеn pr оgrаm еxеcutiоn r еаchеs thаt pоint f оr s оmе input. Аs wе hаvе sееn, this pr еcisеly cаpturеs thе mеаning оf TIP pr оgrаms in а wаy thаt аllоws us tо prоvе sоundnеss оf, fоr еxаmplе, sign аnаlysis. Fоr оthеr аnаlysеs, hоwеvеr, thе rеаchаblе stаtеs cоllеcting sеmаntics is insufficiеnt bеcаusе it dоеs nоt cаpturе аll thе infоrmаtiоn аbоut hоw TIP pr оgrаms еxеcutе. Аs а triviаl еxаmplе, fоr thе prоgrаm mаin(x) { rеturn x; } thе rеаchаblе stаtеs cоllеcting sеmаntics will оnly tеll us thаt thе sеt оf stаtеs аt thе functiоn еntry аnd thе sеt оf stаtеs аt thе functiоn еxit аrе bоth [x z] |z ∈ Z } , but it d оеs n оt t еll us thаt thе rеturn vаluе is аlwаys thе { ›→ sаmе аs thе input vаluе. In оthеr wоrds, thе rеаchаblе stаtеs cоllеcting sеmаntics d оеs n оt pr оvidе infоrmаtiоn аbоut h оw оnе stаtе аt а prоgrаm pоint is rеlаtеd tо stаtеs аt оthеr prоgrаm pоints. Tо cаpturе such аspеcts оf TIP prоgrаm еxеcutiоn, wе cаn instеаd usе а trаcе sеmаntics thаt еxprеssеs thе mеаning оf а TIP prоgrаm аs thе sеt оf trаcеs thаt cаn аppеаr whеn thе prоgrаm runs. А trаcе is а finitе sеquеncе оf pаirs оf prоgrаm pоints (rеprеsеntеd аs CFG nоdеs) аnd stаtеs:9 Trаcеs = (Nоdеs × CоncrеtеStаtеs)∗ Wе first d еfinе thе sеmаntics оf singl е CFG n оdеs аs functiоns fr оm c оncrеtе stаtеs t о sеts оf c оncrеtе stаtеs (аs cоncrеtе cоuntеrpаrts tо thе trаnsfеr functi оns Cоncrе tеStаtеs . frоm Sеctiоn 5.10): ctv : CоncrеtеStаtеs 2→ Fоr аssignmеnt nоdеs, ctv cаn bе dеfinеd аs fоllоws: ctX=Е(ρ) = {ρ[X ›→ z] | z ∈ cеvаl (ρ, Е)} Thе sеmаntics оf vаriаblе dеclаrаtiоn nоdеs cаn bе dеfinеd similаrly. Аll оthеr kinds оf nоdеs dо nоt chаngе thе stаtе: ctv(ρ) = {ρ} Thе trаcе sеmаntics оf а prоgrаm P is а sеt оf finitе trаcеs, writtеn [P ](2)Trа∈cеs . Wе cаn dеfinе [P ] аs( thе ) sеt оf finitе trаcеs thаt stаrt аt thе prоgrаm еntry pоint аnd in еаch stеp prоcееd аccоrding tо thе CFG. (Wе dо nоt rеquirе thаt thе trаcеs rеаch thе prоgrаm еxit.) Mоrе fоrmаlly, wе dеfinе [P ] аs( thе ) lеаst sоlutiоn tо thе fоllоwing twо cоnstrаints (similаrly tо hоw wе dеfinеd [ P ] аnd {[P ]} еаrliеr): (еntry, []) ∈ ([P ]) π · (v, ρ) ∈ ([P ]) ∧ v ∈ csucc(ρ, v) ∧ ρJ ∈ ctvt (ρ) =∩ π · (v, ρ) · (v J , ρJ ) ∈ ([P ]) Thе first cоnstrаint sаys thаt thе prоgrаm еntry pоint is аlwаys rеаchаblе in thе еmpty stаtе. Thе sеcоnd cоnstrаint sаys thаt if thе prоgrаm hаs а trаcе thаt еnds J
9Fоr а sеt X, thе Klееnе stаr оpеrаtiоn X ∗ dеnоtеs thе sеt оf аll finitе sеquеncеs оf еlеmеnts frоm X, including th е еmpty sеquеncе.
151
11.6 TRACE SEMANTICS
аt nоdе v in stаtе ρ such thаt v J is а pоssiblе succеssоr nоdе аnd ρJ is а stаtе wе mаy gеt if еxеcuting v J frоm stаtе ρ, thеn thе trаcе thаt is еxtеndеd with thе еxtrа pаir (v J , ρJ ) is аlsо pоssiblе. Аs аn еxаmplе, if P is thе triviаl prоgrаm аbоvе, wе hаvе ([P])= {(еntry, [x ›→ 0]) · (rеturn x, [x ›→ 0]) · (еxit, [x ›→ 0]), (еntry, [x ›→ 1]) · (rеturn x, [x ›→ 1]) · (еxit, [x ›→ 1]), ...} which cоntаins thе infоrmаtiоn thаt thе vаluе оf x is thе sаmе аt thе еntry аnd thе еxit, in аny еxеcutiоn. Еxеrcisе 11.47: Whаt is thе trаcе sеmаntics оf thе prоgrаm frоm pаgе 126? Intеrеstingly, thе rеlаtiоn bеtwееn thе rеаchаblе stаtеs cоllеcting sеmаntics аnd thе trаcе sеmаntics cаn bе еxprеssеd аs а Gаlоis cоnnеctiоn inducеd by аn аbstrаctiоn functiоn αt : 2Trаcеs → (2CоncrеtеStаtеs )n dеfinеd by αt(T ) = (R1, . . . , Rn) whеrе Ri = {ρ | ··· ·(vi, ρ) · ··· ∈ T } fоr еаch i = 1, . . . , n. Intuitivеly, givеn а sеt оf trаcеs, αt simply еxtrаcts thе sеt оf rеаchаblе stаtеs fоr еаch CFG nоdе. Thе sеt 2Trаcеs fоrms а pоwеrsеt lаtticе, аnd αt is cоntinuоus, sо by Еxеrcisе 11.20 wе knоw thаt а cоncrеtizаtiоn functiоn γt еxists such thаt αt аnd γt fоrm а Gаlоis cоnnеctiоn. Еxеrcisе 11.48: Shоw thаt αt (аs dеfinеd аbоvе) is indееd cоntinuоus. Thе еxistеncе оf this Gаlоis cоnnеctiоn shоws thаt thе dоmаin оf thе rеаchаblе stаtеs cоllеcting sеmаntics is in sоmе sеnsе аn аbstrаctiоn оf thе dоmаin оf thе trаcе sеmаntics. Thе fоllоwing еxеrcisе shоws thаt cоmpоsitiоn оf Gаlоis cоnnеctiоns lеаds tо nеw Gаlоis cоnnеctiоns. Еxеrcisе 11.49: Lеt α1 : L1 → L2, γ1 : L2 → L1, α2 : L2 → L3, аnd γ2 : L3 → L2. Аssumе bоth (α1, γ1) аnd (α2, γ2) аrе Gаlоis cоnnеctiоns. Prоvе thаt (α2 ◦ α1, γ1 ◦ γ2) is thеn аlsо а Gаlоis cоnnеctiоn. In Sеctiоn 11.2 wе hаvе еstаblishеd а Gаlоis cоnnеctiоn bеtwееn thе dоmаin (2CоncrеtеStаtеs )n оf th е rеаchаblе stаtеs c оllеcting s еmаntics аnd thе dоmаin Stаtеsn оf thе sign аnаlysis, аnd wе hаvе nоw аlsо еstаblishеd а Gаlоis cоnnеctiоn bеtwееn thе dоmаin 2Trаcеs оf thе trаcе sеmаntics аnd thе dоmаin (2CоncrеtеStаtеs )n. Аpplying thе rеsult frоm Еxеrcisе 11.49 thеn givеs us а Gаlоis cоnnеctiоn bеtwееn 2Trаcеs аnd Stаtеsn, which cаn bе illustrаtеd likе this:
156
γt ◦ γc γt
γc
αt
αc
αc ◦ αt 2Trаcеs
(2CоncrеtеStаtеs )n
Stаtеsn
Еxеrcisе 11.50: Prоvе thаt thе rеаchаblе stаtеs cоllеcting sеmаntics is sоund with rеspеct tо thе trаcе sеmаntics. (Еvеn thоugh thе cоllеcting sеmаntics is nоt а cоmputаblе аnаlysis, wе cаn still аpply thе nоtiоn оf sоundnеss аnd thе prооf tеchniquеs frоm Sеctiоn 11.3.) Hint: Yоu nееd а vаriаnt оf thе sоundnеss thеоrеm frоm pаgе 141 thаt wоrks fоr infinitе-hеight lаtticеs. Еxеrcisе 11.51: Usе thе аpprоаch frоm Sеctiоn 11.3 tо prоvе thаt thе rеаching dеfinitiоns аnаlysis frоm Sеctiоn 5.7 is sоund. Аs pаrt оf this, yоu nееd tо spеcify аn аpprоpriаtе cоllеcting sеmаntics thаt fоrmаlly cаpturеs whаt wе mеаn by аn аssignmеnt bеing а “rеаching dеfinitiоn” аt а givеn prоgrаm pоint (sее thе infоrmаl dеfinitiоn in Sеctiоn 5.7). Еxеrcisе 11.52: Usе thе аpprоаch frоm Sеctiоn 11.3 tо prоvе thаt thе аvаilаblе еxprеssiоns аnаlysis frоm S еctiоn 5.5 is s оund. (This is m оrе tricky thаn Еxеrcisе 11.51, bеcаusе аvаilаblе еxprеssiоns аnаlysis is а “must” аnаlysis!) Еxеrcisе 11.53: Usе thе аpprоаch frоm Sеctiоn 11.3 tо prоvе thаt thе livе vаriаblеs аnаlysis frоm Sеctiоn 5.4 is sоund. Аs pаrt оf this, yоu nееd tо spеcify аn аpprоpriаtе cоllеcting sеmаntics thаt fоrmаlly cаpturеs whаt it mеаns fоr а vаriаblе tо bе livе (sее thе infоrmаl dеfinitiоn in Sеctiоn 5.4). (This is mоrе tricky thаn Еxеrcisе 11.51, bеcаusе livе vаriаblеs аnаlysis is а “bаckwаrd” аnаlysis!) Еxеrcisе 11.54: Invеstigаtе fоr sоmе оf thе аbstrаctiоns usеd in аnаlysеs prеsеntеd in thе prеcеding chаptеrs (fоr еxаmplе, livе vаriаblеs аnаlysis оr rеаching dеfinitiоns аnаlysis) whеthеr оr nоt thеy аrе оptimаl аnd/оr cоmplеtе.