276 49 3MB
English Pages 321 Year 2012
For your convenience Apress has placed some of the front matter material after the index. Please use the Bookmarks and Contents at a Glance links to access them.
Contents at a Glance AbouttheAuthors.................................................................................................. xi AbouttheTechnicalReviewer .............................................................................. xii Acknowledgments ............................................................................................... xiii Introduction ......................................................................................................... xiv Chapter1:GettingStarted ......................................................................................1 Chapter2:ApplicationFundamentals...................................................................23 Chapter3:DepthImageProcessing .....................................................................49 Chapter4:SkeletonTracking ...............................................................................85 Chapter5:AdvancedSkeletonTracking.............................................................121 Chapter6:Gestures ............................................................................................167 Chapter7:Speech...............................................................................................223 Chapter8:BeyondtheBasics.............................................................................255 Appendix:KinectMath........................................................................................291 Index ...................................................................................................................301
iv
Introduction Itiscustomarytoprefaceaworkwithanexplanationoftheauthor’saim,whyhewrotethebook,and therelationshipinwhichhebelievesittostandtootherearlierorcontemporarytreatisesonthesame subject.Inthecaseofatechnicalwork,however,suchanexplanationseemsnotonlysuperfluousbut, inviewofthenatureofthesubject-matter,eveninappropriateandmisleading.Inthissense,atechnical bookissimilartoabookaboutanatomy.Wearequitesurethatwedonotasyetpossessthesubjectmatteritself,thecontentofthescience,simplybyreadingaroundit,butmustinadditionexert ourselvestoknowtheparticularsbyexaminingrealcadaversandbyperformingrealexperiments. Technicalknowledgerequiresasimilarexertioninordertoachieveanylevelofcompetence. Besidesthereader’sdesiretobehands-onratherthanheads-down,abookaboutKinect developmentofferssomeadditionalchallengesduetoitsnovelty.TheKinectseemedtoarriveexnihilo inNovemberof2010andattemptstointerfacewiththeKinecttechnology,originallyintendedonlytobe usedwiththeXBOXgamingsystem,beganalmostimmediately.Thepopularityoftheseeffortstohack theKinectappearstohavetakenevenMicrosoftunawares. SeveralframeworksforinterpretingtherawfeedsfromtheKinectsensorhavebeenreleased priortoMicrosoft’sofficialrevealoftheKinectSDKinJulyof2011includinglibfreenectdevelopedby theOpenKinectcommunityandOpenNIdevelopedprimarilybyPrimeSense,vendorsofoneofthekey technologiesusedintheKinectsensor.ThesurprisingnatureoftheKinect’sreleaseaswellas Microsoft’sapparentfailuretoanticipatetheoverwhelmingdesireonthepartofdevelopers,hobbyists andevenresearchscientiststoplaywiththetechnologymaygivetheimpressionthattheKinectSDKis hodgepodgeorevenabrieflyflickeringfad. ThegesturerecognitioncapabilitiesmadeaffordablebytheKinect,however,havebeen researchedatleastsincethelate70’s.AbriefsearchonYouTubeforthephrase“putthatthere”will bringupChrisSchmandt’s1979workwiththeMITMediaLabdemonstratingkeyKinectconceptssuch asgesturetrackingandspeechrecognition.TheinfluenceofSchmandt’sworkcanbeseeninMark Lucente’sworkwithgestureandspeechrecognitioninthe90’sforIBMResearchonaprojectcalled DreamSpace.TheseearlyconceptscametogetherinthecentralimagefromStevenSpeilberg’s2002film MinorityReportthatcapturedviewersimaginationsconcerningwhatthefutureshouldlooklike.That imagewasofTomCruisewavinghisarmsandmanipulatinghiscomputerscreenswithouttouching eitherthemonitorsoranyinputdevices.Inthemiddleofanotherwisedystopicsocietyfilledwith roboticspiders,ubiquitousmarketingandpanopticonpolicesurveilence,StevenSpeilbergofferedusa visionnotonlyofapossibletechnologicalfuturebutofafuturewewanted. AlthoughMinorityReportwasintendedasavisionoftechnology50yearsinthefuture,thefirst conceptvideosfortheKinect,code-namedProjectNatal,startedappearingonlysevenyearsafterthe movie’srelease.Oneofthefirstthingspeoplenoticedaboutthetechnologywithrespecttoitscinematic predecessorwasthattheKinectdidnotrequireTomCruise’sthree-fingered,blue-litglovestofunction. WehadnotonlycaughtuptothefutureasenvisionedbyMinorityReportinrecordtimebuthadeven surpassedit. TheKinectisonlynewinthesensethatithasrecentlybecomeaffordableandfitformassproduction.Aspointedoutabove,ithasbeenanticipatedinresearchcirclesforover40years.The
xiv
INTRODUCTION
principleconceptsofgesture-recognitionhavenotchangedsubstantiallyinthattime.Moreover,the cinematicexplorationofgesture-recognitiondevicesdemonstratesthatthetechnologyhassucceededin makingadeepconnectionwithpeople’simaginations,fillinganeedwedidnotknowwehad. Inthenearfuture,readerscanexpecttoseeKinectsensorsbuiltintomonitorsandlaptopsas gesture-basedinterfacesgaingroundinthemarketplace.Overthenextfewyears,Kinect-like technologywillbeginappearinginretailstores,publicbuildings,mallsandmultiplelocationsinthe home.Asthehardwareimprovesandbecomesubiquitous,theauthorsanticipatethattheKinectSDK willbecometheleadingsoftwareplatformforworkingwithit.Althoughslowoutofthegatewiththe KinectSDK,Microsoft’sexpertiseinplatformdevelopment,thefactthattheyownthetechnology,as wellastheirintimateexperiencewiththeKinectforgamedevelopmentaffordsthemremarkable advantagesoverthealternatives.Whilepredictionsaboutthefutureoftechnologyhavebeenshown, overthepastfewyears,tobeatreacherousendeavor,theauthorspositwithsomeconfidencethatskills gainedindevelopingwiththeKinectSDKwillnotbecomeobsoleteinthenearfuture. Evenmoreimportant,however,developingwiththeKinectSDKisfuninawaythattypical developmentisnot.Thepleasureofbuildingyourfirstskeletontrackingprogramisdifficulttodescribe. Itisinordertosharethisineffableexperience--anexperiencefamiliartoanyonewhostillremembers theirfirstsoftwareprogramandbecamesoftwaredevelopersinthebeliefthissenseofjoyand accomplishmentwasrepeatable–thatwehavewrittenthisbook.
AboutThisBook Thisbookisfortheinveteratetinkererwhocannotresistplayingwithcodesamplesbeforereadingthe instructionsonwhythesamplesarewrittenthewaytheyare.Afterall,youboughtthisbookinorderto findouthowtoplaywiththeKinectsensorandreplicatesomeoftheexcitingscenariosyoumayhave seenonline.Weunderstandifyoudonotwanttoinitiallywadethroughdetailedexplanationsbefore seeinghowfaryoucangetwiththesamplesonyourown.Atthesametime,wehaveincludedindepth informationaboutwhytheKinectSDKworksthewayitdoesandtoprovideguidanceonthetricksand pitfallsofworkingwiththeSDK.Youcanalwaysgobackandreadthisinformationatalaterpointasit becomesimportanttoyou. Thechaptersareprovidedinroughlysequentialorder,witheachchapterbuildinguponthe chaptersthatwentbefore.Theybeginwiththebasics,moveontoimageprocessingandskeleton tracking,thenaddressmoresophisticatedscenariosinvolvingcomplexgesturesandspeechrecognition. FinallytheydemonstratehowtocombinetheSDKwithothercodelibrariesinordertobuildcomplex effects.Theappendixoffersanoverviewofmathematicalandkinematicconceptsthatyouwillwantto becomefamiliarwithasyouplanoutyourownuniqueKinectapplications.
ChapterOverview Chapter1:GettingStarted Yourimaginationisrunningwildwithideasandcooldesignsforapplications.Thereareafewthingsto knowfirst,however.Thischapterwillcoverthesurprisinglylonghistorythatleduptothecreationofthe KinectforWindowsSDK.Itwillthenprovidestep-by-stepinstructionsfordownloadingandinstalling thenecessarylibrariesandtoolsneededtodevelopapplicationsfortheKinect. Chapter2:ApplicationFundamentalsguidesthereaderthroughtheprocessofbuildingaKinect application.Atthecompletionofthischapter,thereaderwillhavethefoundationneededtowrite
xv
INTRODUCTION
relativelysophisticatedKinectapplicationsusingtheMicrosoftSDK.Thisincludesgettingdatafromthe Kinecttodisplayaliveimagefeedaswellasafewtrickstomanipulatetheimagestream.Thebasiccode introducedhereiscommontovirtuallyallKinectapplications. Chapter3:DepthImageProcessing ThedepthstreamisatthecoreofKinecttechnology.Thiscodeintensivechapterexplainsthedepth streamindetail:whatdatatheKinectsensorprovidesandwhatcanbedonewiththisdata.Examples includecreatingimageswhereusersareidentifiedandtheirsilhouettesarecoloredaswellassimple tricksusingthesilhouettestodetermininethedistanceoftheuserfromtheKinectandotherusers. Chapter4:SkeletonTracking Byusingthedatafromthedepthstream,theMicrosoftSDKcandeterminehumanshapes. Thisiscalledskeletontracking.Thereaderwilllearnhowtogetskeletontrackingdata,whatthatdata meansandhowtouseit.Atthispoint,youwillknowenoughtohavesomefun.Walkthroughsinclude visuallytrackingskeletonjointsandbones,andcreatingsomebasicgames. Chapter5:AdvancedSkeletonTracking Thereismoretoskeletontrackingthanjustcreatingavatarsandskeletons.Sometimesreadingand processingrawKinectdataisnotenough.Itcanbevolatileandunpredictable.Thischapterprovides tipsandtrickstosmoothoutthisdatatocreatemorepolishedapplications.Inthischapterwewillalso movebeyondthedepthimageandworkwiththeliveimage.Usingthedataproducedbythedepth imageandthevisualoftheliveimage,wewillworkwithanaugmentedrealityapplication. Chapter6:Gestures ThenextlevelinKinectdevelopmentisprocessingskeletontrackingdatatodetectusinggestures. Gesturesmakeinteractingwithyourapplicationmorenatural.Infact,thereisawholefieldofstudy dedicatedtonaturaluserinterfaces.ThischapterwillintroduceNUIandshowhowitaffectsapplication development.Kinectissonewthatwell-establishedgesturelibrariesandtoolsarestilllacking.This chapterwillgiveguidancetohelpdefinewhatagestureisandhowtoimplementabasicgesturelibrary. Chapter7:Speech TheKinectismorethanjustasensorthatseestheworld.Italsohearsit.TheKinecthasanarrayof microphonesthatallowsittodetectandprocessaudio.Thismeansthattheusercanusevoice commandsaswellasgesturestointeractwithanapplication.Inthischapter,youwillbeintroducedto theMicrosoftSpeechRecognitionSDKandshownhowitisintegratedwiththeKinectmicrophone array. Chapter8:BeyondtheBasicsintroducesthereadertomuchmorecomplexdevelopmentthatcanbe donewiththeKinect.Thischapteraddressesusefultoolsandwaystomanipulatedepthdatatocreate complexapplicationsandadvancedKinectvisuals. AppendixA:KinectMath BasicmathskillsandformulasneededwhenworkingwithKinect.Givesonlypracticalinformation neededfordevelopmenttasks.
xvi
INTRODUCTION
WhatYouNeedtoUseThisBook TheKinectSDKrequirestheMicrosoft.NETFramework4.0.Tobuildapplicationswithit,youwillneed eitherVisualStudio2010ExpressoranotherversionofVisualStudio2010.TheKinectSDKmaybe downloadedathttp://www.kinectforwindows.org/download/. ThesamplesinthisbookarewrittenwithWPF4andC#.TheKinectSDKmerelyprovidesaway toreadandmanipulatethesensorstreamsfromtheKinectdevice.Additionaltechnologyisrequiredin ordertodisplaythisdataininterestingways.ForthisbookwehaveselectedWPF,thepreeminant vectorgraphicplatformintheMicrosoftstackaswellasaplatformgenerallyfamiliartomostdevelopers workingwithMicrosofttechnologies.C#,inturn,isthe.NETlanguagewiththegreatestpenetration amongdevelopers.
AbouttheCodeSamples Thecodesamplesinthisbookhavebeenwrittenforversion1.0oftheKinectforWindowsSDKreleased onFebruary1st,2012.Youareinvitedtocopyanyofthecodeanduseitasyouwill,buttheauthorshope youwillactuallyimproveuponit.Bookcode,afterall,isnotrealcode.Eachprojectandsnippetfound inthisbookhasbeenselectedforitsabilitytoillustrateapointratherthanitsefficiencyinperforminga task.WherepossiblewehaveattemptedtoprovidebestpracticesforwritingperformantKinectcode, butwhenevergoodcodecollidedwithlegiblecode,legibilitytendedtowin. Morepainfultous,giventhatboththeauthorsworkforadesignagency,wastherealization thatthebookyouholdinyourhandsneededtobeaboutKinectcoderatherthanaboutKinectdesign. Tothisend,wehavereinedinourimpulsetobuildelaboratepresentationlayersinfavorofspare, workman-likedesigns. Thesourcecodefortheprojectsdescribedinthisbookisavailablefordownloadat http://www.apress.com/9781430241041.Thisistheofficialhomepageofthebook.Youcanalsocheck forerrataandfindrelatedApresstitleshere.
xvii
CHAPTER 1
Getting Started Inthischapter,weexplainwhatmakesKinectspecialandhowMicrosoftgottothepointofprovidinga KinectforWindowsSDK—somethingthatMicrosoftapparentlydidnotenvisionwhenitreleasedwhat wasthoughtofasanewkindof“controller-free”controllerfortheXbox.Wetakeyouthroughthesteps involvedininstallingtheKinectforWindowsSDK,plugginginyourKinectsensor,andverifyingthat everythingisworkingthewayitshouldinordertostartprogrammingforKinect.Wethennavigate throughthesamplesprovidedwiththeSDKanddescribetheirsignificanceindemonstratinghowto programfortheKinect.
TheKinectCreationStory ThehistoryofKinectbeginslongbeforethedeviceitselfwasconceived.Kinecthasrootsindecadesof thinkinganddreamingaboutuserinterfacesbasedupongestureandvoice.Thehit2002movieThe MinorityReportaddedfueltothefirewithitsfuturisticdepictionofaspatialuserinterface.Rivalry betweencompetinggamingconsolesbroughttheKinecttechnologyintoourlivingrooms.Itwasthe hackerethicofunlockinganythingintendedtobesealed,however,thateventuallyopeneduptheKinect todevelopers.
Pre-History BillBuxtonhasbeentalkingoverthepastfewyearsaboutsomethinghecallstheLongNoseof Innovation.AplayonChrisAnderson’snotionoftheLongTail,theLongNosedescribesthedecadesof incubationtimerequiredtoproducea“revolutionary”newtechnologyapparentlyoutofnowhere.The classicexampleistheinventionandrefinementofadevicecentraltotheGUIrevolution:themouse. ThefirstmouseprototypewasbuiltbyDouglasEngelbartandBillEnglish,thenattheStanford ResearchInstitute,in1963.Theyevengavethedeviceitsmurinename.BillEnglishdevelopedthe conceptfurtherwhenhetookittoXeroxPARCin1973.WithJackHawley,headdedthefamousmouse balltothedesignofthemouse.Duringthissametimeperiod,TelefunkeninGermanywas independentlydevelopingitsownrollerballmousedevicecalledtheTelefunkenRollkugel.By1982,the firstcommercialmousebegantofinditswaytothemarket.Logitechbegansellingonefor$299.Itwas somewhereinthisperiodthatSteveJobsvisitedXeroxPARCandsawthemouseworkingwithaWIMP interface(windows,icons,menus,pointers).Sometimeafterthat,JobsinvitedBillGatestoseethe mouse-basedGUIinterfacehewasworkingon.ApplereleasedtheLisain1983withamouse,andthen equippedtheMacintoshwiththemousein1984.MicrosoftannounceditsWindowsOSshortlyafterthe releaseoftheLisaandbegansellingWindows1.0in1985.Itwasnotuntil1995,withthereleaseof Microsoft’sWindows95operatingsystem,thatthemousebecameubiquitous.TheLongNosedescribes the30-yearspanrequiredfordeviceslikethemousetogofrominventiontoubiquity.
1
CHAPTER1GETTINGSTARTED
Asimilar30-yearLongNosecanbesketchedoutforKinect.Startinginthelate70s,abouthalfway intothemouse’sdevelopmenttrajectory,ChrisSchmandtattheMITArchitectureMachineGroup startedaresearchprojectcalledPut-That-There,basedonanideabyRichardBolt,whichcombined voiceandgesturerecognitionasinputvectorsforagraphicalinterface.ThePut-That-Thereinstallation livedinasixteen-footbyeleven-footroomwithalargeprojectionscreenagainstonewall.Theusersat inavinylchairabouteightfeetinfrontofthescreenandhadamagneticcubehiddenuponewristfor spatialinputaswellasahead-mountedmicrophone.Withtheseinputs,andsomerudimentaryspeech parsinglogicaroundpronounslike“that”and“there,”theusercouldcreateandmovebasicshapes aroundthescreen.Boltsuggestsinhis1980paperdescribingtheproject,“Put-That-There:Voiceand GestureattheGraphicsInterface,”thateventuallythehead-mountedmicrophoneshouldbereplaced withadirectionalmic.SubsequentversionsofPut-That-Therealloweduserstoguideshipsthroughthe CaribbeanandplacecolonialbuildingsonamapofBoston. AnotherMITMediaLabsresearchprojectfrom1993byDavidKoonz,KristinnThorrison,and CarltonSparrell—andagaindirectedbyBolt—calledTheIconicSystemrefinedthePut-That-There concepttoworkwithspeechandgestureaswellasathirdinputmodality:eye-tracking.Also,insteadof projectinginputontoatwo-dimensionalspace,thegraphicalinterfacewasacomputer-generatedthreedimensionalspace.InplaceofthemagneticcubesusedforPut-That-There,theIconicSystemincluded specialglovestofacilitategesturetracking. Towardsthelate90s,MarkLucentedevelopedanadvanceduserinterfaceforIBMResearchcalled DreamSpace,whichranonavarietyofplatformsincludingWindowsNT.ItevenimplementedthePutThat-TheresyntaxofChrisSchmandt’s1979project.Unlikeanyofitspredecessors,however, DreamSpacedidnotusewandsorglovesforgesturerecognition.Instead,itusedavisionsystem. Moreover,LucenteenvisionedDreamSpacenotonlyforspecializedscenariosbutalsoasaviable alternativetostandardmouseandkeyboardinputsforeverydaycomputing.Lucentehelpedto popularizespeechandgesturerecognitionbydemonstratingDreamSpaceattradeshowsbetween1997 and1999. In1999JohnUnderkoffler—alsowithMITMediaLabsandacoauthorwithMarkLucenteonapaper afewyearsearlieronholography—wasinvitedtoworkonanewStephenSpielbergprojectcalledThe MinorityReport.UnderkofflereventuallybecametheScienceandTechnologyAdvisoronthefilmand, withAlexMcDowell,thefilm’sProductionDesigner,puttogethertheuserinterfaceTomCruiseusesin themovie.SomeofthedesignconceptsfromTheMinorityReportUIeventuallyendedupinanother projectUnderkofflerworkedoncalledG-Speak. PerhapsUnderkoffler’smostfascinatingdesigncontributiontothefilmwasasuggestionhemade toSpielbergtohaveCruiseaccidentlyputhisvirtualdesktopintodisarraywhenheturnsandreaches outtoshakeColinFarrell’shand.Itisascenethatcapturesthejarringacknowledgmentthateven “smart”computerinterfacesareultimatelystillreliantonconventionsandthattheseconventionsare easilyunderminedbytheuncannyfacticityofreallife. TheMinorityReportwasreleasedin2002.Thefilmvisualsimmediatelyseepedintothecollective unconscious,hanginginthezeitgeistlikeapromissorynote.Amilddiscontentovertheprevalenceof themouseinourdailylivesbegantobefelt,andthepressaswellaspopularattentionbegantoturn towardwhatwecametocalltheNaturalUserInterface(NUI).Microsoftbeganworkingonitsinnovative multitouchplatformSurfacein2003,beganshowingitin2007,andeventuallyreleaseditin2008.Apple unveiledtheiPhonein2007.TheiPadbegansellingin2010.AseachNUItechnologycametomarket,it wasaccompaniedbycomparisonstoTheMinorityReport.
TheMinorityReport SomuchinkhasbeenspilledabouttheobviousinfluenceofTheMinorityReportonthedevelopmentof KinectthatatonepointIinsistedtomyco-authorthatweshouldtrytoavoideverusingthewords
2
CHAPTER1GETTINGSTARTED
“minority”and“report”togetheronthesamepage.InthisendeavorIhavefailedmiserablyandconcede thatavoidingmentionofTheMinorityReportwhendiscussingKinectisvirtuallyimpossible. OneofthemorepeculiarresponsestothemoviewasthemoviecriticRogerEbert’sopinionthatit offeredan“optimisticpreview”ofthefuture.TheMinorityReport,basedlooselyonashortstoryby PhilipK.Dick,depictsafutureinwhichpolicesurveillanceispervasivetothepointofpredictingcrimes beforetheyhappenandincarceratingthosewhohavenotyetcommittedthecrimes.Itincludes massivelypervasivemarketinginwhichretinalscansareusedinpublicplacestotargetadvertisements topedestriansbasedondemographicdatacollectedonthemandstoredinthecloud.Genetic experimentationresultsinmonstrouslycarnivorousplants,robotspidersthatroamthestreets,a thrivingblackmarketinbodypartsthatallowspeopletochangetheiridentitiesand—perhapsthemost jarringfuturepredictionofall—policemenwearingrocketpacks. PerhapswhatEbertrespondedtowasthenotionthattheworldofTheMinorityReportwasa believablefuture,extrapolatedfromourworld,demonstratingthatthroughtechnologyourworldcan actuallychangeandnotmerelybemoreofthesame.Evenifitintroducesnewproblems,sciencefiction reinforcestheideathattechnologycanhelpusleaveourcurrentproblemsbehind.Inthe1958book,The HumanCondition,theauthorandphilosopherHannahArendtcharacterizestheroleofsciencefiction insocietybysaying,“…sciencehasrealizedandaffirmedwhatmenanticipatedindreamsthatwere neitherwildnoridle…buriedinthehighlynon-respectableliteratureofsciencefiction(towhich, unfortunately,nobodyyethaspaidtheattentionitdeservesasavehicleofmasssentimentsandmass desires).”Whilewemaynotallbecravingrocketpacks,wedoallatleasthavetheaspirationthat technologywillsignificantlychangeourlives. WhatispeculiaraboutTheMinorityReportand,beforethat,sciencefictionseriesliketheStarTrek franchise,isthattheydonotalwaysmerelypredictthefuturebutcanevenshapethatfuture.WhenI firstwalkedthroughautomaticslidingdoorsatalocalconveniencestore,Iknewthiswasbasedonthe slidingdoorsontheUSSEnterprise.WhenIheldmyfirstflipphoneinmyhands,Iknewitwasbasedon CaptainKirk’scommunicatorand,moreover,wouldneverhavebeendesignedthiswayhadStarTrek neverairedontelevision. IfTheMinorityReportdrovethedesignandadoptionofthegesturerecognitionsystemonKinect, StarTrekcanbesaidtohavedriventhespeechrecognitioncapabilitiesofKinect.Ininterviewswith Microsoftemployeesandexecutives,therearerepeatedreferencestothedesiretomakeKinectworklike theStarTrekcomputerortheStarTrekholodeck.Thereisasenseinthoseinterviewsthatifthespeech recognitionportionofthedevicewasnotsolved(andoccasionallytherewerediscussionsabout droppingthefeatureasitfellbehindschedule),theKinectsensorwouldnothavebeenthefuturedevice everyonewanted.
Microsoft’sSecretProject Inthegamingworld,Nintendothrewdownthegauntletatthe2005TokyoGameShowconferencewith theunveilingoftheWiiconsole.TheconsolewasaccompaniedbyanewgamingdevicecalledtheWii Remote.LikethemagneticcubesfromtheoriginalPut-That-Thereproject,theWiiRemotecandetect movementalongthreeaxes.Additionally,theremotecontainsanopticalsensorthatdetectswhereitis pointing.Itisalsobatterypowered,eliminatinglongcordstotheconsolecommontootherplatforms. FollowingthereleaseoftheWiiin2006,PeterMoore,thenheadofMicrosoft’sXboxdivision, demandedworkstartonacompetitiveWiikiller.ItwasalsoaroundthistimethatAlexKipman,headof anincubationteaminsidetheXboxdivision,metthefoundersofPrimeSenseatthe2006Electronic EntertainmentExpo.MicrosoftcreatedtwocompetingteamstocomeupwiththeintendedWiikiller: oneworkingwiththePrimeSensetechnologyandtheotherworkingwithtechnologydevelopedbya companycalled3DV.ThoughtheoriginalgoalwastounveilsomethingatE32007,neitherteamseemed tohaveanythingsufficientlypolishedintimefortheexposition.Thingswerethrownabitmoreofftrack in2007whenPeterMooreannouncedthathewasleavingMicrosofttogoworkforElectronicArts.
3
CHAPTER1GETTINGSTARTED
Itisclearthatbythesummerof2007thesecretworkbeingdoneinsidetheXboxteamwasgaining momentuminternallyatMicrosoft.AttheD:AllThingsDigitalconferencethatyear,BillGateswas interviewedside-by-sidewithSteveJobs.Duringthatinterview,inresponsetoaquestionabout MicrosoftSurfaceandwhethermultitouchwouldbecomemainstream,Gatesbegantalkingaboutvision recognitionasthestepbeyondmultitouch:
Gates:Softwareisdoingv ision.Ands o,imagine agamemachinewherey oujustcan pickupthebatandswingitorpickupthetennisracketandswingit. Interviewer:Wehaveoneofthose.That’sWii. Gates:No.No.That’snotit.Youcan’t pick upyour tennisracketandswingit.You can’tsitth erewithyourfri endsandd othosenaturalthings.That’sa3-Dpo sitional device.This isvideor ecognition.This is a camera seeing what’s going on. In a meeting,wh enyouar eon ac onference,youdon’t k nowwho’ ss peakingwh enit’s audioonly …the camerawillb eubiquitous …softwar e cand ovisi on,it candoit very,veryinexpensively…andthatmeansthisstuffbecomespervasive.Youdon’tjust talkab outit beinginala ptopd evice. Yout alkab outitbeinga part ofth em eeting roomorthelivingroom… Amazinglytheinterviewer,WaltMossberg,cutGatesoffduringhisfugueaboutthefutureof technologyandturnedtheconversationbacktowhatwasmostimportantin2007:laptops! Nevertheless,GatesrevealedinthisinterviewthatMicrosoftwasalreadythinkingofthenewtechnology beingdevelopedintheXboxteamassomethingmorethanmerelyagamingdevice.Itwasalready thoughtofasadevicefortheofficeaswell. FollowingMoore’sdeparture,DonMatricktookupthereigns,guidingtheXboxteam.In2008,he revivedthesecretvideorecognitionprojectaroundthePrimeSensetechnology.While3DV’stechnology apparentlynevermadeitintothefinalKinect,Microsoftboughtthecompanyin2009for$35million. ThiswasapparentlydoneinordertodefendagainstpotentialpatentdisputesaroundKinect.Alex Kipman,amanagerwithMicrosoftsince2001,wasmadeGeneralManagerofIncubationandputin chargeofcreatingthenewProjectNataldevicetoincludedepthrecognition,motiontracking,facial recognition,andspeechrecognition.
NoteWhat’sinaname?Microsofthastraditionally,ifnotconsistently,givencitynamestolargeprojectsas theircodenames.AlexKipmandubbedthesecretXboxprojectNatal,afterhishometowninBrazil.
ThereferencedevicecreatedbyPrimeSenseincludedanRGBcamera,aninfraredsensor,andan infraredlightsource.MicrosoftlicensedPrimeSense’sreferencedesignandPS1080chipdesign,which processeddepthdataat30framespersecond.Importantly,itprocesseddepthdatainaninnovativeway thatdrasticallycutthepriceofdepthrecognitioncomparedtotheprevailingmethodatthetimecalled “timeofflight”—atechniquethattracksthetimeittakesforabeamoflighttoleaveandthenreturnto thesensor.ThePrimeSensesolutionwastoprojectapatternofinfrareddotsacrosstheroomanduse thesizeandspacingbetweendotstoforma320X240pixeldepthmapanalyzedbythePS1080chip.The
4
CHAPTER1GETTINGSTARTED
chipalsoautomaticallyalignedtheinformationfortheRGBcameraandtheinfraredcamera,providing RGBDdatatohighersystems. Microsoftaddedafour-piecemicrophonearraytothisbasicstructure,effectivelyprovidinga directionmicrophoneforspeechrecognitionthatwouldbeeffectiveinalargeroom.Microsoftalready hadyearsofexperiencewithspeechrecognition,whichhasbeenavailableonitsoperatingsystems sinceWindowsXP. KudoTsunada,recentlyhiredawayfromElectronicArts,wasalsobroughtontheproject,leading hisownincubationteam,tocreateprototypegamesforthenewdevice.HeandKipmanhadadeadline ofAugust18,2008,toshowagroupofMicrosoftexecutiveswhatProjectNatalcoulddo.Tsunada’steam cameupwith70prototypes,someofwhichwereshowntotheexecs.Theprojectgotthegreenlightand therealworkbegan.TheyweregivenalaunchdateforProjectNatal:Christmasof2010.
MicrosoftResearch WhilethehardwareproblemwasmostlysolvedthankstoPrimeSense—allthatremainedwastogivethe deviceasmallerformfactor—thesoftwarechallengesseemedinsurmountable.First,aresponsive motionrecognitionsystemhadtobecreatedbasedontheRGBandDepthdatastreamscomingfrom thedevice.Next,seriousscrubbinghadtobeperformedinordertomaketheaudiofeedworkablewith theunderlyingspeechplatform.TheProjectNatalteamturnedtoMicrosoftResearch(MSR)tohelp solvetheseproblems. MSRisamultibilliondollarannualinvestmentbyMicrosoft.ThevariousMSRlocationsaretypically dedicatedtopureresearchincomputerscienceandengineeringratherthantotryingtocomeupwith newproductsfortheirparent.Itmusthaveseemedstrange,then,whentheXboxteamapproached variousbranchesofMicrosoftResearchtonotonlyhelpthemcomeupwithaproductbuttodoso accordingtotherhythmsofaveryshortproductcycle. Inlate2008,theProjectNatalteamcontactedJamieShottonattheMSRofficeinCambridge, England,tohelpwiththeirmotion-trackingproblem.ThemotiontrackingsolutionKipman’steam cameupwithhadseveralproblems.First,itreliedontheplayergettingintoaninitialT-shapedposeto allowthemotioncapturesoftwaretodiscoverhim.Next,itwouldoccasionallylosetheplayerduring motion,obligatingtheplayertoreinitializethesystembyonceagainassumingtheTposition.Finally, themotiontrackingsoftwarewouldonlyworkwiththeparticularbodytypeitwasdesignedfor—thatof Microsoftexecutives. Ontheotherhand,thedepthdataprovidedbythesensoralreadysolvedseveralmajorproblemsfor motiontracking.Thedepthdataallowseasyfilteringofanypixelsthatarenottheplayer.Extraneous informationsuchasthecolorandtextureoftheplayer’sclothesarealsofilteredoutbythedepthcamera data.Whatisleftisbasicallyaplayerblobrepresentedinpixelpositions,asshowninFigure1-1.The depthcameradata,additionally,providesinformationabouttheheightandwidthoftheplayerin meters.
5
CHAPTER1GETTINGSTARTED
Figure1-1.ThePlayerblob ThechallengeforShottonwastoturnthisoutlineofapersonintosomethingthatcouldbetracked. Theproblem,ashesawit,wastobreakuptheplayerblobprovidedbythedepthstreaminto recognizablebodyparts.Fromthesebodyparts,jointscanbeidentified,andfromthesejoints,a skeletoncanbereconstructed.WorkingwithAndrewFitzgibbonandAndrewBlake,Shottonarrivedat analgorithmthatcoulddistinguish31bodyparts(seeFigure1-2).Outoftheseparts,theversionof KinectdemonstratedatE3in2009couldproduce48joints(theKinectSDK,bycontrast,exposes20 joints).
Figure1-2.Playerparts
6
CHAPTER1GETTINGSTARTED
TogetaroundtheinitialT-poserequiredoftheplayerforcalibration,Shottondecidedtoappealto thepowerofcomputerlearning.Withlotsandlotsofdata,theimagerecognitionsoftwarecouldbe trainedtobreakuptheplayerblobintousablebodyparts.Teamsweresentouttovideotapepeoplein theirhomesperformingbasicphysicalmotions.AdditionaldatawascollectedinaHollywoodmotion capturestudioofpeopledancing,running,andperformingacrobatics.Allofthisvideowasthenpassed throughadistributedcomputationenginecalledDryadthathadbeendevelopedbyanotherbranchof MicrosoftResearchinMountainView,California,inordertobegingeneratingadecisiontreeclassifier thatcouldmapanygivenpixelofKinect’sRGBDstreamontooneofthe31bodyparts.Thiswasdonefor 12differentbodytypesandrepeatedlytweakedtoimprovethedecisionsoftware’sabilitytoidentifya personwithoutaninitialpose,withoutbreaksinrecognition,andfordifferentkindsofpeople. ThistookcareofTheMinorityReportaspectofKinect.TohandletheStarTrekportion,AlexKipman turnedtoIvanTashevoftheMicrosoftResearchgroupbasedinRedmond.Tashevandhisteamhad workedonthemicrophonearrayimplementationonWindowsVista.Justasbeingabletofilteroutnonplayerpixelsisalargepartoftheskeletalrecognitionsolution,filteringoutbackgroundnoiseona microphonearraysituatedmuchclosertoastereosystemthanitistothespeakerwasthebiggestpartof makingspeechrecognitionworkonKinect.Usingacombinationofpatentedtechnologies(providedto usforfreeintheKinectforWindowsSDK),Tashev’steamcameupwithinnovativenoisesuppression andechocancellationtricksthatimprovedtheaudioprocessingpipelinemanytimesoverthestandard thatwasavailableatthetime. Basedonthisaudioscrubbing,adistributedcomputerlearningprogramofathousandcomputers spentaweekbuildinganacousticalmodelforKinectbasedonvariousAmericanregionalaccentsand thepeculiaracousticpropertiesoftheKinectmicrophonearray.Thismodelbecamethebasisofthe TellMefeatureincludedwiththeXboxaswellastheKinectforWindowsRuntimeLanguagePackused withtheKinectforWindowsSDK.Cuttingthingsveryclose,theacousticalmodelwasnotcompleted untilSeptember26,2010.Shortlyafter,onNovember4,theKinectsensorwasreleased.
TheRacetoHackKinect ThereleaseoftheKinectsensorwasmetwithmixedreviews.Gamingsitesgenerallyacknowledgedthat thetechnologywascoolbutfeltthatplayerswouldquicklygrowtiredofthegameplay.Thisdidnotslow downKinectsaleshowever.Thedevicesoldanaverageof133thousandunitsadayforthefirst60days afterthelaunch,breakingthesalesrecordsforeithertheiPhoneortheiPadandsettinganewGuinness worldrecord.Itwasn’tthatthegamingreviewsiteswerewrongaboutthenoveltyfactorofKinect;itwas justthatpeoplewantedKinectanyways,whethertheyplayedwithiteverydayoronlyforafewhours.It wasapieceofthefuturetheycouldhaveintheirlivingrooms. Theexcitementintheconsumermarketwasmatchedbytheexcitementinthecomputerhacking community.ThehackingstorystartswithJohnnyChungLee,themanwhooriginallyhackedaWii RemotetoimplementfingertrackingandwaslaterhiredontotheProjectNatalteamtoworkongesture recognition.FrustratedbythefailureofinternaleffortsatMicrosofttopublishapublicdriver,Lee approachedAdaFruit,avendorofopen-sourceelectronickits,tohostacontesttohackKinect.The contest,announcedonthedayoftheKinectlaunch,wasbuiltaroundaninterestinghardwarefeatureof theKinectsensor:itusesastandardUSBconnectortotalktotheXbox.ThissameUSBconnectorcanbe pluggedintotheUSBportofanyPCorlaptop.Thefirstpersontosuccessfullycreateadriverforthe deviceandwriteanapplicationconvertingthedatastreamsfromthesensorintovideoanddepth displayswouldwinthe$1,000bountythatLeehadputupforthecontest. Onthesameday,MicrosoftmadethefollowingstatementinresponsetotheAdaFruitcontest: “Microsoftdoesnotcondonethemodificationofitsproducts…WithKinect,Microsoftbuiltin numeroushardwareandsoftwaresafeguardsdesignedtoreducethechancesofproducttampering. Microsoftwillcontinuetomakeadvancesinthesetypesofsafeguardsandworkcloselywithlaw
7
CHAPTER1GETTINGSTARTED
enforcementandproductsafetygroupstokeepKinecttamper-resistant."LeeandAdaFruitresponded byraisingthebountyto$2,000. ByNovember6,JoshuaBlake,SethSandler,andKyleMachulisandothershadcreatedthe OpenKinectmailinglisttohelpcoordinateeffortsaroundthecontest.Theirnotionwasthatthedriver problemwassolvablebutthatthelongevityoftheKinecthackingeffortforthePCwouldinvolvesharing informationandbuildingtoolsaroundthetechnology.TheywerealreadylookingbeyondtheAdaFruit contestandimaginingwhatwouldcomeafter.InaNovember7posttothelist,theyevenproposed sharingthebountywiththeOpenKinectcommunity,ifsomeoneonthelistwonthecontest,inorder lookpastthemoneyandtowardwhatcouldbedonewiththeKinecttechnology.Theirmailinglistwould goontobethehomeoftheKinecthackingcommunityforthenextyear. SimultaneouslyonNovember6,ahackerknownasAlexPwasabletocontrolKinect’smotorsand readitsaccelerometerdata.TheAdaFruitbountywasraisedto$3,000.OnMonday,November8,AlexP postedvideoshowingthathecouldpullbothRGBanddepthdatastreamsfromtheKinectsensorand displaythem.Hecouldnotcollecttheprize,however,becauseofconcernsaboutopensourcinghis code.Onthe8,Microsoftalsoclarifieditspreviouspositioninawaythatappearedtoallowtheongoing effortstohackKinectaslongasitwasn’tcalled“hacking”:
KinectforXbox360hasnotbeenhacked—inanyway—asthesoftwareandhardware thatare partofKin ectfor Xbox360ha venotbe enmodif ied.W hathasha ppenedis someonehascreateddriversthat allowotherdevicestoin terfacewiththe Kinectfor Xbox360.Thecreationofthesedrivers,andtheus eofKinectforXbox360withother devices,isun supported.Westronglyen couragecustomerstouse KinectforXbox360 withtheirXbox360togetthebestexperiencepossible. OnNovember9,AdaFruitfinallyreceivedaUSBanalyzer,theBeagle480,inthemailandsettowork publishingUSBdatadumpscomingfromtheKinectsensor.TheOpenKinectcommunity,calling themselves“TeamTiger,”beganworkingonthisdataoveranIRCchannelandhadmadesignificant progressbyWednesdaymorningbeforegoingtosleep.Atthesametime,however,HectorMartin,a computersciencemajorinBilbao,Spain,hadjustpurchasedKinectandhadbegungoingtothroughthe AdaFruitdata.WithinafewhourshehadwrittenthedriverandapplicationtodisplayRGBanddepth video.TheAdaFruitprizehadbeenclaimedinonlysevendays. MartinbecameacontributortotheOpenKinectgroupandanewlibrary,libfreenect,becamethe basisofthecommunity’shackingefforts.JoshuaBlakeannouncedMartin’scontributiontothe OpenKinectmailinglistinthefollowingpost:
IgotaholdofHectoronIRCjustafterhepostedthevideoandtalkedtohimaboutthis group.Hesaidhe'dbehappytojoinus(andinfacthasalreadysubscribed).Afterhe sleepstorecover,we'lltalksomemoreaboutintegratinghisworkandourwork. Thisiswhentherealfunstarted.ThroughoutNovember,peoplestartedtopostvideosonthe InternetshowingwhattheycoulddowithKinect.Kinect-basedartisticdisplays,augmentedreality experiences,androboticsexperimentsstartedshowinguponYouTube.SiteslikeKinectHacks.net spranguptotrackallthethingspeoplewerebuildingwithKinect.ByNovember20,someonehad postedavideoofalightsabersimulatorusingKinect—anothermovieaspirationcheckedoff.Microsoft, meanwhile,wasnotidle.ThecompanywatchedwithexcitementashundredsofKinecthacksmade theirwaytotheweb. OnDecember10,PrimeSenseannouncedthereleaseofitsownopensourcedriversforKinectalong withlibrariesforworkingwiththedata.Thisprovidedimprovementstotheskeletontrackingalgorithms
8
CHAPTER1GETTINGSTARTED
overwhatwasthenpossiblewithlibfreenectandprojectsthatrequiredintegrationofRGBanddepth databeganmigratingovertotheOpenNItechnologystackthatPrimeSensehadmadeavailable.Without thekeyMicrosoftResearchtechnologies,however,skeletontrackingwithOpenNIstillrequiredthe awkwardT-posetoinitializeskeletonrecognition. OnJune17,2011,MicrosoftfinallyreleasedtheKinectSDKbetatothepublicunderanoncommerciallicenseafterdemonstratingitforseveralweeksateventslikeMIX.Aspromised,itincluded theskeletonrecognitionalgorithmsthatmakeaninitialposeunnecessaryaswellastheAECtechnology andacousticmodelsrequiredtomakeKinectspeechrecognitionsystemworkinalargeroom.Every developernowhadaccesstothesametoolsMicrosoftusedinternallyfordevelopingKinectapplications forthecomputer.
TheKinectforWindowsSDK TheKinectforWindowsSDKisthesetoflibrariesthatallowsustoprogramapplicationsonavarietyof MicrosoftdevelopmentplatformsusingtheKinectsensorasinput.Withit,wecanprogramWPF applications,WinFormsapplications,XNAapplicationsand,withalittlework,evenbrowser-based applicationsrunningontheWindowsoperatingsystem—though,oddlyenough,wecannotcreateXbox gameswiththeKinectforWindowsSDK.DeveloperscanusetheSDKwiththeXboxKinectSensor.In ordertouseKinect'snearmodecapabilities,however,werequiretheofficialKinectforWindows hardware.Additionally,theKinectforWindowssensorisrequiredforcommercialdeployments.
UnderstandingtheHardware TheKinectforWindowsSDKtakesadvantageofandisdependentuponthespecializedcomponents includedinallplannedversionsoftheKinectdevice.InordertounderstandthecapabilitiesoftheSDK, itisimportanttofirstunderstandthehardwareittalksto.TheglossyblackcasefortheKinect componentsincludesaheadaswellasabase,asshowninFigure1-3.Theheadis12inchesby2.5 inchesby1.5inches.Theattachmentbetweenthebaseandtheheadismotorized.Thecasehidesan infraredprojector,twocameras,fourmicrophones,andafan.
Figure1-3.TheKinectcase
9
CHAPTER1GETTINGSTARTED
IdonotrecommendeverremovingtheKinectcase.Inordertoshowtheinternalcomponents, however,Ihaveremovedthecase,asshowninFigure1-4.OnthefrontofKinect,fromlefttoright respectivelywhenfacingKinect,youwillfindthesensorsandlightsourcethatareusedtocaptureRGB anddepthdata.Tothefarleftistheinfraredlightsource.NexttothisistheLEDreadyindicator.Nextis thecolorcamerausedtocollectRGBdata,andfinally,ontheright(towardthecenteroftheKinect head),istheinfraredcamerausedtocapturedepthdata.Thecolorcamerasupportsamaximum resolutionof1280x960whilethedepthcamerasupportsamaximumresolutionof640x480.
Figure1-4.TheKinectcomponents OntheundersideofKinectisthemicrophonearray.Themicrophonearrayiscomposedoffour differentmicrophones.Oneislocatedtotheleftoftheinfraredlightsource.Theotherthreeareevenly spacedtotherightofthedepthcamera. IfyouboughtaKinectsensorwithoutanXboxbundle,theKinectcomeswithaY-cable,which extendstheUSBconnectorwireonKinectaswellasprovidingadditionalpowertoKinect.TheUSB extenderisrequiredbecausethemaleconnectorthatcomesoffofKinectisnotastandardUSB connector.TheadditionalpowerisrequiredtorunthemotorsontheKinect. IfyoubuyanewXboxbundledwithKinect,youwilllikelynothaveaY-cableincludedwithyour purchase.ThisisbecausethenewerXboxconsoleshaveaproprietaryfemaleUSBconnectorthatworks withKinectasisanddoesnotrequireadditionalpowerfortheKinectservos.Thisisaproblem—anda sourceofenormousconfusion—ifyouintendtouseKinectforPCdevelopmentwiththeKinectSDK. YouwillneedtopurchasetheY-cableseparatelyifyoudidnotgetitwithyourKinect.Itistypically marketedasaKinectACAdapterorKinectPowerSource.SoftwarebuiltusingtheKinectSDKwillnot workwithoutit. AfinalpieceofinterestingKinecthardwaresoldbyNycoratherthanbyMicrosoftiscalledthe KinectZoom.ThebaseKinecthardwareperformsdepthrecognitionbetween0.8and4meters.The KinectZoomisasetoflensesthatfitoverKinect,allowingtheKinectsensortobeusedinroomssmaller thanthestandarddimensionsMicrosoftrecommends.ItisparticularlyappealingforusersoftheKinect SDKwhomightwanttouseitforspecializedfunctionalitysuchascustomfingertrackinglogicor productivitytoolimplementationsinvolvingapersonsittingdowninfrontofKinect.From
10
CHAPTER1GETTINGSTARTED
experimentation,itactuallyturnsouttonotbeverygoodforplayinggames,perhapsduetothequality ofthelenses.
KinectforWindowsSDKHardwareandSoftwareRequirements UnlikeotherKinectlibraries,theKinectforWindowsSDK,asitsnamesuggests,onlyrunsonWindows operatingsystems.Specifically,itrunsonx86andx64versionsofWindows7.Ithasbeenshowntoalso workonearlyversionsofWindows8.BecauseKinectwasdesignedforXboxhardware,itrequires roughlysimilarhardwareonaPCtoruneffectively.
HardwareRequirements •
Computerwithadual-core,2.66-GHzorfasterprocessor
•
Windows7–compatiblegraphicscardthatsupportsMicrosoftDirectX9.0c capabilities
•
2GBofRAM(4GBorRAMrecommended)
•
KinectforXbox360sensor
•
KinectUSBpoweradapter
UsethefreeVisualStudio2010ExpressorotherVS2010editionstoprogramagainsttheKinectfor WindowsSDK.YouwillalsoneedtohavetheDirectX9.0cruntimeinstalled.LaterversionsofDirectX arenotbackwardscompatible.Youwillalso,ofcourse,wanttodownloadandinstallthelatestversionof theKinectforWindowsSDK.TheKinectSDKinstallerwillinstalltheKinectdrivers,theMicrosoft ResearchKinectassembly,aswellascodesamples.
SoftwareRequirements •
MicrosoftVisualStudio2010ExpressorotherVisualStudio2010edition: http://www.microsoft.com/visualstudio/en-us/products/2010-editions/express
•
Microsoft.NETFramework4
•
TheKinectforWindowsSDK(x86orx64):http://www.kinectforwindows.com
•
ForC++SkeletalViewersamples: •
DirectXSoftwareDevelopmentKit,June2010orlaterversion: http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=6 812
•
DirectXEnd-UserRuntimeWebInstaller: http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=3 5
TotakefulladvantageoftheaudiocapabilitiesofKinect,youwillalsoneedadditionalMicrosoft speechrecognitionsoftware:theSpeechPlatformAPI,theSpeechPlatformSDK,andtheKinectfor WindowsRuntimeLanguagePack.Fortunately,theinstallfortheSDKautomaticallyinstallsthese additionalcomponentsforyou.Shouldyoueveraccidentallyuninstallthesespeechcomponents,
11
CHAPTER1GETTINGSTARTED
however,itisimportanttobeawarethattheotherKinectfeatures,suchasdepthprocessingand skeletontracking,arefullyfunctionalevenwithoutthespeechcomponents.
Step-By-StepInstallation BeforeinstallingtheKinectforWindowsSDK: 1.
VerifythatyourKinectdeviceisnotpluggedintothecomputeryouare installingto.
2.
VerifythatVisualStudioisclosedduringtheinstallationprocess.
IfyouhaveotherKinectdriversonyourcomputersuchasthoseprovidedbyPrimeSense,you shouldconsiderremovingthese.Theywillnotrunside-by-sidewiththeSDKandtheKinectdrivers providedbyMicrosoftwillnotinteroperatewithotherKinectlibrariessuchasOpenNIorlibfreenect.It ispossibletoinstallanduninstalltheSDKontopofotherKinectplatformsandswitchbackandforthby repeatedlyuninstallingandreinstallingtheSDK.However,thishasalsobeenknowntocause inconsistencies,asthewrongdrivercanoccasionallybeloadedwhenperformingthisprocedure.Ifyou plantogobackandforthbetweendifferentKinectstacks,installingonseparatemachinesisthesafest path. Touninstallotherdrivers,includingpreviousversionsofthoseprovidedwiththeSDK,goto ProgramsandFeaturesintheControlPanel,selectthenameofthedriveryouwishtoremove,andclick Uninstall. Downloadtheappropriateinstallationmsi(x86orx64)foryourcomputer.Ifyouareuncertain whetheryourversionofWindowsis32-bitor64-bit,youcanrightclickontheWindowsicononyour desktopandgotoPropertiesinordertofindout.Youcanalsoaccessyoursysteminformationbygoing totheControlPanelandselectingSystem.Youroperatingsystemarchitecturewillbelistednexttothe titleSystemtype.IfyourOSis64-bit,youshouldinstallthex64version.Otherwise,installthex86 versionofthemsi. Runtheinstalleronceitissuccessfullydownloadedtoyourmachine.FollowtheSetupwizard promptsuntilinstallationoftheSDKiscomplete.MakesurethatKinect’sextrapowersupplyisalso pluggedintoapowersource.YoucannowplugyourKinectdeviceintoaUSBportonyourcomputer. OnfirstconnectingtheKinecttoyourPC,Windowswillrecognizethedeviceandbeginloadingthe Kinectdrivers.YoumayseeamessageonyourWindowstaskbarindicatingthatthisisoccurring.When thedrivershavefinishedloading,theLEDlightonyourKinectwillturnasolidgreen. Youmaywanttoverifythatthedriversinstalledsuccessfully.Thisistypicallyatroubleshooting procedureincaseyouencounteranyproblemsasyouruntheSDKsamplesorbeginworkingthrough thecodeinthisbook.Inordertoverifythatthedriversareinstalledcorrectly,opentheControlPanel andselectDeviceManager.AsFigure1-5shows,theMicrosoftKinectnodeinDeviceManagershould listthreeitemsifthedriverswerecorrectlyinstalled:theMicrosoftKinectAudioArrayControl,Microsoft KinectCamera,andMicrosoftKinectSecurityControl.
12
CHAPTER1GETTINGSTARTED
Figure1-5.Kinectdrivers YouwillalsowanttoverifythatKinect’smicrophonearraywascorrectlyrecognizedduring installation.Todoso,gototheControlManagerandthentheDeviceManageragain.AsFigure1-6 shows,thelistingforKinectUSBAudioshouldbepresentunderthesound,videoandgamecontrollers node.
Figure1-6.Microphonearray IfyoufindthatanyofthefourdevicesmentionedabovedonotappearinDeviceManager,you shoulduninstalltheSDKandattempttoinstallitagain.Themostcommonproblemsseemtooccur aroundhavingtheKinectdeviceaccidentallypluggedintothePCduringinstallorforgettingtoplugin theKinectadapterwhenconnectingtheKinecttothePCforthefirsttime.Youmayalsofindthatother USBdevices,suchasawebcam,stopworkingonceKinectstartsworking.ThisoccursbecauseKinect mayconflictwithotherUSBdevicesconnectedtothesamehostcontroller.Youcanworkaroundthisby tryingotherUSBports.APCorlaptoptypicallyhasonehostcontrollerfortheportsonthefrontorside ofthecomputerandanotherhostcontrollerattheback.AlsousedifferentUSBhostcontrollersifyou attempttodaisychainmultipleKinectdevicesforthesameapplication. Toworkwithspeechrecognition,installtheMicrosoftSpeechPlatformServerRuntime(x86),the SpeechPlatformSDK(x86),andtheKinectforWindowsLanguagePack.Theseinstallsshouldoccurin theorderlisted.WhilethefirsttwocomponentsarenotspecifictoKinectandcanbeusedforgeneral speechrecognitiondevelopment,theKinectlanguagepackcontainstheacousticmodelsspecifictothe
13
CHAPTER1GETTINGSTARTED
Kinect.ForKinectdevelopment,theKinectlanguagepackcannotbereplacedwithanotherlanguage packandtheKinectlanguagepackwillnotbeusefultoyouwhendevelopingspeechrecognition applicationswithoutKinect.
ElementsofaKinectVisualStudioProject IfyouarealreadyfamiliarwiththedevelopmentexperienceusingVisualStudio,thenthebasicstepsfor implementingaKinectapplicationshouldseemfairlystraightforward.Yousimplyhaveto: 1.
Createanewproject.
2.
ReferencetheMicrosoft.Kinect.dll.
3.
DeclaretheappropriateKinectnamespace.
ThemainhurdleinprogrammingforKinectisgettingusedtotheideathatwindows,themainUI containerof.NETprograms,arenotusedforinputastheyareintypicalapplications.Instead,windows areusedtodisplayinformationonlywhileallinputisderivedfromtheKinectsensor.Asecondhurdleis gettingusedtothenotionthatinputfromKinectiscontinuousandconstantlychanging.AKinect programdoesnotwaitforadiscreteeventsuchasabuttonpress.Instead,itrepeatedlyprocesses informationfromtheRGB,depth,andskeletonstreamsandrearrangestheUIcontainerappropriately. TheKinectSDKsupportsthreekindsofmanagedapplications(applicationsthatuseC#orVisual BasicratherthanC++):Consoleapplications,WPFapplications,andWindowsFormsapplications. Consoleapplicationsareactuallytheeasiesttogetstartedwith,astheydonotcreatetheexpectation thatwemustinteractwithUIelementslikebuttons,dropdowns,orcheckboxes. TocreateanewKinectapplication,openVisualStudioandselectFile➤New➤Project.Adialog windowwillappearofferingyouachoiceofprojecttemplates.UnderVisualC#➤Windows,select ConsoleApplicationandeitheracceptthedefaultnamefortheprojectorcreateyourownprojectname. YouwillnowwanttoaddareferencetotheKinectassemblyyouinstalledinthestepsabove.Inthe VisualStudioSolutionspane,right-clickonthereferencesfolder,asshowninFigure1-7.SelectAdd Reference.Anewdialogwindowwillappearlistingvariousassembliesyoucanaddtoyourproject.Find theMicrosoft.Research.Kinectassemblyandaddittoyourproject.
Figure1-7.AddareferencetotheKinectlibrary AtthetopoftheProgram.csfileforyourapplication,addthenamespacedeclarationforthe Mirosoft.Kinectnamespace.ThisnamespaceencapsulatesalloftheKinectfunctionalityforbothnui andaudio.
14
CHAPTER1GETTINGSTARTED
using Microsoft.Kinect; ThreeadditionalstepsarestandardforKinectapplicationsthattakeadvantageofthedatafromthe cameras.TheKinectSensorobjectmustbeinstantiated,initialized,andthenstarted.Tobuildan extremelytrivialapplicationtodisplaythebitstreamflowingfromthedepthcamera,wewillinstantiate anewKinectSensorobjectaccordingtotheexampleinListing1-1.Inthiscase,weassumethereisonly onecameraintheKinectSensorsarray.Weinitializethesensorbyenablingthedatastreamswewishto use.Enablingdatastreamswedonotintendtousewouldcauseunnecessaryperformanceoverhead. NextweaddaneventhandlerfortheDepthFrameReadyevent,andthencreatealoopthatwaitsuntilthe spacebarispressedbeforeendingtheapplication.Asafinalstep,justbeforetheapplicationexits,we followgoodpracticeanddisablethedepthstreamreader. Listing1-1.InstantiateandInitializetheRuntime static void Main(string[] args) { // instantiate the sensor instance KinectSensor sensor = KinectSensor.KinectSensors[0]; // initialize the cameras sensor.DepthStream.Enable(); sensor.DepthFrameReady += sensor_DepthFrameReady; // make it look like The Matrix Console.ForegroundColor = ConsoleColor.Green;
}
// start the data streaming sensor.Start(); while (Console.ReadKey().Key != ConsoleKey.Spacebar) { }
TheheartofanyKinectappisnotthecodeabove,whichisprimarilyboilerplate,butratherwhatwe choosetodowiththedatapassedbytheDepthFrameReadyevent.AllofthecoolKinectapplicationsyou haveseenontheInternetusethedatafromtheDepthFrameReady,ColorFrameReady,and SkeletonFrameReadyeventstoaccomplishtheremarkableeffectsthathavebroughtyoutothisbook.In Listing1-2,wewillfinishofftheapplicationbysimplywritingtheimagebitsfromthedepthcamerato theconsolewindowtoseesomethingsimilartowhattheearlyKinecthackerssawandgotexcitedabout backinNovemberof2010.
15
CHAPTER1GETTINGSTARTED
Listing1-2.FirstPeekAttheKinectDepthStreamData static void sensor_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e) { using (var depthFrame = e.OpenDepthImageFrame()) { if (depthFrame == null) return; short[] bits = new short[depthFrame.PixelDataLength]; depthFrame.CopyPixelDataTo(bits); foreach (var bit in bits) Console.Write(bit); } } AsyouwaveyourarmsinfrontoftheKinectsensor,youwillexperiencethefirstoddityof developingwithKinect.YouwillrepeatedlyhavetopushyourchairawayfromtheKinectsensorasyou testyourapplications.Ifyoudothisinanopenspacewithco-workers,youwillreceivestrangelooks.I highlyrecommendprogrammingforKinectinaprivate,secludedspacetoavoidthesestrangelooks.In myexperience,peoplegenerallyviewasoftwaredeveloperwildlyswinginghisarmswithconcernand, moreoften,suspicion.
TheKinectSDKSampleApplications TheKinectforWindowsSDKinstallsseveralreferenceapplicationsandsamples.Theseapplications provideastartingpointforworkingwiththeSDK.TheyarewritteninacombinationofC#andC++and servethesometimescontraryobjectivesofshowinginaclearwayhowtousetheKinectSDKand presentingbestpracticesforprogrammingwiththeSDK.Whilethisbookdoesnotdelveintothedetails ofprogramminginC++,itisstillusefultoexaminetheseexamplesifonlytoremindourselvesthatthe KinectSDKisbasedonaC++librarythatwasoriginallywrittenforgamedevelopersworkinginC++.The C#classesareoftenmerelywrappersfortheseunderlyinglibrariesand,attimes,exposeleaky abstractionsthatmakesenseonlywhenweconsidertheirC++underpinnings. Awordshouldbesaidaboutthedifferencebetweensampleapplicationsandreference applications.Thecodeforthisbookissamplecode.Itdemonstratesintheeasiestwaypossiblehowto performgiventasksrelatedtothedatareceivedfromtheKinectsensor.Itshouldrarelybeusedasisin yourownapplications.Thecodeinreferenceapplications,ontheotherhand,hastheadditionalburden ofshowingthebestwaytoorganizecodetomakeitrobustandtoembodygoodarchitecturalprinciples. Oneofthegreatestmythsinthesoftwareindustryisperhapstheimplicitbeliefthatgoodarchitectureis alsoreadableand,consequently,easilymaintainable.Thisisoftennotthecase.Goodarchitecturecan oftenbeanendinitself.MostofthecodeprovidedwiththeKinectSDKembodiesgoodarchitectureand shouldbestudiedwiththisinmind.Thecodeprovidedwiththisbook,ontheotherhand,istypically writtentoillustrateconceptsinthemoststraightforwardwaypossible.Youshouldstudybothcode samplesaswellasreferencecodetobecomeaneffectiveKinectdeveloper.Inthefollowingsections,we willintroduceyoutosomeofthesesamplesandhighlightpartsofthecodeworthfamiliarizingyourself with.
16
CHAPTER1GETTINGSTARTED
KinectExplorer KinectExplorerisaWPFprojectwritteninC#.Itdemonstratesthebasicprogrammingmodelfor retrievingthecolor,depth,andskeletonstreamsanddisplayingtheminawindow—moreorlessthe originalcriteriasetfortheAdaFruitKinecthackingcontest.Figure1-8showstheUIforthereference application.Thevideoanddepthstreamsareeachusedtopopulateandupdateadifferentimage controlinrealtimewhiletheskeletonstreamisusedtocreateaskeletaloverlayontheseimages. Besidesthedepthstream,videostream,andskeleton,theapplicationalsoprovidesarunningupdateof theframespersecondprocessedbythedepthstream.Whilethegoalis30fps,thiswilltendtovary dependingonthespecificationsofyourcomputer.
Figure1-8.KinectExplorerreferenceapplication Thesampleexposessomekeyconceptsforworkingwiththedifferentdatastreams.The DepthFrameReadyeventhandler,forinstance,takeseachimageprovidedsequentiallybythedepth streamandparsesitinordertodistinguishplayerpixelsfrombackgroundpixels.Eachimageisbroken downintoabytearray.Eachbyteistheninspectedtodetermineifitisassociatedwithaplayerimageor not.Ifitdoesbelongtoaplayer,thepixelisreplacedwithaflatcolor.Ifnot,itisgrayscaled.Thebytes arethenrecasttoabitmapobjectandsetasthesourceforanimagecontrolintheUI.Thentheprocess beginsagainforthenextimageinthedepthstream.Onewouldexpectthatindividuallyinspectingevery byteinthisstreamwouldtakearemarkablylongtimebut,asthefpsindicatorshows,infactitdoesnot. Thisisactuallytheprevailingtechniqueformanipulatingboththecoloranddepthstreams.Wewillgo intogreaterdetailconcerningthedepthandcolorstreamsinChapter2andChapter3ofthisbook. KinectExplorerisparticularlyinterestingbecauseitdemonstrateshowtobreakupthedifferent capabilitiesoftheKinectsensorintoreusablecomponents.Insteadofacentralcontrollingprocess,each ofthedistinctviewercontrolsforvideo,color,skeleton,andaudioindependentlycontroltheirown
17
CHAPTER1GETTINGSTARTED
accesstotheirrespectivedatastreams.ThisdistributedstructureallowsthevariousKinectcapabilities tobeaddedindependentlyandadhoctoanyapplication. Beyondthisinterestingmodulardesign,therearethreespecificpiecesoffunctionalityinKinect ExplorerthatshouldbeincludedinanyKinectapplication.ThefirstisthewayKinectExplorer implementssensordiscovery.AsListing1-3shows,thetechniqueimplementedinthereference applicationwaitsforKinectsensorstobeconnectedtoaUSBportonthecomputer.Itdefersany initializationofthestreamsuntilKinecthasbeenconnectedandisabletosupportmultipleKinects. Thiscodeeffectivelyactsasagatekeeperthatpreventsanyproblemsthatmightoccurwhenthereisa disruptioninthedatastreamscausedbytrippingoverawireorevensimplyforgettingtopluginthe Kinectsensor. Listing1-3.KinectSensorDiscovery private void KinectStart() { //listen to any status change for Kinects. KinectSensor.KinectSensors.StatusChanged += Kinects_StatusChanged; //show status for each sensor that is found now. foreach (KinectSensor kinect in KinectSensor.KinectSensors) { ShowStatus(kinect, kinect.Status); } } AsecondnoteworthyfeatureofKinectExploreristhewayitmanagesKinectsensor’smotor controllingthesensor’sangleofelevation.InearlyeffortstoprogramwithKinectpriortothearrivalof theSDK,itwasuncommontousesoftwaretoraiseandlowertheangleoftheKinecthead.Inorderto placeKinectcamerascorrectlywhileprogramming,developerswouldmanuallyliftandlowertheangle oftheKinecthead.Thistypicallyproducedaloudandslightlyfrighteningclickbutwasconsidereda necessaryevilasdevelopersexperimentedwithKinect.Unfortunately,Kinect’sinternalmotorswerenot builttohandlethiskindofstress.TherathersophisticatedcodeprovidedwithKinectExplorer demonstrateshowtoperformthisnecessarytaskinamoregenteelmanner. Thefinalpieceoffunctionalitydeservingofcarefulstudyisthewayskeletonsfromtheskeleton streamareselected.TheSDKonlytracksfullskeletonsfortwoplayersatatime.Bydefault,itusesa complicatedsetofrulestodeterminewhichplayersshouldbetrackedinthisway.However,theSDK alsoallowsthisdefaultsetofrulestobeoverwrittenbytheKinectdeveloper.KinectExplorer demonstrateshowtooverwritethebasicrulesandalsoprovidesseveralalternativealgorithmsfor determiningwhichplayersshouldreceivefullskeletontracking,forinstancebyclosestplayersandby mostphysicallyactiveplayers.
ShapeGame TheShapeGamereferenceapp,alsoaWPFapplicationwritteninC#,isanambitiousprojectthatties togetherskeletontracking,speechrecognition,andbasicphysicssimulation.Italsosupportsuptotwo playersatthesametime.TheShapeGameintroducestheconceptofagameloop.Thoughnotdealtwith explicitlyinthisbook,gameloopsareacentralconceptingamedevelopmentthatyouwillwantto becomefamiliarwithinordertopresentshapesconstantlyfallingfromthetopofthescreen.InShape Game,thegameloopisaC#whilelooprunningintheGameThreadmethod,asshowninListing1-4.The GameThreadmethodtweakstherateofthegamelooptoachievetheoptimalframerate.Onevery
18
CHAPTER1GETTINGSTARTED
iterationofthewhileloop,theHndleGameTimermethodiscalledtomoveshapesdownthescreen,add newshapes,anddetectcollisionsbetweentheskeletonhandjointsandthefallingshapes. Listing1-4.ABasicGameLoop private void GameThread() { runningGameThread = true; predNextFrame = DateTime.Now; actualFrameTime = 1000.0 / targetFramerate; while (runningGameThread) { . . .
}
Dispatcher.Invoke(DispatcherPriority.Send, new Action(HandleGameTimer), 0);
} TheresultisthegameinterfaceshowninFigure1-9.WhiletheShapeGamesampleusesprimitive shapesforgamecomponentssuchaslinesandellipsesfortheskeleton,itisalsofairlyeasytoreplace theseshapeswithimagesinordertocreateamoreengagingexperience.
Figure1-9.ShapeGame TheShapeGamealsointegratesspeechrecognitionintothegameplay.Thelogicforthespeech recognitioniscontainedintheproject’sRecognizerclass.Itrecognizesphrasesofuptofivewordswith approximately15possiblewordchoicesforeachword,potentiallysupportingagrammarofupto 700,000phrases.Thecombinationofgestureandspeechrecognitionprovidesawaytoexperimentwith mixed-modalgameplaywithKinect,somethingnotwidelyusedinKinectgamesfortheXboxbutaround whichthereisconsiderableexcitement.Thisbookdelvesintothespeechrecognitioncapabilitiesof KinectinChapter7.
19
CHAPTER1GETTINGSTARTED
NoteTheskeletontrackingintheShapeGamesampleprovidedwiththeKinectforWindowsSDKhighlightsa commonproblemwithstraightforwardrenderingofjointcoordinates.Whenaparticularbodyjointfallsoutsideof thecamera’sview,thejointbehaviorbecomeserratic.Thisismostnoticeablewiththelegs.Abestpracticeisto createdefaultpositionsandmovementsforin-gameavatars.Thedefaultpositionsshouldonlybeoverriddenwhen theskeletaldataforparticularjointsisvalid.
RecordAudio TheRecordAudiosampleistheC#versionofsomeofthefeaturesdemonstratedinAudioCaptureRaw, MFAudioFilter,andMicArrayEchoCancellation.ItisaC#consoleapplicationthatrecordsandsavesthe rawaudiofromKinectasawavfile.Italsoappliesthesourcelocalizationfunctionalityshownin MicArrayEchoCancellationtoindicatethesourceoftheaudiowithrespecttotheKinectsensorin radians.ItintroducesanimportantconceptforworkingwithwavdatacalledtheWAVEFORMATEX struct.ThisisastructurenativetoC++thathasbeenreimplementedasaC#structinRecordAudio,as showninListing1-5.Itcontainsalltheinformation,andonlytheinformation,requiredtodefineawav audiofile.TherearealsomultipleC#implementationsofitalloverthewebsinceitseemstobe reinventedeverytimesomeoneneedstoworkwithwavfilesinmanagedcode. Listing1-5.TheWAVEFORMATEXStruct struct WAVEFORMATEX { public ushort wFormatTag; public ushort nChannels; public uint nSamplesPerSec; public uint nAvgBytesPerSec; public ushort nBlockAlign; public ushort wBitsPerSample; public ushort cbSize; }
SpeechSample TheSpeechsampleapplicationdemonstrateshowtouseKinectwiththespeechrecognitionengine providedintheMicrosoft.Speechassembly.SpeechisaconsoleapplicationwritteninC#.Whereasthe MFAudioFiltersampleusedaWMAfileasitssink,theSpeechapplicationusesthespeechrecognition engineasasinkinitsaudioprocessingpipeline. Thesampleisfairlystraightforward,demonstratingtheconceptsofGrammarobjectsandChoices objects,asshowninListing1-6,thathavebeenapartofspeechrecognitionprogrammingsince WindowsXP.Theseobjectsareconstructedtocreatecustomlexiconsofwordsandphrasesthatthe applicationisconfiguredtorecognize.InthecaseoftheSpeechsample,thisincludesonlythreewords: red,green,andblue.
20
CHAPTER1GETTINGSTARTED
Listing1-6.GrammarsandChoices var colors = new Choices(); colors.Add("red"); colors.Add("green"); colors.Add("blue"); var gb = new GrammarBuilder(); gb.Culture = ri.Culture; gb.Append(colors); var g = new Grammar(gb); ThesamplealsointroducessomewidelyusedboilerplatecodethatusesC#LINQsyntaxto instantiatethespeechrecognitionengine,asillustratedinListing1-7.Instantiatingthespeech recognitionenginerequiresusingpatternmatchingtoidentifyaparticularstring.Thespeech recognitionengineeffectivelyloopsthroughalltherecognizersinstalledonthecomputeruntilitfinds onewhoseIdpropertymatchesthemagicstring.Inthiscase,weuseaLINQexpressiontoperformthe loop.Ifthecorrectrecognizerisfound,itisthenusedtoinstantiatethespeechrecognitionengine.Ifitis notfound,thespeechrecognitionenginecannotbeused. Listing1-7.FindingtheKinectRecognizer private static RecognizerInfo GetKinectRecognizer() { Func matchingFunc = r => { string value; r.AdditionalInfo.TryGetValue("Kinect", out value); return "True".Equals(value, StringComparison.InvariantCultureIgnoreCase) && "en-US".Equals( r.Culture.Name , StringComparison.InvariantCultureIgnoreCase); }; return SpeechRecognitionEngine.InstalledRecognizers() .Where(matchingFunc) .FirstOrDefault(); } Althoughsimple,theSpeechsampleisagoodstartingpointforexploringtheMicrosoft.SpeechAPI. Aproductivewaytousethesampleistobeginaddingadditionalwordchoicestothelimitedthree-word targetset.ThentrytocreatetheTellMestylefunctionalityontheXboxbyignoringanyphrasethatdoes notbeginwiththeword“Xbox.”Thentrytocreateagrammarthatincludescomplexgrammatical structuresthatincludeverbs,subjectsandobjectsastheShapeGameSDKsampledoes. Thisis,afterall,thechiefutilityofthesampleapplicationsprovidedwiththeKinectforWindows SDK.Theyprovidecodeblocksthatyoucancopydirectlyintoyourowncode.Theyalsoofferawayto beginlearninghowtogetthingsdonewithKinectwithoutnecessarilyunderstandingalloftheconcepts behindtheKinectAPIrightaway.Iencourageyoutoplaywiththiscodeassoonaspossible.Whenyou hitawall,returntothisbooktolearnmoreaboutwhytheKinectAPIworksthewayitdoesandhowto getfurtherinimplementingthespecificscenariosyouareinterestedin.
21
CHAPTER1GETTINGSTARTED
Summary Inthischapter,youlearnedaboutthesurprisinglylonghistoryofgesturetrackingasadistinctmodeof naturaluserinterface.YoualsolearnedaboutthecentralroleAlexKipmanplayedinbringingKinect technologytotheXboxandhowMicrosoftResearch,Microsoft’sresearchanddevelopmentgroup,was usedtobringKinecttomarket.YoufoundoutthemomentumonlinecommunitieslikeOpenKinect addedtowardpopularizingKinectdevelopmentbeyondXboxgaming,openingupanewtrendinKinect developmentonthePC.YoulearnedhowtoinstallandstartprogrammingfortheMicrosoftKinectfor WindowsSDK.Finally,youlearnedaboutthevariouspiecesinstalledwiththeKinectforWindowsSDK andhowtousethemasaspringboardforyourownprogrammingaspirations.
22
CHAPTER 2
Application Fundamentals EveryKinectapplicationhascertainbasicelements.Theapplicationmustdetectordiscoverattached Kinectsensors.Itmusttheninitializethesensor.Onceinitialized,thesensorproducesdata,whichthe applicationthenprocesses.Finally,whentheapplicationfinishesusingthesensoritmustproperly uninitializedthesensor. Inthefirstsectionofthischapter,wecoversensordiscovery,initialization,anduninitialization. ThesefundamentaltopicsarecriticaltoallformsofKinectapplicationsusingMicrosoft’sKinectfor WindowsSDK.Thefirstsectionpresentsseveralcodeexamples,whichshowcodethatisnecessaryfor virtuallyanyKinectapplicationyouwriteandisrequiredbyeverycodingprojectinthisbook.The codingdemonstrationsinthesubsequentchaptersdonotexplicitlyshowthesensordiscoveryand initializationcode.Instead,theysimplymentionthistaskasaprojectrequirement. Onceinitialized,Kinectgeneratesdatabasedoninputgatheredbythedifferentcameras.Thisdata isavailabletoapplicationsthroughdatastreams.TheconceptissimilartotheIOstreamsfoundinthe System.IOnamespace.Thesecondsectionofthischapterdetailsstreambasicsanddemonstrateshowto pulldatafromKinectusingtheColorImageStream.Thisstreamcreatespixeldata,whichallowsan applicationtocreateacolorimagelikeabasicphotoorvideocamera.Weshowhowtomanipulatethe streamdatainfunandinterestingways,andweexplainhowtosavestreamdatatoenhanceyourKinect application’suserexperience. Thefinalsectionofthischaptercomparesandcontraststhetwoapplicationarchitecturemodels (EventandPolling)availablewithMicrosoft’sKinectforWindowsSDK.Wedetailhowtouseeach architecturestructureandwhy.Thisincludescodeexamples,whichcanserveastemplatesforyournext project. Thischapterisanecessaryreadasitisthefoundationfortheremainingchaptersofthebook,and becauseitcoversthebasicsoftheentireSDK.Afterreadingthischapter,findingyourwaythroughthe restoftheSDKiseasy.TheadventureinKinectapplicationdevelopmentbeginsnow.Havefun!
23
CHAPTER2APPLICATIONFUNDAMENTALS
TheKinectSensor KinectapplicationdevelopmentstartswiththeKinectSensor.ThisobjectdirectlyrepresentstheKinect hardware.ItisfromtheKinectSensorobjectthatyouaccessthedatastreamsforvideo(color)anddepth imagesaswellasskeletontracking.Inthischapter,weexploreonlytheColorImageStream.The DepthImageStreamandSkeletonStreamwarrantentirechapterstothemselves. Themostcommonmethodofdataretrievalfromthesensor’sstreamsisfromasetofeventsonthe KinectSensorobject.Eachstreamhasanassociatedevent,whichfireswhenthestreamhasaframeof dataavailableforprocessing.Eachstreampackagesdatainwhatistermedaframe.Forexample,the ColorFrameReadyeventfireswhentheColorImageStreamhasnewdata.Weexamineeachoftheseevents inmoredepthwhencoveringtheparticularsensorstream.Moregeneraleventingdetailsareprovided laterinthischapter,whendiscussingthetwodifferentdataretrievalarchitecturemodels. Eachofthedatastreams(color,depthandskeleton)returndatapointsindifferentcoordinates systems,asweexploreeachdatastreamindetailthiswillbecomeclearer.Itisacommontaskto translatedatapointsgeneratedinonestreamtoadatapointinanother.Laterinthischapterwe demonstratehowandwhypointtranslatesareneeded.TheKinectSensorobjecthasasetofmethodsto performthedatastreamdatapointtranslations.TheyareMapDepthToColorImagePoint, MapDepthToColorImagePoint,MapDepthToSkeletonPoint,andMapSkeletonPointToDepth.Beforeweareable toworkwithanyKinectdata,wemustfindanattachedKinect.Theprocessofdiscoveringconnected sensorsiseasy,butrequiressomeexplanation.
DiscoveringConnectedaSensor TheKinectSensorobjectdoesnothaveapublicconstructorandcannotbecreatedbyanapplication. Instead,theSDKcreatesKinectSensorobjectswhenitdetectsanattachedKinect.Theapplicationmust discoverorbenotifiedwhenKinectisattachedtothecomputer.TheKinectSensorclasshasastatic propertynamedKinectSenors.ThispropertyisoftypeKinectSensorCollection.The KinectSensorCollectionobjectinheritsfromReadOnlyCollectionandissimple,asitconsistsonlyofan indexerandaneventnamedStatusChanged. TheindexerprovidesaccesstoKinectSensorobjects.Thecollectioncountisalwaysequaltothe numberofattachedKinects.Yes,thismeansitispossibletobuildapplicationsthatusemorethanone Kinect!YourapplicationcanuseasmanyKinectsasdesired.Youareonlylimitedbythemuscleofthe computerrunningyourapplication,becausetheSDKdoesnotrestrictthenumberofdevices.Because ofthepowerandbandwidthneedsofKinect,eachdevicerequiresaseparateUSBcontroller. Additionally,whenusingmultipleKinectsonasinglecomputer,theCPUandmemorydemands necessitatesomeserioushardware.Giventhis,weconsidermulti-Kinectapplicationsanadvancedtopic thatisbeyondthescopeofthisbook.Throughoutthisbook,weonlyeverconsiderusingasingleKinect device.Allcodeexamplesinthisbookarewrittentouseasingledeviceandignoreallotherattached Kinects. FindinganattachedKinectisaseasyasiteratingthroughthecollection;however,justthepresence ofaKinectSensorcollectiondoesnotmeanitisdirectlyusable.TheKinectSensorobjecthasaproperty namedStatus,whichindicatesthedevice’sstate.Theproperty’stypeisKinectStatus,whichisan enumeration.Table2-1liststhedifferentstatusvalues,andexplainstheirmeaning.
24
CHAPTER2APPLICATIONFUNDAMENTALS
Table2-1KinectStatusValuesandSignificance
KinectStatus
What it means
Undefined
Thestatusoftheattacheddevicecannotbedetermined.
Connected
Thedeviceisattachedandiscapableofproducingdatafromitsstreams.
DeviceNotGenuine
TheattacheddeviceisnotanauthenticKinectsensor.
Disconnected
TheUSBconnectionwiththedevicehasbeenbroken.
Error
Communicationwiththedeviceproduceserrors.
Initializing
Thedeviceisattachedtothecomputer,andisgoingthroughtheprocess ofconnecting.
InsufficientBandwidthKinect cannotinitialize,becausetheUSBconnectordoesnothavethe necessarybandwidthrequiredtooperatethedevice. NotPowered
Kinectisnotfullypowered.ThepowerprovidedbyaUSBconnectionis notsufficienttopowertheKinecthardware.Anadditionalpoweradapter isrequired.
NotReady
Kinectisattached,butisyettoentertheConnectedstate.
AKinectSensorcannotbeinitializeduntilitreachesaConnectedstatus.Duringanapplication’s lifespan,asensorcanchangestate,whichmeanstheapplicationmustmonitorthestatechangesof attacheddevicesandreactappropriatelytothestatuschangeandtotheneedsoftheuserexperience. Forexample,iftheUSBcableisremovedfromthecomputer,thesensor’sstatuschangesto Disconnected.Ingeneral,anapplicationshouldpauseandnotifytheusertoplugKinectbackintothe computer.AnapplicationmustnotassumeKinectwillbeconnectedandreadyforuseatstartup,orthat thesensorwillmaintainconnectivitythroughoutthelifeoftheapplication. CreateanewWPFprojectusingVisualStudiosothatwecanproperlydemonstratethediscovery process.AddareferencetoMicrosoft.Kinect.dll,andupdatetheMainWindow.xaml.cscode,asshownin Listing2-1.ThecodelistingshowsthebasiccodetodetectandmonitoraKinectsensor. Listing2-1DetectingaKinectSensor public partial class MainWindow : Window { #region Member Variables private KinectSensor _Kinect; #endregion Member Variables #region Constructor public MainWindow()
25
CHAPTER2APPLICATIONFUNDAMENTALS
{
InitializeComponent(); this.Loaded += (s, e) => { DiscoverKinectSensor(); }; this.Unloaded += (s, e) => { this.Kinect = null; };
} #endregion Constructor #region Methods private void DiscoverKinectSensor() { KinectSensor.KinectSensors.StatusChanged += KinectSensors_StatusChanged; this.Kinect = KinectSensor.KinectSensors .FirstOrDefault(x => x.Status == KinectStatus.Connected); } private void KinectSensors_StatusChanged(object sender, StatusChangedEventArgs e) { switch(e.Status) { case KinectStatus.Connected: if(this.Kinect == null) { this.Kinect = e.Sensor; } break; case KinectStatus.Disconnected: if(this.Kinect == e.Sensor) { this.Kinect = null; this.Kinect = KinectSensor.KinectSensors .FirstOrDefault(x => x.Status == KinectStatus.Connected); if(this.Kinect == null) { //Notify the user that the sensor is disconnected }
} break; }
//Handle all other statuses according to needs
} #region Properties public KinectSensor Kinect { get { return this._Kinect; }
26
CHAPTER2APPLICATIONFUNDAMENTALS
set { if(this._Kinect != value) { if(this._Kinect != null) { //Uninitialize this._Kinect = null; }
}
if(value != null && value.Status == KinectStatus.Connected) { this._Kinect = value; //Initialize }
}
} #endregion Properties } AnexaminationofthecodeinListing2-1beginswiththemembervariable_Kinectandtheproperty namedKinect.AnapplicationshouldalwaysmaintainalocalreferencetotheKinectSensorobjectsused bytheapplication.Thereareseveralreasonsforthis,whichbecomemoreobviousasyouproceed throughthebook;however,attheveryleast,areferenceisneededtouninitializetheKinectSensorwhen theapplicationisfinishedusingthesensor.Thepropertyservesasawrapperforthemembervariable. Theprimarypurposeofusingapropertyistoensureallsensorinitializationanduninitializationisina commonplaceandexecutedinastructuredway.Noticeintheproperty’ssetterhowthemember variableisnotsetunlesstheincomingvaluehasastatusofKinectStatus.Connected.Whengoing throughthesensordiscoveryprocess,anapplicationshouldonlybeconcernedwithconnecteddevices. Besides,anyattempttoinitializeasensorthatdoesnothaveaconnectedstatusresultsinan InvalidOperationExceptionexception. Intheconstructoraretwoanonymousmethods,onetorespondtotheLoadedeventandtheother fortheUnloadedevent.Whenunloaded,theapplicationsetstheKinectpropertytonull,which uninitializesthesensorusedbytheapplication.Inresponsetothewindow’sLoadedevent,the applicationattemptstodiscoveraconnectedsensorbycallingtheDiscoverKinectSensormethod.The primarymotivationforusingtheLoadedandUnloadedeventsoftheWindowisthattheyserveassolid pointstobeginandendKinectprocessing.IftheapplicationfailstodiscoveravalidKinect,the applicationcanvisuallynotifytheuser. TheDiscoverKinectSensormethodonlyhastwolinesofcode,buttheyareimportant.Thefirstline subscribestotheStatusChangedeventoftheKinectSensorsobject.Thesecondlineofcodeusesa lambdaexpressiontofindthefirstKinectSensorobjectinthecollectionwithastatusof KinectSensor.Connected.TheresultisassignedtotheKinectproperty.Thepropertysettercode initializesanynon-nullsensorobject. TheStatusChangedeventhandler(KinectSensors_StatusChanged)isstraightforwardandselfexplanatory.However,itisworthmentioningthecodeforwhenthestatusisequalto KinectSensor.Connected.Thefunctionoftheifstatementistolimittheapplicationtoonesensor.The applicationignoresanysubsequentKinectsconnectedonceonesensorisdiscoveredandinitializedby theapplication.
27
CHAPTER2APPLICATIONFUNDAMENTALS
ThecodeinListing2-1illustratestheminimalcoderequiredtodiscoverandmaintainareferenceto aKinectdevice.Theneedsofeachindividualapplicationarelikelytonecessitateadditionalcodeor processing,butthecoreremains.Asyourapplicationsbecomemoreadvanced,controlsorotherclasses willcontaincodesimilartothis.Itisimportanttoensurethreadsafetyandreleaseresourcesproperlyfor garbagecollectiontopreventmemoryleaks.
StartingtheSensor Oncediscovered,Kinectmustbeinitializedbeforeitcanbeginproducingdataforyourapplication.The initializationprocessconsistsofthreesteps.First,yourapplicationmustenablethestreamsitneeds. EachstreamhasanEnabledmethod,whichinitializesthestream.Eachstreamisuniquelydifferentand assuchhassettingsthatrequireconfiguringbeforeenabled.Insomecases,thesesettingsareproperties andothersareparametersontheEnabledmethod.Laterinthischapter,wecoverinitializingthe ColorImageStream.Chapter3detailstheinitializationprocessfortheDepthImageStreamandChapter4 givetheparticularsontheSkeletonStream. Thenextstepisdetermininghowyourapplicationretrievesthedatafromthestreams.Themost commonmeansisthroughasetofeventsontheKinectSensorobject.Thereisaneventforeachstream (ColorFrameReadyfortheColorImageStream,DepthFrameReadyfortheDepthImageStream,and SkeletonFrameReadyfortheSkeletonStream),andtheAllFramesReadyevent,whichsynchronizesthe framedataofallthestreamssothatallframesareavailableatonce.Individualframe-readyeventsfire onlywhentheparticularstreamisenabled,whereastheAllFramesReadyeventfireswhenoneormore streamsisenabled. Finally,theapplicationmuststarttheKinectSensorobjectbycallingtheStartmethod.Almost immediatelyaftercallingtheStartmethod,theframe-readyeventsbegintofire.Ensurethatyour applicationispreparedtohandleincomingKinectdatabeforestartingtheKinectSensor.
StoppingtheSensor Oncestarted,theKinectSensorisstoppedbycallingtheStopmethod.Alldataproductionstops, howeveryoucanexpecttheframe-readyeventstofireonelasttime,soremembertoaddchecksfornull frameobjectsinyourframe-readyeventhandlers.Theprocesstostopthesensorisstraightforward enough,butthemotivationsfordoingsoaddpotentialcomplexity,whichcanaffectthearchitectureof yourapplication. ItistoosimplistictothinkthattheonlyreasonforhavingtheStopmethodisthateveryonswitch mustalsohaveanoffposition.TheKinectSensorobjectanditsstreamsusesystemresourcesandall well-behavedapplicationsshouldproperlyreleasetheseresourceswhennolongerneeded.Inthiscase, theapplicationwouldnotonlystopthesensor,butalsowouldunsubscribefromtheframe-readyevent handlers.BecarefulnottocalltheDisposemethodontheKinectSensororthestreams.Thisprevents yourapplicationfromaccessingthesensoragain.Theapplicationmustberestartedorthesensormust beunpluggedandpluggedinagain,beforethedisposedsensorisagainavailableforuse.
TheColorImageStream Kinecthastwocameras:anIRcameraandanormalvideocamera.Thevideocameraproducesabasic colorvideofeedlikeanyoff-the-shelfvideocameraorwebcam.Thisstreamistheleastcomplexofthe threebythewayitdataproducesandconfigurationsettings.Therefore,itservesperfectlyasan introductiontousingaKinectdatastream. WorkingwithaKinectdatastreamisathree-stepprocess.Thestreammustfirstbeenabled.Once enabled,theapplicationextractsframedatafromthestream,andfinallytheapplicationprocessesthe
28
CHAPTER2APPLICATIONFUNDAMENTALS
framedata.Thelasttwostepscontinueoverandoverforaslongasframedataisavailable.Continuing withthecodefromListing2-1,wecodetoinitializetheColorImageStream,asshowninListing2-2. Listing2-2EnablingtheColorImageStream public KinectSensor Kinect { get { return this._Kinect; } set { if(this._Kinect != value) { if(this._Kinect != null) { UninitializeKinectSensor(this._Kinect); this._Kinect = null; }
} }
if(value != null && value.Status == KinectStatus.Connected) { this._Kinect = value; InitializeKinectSensor(this._Kinect); }
}
private void InitializeKinectSensor(KinectSensor sensor) { if(sensor != null) { sensor.ColorStream.Enable(); sensor.ColorFrameReady += Kinect_ColorFrameReady; sensor.Start(); } } private void UninitializeKinectSensor(KinectSensor sensor) { if(sensor != null) { sensor.Stop(); sensor.ColorFrameReady -= Kinect_ColorFrameReady; } } ThefirstpartofListing2-2showstheKinectpropertywithupdatesinbold.Thetwonewlinescall twonewmethods,whichinitializeanduninitializetheKinectSensorandtheColorImageStream.The InitializeKinectSensormethodenablestheColorImageStream,subscribestotheColorFrameReady
29
CHAPTER2APPLICATIONFUNDAMENTALS
event,andstartsthesensor.Oncestarted,thesensorcontinuallycallstheframe-readyeventhandler whenanewframeofdataisavailable,whichinthisinstanceis30timespersecond. Atthispoint,ourprojectisincompleteandfailstocompile.Weneedtoaddthecodeforthe Kinect_ColorFrameReadyeventhandler.BeforedoingthisweneedtoaddsomecodetotheXAML.Each timetheframe-readyeventhandleriscalled,wewanttocreateabitmapimagefromtheframe’sdata, andweneedsomeplacetodisplaytheimage.Listing2-3showstheXAMLneededtoserviceourneeds. Listing2-3DisplayingaColorFrameImage
Listing2-4containstheframe-readyeventhandler.Theprocessingofframedatabeginsbygetting oropeningtheframe.TheOpenColorImageFramemethodontheColorImageFrameReadyEventArgsobject returnsthecurrentColorImageFrameobject.Theframeobjectisdisposable,whichiswhythecodewraps thecalltoOpenColorImageFrameinausingstatement.Extractingpixeldatafromtheframefirstrequires ustocreateabytearraytoholdthedata.ThePixelDataLengthpropertyontheframeobjectgivesthe exactsizeofthedataandsubsequentlythesizeofthearray.CallingtheCopyPixelDataTomethod populatesthearraywithpixeldata.Thelastlineofcodecreatesabitmapimagefromthepixeldataand displaystheimageontheUI. Listing2-4ProcessingColorImageFrameData private void Kinect_ColorFrameReady(object sender, ColorImageFrameReadyEventArgs e) { using(ColorImageFrame frame = e.OpenColorImageFrame()) { if(frame != null) { byte[] pixelData = new byte[frame.PixelDataLength]; frame.CopyPixelDataTo(pixelData);
}
}
ColorImageElement.Source = BitmapImage.Create(frame.Width, frame.Height, 96, 96, PixelFormats.Bgr32, null, pixelData, frame.Width * frame.BytesPerPixel);
} Withthecodeinplace,compileandrun.TheresultshouldbealivevideofeedfromKinect.You wouldseethissameoutputfromawebcamoranyothervideocamera.Thisaloneisnothingspecial.The differenceisthatitiscomingfromKinect,andasweknow,Kinectcanseethingsthatawebcamor genericvideocameracannot.
30
CHAPTER2APPLICATIONFUNDAMENTALS
BetterImagePerformance ThecodeinListing2-4createsanewbitmapimageforeachcolorimageframe.Anapplicationusingthis codeandthedefaultimageformatcreates30bitmapimagespersecond.Thirtytimespersecond memoryforanewbitmapobjectisallocated,initialized,andpopulatedwithpixeldata.Thememoryof thepreviousframe’sbitmapisalsomarkedforgarbagecollectionthirtytimespersecond,whichmeans thegarbagecollectorislikelyworkingharderthaninmostapplications.Inshort,thereisalotofwork beingdoneforeachframe.Insimpleapplications,thereisnodiscernibleperformanceloss;however,for morecomplexandperformance-demandingapplications,thisisunacceptable.Fortunately,thereisa betterway. ThesolutionistousetheWriteableBitmapobject.Thisobjectispartofthe System.Windows.Media.Imagingnamespace,andwasbuilttohandlefrequentupdatesofimagepixel data.WhencreatingtheWriteableBitmap,theapplicationmustdefinetheimagepropertiessuchasthe width,height,andpixelformat.ThisallowstheWriteableBitmapobjecttoallocatethememoryonceand justupdatepixeldataasneeded. ThecodechangesnecessarytousetheWriteableBitmapareonlyminor.Listing2-5beginsby declaringthreenewmembervariables.ThefirstistheactualWriteableBitmapobjectandtheothertwo areusedwhenupdatingpixeldata.Thevaluesofimagerectangleandimagestridedonotchangefrom frametoframe,sowecancalculatethemoncewhencreatingtheWriteableBitmap. Listing2-5alsoshowschanges,inbold,totheInitializeKinectmethod.Thesenewlinesofcode createtheWriteableBitmapobjectandprepareittoreceivepixeldata.Theimagerectangleandstride calculatesareincluded.WiththeWriteableBitmapcreatedandinitialized,itissettobetheimagesource fortheUIImageelement(ColorImageElement).Atthispoint,theWriteableBitmapcontainsnopixeldata, sotheUIimageisblank. Listing2-5CreateaFrameImageMoreEfficiently private WriteableBitmap _ColorImageBitmap; private Int32Rect _ColorImageBitmapRect; private int _ColorImageStride; private void InitializeKinect(KinectSensor sensor) { if(sensor != null) { ColorImageStream colorStream = sensor.ColorStream; colorStream.Enable(); this._ColorImageBitmap = new WriteableBitmap(colorStream.FrameWidth, colorStream.FrameHeight, 96, 96, PixelFormats.Bgr32, null); this._ColorImageBitmapRect = new Int32Rect(0, 0, colorStream.FrameWidth, colorStream.FrameHeight); this._ColorImageStride = colorStream.FrameWidth * colorStream.FrameBytesPerPixel; ColorImageElement.Source = this._ColorImageBitmap;
}
sensor.ColorFrameReady += Kinect_ColorFrameReady; sensor.Start();
}
31
CHAPTER2APPLICATIONFUNDAMENTALS
Tocompletethisupgrade,weneedtoreplaceonelineofcodefromtheColorFrameReadyevent handler.Listing2-6showstheeventhandlerwiththenewlineofcodeinbold.First,deletethecodethat createdanewbitmapfromtheframedata.ThecodeupdatestheimagepixelsbycallingtheWritePixels methodontheWriteableBitmapobject.Themethodtakesinthedesiredimagerectangle,anarrayof bytesrepresentingthepixeldata,theimagestride,andanoffset.Theoffsetisalwayszero,becausewe arereplacingeverypixelintheimage. Listing2-6UpdatingtheImagePixels private void Kinect_ColorFrameReady(object sender, ColorImageFrameReadyEventArgs e) { using(ColorImageFrame frame = e.OpenColorImageFrame()) { if(frame != null) { byte[] pixelData = new byte[frame.PixelDataLength]; frame.CopyPixelDataTo(pixelData); this._ColorImageBitmap.WritePixels(this._ColorImageBitmapRect, pixelData, this._ColorImageStride, 0); }
}
}
AnyKinectapplicationthatdisplaysimageframedatafromeithertheColorImageStreamorthe DepthImageStreamshouldusetheWriteableBitmaptodisplayframeimages.Inthebestcase,thecolor streamproduces30framespersecond,whichmeansalargedemandonmemoryresourcesisrequired. TheWriteableBitmapreducesthememoryconsumptionandsubsequentlythenumberofmemory allocationanddeallocationoperationsneededtosupportthedemandsofconstantimageupdates.After all,thedisplayofframedataislikelynottheprimaryfunctionofyourapplication.Therefore,youwant imagegenerationtobeasperformantaspossible.
SimpleImageManipulation EachColorImageFramereturnsrawpixeldataintheformofanarrayofbytes.Anapplicationmust explicitlycreateanimagefromthisdata.Thismeansthatifsoinspired,wecanalterthepixeldatabefore creatingtheimagefordisplay.Let’sdoaquickexperimentandhavesomefun.Addthecodeinboldin Listing2-7tothe Kinect_ColorFrameReadyeventhandler. Listing2-7.SeeingShadesofRed private void Kinect_ColorFrameReady (object sender, ImageFrameReadyEventArgs e) { using(ColorImageFrame frame = e.OpenColorImageFrame()) { if(frame != null) { byte[] pixelData = new byte[frame.PixelDataLength]; frame.CopyPixelDataTo(pixelData);
32
CHAPTER2APPLICATIONFUNDAMENTALS
for(int i = 0; i < pixelData.Length; i += frame.BytesPerPixel) { pixelData[i] = 0x00; //Blue pixelData[i + 1] = 0x00; //Green }
} }
this._ColorImageBitmap.WritePixels(this._ColorImageBitmapRect, pixelData, this._ColorImageStride, 0);
}
Thisexperimentturnsofftheblueandgreenchannelsofeachpixel.TheforloopinListing2-7 iteratesthroughthebytessuchthatiisalwaysthepositionofthefirstbyteofeachpixel.Sincepixeldata isinBgr32format,thefirstbyteisthebluechannelfollowedbygreenandred.Thetwolinesofcode insidetheloopsettheblueandgreenbytevaluesforeachpixeltozero.Theoutputisanimagewithonly shadesofred.Thisisaverybasicexampleofimageprocessing. Ourloopmanipulatesthecolorofeachpixel.Thatmanipulationisactuallysimilartothefunctionof apixelshader—algorithms,oftenverycomplex,thatmanipulatethecolorsofeachpixel.Chapter8takes adeeperlookatusingpixelshaderswithKinect.Inthemeantime,trythesimplepseudo-pixelshadersin thefollowinglist.Allyouhavetodoisreplacethecodeinsidetheforloop.Iencourageyouto experimentonyourown,andresearchpixeleffectsandshaders.Bemindfulthatthistypeofprocessing canbeveryresourceintensiveandtheperformanceofyourapplicationcouldsuffer.Pixelshadingis generallyalow-levelprocessperformedbytheGPUonthecomputergraphicscard,andnotoftenby high-levellanguagessuchasC#. •
InvertedColors–Beforedigitalcameras,therewasfilm.Thisishowapicture lookedonthefilmbeforeitwasprocessedontopaper. pixelData[i] = (byte) ~pixelData [i]; pixelData [i + 1] = (byte) ~pixelData [i + 1]; pixelData [i + 2] = (byte) ~pixelData [i + 2];
•
ApocalypticZombie–Inverttheredpixelandswaptheblueandgreenvalues. pixelData [i] = pixelData [i + 1]; pixelData [i + 1] = pixelData [i]; pixelData [i + 2] = (byte) ~pixelData [i + 2];
•
Grayscale byte gray gray pixelData [i] pixelData [i + 1] pixelData [i + 2]
•
= = = = =
Math.Max(pixelData [i], pixelData [i + 1]); Math.Max(gray, pixelData [i + 2]); gray; gray; gray;
Grainyblackandwhitemovie byte gray gray pixelData [i] pixelData [i + 1]
= = = =
Math.Min(pixelData [i], pixelData [i + 1]); Math.Min(gray, pixelData [i + 2]); gray; gray;
33
CHAPTER2APPLICATIONFUNDAMENTALS
pixelData [i + 2] = gray; •
Washedoutcolors double gray
= (pixelData [i] * 0.11) + (pixelData [i + 1] * 0.59) + (pixelData [i + 2] * 0.3); double desaturation = 0.75; pixelData [i] = (byte) (pixelData [i] + desaturation * (gray - pixelData [i])); pixelData [i + 1] = (byte) (pixelData [i + 1] + desaturation * (gray - pixelData [i + 1])); pixelData [i + 2] = (byte) (pixelData [i + 2] + desaturation * (gray - pixelData [i + 2])); •
Highsaturation–Alsotryreversingthelogicsothatwhentheifconditionistrue, thecoloristurnedon(0xFF),andwhenfalse,itisturnedoff(0x00). if(pixelData [i] < 0x33 || pixelData [i] > 0xE5) { pixelData [i] = 0x00; } else { pixelData [i] = 0xFF; } if(pixelData [i + 1] < 0x33 || pixelData [i + 1] > 0xE5) { pixelData [i + 1] = 0x00; } else { pixelData [i + 1] = 0xFF; } if(pixelData [i + 2] < 0x33 || pixelData [i + 2] > 0xE5) { pixelData [i + 2] = 0x00; } else { pixelData [i + 2] = 0xFF; }
TakingaSnapshot AfunthingtodoinanyKinectapplicationistocapturepicturesfromthevideocamera.Becauseofthe gesturalnatureofKinectapplications,peopleareofteninawkwardandfunnypositions.Taking snapshotsespeciallyworksifyourapplicationissomeformofaugmentedreality.SeveralXboxKinect gamestakesnapshotsofplayersatdifferentpointsinthegame.Thisprovidesanextrasourceoffun,
34
CHAPTER2APPLICATIONFUNDAMENTALS
becausethepicturesareviewableafterthegamehasended.Playersaregivenanadditionalformof entertainmentbylaughingatthemselves.Whatismore,theseimagesaresharable.Yourapplicationcan uploadtheimagestosocialsitessuchasFacebook,Twitter,orFlickr.Savingframesduringthegameor providinga“Takeapicture”buttonaddstotheexperienceofallKinectapplications.Thebestpartitis reallysimpletocode.ThefirststepistoaddabuttontoourXAMLasshowninListing2-8. Listing2-8.AddaButtontoTakeaSnapshot
Inthecode-behindfortheMainWindow,addausingstatementtoreferenceSystem.IO.Amongother things,thisnamespacecontainsobjectsthatreadandwritefilestotheharddrive.Next,createthe TakePictureButton_Clickeventhandler,asshowninListing2-9. Listing2-9TakingaPicture private void TakePictureButton_Click(object sender, RoutedEventArgs e) { string fileName = "snapshot.jpg"; if(File.Exists(fileName)) { File.Delete(fileName); } using(FileStream savedSnapshot = new FileStream(fileName, FileMode.CreateNew)) { BitmapSource image = (BitmapSource) VideoStreamElement.Source; JpegBitmapEncoder jpgEncoder = new JpegBitmapEncoder(); jpgEncoder.QualityLevel = 70; jpgEncoder.Frames.Add(BitmapFrame.Create(image)); jpgEncoder.Save(savedSnapshot);
}
}
savedSnapshot.Flush(); savedSnapshot.Close(); savedSnapshot.Dispose();
35
CHAPTER2APPLICATIONFUNDAMENTALS
ThefirstfewlinesofcodeinListing2-9removeanyexistingsnapshots.Thisisjusttomakethesave processeasy.Therearemorerobustwaysofhandlingsavedfilesormaintainingfiles,butweleave documentmanagementdetailsforyoutoaddress.Thisisjustasimpleapproachtosavingasnapshot. Oncetheoldsavedsnapshotisdeleted,weopenaFileStreamtocreateanewfile.The JpegBitmapEncoderobjecttranslatestheimagefromtheUItoastandardJPEGfile.AftersavingtheJPEG totheopenFileStreamobject,weflush,close,anddisposeofthefilestream.Theselastthreeactionsare unnecessarybecausewehavetheFileStreamwrappedintheusingstatement,butwegenerallyliketo explicitlywritethiscodetoensurethatthefilehandleandotherresourcesarereleased. Testitout!Runtheapplication,strikeapose,andclickthe“TakePicture”button.Yoursnapshot shouldbesittinginthesamedirectoryinwhichtheapplicationisrunning.Thelocationwheretheimage issaveddependsonthevalueofthefilenamevariable,soyouultimatelycontrolwherethefileissaved. Thisdemonstrationismeanttobesimple,butamorecomplexapplicationwouldlikelyrequiremore thoughtastowheretosavetheimages.Havefunwiththesnapshotfeature.Gobackthroughtheimage manipulationsection,andtakesnapshotsofyouandyourfriendswiththedifferentimagetreatments.
Reflectingontheobjects Tothispoint,wehaveaccomplisheddiscoveringandinitializingaKinectsensor.Wecreatedcolor imagesfromtheKinect’svideocamera.Letustakeamomenttovisualizetheessenceoftheseclasses, andhowtheyrelatetoeachother.Thisalsogivesusasecondtoreflectonwhatwe’velearnedsofar,as wequicklybrushedoveracoupleofimportantobjects.Figure2-1iscalledaclassdiagram.Theclass diagramisatoolsoftwaredevelopersuseforobjectmodeling.Thepurposeofthisdiagramistoillustrate theinterconnectednatureofclasses,enumerations,andstructures.Italsoshowsallclassmembers (fields,methods,andproperties).
36
CHAPTER2APPLICATIONFUNDAMENTALS
Figure2-1TheColorImageStreamobjectmodel WeknowfromworkingwiththeColorImageStream(Listing2-2)thatitisapropertyofaKinectSensor object.Thecolorstream,likeallstreamsontheKinectSensor,mustbeenabledforittoproduceframes ofdata.TheColorImageStreamhasanoverloadedEnabledmethod.Thedefaultmethodtakesno parameters,whiletheotherprovidesameansfordefiningtheimageformatofeachframethrougha singleparameter.Theparameter’stypeisColorImageFormat,whichisanenumeration.Table2-2liststhe valuesoftheColorImageFormatenumeration.ThedefaultEnabledmethodoftheColorImageStreamsets theimageformattoRGBwitharesolutionof640x480at30framespersecond.Onceenabled,theimage formatisavailablebywayoftheFormatproperty.
37
CHAPTER2APPLICATIONFUNDAMENTALS
Table2-2ColorImageStreamFormats
ColorImageFormat
What it means
Undefined
Theimageresolutionisindeterminate.
RgbResolution640x480Fps30
Theimageresolutionis640x480,pixeldataisRGB32format at30framespersecond.
RgbResolution1280x960Fps12
Theimageresolutionis1280x960,pixeldataisRGB32 formatat12framespersecond.
YuvResolution640x480Fps15
Theimageresolutionis640x480,pixeldataisYUVformatat 15framespersecond.
RawYuvResolution640x480Fps15
Theimageresolutionis640x480,pixeldataisrawYUV formatat15framespersecond.
TheColorImageStreamhasfivepropertiesthatprovidemeasurementsofthecamera’sfieldofview. ThepropertiesallbeginwithNominal,becausethevaluesscaletotheresolutionsetwhenthestreamis enabled.Someapplicationsneedtoperformcalculationsbasedonthecamera’sopticpropertiessuchas fieldofviewandfocallength.ThepropertiesontheColorImageStreamallowdeveloperstocodetothe properties,makingtheapplicationsmorerobusttoresolutionchangesorfuturehardwareupdates, whichprovidegreaterqualityimageresolutions.InChapter3,wedemonstrateusesfortheseproperties, butusingtheDepthImageStream. TheImageStreamclassisthebasefortheColorImageStream(theDepthImageStream,too).Assuch,the ColorImageStreaminheritsthefourpropertiesthatdescribethepixeldatageneratedbyeachframe producedbythestream.WeusedthesepropertiesinListing2-2tocreateaWriteableBitmapobject.The valuesofthesepropertiesdependontheColorImageFormatspecifiedwhenthestreamisenabled. TocompletethecoverageoftheImageStreamclass,theclassdefinesapropertyandmethod,not discusseduptonow,namedIsEnabledandDisable,respectively.TheIsEnabledpropertyisread-only.It returnstrueafterthestreamisenabledandfalseafteracalltotheDisablemethod.TheDisable methoddeactivatesthestream.Frameproductionceasesand,asaresult,theColorFrameReadyeventon theKinectSensorobjectstopsfiring. Whenenabled,theColorImageStreamproducesColorImageFrameobjects.TheColorImageFrame objectissimple.IthasapropertynamedFormat,whichistheColorImageFormatvalueoftheparent stream.Ithasasingle,non-inheritedmethod,CopyPixelDataTo,whichcopiesimagepixeldatatoa specifiedbytearray.Theread-onlyPixelDataLengthpropertydefinesthesizeofthearray.Thevalueof thePixelDataLengthpropertyiscalculatedbymultiplyingthevaluesoftheWidth,Height,and BytesPerPixelproperties.ThesepropertiesareallinheritedfromtheabstractclassImageFrame.
38
CHAPTER2APPLICATIONFUNDAMENTALS
BYTES PER PIXEL ThestreamFormatdeterminesthepixelformatandthereforethemeaningofthebytes.Ifthestreamis enabledusingformatColorImageFormat.RgbResolution640x480Fps30,thepixelformatisBgr32.This meansthatthereare32bits(4bytes)perpixel.Thefirstbyteisthebluechannelvalue,thesecondis green,andthethirdisred.Thefourthbyteisunused.TheBgra32pixelformatisalsovalidtousewhen workingwithanRGBresolution(RgbResolution640x480Fps30 orRgbResolution1280x960Fps12).Inthe Bgra32pixelformat,thefourthbytedeterminesthealphaoropacityofthepixel. Iftheimagesizeis640x480,thenthebytearraywillhave122880bytes(height*width*BytesPerPixel= 640*480*4). Asasidenote,whenworkingwithimages,youwillseethetermstride.Thestrideisthenumberofbytes forarowofpixels.Multiplyingtheimagewidthbythebytesperpixelcalculatesthestride. Inadditiontohavingpropertiesthatdescribethepixeldata,theColorImageFrameobjecthasa coupleofpropertiestodescribetheframeitself.Eachframeproducedbythestreamisnumbered.The framenumberincreasessequentiallyastimeprogresses.Anapplicationshouldnotexpecttheframe numbertoalwaysbeincrementallyonegreaterthanthepreviousframe,butratheronlytobegreater thanthepreviousframe.Thisisbecauseitisalmostimpossibleforanapplicationtonotskipover framesduringnormalexecution.Theotherproperty,whichdescribestheframe,isTimestamp.The TimestampisthenumberofmillisecondssincetheKinectSensorwasstarted(thatis,theStartmethod wascalled).TheTimestampvalueresetstozeroeachtimetheKinectSensorstarts.
DataRetrieval:EventsandPolling TheprojectsthroughoutthischapterrelyoneventsfromtheKinectSensorobjecttodeliverframedata forprocessing.EventsareprolificinWPFandwellunderstoodbydevelopersastheprimarymeansof notifyingapplicationcodewhendataorstatechangesoccur.FormostKinect-basedapplications,the eventmodelissufficient;however,itisnottheonlymeansofretrievingframedatafromastream.An applicationcanmanuallypollaKinectdatastreamforanewframeofdata. Pollingisasimpleprocessbywhichanapplicationmanuallyrequestsaframeofdatafroma stream.EachKinectdatastreamhasamethodnamedOpenNextFrame.WhencallingtheOpenNextFrame method,theapplicationspecifiesatimeoutvalue,whichistheamountoftimetheapplicationiswilling towaitforanewframe.Thetimeoutismeasuredinmilliseconds.Themethodattemptstoretrieveanew frameofdatafromthesensorbeforethetimeoutexpires.Ifthetimeoutexpires,themethodreturnsa nullframe. Whenusingtheeventmodel,theapplicationsubscribestothestream’sframe-readyevent.Each timetheeventfires,theeventhandlercallsamethodontheeventargumentstogetthenextframe.For example,whenworkingwiththecolorstream,theeventhandlercallstheOpenColorImageFrameonthe ColorImageFrameReadyEventArgstogettheColorImageFrameobject.Theeventhandlershouldalwaystest theframeforanullvalue,becausetherearecircumstanceswhentheeventfires,butthereisnoavailable frame.Beyondthat,theeventmodelrequiresnoextrasanitychecksorexceptionhandling. Bycontrast,theOpenNextFramemethodhasthreeconditionsunderwhichitraisesan InvalidOperationExceptionexception.AnapplicationcanexpectanexceptionwhentheKinectSensoris notrunning,whenthestreamisnotenabled,orwhenusingeventstoretrieveframedata.Anapplication
39
CHAPTER2APPLICATIONFUNDAMENTALS
mustchooseeithereventsorpolling,butcannotuseboth.However,anapplicationcanuseeventsfor onestream,butpollingonanother.Forexample,anapplicationcouldsubscribetotheColorFrameReady event,butpolltheSkeletonStreamforframedata.TheonecaveatistheAllFramesReadyeventcoversall streams—meaningthatifanapplicationsubscribestotheAllFramesReadyevent,anyattempttopollany ofthestreamsresultsinanInvalidOperationException. Beforedemonstratinghowtocodethepollingmodel,itisgoodtounderstandthecircumstances underwhichanapplicationwouldrequirepolling.Themostbasicreasonforusingpollingis performance.Pollingremovestheinnateoverheadassociatedwithevents.Whenanapplicationtakeson frameretrieval,itisrewardedwithperformancegains.Thedrawbackispollingismorecomplicatedto implementthanusingevents. Besidesperformance,theapplicationtypecannecessitateusingpolling.WhenusingtheKinectfor WindowsSDK,anapplicationisnotrequiredtouseWPF.TheSDKalsoworkswithXNA,whichhasa differentapplicationloopthanWPFandisnoteventdriven.Iftheneedsoftheapplicationdictateusing XNA,forexamplewhenbuildinggames,youhavetousepolling.UsingtheKinectforWindowsSDK,itis alsopossibletocreateconsoleapplications,whichhavenouserinterfaceatall.Imaginecreatingarobot thatusesKinectforeyes.Theapplicationthatdrivesthefunctionsoftherobotdoesnotneedauser interface.Itcontinuallypullsandprocessestheframedatatoprovideinputtotherobot.Inthisusecase, pollingisthebestoption. Ascoolasitwouldbeforthisbooktoprovideyouwithaprojectthatbuildsarobottodemonstrate polling,itisimpossible.Instead,wemodestlyrecreatethepreviousproject,whichdisplayscolorimage frames.Tobegin,createanewproject,addreferencestoMicrosoft.Kinect.dll,andupdatethe MainWindow.xamltoincludeanImageelementnamedColorImageElement.Listing2-10showsthebase codeforMainWindow. Listing2-10PreparingtheFoundationforaPollingApplication #region Member Variables private KinectSensor _Kinect; private WriteableBitmap _ColorImageBitmap; private Int32Rect _ColorImageBitmapRect; private int _ColorImageStride; private byte[] _ColorImagePixelData; #endregion Member Variables #region Constructor public MainWindow() { InitializeComponent(); CompositionTarget.Rendering += CompositionTarget_Rendering; } #endregion Constructor #region Methods private void CompositionTarget_Rendering(object sender, EventArgs e) { DiscoverKinectSensor(); PollColorImageStream(); } #endregion Methods
40
CHAPTER2APPLICATIONFUNDAMENTALS
ThemembervariabledeclarationsinListing2-10arethesameaspreviousprojectsinthischapter. Pollingchangeshowanapplicationretrievesdata,butmanyoftheotheraspectsofworkingwiththe dataarethesameaswhenusingtheeventmodel.Anypolling-basedprojectneedstodiscoverand initializeaKinectSensorobjectjustasanevent-basedapplicationdoes.Thisprojectalsousesa WriteableBitmaptocreateframeimages.Theprimarydifferenceisintheconstructorwesubscribetothe RenderingeventontheCompositionTargetobject.But,wait! WhatistheCompositionTargetobject? WhatcausestheRenderingeventtofire? Aren’twetechnicallystillusinganeventmodel? Theseareallvalidquestionstoask.TheCompositionTargetobjectisarepresentationofthe drawablesurfaceofanapplication.TheRenderingeventfiresonceperrenderingcycle.TopollKinectfor newframedata,weneedaloop.Therearetwowaystocreatethisloop.Onemethodistouseathread, whichwewilldoinournextproject,butanothermoresimpletechniqueistouseabuilt-inloop.The RenderingeventoftheCompositionTargetprovidesaloopwiththeleastamountofwork.Usingthe CompositionTargetissimilartothegamingloopofagamingenginesuchasXNA.Thereisonedrawback withusingtheCompositionTarget.Anylong-runningprocessintheRenderingeventhandlercancause performanceissuesontheUI,becausetheeventhandlerexecutesonthemainUIthread.Withthisin mind,donotattempttodotoomuchworkinthiseventhandler. ThecodewithintheRenderingeventhandlerneedstodofourtasks.Itmustdiscoveraconnected KinectSensor,initializethesensor,responsetoanystatuschangeinthesensor,andultimatelypollfor andprocessanewframeofdata.Webreakthesefourtasksintotwomethods.ThecodeinListing2-11 performsthefirstthreetasks.ThiscodeisalmostidenticaltocodewepreviouswroteinListing2-2.
41
CHAPTER2APPLICATIONFUNDAMENTALS
Listing2-11DiscoverandInitializeaKinectSensorforPolling private void DiscoverKinectSensor() { if(this._Kinect != null && this._Kinect.Status != KinectStatus.Connected) { //If the sensor is no longer connected, we need to discover a new one. this._Kinect = null; } if(this._Kinect == null) { //Find the first connected sensor this._Kinect = KinectSensor.KinectSensors .FirstOrDefault(x => x.Status == KinectStatus.Connected); if(this._Kinect != null) { //Initialize the found sensor this._Kinect.ColorStream.Enable(); this._Kinect.Start(); ColorImageStream colorStream this._ColorImageBitmap
this._ColorImageBitmapRect this._ColorImageStride this.ColorImageElement.Source this._ColorImagePixelData }
}
= this._Kinect.ColorStream; = new WriteableBitmap(colorStream.FrameWidth, colorStream.FrameHeight, 96, 96, PixelFormats.Bgr32, null); = new Int32Rect(0, 0, colorStream.FrameWidth, colorStream.FrameHeight); = colorStream.FrameWidth * colorStream.FrameBytesPerPixel; = this._ColorImageBitmap; = new byte[colorStream.FramePixelDataLength];
}
Listing2-12providesthecodeforthePollColorImageStreammethod.ReferbacktoListing2-6and noticethatthecodeisvirtuallythesame.InListing2-12,wetesttoensurewehaveasensorwithwhich towork,andwecalltheOpenNextFramemethodtogetacolorimageframe.Thecodetogettheframeand updatetheWriteableBitmapiswrappedinatry-catchstatement,becauseofthepossibilitiesofthe OpenNextFramemethodcallthrowinganexception.The100millisecondstimeoutdurationpassedtothe OpenNextFramecallisfairlyarbitrary.Awell-chosentimeoutensurestheapplicationcontinuestooperate smoothlyevenifaframeortwoareskipped.Youwillalsowantyourapplicationtomaintainascloseto 30framespersecondaspossible.
42
CHAPTER2APPLICATIONFUNDAMENTALS
Listing2-12PollingForaFrameofStreamData private void PollColorImageStream() { if(this._Kinect == null) { //Display a message to plug-in a Kinect. } else { try { using(ColorImageFrame frame = this._Kinect.ColorStream.OpenNextFrame(100)) { if(frame != null) { frame.CopyPixelDataTo(this._ColorImagePixelData); this._ColorImageBitmap.WritePixels(this._ColorImageBitmapRect, this._ColorImagePixelData, this._ColorImageStride, 0); } } } catch(Exception ex) { //Report an error message } } } Nowruntheproject.Overall,thepollingmodelshouldperformbetterthantheeventmodel, although,duetothesimplicityoftheseexamples,anychangeinperformancemaybemarginalatbest. Thisexampleofusingpolling,however,continuestosufferfromanotherproblem.Byusingthe CompositionTargetobject,theapplicationremainstiedtoWPF’sUIthread.Anylong-runningdata processingorpoorlychosentimeoutfortheOpenNextFramemethodcancauseslow,choppy,or unresponsivebehaviorintheapplication,becauseitexecutesontheUIthread.Thesolutionistoforka newthreadandimplementallpollinganddataprocessingonthesecondarythread. ForthecontrastbetweenpollingusingtheCompositionTargetandabackgroundthread,createa newWPFprojectinVisualStudio.In.NET,workingwiththreadsismadeeasywiththeBackgroundWorker class.UsingaBackgroundWorkerobjectdevelopersdonothavetoconcernthemselveswiththetedious workofmanagingthethread.Startingandstoppingthethreadbecomestrivial.Developersareonly responsibleforwritingthecodethatexecutesonthethread.TousetheBackgroundWorkerclass,adda referencetoSystem.ComponentModeltothesetofusingstatementsinMainWindow.xaml.cs.When finished,addthecodefromListing2-13.
43
CHAPTER2APPLICATIONFUNDAMENTALS
Listing2-13PollingonaSeparateThread #region Member Variables private KinectSensor _Kinect; private WriteableBitmap _ColorImageBitmap; private Int32Rect _ColorImageBitmapRect; private int _ColorImageStride; private byte[] _ColorImagePixelData; private BackgroundWorker _Worker; #endregion Member Variables public MainWindow() { InitializeComponent(); this._Worker = new BackgroundWorker(); this._Worker.DoWork += Worker_DoWork; this._Worker.RunWorkerAsync(); this.Unloaded += (s, e) => { this._Worker.CancelAsync(); }; } private void Worker_DoWork(object sender, DoWorkEventArgs e) { BackgroundWorker worker = sender as BackgroundWorker;
}
if(worker != null) { while(!worker.CancellationPending) { DiscoverKinectSensor(); PollColorImageStream(); } }
First,noticethatthesetofmembervariablesinthisprojectarethesameasinthepreviousproject (Listing2-10)withtheadditionoftheBackgroundWorkervariable_Worker.Intheconstructor,wecreate aninstanceoftheBackgroundWorkerclass,subscribetotheDoWorkevent,andstartthenewthread.The lastlineofcodeintheconstructorcreatesananonymouseventhandlertostoptheBackgroundWorker whenthewindowcloses.AlsoincludedinListing2-13isthecodefortheDoWorkeventhandler. TheDoWorkeventfireswhenthethreadstarts.Theeventhandlerloopsuntilitisnotifiedthatthe threadhasbeencancelled.Duringeachpassintheloop,itcallstheDiscoverKinectSensorand PollColorImageStreammethods.Copythecodeforthesemethodsfromthepreviousproject(Listing2-11 andListing2-12),butdonotattempttoruntheapplication.Ifyoudo,youquicklynoticetheapplication failswithanInvalidOperationExceptionexception.Theerrormessagereads,“Thecallingthreadcannot accessthisobjectbecauseadifferentthreadownsit.”
44
CHAPTER2APPLICATIONFUNDAMENTALS
Thepollingofdataoccursonabackgroundthread,butwewanttoupdateUIelements,whichexist onanotherthread.Thisisthejoyofworkingacrossthreads.ToupdatetheUIelementsfromthe backgroundthread,weusetheDispatcherobject.EachUIelementinWPFhasaDispatcher,whichis usedtoexecuteunitsofworkonthesamethreadastheUIelement.Listing2-14containstheupdated versionsoftheDiscoverKinectSensorandPollColorImageStreammethods.Thechangesnecessaryto updatetheUIthreadareshowninbold. Listing2-14UpdatingtheUIThread private void DiscoverKinectSensor() { if(this._Kinect != null && this._Kinect.Status != KinectStatus.Connected) { this._Kinect = null; } if(this._Kinect == null) { this._Kinect = KinectSensor.KinectSensors .FirstOrDefault(x => x.Status == KinectStatus.Connected); if(this._Kinect != null) { this._Kinect.ColorStream.Enable(); this._Kinect.Start(); ColorImageStream colorStream
= this._Kinect.ColorStream;
this.ColorImageElement.Dispatcher.BeginInvoke(new Action(() => { this._ColorImageBitmap = new WriteableBitmap(colorStream.FrameWidth, colorStream.FrameHeight, 96, 96, PixelFormats.Bgr32, null); this._ColorImageBitmapRect = new Int32Rect(0, 0, colorStream.FrameWidth, colorStream.FrameHeight); this._ColorImageStride = colorStream.FrameWidth * colorStream.FrameBytesPerPixel; this._ColorImagePixelData = new byte[colorStream.FramePixelDataLength]; this.ColorImageElement.Source = this._ColorImageBitmap; }
}
}));
} private void PollColorImageStream() { if(this._Kinect == null) {
45
CHAPTER2APPLICATIONFUNDAMENTALS
//Notify that there are no available sensors.
} else {
try {
using(ColorImageFrame frame = this._Kinect.ColorStream.OpenNextFrame(100)) { if(frame != null) { frame.CopyPixelDataTo(this._ColorImagePixelData); this.ColorImageElement.Dispatcher.BeginInvoke(new Action(() => { this._ColorImageBitmap.WritePixels(this._ColorImageBitmapRect, this._ColorImagePixelData, this._ColorImageStride, 0); })); }
}
}
} catch(Exception ex) { //Report an error message }
} Withthesecodeadditions,younowhavetwofunctionalexamplesofpolling.Neitherofthese pollingexamplesarecompleterobustapplications.Infact,bothrequiresomeworktoproperlymanage andcleanupresources.Forexample,neitheruninitializedtheKinectSensor.Suchtasksremainyour responsibilitywhenbuildingreal-worldKinect-basedapplications. Theseexamplesneverthelessprovidethefoundationfromwhichyoucanbuildanyapplication drivenbyapollingarchitecture.Pollinghasdistinctadvantagesovertheeventmodel,butatthecostof additionalworkforthedeveloperandpotentialcomplexitytotheapplicationcode.Inmost applications,theeventmodelisadequateandshouldbeusedinsteadofpolling;theprimaryexception fornotusingtheeventmodeliswhenyourapplicationisnotwrittenforWPF.Forexample,anyconsole, XNA,orotherapplicationusingacustomapplicationloopshouldemploythepollingarchitecture model.ItisrecommendedthatallWPF-basedapplicationsinitiallyusetheframe-readyeventsonthe KinectSensorandonlytransitiontopollingifperformanceconcernswarrantpolling.
Summary Insoftwaredevelopment,patternspersisteverywhere.Everyapplicationofacertaintypeorcategoryis fundamentallythesameandcontainssimilarstructuresandarchitectures.Inthischapter,wepresented corecodeforbuildingKinect-drivenapplications.EachKinectapplicationyoudevelopwillcontainthe samelinesofcodetodiscover,initialize,anduninitializetheKinectsensor. WeexploredtheprocessofworkingwithframedatageneratedbytheColorImageStream. Additionally,wedovedeeperandstudiedtheColorImageStream’sbaseImageStreamclassaswellasthe
46
CHAPTER2APPLICATIONFUNDAMENTALS
ImageFrameclass.TheImageStreamandImageFramearealsothebaseclassesfortheDepthImageStreamand DepthFrameclasses,whichweintroduceinthenextchapter. ThemechanismstoretrieverawdatafromKinect’sdatastreamsarethesameregardlessofwhich streamyouuse.Architecturallyspeaking,allKinectapplicationsuseeithereventsorpollingtoretrieve framedatafromaKinectdatastream.Theeasiesttouseistheeventmodel.Thisisalsothedefacto architecture.Whenusingtheeventmodel,theKinectSDKdoestheworkofpollingstreamdataforyou. However,iftheneedsoftheapplicationdictate,theKinectSDKallowsdeveloperstopolldatafromthe Kinectmanually. Thesearethefundamentals.AreyoureadytoseetheKinectSDKingreaterdepth?
47
CHAPTER 3
Depth Image Processing Theproductionofthree-dimensionaldataistheprimaryfunctionofKinect.Itisuptoyoutocreate excitingexperienceswiththedata.ApreconditiontobuildingaKinectapplicationishavingan understandingoftheoutputofthehardware.Beyondsimplyunderstanding,theintrinsicmeaningof the1’sand0’sisacomprehensionofitsexistentialsignificance.Image-processingtechniquesexist todaythatdetecttheshapesandcontoursofobjectswithinanimage.TheKinectSDKusesimage processingtotrackusermovementsintheskeletontrackingengine.Depthimageprocessingcanalso detectnon-humanobjectssuchasachairorcoffeecup.Therearenumerouscommerciallabsand universitiesactivelystudyingtechniquestoperformthislevelofobjectdetectionfromdepthimages. Therearemanydifferentusesandfieldsofstudyarounddepthinputthatitwouldbeimpossibleto coverthemorcoveranyonetopicwithconsiderableprofundityinthisbookmuchlessasinglechapter. Thegoalofthischapteristodetailthedepthdatadowntothemeaningofeachbit,andintroduceyouto thepossibleimpactthataddingjustoneadditionaldimensioncanhaveonanapplication.Inthis chapter,wediscusssomebasicconceptsofdepthimageprocessing,andsimpletechniquesforusing thisdatainyourapplications.
SeeingThroughtheEyesoftheKinect Kinectisdifferentfromallotherinputdevices,becauseitprovidesathirddimension.Itdoesthisusing aninfraredemitterandcamera.UnlikeotherKinectSDKssuchasOpenNI,orlibfreenect,theMicrosoft SDKdoesnotproviderawaccesstotheIRstream.Instead,theKinectSDKprocessestheIRdata returnedbytheinfraredcameratoproduceadepthimage.Depthimagedatacomesfroma DepthImageFrame,whichisproducedbytheDepthImageStream. WorkingwiththeDepthImageStreamissimilartotheColorImageStream.TheDepthImageStreamand ColorImageStreambothsharethesameparentclassImageStream.Wecreateimagesfromaframeofdepth datajustaswedidwiththecolorstreamdata.Begintoseethedepthstreamimagesbyfollowingthese steps,whichbynowshouldlookfamiliar.Theyarethesameasfromthepreviouschapterwherewe workedwiththecolorstream. 1.
CreateanewWPFApplicationproject.
2.
AddareferencetoMicrosoft.Kinect.dll.
3.
AddanImageelementtoMainWindow.xamlandnameit“DepthImage”.
4.
AddthenecessarycodetodetectandinitializeaKinectSensorobject.Referto Chapter2asneeded.
5.
UpdatethecodethatinitializestheKinectSensorobjectsothatitmatches Listing3-1.
49
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-1.InitializingtheDepthStream this._KinectDevice.DepthStream.Enable(); this._KinectDevice.DepthFrameReady += KinectDevice_DepthFrameReady; 6.
AddtheDepthFrameReadyeventhandlercode,asshowninListing3-2.For thesakeofbeingbriefwiththecodelisting,wearenotusingthe WriteableBitmaptocreatedepthimages.Weleavethisasarefactoring exerciseforyoutoundertake.RefertoListing2-5ofChapter2asneeded.
Listing3-2.DepthFrameReadyEventHandler using(DepthImageFrame frame = e.OpenDepthImageFrame()) { if(frame != null) { short[] pixelData = new short[frame.PixelDataLength]; frame.CopyPixelDataTo(pixelData); int stride = frame.Width * frame.BytesPerPixel; DepthImage.Source = BitmapSource.Create(frame.Width, frame.Height, 96, 96, PixelFormats.Gray16, null, pixelData, stride); } } 7.
Runtheapplication!
WhenKinecthasanewdepthimageframeavailableforprocessing,theKinectSensorfiresthe DepthFrameReadyevent.Oureventhandlersimplytakestheimagedataandcreatesabitmap,whichis thendisplayedintheUIwindow.ThescreenshotinFigure3-1isanexampleofthedepthstreamimage. ObjectsnearKinectareadarkshadeofgrayorblack.ThefartheranobjectisfromKinect,thelighterthe gray.
50
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-1.Rawdepthimageframe
MeasuringDepth TheIRordepthcamerahasafieldofviewjustlikeanyothercamera.ThefieldofviewofKinectis limited,asillustratedinFigure3-2.TheoriginalpurposeofKinectwastoplayvideogameswithinthe confinesofgameroomorlivingroomspace.Kinect’snormaldepthvisionrangesfromaroundtwoanda halffeet(800mm)tojustover13feet(4000mm).However,arecommendedusagerangeis3feetto12 feetasthereliabilityofthedepthvaluesdegradeattheedgesofthefieldofview.
51
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-2.Kinectfieldofview Likeanycamera,thefieldofviewofthedepthcameraispyramidshaped.Objectsfartherawayfrom thecamerahaveagreaterlateralrangethanobjectsnearertoKinect.Thismeansthatheightandwidth pixeldimensions,suchas640480,donotcorrespondwithaphysicallocationinthecamera’sfieldof view.Thedepthvalueofeachpixel,however,doesmaptoaphysicaldistanceinthefieldofview.Each pixelrepresentedinadepthframeis16bits,makingtheBytesPerPixelpropertyofeachframeavalueof two.Thedepthvalueofeachpixelisonly13ofthe16bits,asshowninFigure3-3.
Figure3-3.Layoutofthedepthbits Gettingthedistanceofeachpixeliseasy,butnotobvious.Itrequiressomebitmanipulation,which sadlyasdeveloperswedonotgettodomuchofthesedays.Itisquitepossiblethatsomedevelopers haveneverusedorevenheardofbitwiseoperators.Thisisunfortunate,becausebitmanipulationisfun andcanbeanart.Atthispoint,youarelikelythinkingsomethingtotheeffectof,“thisguy‘sanerd”and
52
CHAPTER3DEPTHIMAGEPROCESSING
you’dberight.Includedintheappendixismoreinstructionandexamplesofusingbitmanipulationand math. AsFigure3-3,shows,thedepthvalueisstoredinbits3to15.Togetthedepthvalueintoaworkable form,wehavetoshiftthebitstotherighttoremovetheplayerindexbits.Wediscussthesignificanceof theplayerindexbitslater.Listing3-3showssamplecodetogetthedepthvalueofapixel.Referto AppendixAforathoroughexplanationofthebitwiseoperationsusedinListing3-3.Inthelisting,the pixelDatavariableisassumedanarrayofshortvaluesoriginatingfromthedepthframe.ThepixelIndex variableiscalculatedbasedonthepositionofthedesiredpixel.TheKinectforWindowsSDKdefinesa constantontheDepthImageFrameclass,whichspecifiesthenumberofbitstoshiftrighttogetthedepth valuenamedPlayerIndexBitmaskWidth.ApplicationsshouldusethisconstantinsteadofusinghardcodedliteralsasthenumberofbitsreservedforplayersmayincreaseinfuturereleasesofKinect hardwareandtheSDK. Listing3-3.BitManipulationtoGetDepth int pixelIndex = pixelX + (pixelY * frame.Width); int depth = pixelData[pixelIndex] >> DepthImageFrame.PlayerIndexBitmaskWidth; Aneasywaytoseethedepthdataistodisplaytheactualnumbers.Letusupdateourcodetooutput thedepthvalueofapixelataparticularlocation.Thisdemonstrationusesthepositionofthemouse pointerwhenthemouseisclickedonthedepthimage.Thefirststepistocreateaplacetodisplaythe depthvalue.UpdatetheMainWindow.xaml,tolooklikeListing3-4. Listing3-4.NewTextBlocktoDisplayDepthValues
Listing3-5showsthecodeforthemouse-upeventhandler.Beforeaddingthiscode,therearea coupleofchangestonote.ThecodeinListing3-5assumestheprojecthasbeenrefactoredtousea WriteableBitmap.Thecodechangesspecifictothisdemonstrationstartbycreatingaprivatemember variablenamed_LastDepthFrame.IntheKinectDevice_DepthFrameReadyeventhandler,setthevalueof the_LastDepthFramemembervariabletothecurrentframeeachtimetheDepthFrameReadyeventfires. Becauseweneedtokeepareferencetothelastdepthframe,theeventhandlercodedoesnot immediatelydisposeoftheframeobject.Next,subscribetotheMouseLeftButtonUpeventonthe DepthFrameimageobject.Whentheuserclicksthedepthimage,theDepthImage_MouseLeftButtonUp eventhandlerexecutes,whichlocatesthecorrectpixelbythemousecoordinates.Thelaststepisto displaythevalueintheTextBlocknamedPixelDepthcreatedinListing3-4.
53
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-5.ResponsetoaMouseClick private void KinectDevice_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e) { if(this._LastDepthFrame != null) { this._LastDepthFrame.Dispose(); this._LastDepthFrame = null; } this._LastDepthFrame = e.OpenDepthImageFrame(); if(this._LastDepthFrame != null) { this._LastDepthFrame.CopyPixelDataTo(this._DepthImagePixelData); this._RawDepthImage.WritePixels(this._RawDepthImageRect, this._DepthImagePixelData, this._RawDepthImageStride, 0); } } private void DepthImage_MouseLeftButtonUp(object sender, MouseButtonEventArgs e) { Point p = e.GetPosition(DepthImage); if(this._DepthImagePixelData != null && this._DepthImagePixelData.Length > 0) { int pixelIndex = (int) (p.X + ((int) p.Y * this._LastDepthFrame.Width)); int depth = this._DepthImagePixelData[pixelIndex] >> DepthImageFrame.PlayerIndexBitmaskWidth; int depthInches = (int) (depth * 0.0393700787); int depthFt = depthInches / 12; depthInches = depthInches % 12; PixelDepth.Text = string.Format("{0}mm ~ {1}'{2}\"", depth, depthFt, depthInches); }
}
Itisimportanttopointoutafewparticularswiththiscode.NoticethattheWidthandHeight propertiesoftheImageelementarehard-coded(Listing3-4).Ifthesevaluesarenothard-coded,thenthe Imageelementnaturallyscaleswiththesizeofitsparentcontainer.IftheImageelement’sdimensions weretobesizeddifferentlyfromthedepthframedimensions,thiscodereturnsincorrectdataormore likelythrowanexceptionwhentheimageisclicked.Thepixelarrayintheframeisafixedsizebasedon theDepthImageFormatvaluegiventotheEnabledmethodoftheDepthImageStream.Notsettingtheimage sizemeansthatitwillscalewiththesizeofitsparentcontainer,which,inthiscase,istheapplication window.Ifyoulettheimagescaleautomatically,youthenhavetoperformextracalculationstotranslate themousepositiontothedepthframedimensions.Thistypeofscalingexerciseisactuallyquite common,aswewillseelaterinthischapterandthechaptersthatfollow,butherewekeepitsimpleand hard-codetheoutputimagesize.
54
CHAPTER3DEPTHIMAGEPROCESSING
Wecalculatethepixellocationwithinthebytearrayusingthepositionofthemousewithinthe imageandthesizeoftheimage.Withthepixel’sstartingbytelocated,convertthedepthvalueusingthe logicfromListing3-3.Forcompleteness,wedisplaythedepthinfeetandinchesinadditionto millimeters.Allofthelocalvariablesonlyexisttomakethecodemorereadableonthesepagesanddo notmateriallyaffecttheexecutionofthecode. Figure3-4showstheoutputproducedbythecode.Thedepthframeimagedisplaysonthescreen andprovidesapointofreferencefortheusertotarget.Inthisscreenshot,themouseispositionedover thepalmoftheuser’shand.Onmouseclick,thepositionofthemousecursorisusedtofindthedepth valueofthepixelatthatposition.Withthepixellocated,itiseasytoextractthedepthvalue.
Figure3-4.Displayingthedepthvalueforapixel
NoteAdepthvalueofzeromeansthattheKinectwasunabletodeterminethedepthofthepixel.When processingdepthdata,treatzerodepthvaluesasaspecialcase;inmostinstances,youwilldisregardthem. ExpectazerodepthvalueforanypixelwherethereisanobjecttooclosetotheKinect.
55
CHAPTER3DEPTHIMAGEPROCESSING
EnhancedDepthImages Beforegoinganyfurther,weneedtoaddressthelookofthedepthimage.Itisnaturallydifficulttosee. Theshadesofgrayfallonthedarkerendofthespectrum.Infact,theimagesinFigures3-1and3-4had tobealteredwithanimage-editingtooltobeprintableinthebook!Inthenextsetofexercises,we manipulatetheimagebitsjustaswedidinthepreviouschapter.However,therewillbeafew differences,becauseasweknow,thedataforeachpixelisdifferent.Followingthat,weexaminehowwe cancolorizethedepthimagestoprovideevengreaterdepthresolution.
BetterShadesofGray Theeasiestwaytoimprovetheappearanceofthedepthimageistoinvertthebits.Thecolorofeach pixelisbasedonthedepthvalue,whichstartsfromzero.Inthedigitalcolorspectrum,blackis0and 65536(16-bitgrayscale)iswhite.Thismeansthatmostdepthsfallintothedarkerendofthespectrum. Additionally,donotforgetthatallundeterminabledepthsaresettozero.Invertingorcomplementing thebitsshiftsthebiastowardsthelighterendofthespectrum.Adepthofzeroisnowwhite. WekeeptheoriginaldepthimageintheUIforcomparisonwiththeenhanceddepthimage.Update MainWindow.xamltoincludeanewStackPanelandImageelement,asshowninListing3-6.Noticethe adjustmenttothewindow’ssizetoensurethatbothimagesarevisiblewithouthavingtoresizethe window. Listing3-6.UpdatedUIforNewDepthImage
Listing3-7showsthecodetoflipthedepthbitstocreateabetterdepthimage.Addthismethodto yourprojectcode,andcallitfromtheKinectDevice_DepthFrameReadyeventhandler.Thesimple functionofthiscodeistocreateanewbytearray,anddoabitwisecomplementofthebits.Also,notice thismethodfiltersoutsomebitsbydistance.Becauseweknowdepthdatabecomesinaccurateatthe edgesofthedepthrange,wesetthepixelsoutsideofourthresholdrangetoblack.Inthisexample,any pixelgreaterthan10feetandcloserthan4feetiswhite(0xFF).
56
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-7.ALightShadeofGrayDepthImage private void CreateLighterShadesOfGray(DepthImageFrame depthFrame , short[] pixelData) { int depth; int loThreshold = 1220; int hiThreshold = 3048; short[] enhPixelData = new short[depthFrame.Width * depthFrame.Height]; for(int i = 0; i < pixelData.Length; i++) { depth = pixelData[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;
}
}
if(depth < loThreshold || depth > hiThreshold) { enhPixelData [i] = 0xFF; } else { enhPixelData [i] = (short) ~pixelData[i]; }
EnhancedDepthImage.Source = BitmapSource.Create(depthFrame.Width, depthFrame.Height, 96, 96, PixelFormats.Gray16, null, enhPixelData, depthFrame.Width * depthFrame.BytesPerPixel);
Notethataseparatemethodisdoingtheimagemanipulation,whereasuptonowallframe processinghasbeenperformedintheeventhandlers.Eventhandlersshouldcontainaslittlecodeas possibleandshoulddelegatetheworktoothermethods.Theremaybeinstances,mostlydrivenby performanceconsiderations,wheretheprocessingworkwillhavetobedoneinaseparatethread. Havingthecodebrokenoutintomethodslikethismakesthesetypesofchangeseasyandpainless. Figure3-5showstheapplicationoutput.Thetwodepthimagesareshownsidebysideforcontrast. Theimageontheleftisthenaturaldepthimageoutput,whiletheimageontherightisproducedbythe codeinListing3-7.Noticethedistinctinversionofgrays.
57
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-5.Lightershadesofgray Whilethisimageisbetter,therangeofgraysislimited.Wecreatedalightershadeofgrayandnota bettershadeofgray.Tocreatearichersetofgrays,weexpandtheimagefrombeinga16-bitgrayscaleto 32bitsandcolor.Thecolorgrayoccurswhenthecolors(red,blue,andgreen)havethesamevalue.This givesusarangefrom0to255.Zeroisblack,255iswhite,andeverythingelseinbetweenisashadeof gray.Tomakeiteasiertoswitchbetweenthetwoprocesseddepthimages,wecreateanewversionof themethod,asshowninListing3-8. Listing3-8.TheDepthImageInaBetterShadeofGray private void CreateBetterShadesOfGray(DepthImageFrame depthFrame , short[] pixelData) { int depth; int gray; int loThreshold = 1220; int hiThreshold = 3048; int bytesPerPixel = 4; byte[] enhPixelData = new byte[depthFrame.Width * depthFrame.Height * bytesPerPixel]; for(int i = 0, j = 0; i < pixelData.Length; i++, j += bytesPerPixel) { depth = pixelData[i] >> DepthImageFrame.PlayerIndexBitmaskWidth; if(depth < loThreshold || depth > hiThreshold) { gray = 0xFF; } else { gray = (255 * depth / 0xFFF); } enhPixelData[j]
58
= (byte) gray;
CHAPTER3DEPTHIMAGEPROCESSING
enhPixelData[j + 1] enhPixelData[j + 2]
= (byte) gray; = (byte) gray;
}
}
EnhancedDepthImage.Source = BitmapSource.Create(depthFrame.Width, depthFrame.Height, 96, 96, PixelFormats.Bgr32, null, enhPixelData, depthFrame.Width * bytesPerPixel);
ThecodeinboldinListing3-8representsthedifferencesbetweenthisprocessingofthedepth imageandthepreviousattempt(Listing3-7).ThecolorimageformatchangestoBgr32,whichmeans thereareatotalof32bits(4bytesperpixel).Eachcolorgets8bitsandthereare8unusedbits.This limitsthenumberofpossiblegraysto255.Anyvalueoutsideofthethresholdrangeissettothecolor white.Allotherdepthsarerepresentedinshadesofgray.Theintensityofthegrayistheresultofdividing thedepthbythe4095(0xFFF),whichisthelargestpossibledepthvalue,andthenmultiplyingby255. Figure3-6showsthethreedifferentdepthimagesdemonstratedsofarinthechapter.
Figure3-6.Differentvisualizationsofthedepthimage—fromlefttoright:rawdepthimage,depthimage fromListing3-7,anddepthimagefromListing3-8
ColorDepth Theenhanceddepthimageproducesashadeofgrayforeachdepthvalue.Therangeofgraysisonly0to 255,whichismuchlessthanourrangeofdepthvalues.Usingcolorstorepresenteachdepthvaluegives moredepthtothedepthimage.Whiletherearecertainlymoreadvancedtechniquesfordoingthis,a simplemethodistoconvertthedepthvaluesintohueandsaturationvalues.Listing3-9showsan exampleofonewaytocolorizeadepthimage.
59
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-9.ColoringtheDepthImage private void CreateColorDepthImage(DepthImageFrame depthFrame , short[] pixelData) { int depth; double hue; int loThreshold = 1220; int hiThreshold = 3048; int bytesPerPixel = 4; byte[] rgb = new byte[3]; byte[] enhPixelData = new byte[depthFrame.Width * depthFrame.Height * bytesPerPixel]; for(int i = 0, j = 0; i < pixelData.Length; i++, j += bytesPerPixel) { depth = pixelData[i] >> DepthImageFrame.PlayerIndexBitmaskWidth; if(depth < loThreshold || depth > hiThreshold) { enhPixelData[j] = 0x00; enhPixelData[j + 1] = 0x00; enhPixelData[j + 2] = 0x00; } else { hue = ((360 * depth / 0xFFF) + loThreshold); ConvertHslToRgb(hue, 100, 100, rgb);
}
}
enhPixelData[j] = rgb[2]; enhPixelData[j + 1] = rgb[1]; enhPixelData[j + 2] = rgb[0];
//Blue //Green //Red
EnhancedDepthImage.Source = BitmapSource.Create(depthFrame.Width, depthFrame.Height, 96, 96, PixelFormats.Bgr32, null, enhPixelData, depthFrame.Width * bytesPerPixel); } Huevaluesaremeasuredindegreesofacircleandrangefrom0to360.Thehuevalueis proportionaltothedepthoffsetintegerandthedepththreshold.TheConvertHslToRgbmethodusesa commonalgorithmtoconverttheHSLvaluestoRGBvalues,andisincludedinthedownloadablecode forthisbook.Thisexamplesetsthesaturationandlightnessvaluesto100%. TherunningapplicationgeneratesadepthimagelikethelastimageinFigure3-7.Thefirstimagein thefigureistherawdepthimage,andthemiddleimageisgeneratedfromListing3-8.Depthscloserto thecameraareshadesofblue.Theshadesofbluetransitiontopurple,andthentoredthefartherfrom Kinecttheobjectis.Thevaluescontinuealongthisscale.
60
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-7.Colordepthimagecomparedtograyscale Youwillnoticethattheperformanceoftheapplicationissuddenlymarkedlysluggish.Ittakesa copiousamountofworktoconverteachpixel(640480=307200pixels!)intoacolorvalueusingthis method.WedonotrecommendyoudothisworkontheUIthreadaswehaveinthisexample.Abetter approachistodothisworkonabackgroundthread.EachtimetheKinectSensorfirestheframe-ready event,yourcodestorestheframeinaqueue.Abackgroundthreadwouldcontinuouslyconvertthenext frameinthequeuetoacolorimage.Aftertheconversion,thebackgroundthreadusesWPF’sDispatcher toupdatetheImagesourceontheUIthread.Thistypeofapplicationarchitectureisverycommonin Kinect-basedapplications,becausetheworknecessarytoprocessthedepthdataisperformance intensive.ItisbadapplicationdesigntodothistypeofworkontheUIasitwilllowertheframerateand ultimatelycreateabaduserexperience.
SimpleDepthImageProcessing Tothispoint,wehaveextractedthedepthvalueofeachpixelandcreatedimagesfromthedata.In previousexamples,wefilteredoutpixelsthatwerebeyondcertainthresholdvalues.Thisisaformof imageprocessing,notsurprisinglycalledthresholding.Ouruseofthresholding,whilecrude,suitsour needs.Moreadvancedprocessesusemachinelearningtocalculatethresholdvaluesforeachframe.
NoteKinectreturns4096(0to4095)possibledepthvalues.Sinceazerovaluealwaysmeansthedepthis undeterminable,itcanalwaysbefilteredout.Microsoftrecommendsusingonlydepthsfrom4to12.5feet.Before doinganyotherdepthprocessing,youcanbuildthresholdsintoyourapplicationandonlyprocessdepthranging from1220(4’)to3810(12.5’).
Usingstatisticsiscommonwhenprocessingdepthimagedata.Thresholdscanbecalculatedbased onthemeanormedianofdepthvalues.Probabilitieshelpdetermineifapixelisnoise,ashadow,or somethingofgreatermeaning,suchasbeingpartofauser’shand.Ifyouallowyourmindtoforgetthe visualmeaningofapixel,ittransitionsintorawdataatwhichpointdataminingtechniquesbecome applicable.Themotivationbehindprocessingdepthpixelsistoperformshapeandobjectrecognition. Withthisinformation,applicationscandeterminewhereauserisinrelationtoKinect,wherethatuser’s handis,andifthathandisintheactofwaving.
61
CHAPTER3DEPTHIMAGEPROCESSING
Histograms Thehistogramisatoolfordeterminingstatisticaldistributionsofdata.Ourconcernisthedistributionof depthdata.Histogramsvisuallytellthestoryofhowrecurrentcertaindatavaluesareforagivendataset. Fromahistogramwediscernhowfrequentlyandhowtightlygroupeddepthvaluesare.Withthis information,itispossibletomakedecisionsthatdeterminethresholdsandotherfilteringtechniques, whichultimatelyrevealthecontentsofthedepthimage.Todemonstratethis,wenextbuildanddisplay ahistogramfromadepthframe,andthenusesimpletechniquestofilterunwantedpixels. Let’sstartfreshandcreateanewproject.Performthestandardstepsofdiscoveringandinitializing aKinectSensorobjectfordepth-onlyprocessing,includingsubscribingtotheDepthFrameReadyevent. Beforeaddingthecodetobuildthedepthhistogram,updatetheMainWindow.xamlwiththecode showninListing3-10. Listing3-10.DepthHistogramUI
Ourapproachtocreatingthehistogramissimple.WecreateaseriesofRectangleelementsandadd themtotheDepthHistogram(StackPanelelement).Whilethegraphwillnothaveahighfidelityforthis demonstration,itservesuswell.Mostapplicationscalculatehistogramdataanduseitforinternal processingonly.However,ifourintentweretoincludethehistogramdatawithintheUI,wewould certainlyputmoreeffortintothelookandfeelofthegraph.Thecodetobuildanddisplaythehistogram isshowninListing3-11. Listing3-11.BuildingaDepthHistogram private void KinectDevice_DepthFrameReady(object sender, ImageFrameReadyEventArgs e) { using(DepthImageFrame frame = e.OpenDepthImageFrame()) { if(frame != null) { frame.CopyPixelDataTo(this._DepthPixelData); CreateBetterShadesOfGray(frame, this._DepthPixelData); //See Listing 3-8 CreateDepthHistogram(frame, this._DepthPixelData);
62
CHAPTER3DEPTHIMAGEPROCESSING
}
}
} private void CreateDepthHistogram(DepthImageFrame depthFrame , short[] pixelData ) { int depth; int[] depths = new int[4096]; int maxValue = 0; double chartBarWidth = DepthHistogram.ActualWidth / depths.Length; DepthHistogram.Children.Clear(); //First pass - Count the depths. for(int i = 0; i < pixels.Length; i += depthFrame.BytesPerPixel) { depth = pixelData[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;
}
if(depth != 0) { depths[depth]++; }
//Second pass - Find the max depth count to scale the histogram to the space available. // This is only to make the UI look nice. for(int i = 0; i < depths.Length; i++) { maxValue = Math.Max(maxValue, depths[i]); }
}
//Third pass - Build the histogram. for(int i = 0; i < depths.Length; i++) { if(depths[i] > 0) { Rectangle r = new Rectangle(); r.Fill = Brushes.Black; r.Width = chartBarWidth; r.Height = DepthHistogram.ActualHeight * (depths[i] / (double) maxValue); r.Margin = new Thickness(1,0,1,0); r.VerticalAlignment = System.Windows.VerticalAlignment.Bottom; DepthHistogram.Children.Add(r); } }
Buildingthehistogramstartsbycreatinganarraytoholdacountforeachdepth.Thearraysizeis 4096,whichisthenumberofpossibledepthvalues.Thefirststepistoiteratethroughthedepthimage
63
CHAPTER3DEPTHIMAGEPROCESSING
pixels,extractthedepthvalue,andincrementthedepthcountinthedepthsarraytothefrequencyof eachdepthvalue.Depthvaluesofzeroareignored,becausetheyrepresentout-of-rangedepths.Figure 3-8showsadepthimagewithahistogramofthedepthvalues.ThedepthvaluesarealongtheX-axis. TheY-axisrepresentsthefrequencyofthedepthvalueintheimage.
Figure3-8.Depthimagewithhistogram Asyouinteractwiththeapplication,itisinteresting(andcool)toseehowthegraphflowsand changesasyoumovecloserandfartherawayfromKinect.Grabafriendandseetheresultsasmultiple usersareinview.Anothertestistoadddifferentobjects,thelargerthebetter,andplacethemintheview areatoseehowthisaffectsthehistogram.TakenoticeofthetwospikesattheendofthegraphinFigure 3-8.Thesespikesrepresentthewallinthispicture.ThewallisaboutsevenfeetfromKinect,whereasthe userisroughlyfivefeetaway.Thisisanexampleofwhentoemploythresholding.Inthisinstance,itis undesirabletoincludethewall.TheimagesshowninFigure3-9aretheresultofhardcodingathreshold rangeof3-6.5feet.Noticehowthedistributionofdepthchanges.
64
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-9.Depthimagesandhistogramwiththewallfilteredoutoftheimage;thesecondimage(right) showstheuserholdinganewspaperapproximatelytwofeetinfrontoftheuser Whilewatchingtheundulationsinthegraphchangeinrealtimeisinteresting,youquicklybegin wonderingwhatthenextstepsare.Whatelsecanwedowiththisdataandhowcanitbeusefulinan application?Analysisofthehistogramcanrevealspeaksandvalleysinthedata.Byapplyingimageprocessingtechniques,suchasthresholdingtofilteroutdata,thehistogramdatacanrevealmoreabout theimage.Furtherapplicationofotherdataprocessingtechniquescanreducenoiseornormalizethe data,lesseningthedifferencesbetweenthepeaksandvalleys.Asaresultoftheprocessing,itthen becomespossibletodetectedgesofshapesandblobsofpixels.Theblobsbegintotakeonrecognizable shapes,suchaspeople,chairs,orwalls.
FurtherReading Astudyofimage-processingtechniquesfallsfarbeyondthescopeofthischapterandbook.Thepurpose hereisshowthatrawdepthdataisavailabletoyou,andhelpyouunderstandpossibleusesofthedata. Morethanlikely,yourKinectapplicationwillnotneedtoprocessdepthdataextensively.For applicationsthatrequiredepthdataprocessing,itquicklybecomesnecessarytousetoolslikethe OpenCVlibrary.Depthimageprocessingisoftenresourceintensiveandneedstobeexecutedatalower levelthanisachievablewithalanguagelikeC#.
65
CHAPTER3DEPTHIMAGEPROCESSING
NoteTheOpenCV(OpenSourceComputerVision–opencv.willowgarage.com)libraryisacollectionof commonlyusedalgorithmsforprocessingandmanipulatingimages.ThisgroupisalsoinvolvedinthePointCloud Library(PCL)andRobotOperatingSystem(ROS),bothofwhichinvolveintensiveprocessingofdepthdata.Anyone lookingbeyondbeginner’smaterialshouldresearchOpenCV.
Themorecommonreasonanapplicationwouldprocessrawdepthdataistodeterminethe positionsofusersinKinect’sviewarea.WhiletheMicrosoftKinectSDKactuallydoesmuchofthiswork foryouthroughskeletontracking,yourapplicationneedsmaygobeyondwhattheSDKprovides.Inthe nextsection,wewalkthroughtheprocessofeasilydetectingthepixelsthatbelongtousers.Before movingon,youareencouragedtoresearchandstudyimage-processingtechniques.Belowareseveral topicstohelpfurtheryourresearch: •
•
ImageProcessing(general) •
Thresholding
•
Segmentation
Edge/ContourDetection •
Guaussianfilters
•
Sobel,Prewitt,andKirsh
•
Canny-edgedetector
•
Roberts’Crossoperator
•
HoughTransforms
•
BlobDetection
•
LaplacianoftheGuaussian
•
Hessianoperator
•
k-meansclustering
DepthandPlayerIndexing TheSDKhasafeaturethatanalyzesdepthimagedataanddetectshumanorplayershapes.Itrecognizes asmanysixplayersatatime.TheSDKassignsanumbertoeachtrackedplayer.Thenumberorplayer indexisstoredinthefirstthreebitsofthedepthpixeldata(Figure3-10).Asdiscussedinanearlier sectionofthischapter,eachpixelis16bits.Bits0to2holdtheplayerindexvalue,andbits3to15hold thedepthvalue.Abitmaskof7(00000111)getstheplayerindexfromthedepthvalue.Foradetailed explanationofbitmasks,refertoAppendixA.Fortunately,theKinectSDKdefinesapairofconstants focusedontheplayerindexbits.TheyareDepthImageFrame.PlayerIndexBitmaskWidthand DepthImageFrame.PlayerIndexBitmask.Thevalueoftheformeris3andthelatteris7.Yourapplication shouldusetheseconstantsandnotusetheliteralvaluesasthevaluesmaychangeinfutureversionsof theSDK.
66
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-10.Depthandplayerindexbits Apixelwithaplayerindexvalueofzeromeansnoplayerisatthatpixel,otherwiseplayersare numbered1to6.However,enablingonlythedepthstreamdoesnotactivateplayertracking.Player trackingrequiresskeletontracking.WheninitializingtheKinectSensorobjectandtheDepthImageStream, youmustalsoenabletheSkeletonStream.OnlywiththeSkeletonStreamenabledwillplayerindexvalues appearinthedepthpixelbits.YourapplicationdoesnotneedtosubscribetotheSkeletonFrameReady eventtogetplayerindexvalues. Let’sexploretheplayerindexbits.CreateanewprojectthatdiscoversandinitializesaKinectSensor object.EnablebothDepthImageStreamandSkeletonStream,andsubscribetotheDepthFrameReadyevent ontheKinectSensorobject.IntheMainWindow.xamladdtwoImageelementsnamedRawDepthImageand EnhDepthImage.Addthemembervariablesandcodetosupportcreatingimagesusingthe WriteableBitmap.Finally,addthecodeinListing3-12.Thisexamplechangesthevalueofallpixels associatedwithaplayertoblackandallotherpixelstowhite.Figure3-11showstheoutputofthiscode. Forcontrast,thefigureshowstherawdepthimageontheleft.
67
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-12.DisplayingUsersinBlackandWhite private void KinectDevice_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e) { using(DepthIm*ageFrame frame = e.OpenDepthImageFrame()) { if(frame != null) { frame.CopyPixelDataTo(this._RawDepthPixelData); this._RawDepthImage.WritePixels(this._RawDepthImageRect, this._RawDepthPixelData, this._RawDepthImageStride, 0); CreatePlayerDepthImage(frame, this._RawDepthPixelData); } } } private void GeneratePlayerDepthImage(DepthImageFrame depthFrame, short[] pixelData) { int playerIndex; int depthBytePerPixel = 4; byte[] enhPixelData = new byte[depthFrame.Height * this._EnhDepthImageStride]; for(int i = 0, j = 0; i < pixelData.Length; i++, j += depthBytePerPixel) { playerIndex = pixelData[i] & DepthImageFrame.PlayerIndexBitmask;
}
}
68
if(playerIndex == 0) { enhPixelData[j] enhPixelData[j + enhPixelData[j + } else { enhPixelData[j] enhPixelData[j + enhPixelData[j + }
= 0xFF; 1] = 0xFF; 2] = 0xFF;
= 0x00; 1] = 0x00; 2] = 0x00;
this._EnhDepthImage.WritePixels(this._EnhDepthImageRect, enhPixelData, this._EnhDepthImageStride, 0);
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-11.Rawdepthimage(left)andprocesseddepthimagewithplayerindexing(right) Thereareseveralpossibilitiesforenhancingthiscodewithcodewewroteearlierinthischapter.For example,youcanapplyagrayscaletotheplayerpixelsbasedonthedepthandblackoutallotherpixels. Insuchaproject,youcouldbuildahistogramoftheplayer’sdepthvalueandthendeterminethe grayscalevalueofeachdepthinrelationtothehistogram.Anothercommonexerciseistoapplyasolid colortoeachdifferentplayerwherePlayer1’spixelsred,Player2blue,Player3greenandsoon.The KinectExplorersampleapplicationthatcomeswiththeSDKdoesthis.Youcouldapply,ofcourse,the colorintensityforeachpixelbasedonthedepthvaluetoo.Sincethedepthdataisthedifferentiating elementofKinect,youshouldusethedatawhereverandasmuchaspossible. Asawordofcaution,donotcodetospecificplayerindexesastheyarevolatile.Theactualplayer indexnumberisnotalwaysconsistentanddoesnotcoordinatewiththeactualnumberofvisibleusers. Forexample,asingleusermightbeinviewofKinect,butKinectwillreturnaplayerindexofthreefor thatuser’spixels.Todemonstratethis,updatethecodetodisplayalistofplayerindexesforallvisible users.Youwillnoticethatsometimeswhenthereisonlyasingleuser,Kinectdoesnotalwaysidentify thatuserasplayer1.Totestthisout,walkoutofview,waitforabout5seconds,andwalkbackin.Kinect willidentifyyouasanewplayer.Grabseveralfriendstofurthertestthisbykeepingonepersoninviewat alltimesandhavetheotherswalkinandoutofview.Kinectcontinuallytracksusers,butonceauserhas lefttheviewarea,itforgetsaboutthem.Thisisjustsomethingtokeepinmindasyouasdevelopyour Kinectapplication.
TakingMeasure Aninterestingexerciseistomeasurethepixelsoftheuser.AsdiscussedintheMeasuringDepthsection ofthischapter,theXandYpositionsofthepixelsdonotcoordinatetoactualwidthorheight measurements;however,itispossibletocalculatethem.Everycamerahasafieldofview.Thefocal lengthandsizeofthecamera’ssensordeterminestheanglesofthefield.TheMicrosoft’sKinectSDK ProgrammingGuidetellsusthattheviewanglesare57degreeshorizontaland43vertical.Sincewe knowthedepthvalues,wecandeterminethewidthandheightofaplayerusingtrigonometry,as illustratedinFigure3-12,wherewecalculateaplayer’swidth.
69
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-12.Findingtheplayerrealworldwidth Theprocessdescribedbelowisnotperfectandincertaincircumstancescanresultininaccurate anddistortedvalues,however,sotooisthedatareturnedbyKinect.Theinaccuracyisduetothe simplicityofthecalculationsandtheydonottakeintoaccountotherphysicalattributesoftheplayer andspace.Despitethis,thevaluesareaccurateenoughformostuses.Themotivationhereistoprovide anintroductoryexampleofhowKinectdatamapstotherealworld.Youareencouragedtoresearchthe physicsbehindcameraopticsandfieldofviewsothatyoucanupdatethiscodetoensuretheoutputis moreaccurate. Letuswalkthroughthemathbeforedivingintothecode.AsFigure3-12shows,theangleofviewof thecameraisanisoscelestrianglewiththeplayer’sdepthpositionformingthebase.Theactualdepth valueistheheightofthetriangle.Wecanevenlysplitthetriangleinhalftocreatetworighttriangles, whichallowsustocalculatethewidthofthebase.Onceweknowthewidthofthebase,wetranslate pixelwidthsintoreal-worldwidths.Forexample,ifwecalculatethebaseofthetriangletohaveawidth of1500mm(59in),theplayer’spixelwidthtobe100,andthepixelwidthoftheimagetobe320,thenthe resultisaplayerwidthof468.75mm(18.45in).Forustoperformthecalculation,weneedtoknowthe player’sdepthandthenumberofpixelswidetheplayerspans.Wetakeanaverageofdepthsforeachof theplayer’spixels.Thisnormalizesthedepthbecauseinrealitynopersoniscompletelyflat.Ifthatwere true,itcertainlywouldmakeourcalculationsmucheasier.Thecalculationisthesamefortheplayer’s height,butwithadifferentangleandimagedimension. Nowthatweknowthelogicweneedtoperform,letuswalkthroughthecode.Createanewproject thatdiscoversandinitializesaKinectSensorobject.EnablebothDepthStreamandSkeletonStream,and subscribetotheDepthFrameReadyeventontheKinectSensorobject.CodetheMainWindow.xamlto matchListing3-13.
70
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-13. The UI for Measuring Players
ThepurposeoftheItemsControlistodisplayplayermeasurements.Ourapproachistocreatean objecttocollectplayerdepthdataandperformthecalculationsdeterminingtherealwidthandheight valuesoftheuser.Theapplicationmaintainsanarrayoftheseobjectsandthearraybecomesthe ItemsSourcefortheItemsControl.TheUIdefinesatemplatetodisplayrelevantdataforeachplayer depthobject,whichwewillcallPlayerDepthData.Beforecreatingthisclass,let’sreviewthecodethat interfaceswiththeclassittoseehowtoitisused.Listing3-14showsamethodnamed CalculatePlayerSize,whichiscalledfromtheDepthFrameReadyeventhandler.
71
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-14.CalculatingPlayerSizes private void KinectDevice_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e) { using(DepthImageFrame frame = e.OpenDepthImageFrame()) { if(frame != null) { frame.CopyPixelDataTo(this._DepthPixelData); CreateBetterShadesOfGray(frame, this._DepthPixelData); CalculatePlayerSize(frame, this._DepthPixelData); } } } private void CalculatePlayerSize(DepthImageFrame depthFrame, short[] pixelData) { int depth; int playerIndex; int pixelIndex; int bytesPerPixel = depthFrame.BytesPerPixel; PlayerDepthData[] players = new PlayerDepthData[6]; //First pass - Calculate stats from the pixel data for(int row = 0; row < depthFrame.Height; row++) { for(int col = 0; col < depthFrame.Width; col++) { pixelIndex = col + (row * depthFrame.Width); depth = pixelData[pixelIndex] >> DepthImageFrame.PlayerIndexBitmaskWidth; if(depth != 0) { playerIndex = (pixelData[pixelIndex] & DepthImageFrame.PlayerIndexBitmask); playerIndex -= 1; if(playerIndex > -1) { if(players[playerIndex] == null) { players[playerIndex] = new PlayerDepthData(playerIndex + 1, depthFrame.Width,depthFrame.Height); } players[playerIndex].UpdateData(col, row, depth); } }
72
}
}
CHAPTER3DEPTHIMAGEPROCESSING
PlayerDepthData.ItemsSource = players; } TheboldlinesofcodeinListing3-14referenceusesofthePlayerDepthDataobjectinsomeway.The logicoftheCalculatePlayerSizemethodgoespixelbypixelthroughthedepthimageandextractsthe depthandplayerindexvalues.Thealgorithmignoresanypixelwithadepthvalueofzeroandnot associatedwithaplayer.Foranypixelbelongingtoaplayer,thecodecallstheUpdateDatamethodonthe PlayerDepthDataobjectofthatplayer.Afterprocessingallpixels,thecodesetstheplayer’sarraytobethe sourcefortheItemsControlnamedPlayerDepthData.Therealworkofcalculatingeachplayer’ssizeis encapsulatedwithinthePlayerDepthDataobject,whichwe’llturnourattentiontonow. CreateanewclassnamedPlayerDepthData.ThecodeisshowninListing3-15.Thisobjectisthe workhorseoftheproject.Itholdsandmaintainsplayerdepthdata,andcalculatesthereal-worldwidth accordingly. Listing3-15.ObjecttoHoldandMaintainPlayerDepthData public class PlayerDepthData { #region Member Variables private const double MillimetersPerInch = 0.0393700787; private static readonly double HorizontalTanA = Math.Tan(28.5 * Math.PI / 180); private static readonly double VerticalTanA = Math.Abs(Math.Tan(21.5 * Math.PI / 180)); private int _DepthSum; private int _DepthCount; private int _LoWidth; private int _HiWidth; private int _LoHeight; private int _HiHeight; #endregion Member Variables #region Constructor public PlayerDepthData(int playerId, double frameWidth, double frameHeight) { this.PlayerId = playerId; this.FrameWidth = frameWidth; this.FrameHeight = frameHeight; this._LoWidth = int.MaxValue; this._HiWidth = int.MinValue; this._LoHeight = int.MaxValue; this._HiHeight = int.MinValue; } #endregion Constructor #region Methods public void UpdateData(int x, int y, int depth) {
73
CHAPTER3DEPTHIMAGEPROCESSING
this._DepthCount++; this._DepthSum += this._LoWidth = this._HiWidth = this._LoHeight = this._HiHeight =
} #endregion Methods
depth; Math.Min(this._LoWidth, x); Math.Max(this._HiWidth, x); Math.Min(this._LoHeight, y); Math.Max(this._HiHeight, y);
#region Properties public int PlayerId { get; private set; } public double FrameWidth { get; private set; } public double FrameHeight { get; private set; } public double Depth { get { return this._DepthSum / (double) this._DepthCount; } } public int PixelWidth { get { return this._HiWidth - this._LoWidth; } } public int PixelHeight { get { return this._HiHeight - this._LoHeight; } } public double RealWidth { get { double opposite = this.Depth * HorizontalTanA; return this.PixelWidth * 2 * opposite / this.FrameWidth * MillimetersPerInch; } } public double RealHeight { get { double opposite = this.Depth * VerticalTanA; return this.PixelHeight * 2 * opposite / this.FrameHeight * MillimetersPerInch; } }
74
CHAPTER3DEPTHIMAGEPROCESSING
}
#endregion Properties
TheprimaryreasonthePlayerDepthDataclassexistsistoencapsulatethemeasurementcalculations andmaketheprocesseasiertounderstand.Theclassaccomplishesthisbyhavingtwoinputpointsand twooutputs.TheconstructorandtheUpdateDatamethodarethetwoformsofinputandtheRealWidth andRealHeightpropertiesaretheoutput.Thecodebehindeachoftheoutputpropertiescalculatesthe resultbasedontheformulasdetailedinFigure3-12.Eachformulareliesonanormalizeddepthvalue, measurementoftheframe(widthorheight),andthetotalpixelsconsumedbytheplayer.The normalizeddepthandtotalpixelmeasurederivefromdatapassedtotheUpdateDatamethod.Thereal widthandheightvaluesareonlyasgoodasthedatasuppliedtotheUpdateDatamethod. Figure3-13showstheresultsofthisproject.Eachframeexhibitsauserindifferentposes.The imagesshowaUIdifferentfromtheoneinourprojectinordertobetterillustratetheplayer measurementcalculations.Thewidthandheightcalculationsadjustforeachalteredposture.Notethat thewidthandheightvaluesareonlyforthevisiblearea.TakethefirstframeofFigure3-13.Theuser’s heightisnotactually42inches,buttheheightoftheuserseenbyKinectis42inches.Theuser’sreal heightis74inches,whichmeansthatonlyjustoverhalfoftheuserisvisible.Thewidthvaluehasa similarcaveat.
Figure3-13.Playermeasurementsindifferentposes
75
CHAPTER3DEPTHIMAGEPROCESSING
AligningDepthandVideoImages Inourpreviousexamples,wealteredthepixelsofthedepthimagetobetterindicatewhichpixelsbelong tousers.Wecoloredtheplayerpixelsandalteredthecolorofthenon-playerpixels.However,thereare instanceswhereyouwanttoalterthepixelsinthevideoimagebasedontheplayerpixels.Thereisan effectusedbymoviemakerscalledgreenscreeningor,moretechnically,chromakeying.Thisiswherean actorstandsinfrontofagreenbackdropandactsoutascene.Later,thebackdropiseditedoutofthe sceneandreplacedwithsomeothertypeofbackground.Thisiscommoninsci-fimovieswhereitis impossibletosendactorstoMars,forexample,toperformascene.Wecancreatethissametypeof effectwithKinect,andtheMicrosoftSDKmakesthiseasy.Thecodetowritethistypeofapplicationis notmuchdifferentfromwhatwehavealreadycodedinthischapter.
NoteThistypeofapplicationisabasicexampleofaugmentedrealityexperience.Augmentedreality applicationsareextremelyfunandcaptivatinglyimmersiveexperiences.ManyartistsareusingKinecttocreate augmentedrealityinteractiveexhibits.Additionally,thesetypesofexperiencesareusedastoolsforadvertising andmarketing.
WeknowhowtogetKinecttotelluswhichpixelsbelongtousers,butonlyforthedepthimage. Unfortunately,thepixelsofthedepthimagedonottranslateone-to-onewiththosecreatedbythecolor stream,evenifyousettheresolutionsofeachstreamtothesamevalue.Thepixelsofthetwocameras arenotalignedbecausetheyarepositionedonKinectjustliketheeyesonyourface.Youreyesseein stereoinwhatiscalledstereovision.Closeyourlefteyeandnoticehowyourviewoftheworldis different.Nowcloseyourrighteyeandopenyourleft.Whatyouseeisdifferentfromwhatyousawwhen onlyyourrighteyewasopen.Whenbothofyoureyesareopen,yourbraindoestheworktomergethe imagesyouseefromeacheyeintoone. Thecalculationsrequiredtotranslatepixelsfromonecameratotheotherisnottrivialeither. Fortunately,theSDKprovidesmethodsthatdotheworkforus.Themethodsarelocatedonthe KinectSensorandarenamedMapDepthToColorImagePoint,MapDepthToSkeletonPoint, MapSkeletonPointToColor,andMapSkeletonPointToDepth.TheDepthImageFrameobjecthasmethodswith slightlydifferentnames,butfunctionthesame(MapFromSkeletonPoint,MapToColorImagePoint,and MapToSkeletonPoint).Forthisproject,weusetheMapDepthToColorImagePointmethodtotranslatea depthimagepixelpositionintoapixelpositiononacolorimage.Incaseyouarewondering,thereisnot amethodtogetthedepthpixelbasedonthecoordinatesofacolorpixel. CreateanewprojectandaddtwoImageelementstotheMainWindow.xamllayout.Thefirstimageis thebackgroundandcanbehard-codedtowhateverimageyouwant.Thesecondimageisthe foregroundandistheimagewewillcreate.Listing3-16showstheXAMLforthisproject.
76
CHAPTER3DEPTHIMAGEPROCESSING
Listing3-16.GreenScreenAppUI
Inthisproject,wewillemploypollingtoensurethatthecoloranddepthframesareasclosely alignedaspossible.ThecutoutismoreaccuratetheclosertheframesareinTimestamp,andevery millisecondcounts.WhileitispossibletousetheAllFramesReadyeventontheKinectSensorobject,this doesnotguaranteethattheframesgivenbytheeventargumentsarecloseintimewithoneanother.The frameswillneverbeincompletesynchronization,butthepollingmodelgetstheframesascloseas possible.Listing3-17showstheinfrastructurecodetodiscoveradevice,enablethestreams,andpollfor frames. Listing3-17.PollingInfrastructure #region Member Variables private KinectSensor _KinectDevice; private WriteableBitmap _GreenScreenImage; private Int32Rect _GreenScreenImageRect; private int _GreenScreenImageStride; private short[] _DepthPixelData; private byte[] _ColorPixelData; #endregion Member Variables private void CompositionTarget_Rendering(object sender, EventArgs e) { DiscoverKinect(); if(this.KinectDevice != null) { try { ColorImageStream colorStream = this.KinectDevice.ColorStream; DepthImageStream depthStream = this.KinectDevice.DepthStream; using(ColorImageFrame colorFrame = colorStream.OpenNextFrame(100)) { using(DepthImageFrame depthFrame = depthStream.OpenNextFrame(100)) { RenderGreenScreen(this.KinectDevice, colorFrame, depthFrame); }
77
CHAPTER3DEPTHIMAGEPROCESSING
}
}
} } catch(Exception) { //Handle exception as needed }
private void DiscoverKinect() { if(this._KinectDevice != null && this._KinectDevice.Status != KinectStatus.Connected) { this._KinectDevice.ColorStream.Disable(); this._KinectDevice.DepthStream.Disable(); this._KinectDevice.SkeletonStream.Disable(); this._KinectDevice.Stop(); this._KinectDevice = null; } if(this._KinectDevice == null) { this._KinectDevice = KinectSensor.KinectSensors.FirstOrDefault(x => x.Status == KinectStatus.Connected); if(this._KinectDevice != null) { this._KinectDevice.SkeletonStream.Enable(); this._KinectDevice.DepthStream.Enable(DepthImageFormat.Resolution640x480Fps30); this._KinectDevice.ColorStream.Enable(ColorImageFormat.RgbResolution1280x960Fps12); DepthImageStream depthStream = this._KinectDevice.DepthStream; this._GreenScreenImage = new WriteableBitmap(depthStream.FrameWidth, depthStream.FrameHeight, 96, 96, PixelFormats.Bgra32, null); this._GreenScreenImageRect = new Int32Rect(0, 0, (int) Math.Ceiling(depthStream.Width), (int) Math.Ceiling(depthStream.Height)); this._GreenScreenImageStride = depthStream.FrameWidth * 4; this.GreenScreenImage.Source = this._GreenScreenImage; this._DepthPixelData = new short[depthStream.FramePixelDataLength]; int colorFramePixelDataLength = this._ColorPixelData = new byte[this._KinectDevice.ColorStream.FramePixelDataLength]; }
78
this._KinectDevice.Start();
CHAPTER3DEPTHIMAGEPROCESSING
}
}
ThebasicimplementationofthepollingmodelinListing3-17shouldbecommonand straightforwardbynow.Thereareafewlinesofcodeofnote,whicharemarkedinbold.Thefirstlineof codeinboldisthemethodcalltoRenderGreenScreen.Commentoutthislineofcodefornow.We implementitnext.Thenexttwolinescodeinbold,whichenablethecoloranddepthstream,factorinto thequalityofourbackgroundsubtractionprocess.Whenmappingbetweenthecoloranddepthimages, itisbestthatthecolorimageresolutionbetwicethatofthedepthstream,toensurethebestpossible pixeltranslation. TheRenderGreenScreenmethoddoestheactualworkofthisproject.Itcreatesanewcolorimageby removingthenon-playerpixelsfromthecolorimage.Thealgorithmstartsbyiteratingovereachpixelof thedepthimage,anddeterminesifthepixelhasavalidplayerindexvalue.Thenextstepistogetthe correspondingcolorpixelforanypixelbelongingtoaplayer,andaddthatpixeltoanewbytearrayof pixeldata.Allotherpixelsarediscarded.ThecodeforthismethodisshowninListing3-18. Listing3-18.PerformingBackgroundSubstraction private void RenderGreenScreen(KinectSensor kinectDevice, ColorImageFrame colorFrame, DepthImageFrame depthFrame) { if(kinectDevice != null && depthFrame != null && colorFrame != null) { int depthPixelIndex; int playerIndex; int colorPixelIndex; ColorImagePoint colorPoint; int colorStride = colorFrame.BytesPerPixel * colorFrame.Width; int bytesPerPixel = 4; byte[] playerImage = new byte[depthFrame.Height * this._GreenScreenImageStride]; int playerImageIndex = 0; depthFrame.CopyPixelDataTo(this._DepthPixelData); colorFrame.CopyPixelDataTo(this._ColorPixelData); for(int depthY = 0; depthY < depthFrame.Height; depthY++) { for(int depthX = 0; depthX < depthFrame.Width; depthX++, playerImageIndex += bytesPerPixel) { depthPixelIndex = depthX + (depthY * depthFrame.Width); playerIndex = this._DepthPixelData[depthPixelIndex] & DepthImageFrame.PlayerIndexBitmask; if(playerIndex != 0) { colorPoint = kinectDevice.MapDepthToColorImagePoint(depthX, depthY, this._DepthPixelData[depthPixelIndex], colorFrame.Format, depthFrame.Format); colorPixelIndex = (colorPoint.X * colorFrame.BytesPerPixel) +
79
CHAPTER3DEPTHIMAGEPROCESSING
(colorPoint.Y * colorStride); playerImage[playerImageIndex] = this._ColorPixelData[colorPixelIndex]; //Blue playerImage[playerImageIndex + 1] = this._ColorPixelData[colorPixelIndex + 1]; //Green playerImage[playerImageIndex + 2] = this._ColorPixelData[colorPixelIndex + 2]; //Red playerImage[playerImageIndex + 3] = 0xFF; //Alpha }
}
}
}
}
this._GreenScreenImage.WritePixels(this._GreenScreenImageRect, playerImage, this._GreenScreenImageStride, 0);
ThebytearrayplayerImageholdsthecolorpixelsbelongingtoplayers.Sincethedepthimageisthe sourceofourplayerdatainput,itbecomesthelowestcommondenominator.Theimagecreatedfrom thesepixelsisthesamesizeasthedepthimage.Unlikethedepthimage,whichusestwobytesperpixel, theplayerimageusesfourbytesperpixel:blue,green,red,andalpha.Thealphabitsareimportantto thisprojectastheydeterminethetransparencyofeachpixel.Theplayerpixelsgetsetto255(0xFF), meaningtheyarefullyopaque,whereasthenon-playerpixelsgetavalueofzeroandaretransparent. TheMapDepthToColorImagePoint takesinthedepthpixelcoordinatesandthedepth,andreturnsthe colorcoordinates.Theformatofthedepthvaluerequiresmentioning.Themappingmethodrequires therawdepthvalueincludingtheplayerindexbits;otherwisethereturnedresultisincorrect. TheremainingcodeofListing3-16extractsthecolorpixelvaluesandstorestheminthe playerImagearray.Afterprocessingalldepthpixels,thecodeupdatesthepixelsoftheplayerbitmap. Runthisprogram,anditisquicklyapparenttheeffectisnotperfect.Itworksperfectlywhentheuser standsstill.However,iftheusermovesquickly,theprocessbreaksdown,becausethedepthandcolor framescannotstayaligned.NoticeinFigure3-14,thepixelsontheuser’sleftsidearenotcrispandshow noise.Itispossibletofixthis,buttheprocessisnon-trivial.Itrequiressmoothingofthepixelsaround theplayer.Forthebestresults,itisnecessarytomergeseveralframesofimagesintoone.Wepickthis projectbackupinChapter8todemonstratehowtoolslikeOpenCVcandothisworkforus.
80
CHAPTER3DEPTHIMAGEPROCESSING
Figure3-14.Avisittowinecountry
DepthNearMode TheoriginalpurposeforKinectwastoserveasagamecontrolfortheXbox.TheXboxisprimarilyplayed inalivingroomspacewheretheuserisafewfeetawayfromtheTVscreenandKinect.Aftertheinitial release,developersallovertheworldbeganbuildingapplicationsusingKinectonPCs.Severalofthese PC-basedapplicationsrequireKinecttoseeorfocusatamuchcloserrangethanisavailablewiththe originalhardware.ThedevelopercommunitycalledonMicrosofttoupdateKinectsothatKinectcould returndepthdatafordistancesnearerthan800mm(31.5inches). MicrosoftansweredbyreleasingnewhardwarespeciallyconfiguredforuseonPCs.Thenew hardwaregoesbythenameKinectforWindowsandtheoriginalhardwarebyKinectforXbox.TheKinect forWindowsSDKhasanumberofAPIelementsspecifictothenewhardware.TheRangepropertysets theviewrangeoftheKinectsensor.TheRangepropertyisofDepthRange—anenumerationwithtwo options,asshowinginTable3-1.Alldepthrangesareinclusive.
81
CHAPTER3DEPTHIMAGEPROCESSING
Table3-1.DepthRangeValues
DepthRange
What it is
Normal
Setstheviewabledepthrangeto800mm(2’7.5”)–4000mm(13’1.48”).Hasan integervalueof0.
Near
Setstheviewabledepthrangeto400mm(1’3.75”)–3000mm(9’10.11”).Hasan integervalueof1.
TheRangepropertycanbechangeddynamicallywhiletheDepthImageStreamisenabledand producingframes.Thisallowsfordynamicandquickchangesinfocusasneededwithouthavingto restarttheKinectSensoroftheDepthImageStream.However,theRangepropertyissensitivetothetypeof Kinecthardwarebeingused.AnychangetotheRangepropertytoDepthRange.NearwhenusingKinectfor XboxhardwareresultsinanInvalidOperationExeceptionexceptionwithamessageof,“Thefeatureis notsupportedbythisversionofthehardware.”NearmodeviewingisonlysupportedbyKinectfor Windowshardware. Twoadditionalpropertiesaccompanytheneardepthrangefeature.TheyareMinDepthand MaxDepth.ThesepropertiesdescribetheboundariesofKinect’sdepthrange.Bothvaluesupdateonany changetotheRangepropertyvalue. Onefinalfeatureofnotewiththedepthstreamisthespecialtreatmentofdepthvaluesthatexceed theboundariesofthedepthrange.TheDepthImageStreamdefinestwopropertiesnamedTooFarDepth andTooNearDepth,whichgivetheapplicationmoreinformationabouttheoutofrangedepth.Thereare instanceswhenadepthiscompletelyindeterminateandisgiveavalueequaltotheUnknownDepth propertyontheDepthImageStream.
Summary DepthisfundamentaltoKinect.Depthiswhatdifferentiatesitfromallotherinputdevices. UnderstandinghowtoworkwithKinect’sdepthdataisequallyfundamentaltodevelopingKinect experiences.AnyKinectapplicationthatdoesnotincorporatedepthisunderutilizingthehardware,and ultimatelylimitingtheuserexperiencefromreachingitsfullestpotential.Whilenoteveryapplication needstoaccessandprocesstherawdepthdata,asadeveloperorapplicationarchitectyouneedto knowthisdataisavailableandhowtoexploitittothebenefitoftheuserexperience.Further,whileyour applicationmaynotprocessthedatadirectly,itwillreceiveaderivativeofthedata.Kinectprocessesthe originaldepthdatatodeterminewhichpixelsbelongtoeachuser.Theskeletontrackingengine componentoftheSDKperformsmoreextensiveprocessingofdepthdatatoproduceuserskeleton information. Itislessfrequentforareal-worldKinectexperiencetousetherawdepthdatadirectly.Itismore commontousethird-partytoolssuchasOpenCVtoprocessthisdata,aswewillshowinChapter9. Processingofrawdepthdataisnotalwaysatrivialprocess.Italsocanhaveextremeperformance demands.ThisalonemeansthatamanagedlanguagelikeC#isnotalwaysthebesttoolforthejob.This isnottosayitisimpossible,butoftenrequiresalowerlevelofprocessingthanC#canprovide.Ifthe kindofdepthimageprocessingyouwanttodoisunachievablewithanexistingthird-partylibrary, createyourownC/C++librarytodotheprocessing.YourWPFapplicationcanthenuseit. Depthdatacomesintwoforms.TheSDKdoessomeprocessingtodeterminewhichpixelsbelongto aplayer.Thisispowerfulinformationtohaveandprovidesabasisforatleastrudimentaryimage processingtobuildinteractiveexperiences.Bycreatingsimplestatisticsaroundaplayer’sdepthvalues,
82
CHAPTER3DEPTHIMAGEPROCESSING
wecantellwhenaplayerisinviewareaofKinectormorespecifically,wheretheyareinrelationtothe entireviewingarea.Usingthedatayourapplicationcouldperformsomeactionlikeplayasoundclipof applausewhenKinectdetectsanewuser,oraseries“boo”soundswhenauserleavestheviewarea. However,beforeyourunoffandstartwritingcodethatdoesthis,waituntilnextchapterwhenwe introduceskeletontracking.TheKinectforWindowsSDK’sskeletontrackingenginemakesthisan easiertask.Thepointisthatthedataisavailableforyoutouseifyourapplicationneedsit. Calculatingthedimensionsofobjectsinreal-worldspaceisonereasontoprocessdepthdata.In ordertodothisyoumustunderstandthephysicsbehindcameraoptics,andbeproficientin trigonometry.TheviewanglesofKinectcreatetriangles.Sinceweknowtheanglesofthesetrianglesand thedepth,wecanmeasureanythingintheviewfield.Asanaside,imaginehowproudyourhighschool trigteacherwouldbetoknowyouareusingtheskillsshetaughtyou. Thelastprojectofthischapterconverteddepthpixelcoordinatesintocolorstreamcoordinatesto performbackgroundsubtractiononanimage.Theexamplecodedemonstratedaverysimpleand practicalusecase.Ingaming,itismorecommontouseanavatartorepresenttheuser.However,many otherKinectexperiencesincorporatethevideocameraanddepthdata,withaugmentedrealityconcepts beingthemostcommon. Finally,thischaptercoveredtheneardepthmodeavailableonKinectforWindowshardware.Using asimplesetofproperties,anapplicationcandynamicallychangethedepthrangeviewablebyKinect. Thisconcludescoverageofthedepthstream.Nowlet’smoveontoreviewtheskeletonstream.
83
CHAPTER 4
Skeleton Tracking TherawdepthdataproducedbyKinecthaslimiteduses.Tobuildtrulyinteractive,fun,andmemorable experienceswithKinect,weneedmoreinformationbeyondjustthedepthofeachpixel.Thisiswhere skeletontrackingcomesin.Skeletontrackingistheprocessingofdepthimagedatatoestablishthe positionsofvariousskeletonjointsonahumanform.Forexample,skeletontrackingdetermineswhere auser’shead,hands,andcenterofmassare.SkeletontrackingprovidesX,Y,andZvaluesforeachof theseskeletonpoints.Inthepreviouschapter,weexploredelementarydepthimageprocessing techniques.Skeletontrackingsystemsgobeyondourintroductoryimageprocessingroutines.They analyzedepthimagesemployingcomplicatedalgorithmsthatusematrixtransforms,machinelearning, andothermeanstocalculateskeletonpoints. Inthefirstsectionofthischapter,webuildanapplicationthatworkswithallofthemajorobjectsof theskeletontrackingsystem.Whatfollowsisathoroughexaminationoftheskeletontrackingobject model.Itisimportanttoknowwhatdatatheskeletontrackingengineprovidesyou.Wenextproceedto buildingacompletegameusingtheKinectandskeletontracking.Weuseeverythinglearnedinthis chaptertobuildthegame,whichwillserveasaspringboardforotherKinectexperiences.Thechapter concludeswithanexaminationofahardwarefeaturethatcanimprovethequalityoftheskeleton tracking. Theanalogythatyoumustwalkbeforeyourunappliestothisbook.Uptoandincludingthis chapterwehavebeenlearningtowalk.Afterthischapterwerun.Thefundamentalsofskeletontracking learnedherecreateafoundationforthenexttwochaptersandeveryKinectapplicationyouwritegoing forward.YouwillfindthatinvirtuallyeveryapplicationyoucreateusingKinect,thevastmajorityofyour codewillfocusontheskeletontrackingobjects.Aftercompletingthischapter,wewillhavecoveredall componentsoftheKinectforWindowsSDKdealingwithKinect’scameras.Westartwithanapplication thatdrawsstickfiguresfromskeletondataproducedbytheSDK’sskeletonstream.
SeekingSkeletons OurgoalistobeabletowriteanapplicationthatdrawstheskeletonofeveryuserinKinect’sviewarea. Beforejumpingintocodeandworkingwithskeletondata,weshouldfirstwalkthroughthebasicoptions andseehowtogetskeletondata.Itisalsohelpfultoknowtheformatofthedatasothatwecanperform anynecessarydatamanipulation.However,theintentisforthisexaminationtobebrief,sowe understandjustenoughoftheskeletonobjectsanddatatodrawskeletons. SkeletondatacomesfromtheSkeletonStream.Datafromthisstreamisaccessibleeitherfrom eventsorbypollingsimiliarilytothecoloranddepthstreams.Inthiswalkthrough,weuseeventssimply becauseitistakeslesscodeandisamorecommonandbasicapproach.TheKinectSensorobjecthasan eventnamedSkeletonFrameReady,whichfireseachtimenewskeletondatabecomesavailable.Skeleton dataisalsoavailablefromtheAllFramesReadyevent.Welookattheskeletontrackingobjectmodelin greaterdetailshortly,butfornow,weareonlyconcernourselveswithgettingskeletondatafromthe
85
CHAPTER4SKELETONTRACKING
stream.EachframeoftheSkeletonStreamproducesacollectionofSkeletonobjects.EachSkeleton objectcontainsdatathatdescribeslocationofskeletonandtheskeleton’sjoints.Eachjointhasan idenitiy(head,shoulder,elbow,etc.)anda3Dvector. Nowlet’swritesomecode.CreateanewprojectwithareferencetotheMicrosoft.Kinectdll,and addthebasicboilerplatecodeforcapturingaconnectedKinectsensor.Beforestartingthesensor, enabletheSkeletonStreamandsubscribetotheSkeletonFrameReadyevent.Ourfirstprojecttointroduce skeletontrackingdoesnotusethevideoordepthstreams.Theinitializationshouldappearasin Listing4-1. Listing4-1.SimpleSkeletonTrackingInitialization #region Member Variables private KinectSensor _KinectDevice; private readonly Brush[] _SkeletonBrushes; private Skeleton[] _FrameSkeletons; #endregion Member Variables #region Constructor public MainWindow() { InitializeComponent(); this._SkeletonBrushes = new [] { Brushes.Black, Brushes.Crimson, Brushes.Indigo, Brushes.DodgerBlue, Brushes.Purple, Brushes.Pink }; KinectSensor.KinectSensors.StatusChanged += KinectSensors_StatusChanged; this.KinectDevice = KinectSensor.KinectSensors .FirstOrDefault(x => x.Status == KinectStatus.Connected);
} #endregion Constructor
#region Methods private void KinectSensors_StatusChanged(object sender, StatusChangedEventArgs e) { switch (e.Status) { case KinectStatus.Initializing: case KinectStatus.Connected: this.KinectDevice = e.Sensor; break; case KinectStatus.Disconnected: //TODO: Give the user feedback to plug-in a Kinect device. this.KinectDevice = null; break; default: //TODO: Show an error state break; } } #endregion Methods
86
CHAPTER4SKELETONTRACKING
#region Properties public KinectSensor KinectDevice { get { return this._KinectDevice; } set { if(this._KinectDevice != value) { //Uninitialize if(this._KinectDevice != null) { this._KinectDevice.Stop(); this._KinectDevice.SkeletonFrameReady -= KinectDevice_SkeletonFrameReady; this._KinectDevice.SkeletonStream.Disable(); this._FrameSkeletons = null; } this._KinectDevice = value; //Initialize if(this._KinectDevice != null) { if(this._KinectDevice.Status == KinectStatus.Connected) { this._KinectDevice.SkeletonStream.Enable(); this._FrameSkeletons = new Skeleton[this._KinectDevice.SkeletonStream.FrameSkeletonArrayLength]; this.KinectDevice.SkeletonFrameReady += KinectDevice_SkeletonFrameReady; this._KinectDevice.Start(); } } } } } #endregion Properties Takenoteofthe_FrameSkeletonsarrayandhowthearraymemoryisallocatedduringstream initialization.ThenumberofskeletonstrackedbyKinectisconstant.Thisallowsustocreatethearray onceanduseitthroughoutthelifeoftheapplication.Conveniently,theSDKdefinesaconstantforthe arraysizeontheSkeletonStream.ThecodeinListing4-1alsodefinesanarrayofbrushes.Thesebrushed willbeusedtocolorthelinesconnectingskeletonjoints.Youarewelcometocustomizethebrushcolors tobeyourfavoritecolorsinsteadoftheonesinthecodelisting. ThecodeinListing4-2showstheeventhandlerfortheSkeletonFrameReadyevent.Eachtimethe eventhandlerexecutesitretrievesthecurrentframebycallingtheOpenSkeletonFramemethodonthe eventargumentparameter.Theremainingcodeiteratesovertheframe’sarrayofSkeletonobjectsand drawslinesontheUIthatconnecttheskeletonjoints.Thiscreatesastickfigureforeachskeleton.The UIforourapplicationissimple.ItisonlyaGridelementnamed“LayoutRoot”andwiththebackground settowhite.
87
CHAPTER4SKELETONTRACKING
Listing4-2.ProducingStickFigures private void KinectDevice_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e) { using(SkeletonFrame frame = e.OpenSkeletonFrame()) { if(frame != null) { Polyline figure; Brush userBrush; Skeleton skeleton; LayoutRoot.Children.Clear(); frame.CopySkeletonDataTo(this._FrameSkeletons); for(int i = 0; i < this._FrameSkeletons.Length; i++) { skeleton = this._FrameSkeletons[i]; if(skeleton.TrackingState == SkeletonTrackingState.Tracked) { userBrush = this._SkeletonBrushes[i % this._SkeletonBrushes.Length]; //Draws the skeleton’s head and torso joints = new [] { JointType.Head, JointType.ShoulderCenter, JointType.ShoulderLeft, JointType.Spine, JointType.ShoulderRight, JointType.ShoulderCenter, JointType.HipCenter, JointType.HipLeft, JointType.Spine, JointType.HipRight, JointType.HipCenter }); LayoutRoot.Children.Add(CreateFigure(skeleton, userBrush, joints)); //Draws the skeleton’s left leg joints = new [] { JointType.HipLeft, JointType.KneeLeft, JointType.AnkleLeft, JointType.FootLeft }; LayoutRoot.Children.Add(CreateFigure(skeleton, userBrush, joints)); //Draws the skeleton’s right leg joints = new [] { JointType.HipRight, JointType.KneeRight, JointType.AnkleRight, JointType.FootRight }; LayoutRoot.Children.Add(CreateFigure(skeleton, userBrush, joints)); //Draws the skeleton’s left arm joints = new [] { JointType.ShoulderLeft, JointType.ElbowLeft, JointType.WristLeft, JointType.HandLeft }; LayoutRoot.Children.Add(CreateFigure(skeleton, userBrush, joints)); //Draws the skeleton’s right arm joints = new [] { JointType.ShoulderRight, JointType.ElbowRight, JointType.WristRight, JointType.HandRight }; LayoutRoot.Children.Add(CreateFigure(skeleton, userBrush, joints));
88
CHAPTER4SKELETONTRACKING
} }
}
}
}
Eachtimeweprocessaskeleton,ourfirststepistodetermineifwehaveanactualskeleton.One waytodothisiswiththeTrackingStatepropertyoftheskeleton.Onlythoseusersactivelytrackedbythe skeletontrackingenginearedrawn.Weignoreprocessinganyskeletonsthatarenottrackingauser(that is,theTrackingStateisnotequaltoSkeletonTrackingState.Tracked).WhileKinectcandetectuptosix users,itonlytracksjointpositionsfortwo.WeexplorethisandtheTrackingStatepropertyingreater depthlaterinthechapter. Theprocessingperformedontheskeletondataissimple.Weselectabrushtocolorthestickfigure basedonthepositionoftheplayerinthecollection.Next,wedrawthestickfigure.Youcanfindthe methodsthatcreatetheactualUIelementsinListing4-3.TheCreateFiguremethoddrawstheskeleton stickfigureforasingleskeletonobject.TheGetJointPointmethodiscriticaltodrawingthestickfigure. ThismethodtakesthepositionvectorofthejointandcallstheMapSkeletonPointToDepthmethodonthe KinectSensorinstancetoconverttheskeletoncoordinatestothedepthimagecoordinates.Laterinthe chapter,wediscusswhythisconversionisnecessaryanddefinethecoordinatesystemsinvolved.Atthis point,thesimpleexplanationisthattheskeletoncoordinatesarenotthesameasdepthorvideoimage coordinates,oreventheUIcoordinates.Theconversionofcoordinatesystemsorscalingfromone coordinatesystemtoanotherisquitecommonwhenbuildingKinectapplications.Thecentraltakeaway istheGetJointPointmethodconvertstheskeletonjointfromtheskeletoncoordinatesystemintotheUI coordinatesystemandreturnsapointintheUIwherethejointshouldbe. Listing4-3.DrawingSkeletonJoints private Polyline CreateFigure(Skeleton skeleton, Brush brush, JointType[] joints) { Polyline figure = new Polyline(); figure.StrokeThickness = 8; figure.Stroke = brush; for(int i = 0; i < joints.Length; i++) { figure.Points.Add(GetJointPoint(skeleton.Joints[joints[i]])); } }
return figure;
private Point GetJointPoint(Joint joint) { DepthImagePoint point = this.KinectDevice.MapSkeletonPointToDepth(joint.Position, this.KinectDevice.DepthStream.Format); point.X *= (int) this.LayoutRoot.ActualWidth / this.KinectDevice.DepthStream.FrameWidth; point.Y *= (int) this.LayoutRoot.ActualHeight / this.KinectDevice.DepthStream.FrameHeight; }
return new Point(point.X, point.Y);
89
CHAPTER4SKELETONTRACKING
ItisalsoimportanttopointoutwearediscardingtheZvalue.ItseemsawasteforKinecttodoa bunchofworktoproduceadepthvalueforeveryjointandthenforustonotusethisdata.Inactuality, weareusingtheZvalue,butnotexplicitly.Itisjustnotusedintheuserinterface.Thecoordinatespace conversionrequiresthedepthvalue.TestthisyourselfbycallingtheMapSkeletonPointToDepthmethod, passingintheXandYvalueofjoint,andsettingtheZvaluetozero.TheoutcomeisthatthedepthXand depthYvariablesalwaysreturnas0.Asanadditionalexercise,usethedepthvaluetoapplya ScaleTransformtotheskeletonfiguresbasedontheZvalue.Thescalevaluesareinverselyproportional tothedepthvalue.Thismeansthatthesmallerthedepthvalue,thelargerthescalevalue,sothatthe closerauseristotheKinect,thelargertheskeleton. Compileandruntheproject.TheoutputshouldbesimilartothatshowninFigure4-1.The applicationdisplaysacoloredstickfigureforeachskeleton.Grabafriendandwatchittrackyour movements.Studyhowtheskeletontrackingreactstoyourmovements.Trydifferentposesand gestures.MovetowhereonlyhalfofyourbodyisinKinect’sviewareaandnotehowtheskeleton drawingsbecomejumbled.Walkinandoutoftheviewareaandnoticethattheskeletoncolorschange. Thesometimes-strangebehavioroftheskeletondrawingsbecomesclearinthenextsectionwherewe exploretheskeletonAPIingreaterdepth.
90
CHAPTER4SKELETONTRACKING
Figure4-1.Stickfiguresgeneratedfromskeletondata
TheSkeletonObjectModel Therearemoreobjects,structures,andenumerationsassociatedwithskeletontrackingthananyother featureoftheSDK.Infact,skeletontrackingaccountsforoverathirdoftheentireSDK.Skeleton trackingisobviouslyasignificantcomponent.Figure4-2illustratestheprimaryelementsoftheskeleton tracking.Therearefourmajorelements(SkeletonStream,SkeletonFrame,SkeletonandJoint)and severalsupportingparts.Thesubsectionstofollowdescribeindetailthemajorobjectsandstructure.
91
CHAPTER4SKELETONTRACKING
Figure4-2.Skeletonobjectmodel
92
CHAPTER4SKELETONTRACKING
SkeletonStream TheSkeletonStreamgeneratesSkeletonFrameobjects.RetievingframedatafromtheSkeletonStream is similartotheColorStream and DepthStream.Applicationsretrieveskeletondataeitherfromthe SkeletonFrameReadyevent,AllFramesReadyeventorfromtheOpenNextFramemethod.Note,thatcalling theOpenNextFramemethodaftersubscribingtotheSkeletonFrameReadyeventoftheKinectSensorobject resultsinanInvalidOperationExceptionexception.
EnablingandDisabling TheSkeletonStreamdoesnotproduceanydatauntilenabled.Bydefault,itisdisabled.Toactivatethe SkeletonStreamsothatitbeginsgeneratingdata,callitsEnabledmethod.Thereisalsoamethodnamed Disable,whichsuspendstheproductionofskeletondata.TheSkeletonStreamobjectalsohasaproperty namedIsEnabled,whichdescribesthecurrentstateofskeletondataproduction.The SkeletonFrameReadyeventonaKinectSensorobjectdoesnotfireuntiltheSkeletonStreamisenabled.If choosingtoemployapollingarchitecture,theSkeletonStreammustbeenabledbeforecallingthe OpenNextFramemethod;otherwise,thecallthrowsanInvalidOperationExceptionexception. Inmostapplications,onceenabledtheSkeletonStreamisunlikelytobedisabledduringthelifetime oftheapplication.However,thereareinstanceswhereitisdesirabletodisablethestream.Oneexample iswhenusingmultipleKinectsinanapplication—anadvancedtopicthatisnotcoveredinthisbook. NotethatonlyoneKinectcanreportskeletondataforeachprocess.Thismeansthatevenwithmultiple Kinectsrunning,theapplicationisstilllimitedtotwoskeletons.Theapplicationmustthenchooseon whichKinecttoenableskeletontracking.Duringapplicationexecution,itisthenpossibletochange whichKinectisactivelytrackingskeletonsbydisablingtheSkeletonStreamofoneKinectandenabling thatoftheother. Anotherreasontodisableskeletondataproductionisforperformance.Skeletonprocessingisan expensiveoperation.ThisisobviousbywatchingtheCPUusageofanapplicationwithskeletontracking enabled.Toseethisforyourself,openWindowsTaskManagerandrunthestickfigureapplicationfrom thefirstsectionofthischapter.ThestickfiguredoesverylittleyetithasarelativelyhighCPUusage.This isaresultofskeletontracking.Disablingskeletontrackingisusefulwhenyourapplicationdoesnotneed skeletondata,andinsomeinstances,disablingskeletontrackingmaybenecessary.Forexample,ina gamesomeeventmighttriggeracomplexanimationorcutscenevideo.Forthedurationofthe animationorvideosequenceskeletondataisnotneeded.Disablingskeletontrackingmightalsobe necessarytoensureasmoothanimationsequenceorvideoplayback. ThereisasideeffecttodisablingtheSkeletonStream.Allstreamdataproductionstopsandrestarts whentheSkeletonStreamchangesstate.Thisisnotthecasewhenthecolorordepthstreamsare disabled.AchangeintheSkeletonStreamstatecausesthesensortoreinitialize.Thisprocessresetsthe TimeStampandFrameNumberofallframestozero.Thereisalsoaslightlagwhenthesensorreinitializes, butitisonlyafewmilliseconds.
Smoothing Asyougainexperienceworkingwithskeletaldata,younoticeskeletonmovementisoftenjumpy.There areseveralpossiblecausesofthis,rangingfrompoorapplicationperformancetoauser’sbehavior (manyfactorscancauseapersontoshakeornotmovesmoothly),totheperformanceoftheKinect hardware.Thevarianceofajoint’spositioncanberelativelylargefromframetoframe,whichcan negativelyaffectanapplicationindifferentways.Inadditiontocreatinganawkwarduserexperience
93
CHAPTER4SKELETONTRACKING
andbeingaestheticallydispleasing,itisconfusingtouserswhentheiravatarorhandcursorappears shakyorworseconvulsive. TheSkeletonStreamhasawaytosolvethisproblembynormalizingpositionvaluesbyreducingthe varianceinjointpositionsfromframetoframe.WhenenablingtheSkeletonStreamusingtheoverloaded EnablemethodandpassinaTransformSmoothParametersstructure.ForreferencetheSkeletonStreamhas tworead-onlypropertiesforsmoothingnamedIsSmoothingEnabledandSmoothParameters.The IsSmoothingEnabledpropertyissettotruewhenthestreamisenabledwithaTransformSmoothParameters andfalsewhenthedefaultEnablemethodisused.TheSmoothParameterspropertystoredthedefined smoothingparameters.TheTransformSmoothParametersstructuredefinestheseproperties: •
Correction–Takesafloatrangingfrom0to1.0.Thelowerthenumber,themore correctionisapplied.
•
JitterRadius–Setstheradiusofcorrection.Ifajointposition“jitters”outsideof thesetradius,itiscorrectedtobeattheradius.Thepropertyisafloatvalue measuredinmeters
•
MaxDeviationRadius–UsedthissettinginconjunctionwiththeJitterRadius settingtodeterminetheouterboundsofthejitterradius.Anypointthatfalls outsideofthisradiusisnotconsideredajitter,butavalidnewposition.The propertyisafloatvaluemeasuredinmeters.
•
Prediction–Returnsthenumberofframespredicted.
•
Smoothing–Determinestheamountofsmoothingappliedwhileprocessing skeletalframes.Itisafloattypewitharangeof0to1.0.Thehigherthevalue,the moresmoothingapplied.Azerovaluedoesnotaltertheskeletondata.
Smoothingskeletonjitterscomesatacost.Themoresmoothingapplied,themoreadverselyit affectstheapplication’sperformance.Settingthesmoothingparametersismoreofanartthanascience. Thereisnotasingleorbestsetofsmoothingvalues.Youhavetotestandtweakyourapplicationduring developmentandusertestingtoseewhatvaluesworkbest.Itislikelythatyourapplicationusesmultiple smoothingsettingsatdifferentpointsintheapplication’sexecution.
NoteTheSDKusestheHoltDoubleExponentialSmoothingproceduretoreducethejittersfromskeletaljoint data.Exponentialsmoothingappliestodatageneratedinrelationtotime,whichiscalledtimeseriesdata. Skeletondataistimeseriesdata,becausetheskeletonenginegeneratesaframeofskeletondataforsome intervaloftime.1Thissmoothingprocessusesstatisticalanalysistocreateamovingaverage,whichreducesthe noiseorextremesfromthedataset.Thistypeofdataprocessingwasoriginallyappliedtofinancialmarketand economicdataforecasting.2
1
Wikipedia,“Exponentialsmoothing,”http://en.wikipedia.org/wiki/Exponential_smoothing, 2011. PaulGoodwin,“TheHolt-WintersApproachtoExponentialSmoothing:50YearsOldandGoing Strong,”http://forecasters.org/pdfs/foresight/free/Issue19_goodwin.pdf, Spring 2010. 2
94
CHAPTER4SKELETONTRACKING
ChoosingSkeletons Bydefault,theskeletonengineselectswhichavailableskeletonstoactivelytrack.Theskeletonengine choosesthefirsttwoskeletonsavailablefortracking,whichisnotalwaysdesirablelargelybecausethe seletionprocessisunpredicatable.Ifyousochoose,youhavetheoptiontoselectwhichskeletonsto trackusingtheAppChoosesSkeletonspropertyandChooseSkeletonsmethod.TheAppChoosesSkeletons propertyisfalsebydefaultandsotheskeletonengineselectsskeletonsfortracking.Tomanuallyselect whichskeletonstotrack,settheAppChoosesSkeletonspropertytotrueandcalltheChooseSkeletons methodpassingintheTrackingIDsoftheskeletonsyouwanttotrack.TheChooseSkeletonsmethod acceptsone,two,ornoTrackingIDs.Theskeletonenginestopstrackingallskeletonswhenthe ChooseSkeletonsmethodispassednoparameters.Therearesomenuancestoselectingskeletons: •
AcalltoChooseSkeletonswhenAppChoosesSkeletonsisfalseresultsinan InvalidOperationExceptionexception.
•
IfAppChoosesSkeletonsissettotruebeforetheSkeletonStreamisenabled,no skeletonsareactivelytrackeduntilmanuallyselectedbycallingChooseSkeletons.
•
SkeletonsautomaticallyselectedfortrackingbeforesettingAppChoosesSkeletons issettotruecontinuetobeactivelytrackeduntiltheskeletonleavesthesceneor ismanuallyreplaced.Iftheautomaticallyselectedskeletonleavesthescene,itis notautomaticallyreplaced.
•
Anyskeletonsmanuallychosenfortrackingcontinuetobetrackedafter AppChoosesSkeletonsissettofalseuntiltheskeletonleavesthescene.Itisatthis pointthattheskeletonengineselectsanotherskeleton,ifanyareavailable.
SkeletonFrame TheSkeletonStreamproducesSkeletonFrameobjects.Whenusingtheeventmodeltheapplication retrievesaSkeletonFrameobjectfromeventargumentsbycallingtheOpenSkeletonFramemethod,or fromtheOpenNextFramemethodontheSkeletonStreamwhenpolling.TheSkeletonFrameobjectholds skeletondataforamomentintime.Theframe’sskeletondataisavailablebycallingthe CopySkeletonDataTomethod.Thismethodpopulatesanarraypassedtoitwithskeletondata.The SkeletonFramehasapropertynamedSkeletonArrayLength,whichgivesthenumberofskeletonsithas datafor.ThearrayalwaysreturnsfullypopulatedevenwhentherearenousersintheKinect’sviewarea.
MarkingTime TheFrameNumberandTimestampfieldsmarkthemomentintimeinwhichtheframewasrecorded. FrameNumberisanintegerthatistheframenumberofthedepthimageusedtogeneratetheskeleton frame.Theframenumbersarenotalwayssequential,buteachframenumberwillalwaysbegreaterthan thatofthepreviousframe.Itispossiblefortheskeletonenginetoskipdepthframesduringexecution. Thereasonsforthisvarybasedonoverallapplicationperformanceandframerate.Forexample,long runningprocesseswithinanyofthestreameventhandlerscanslowprocessing.Ifanapplicationuses pollinginsteadoftheeventmodel,itisdependentontheapplicationtodeterminehowfrequentlythe skeletonenginegeneratesdata,andeffectivelyfromwhichdepthframeskeletondataderives. TheTimestampfieldisthenumberofmillisecondssincetheKinectSensorwasinitialized.Youdonot needtoworryaboutlongrunningapplicationsreachingthemaximumFrameNumberorTimestamp.The FrameNumberisa32-bitintegerwhereastheTimestampisa64-bitinteger.Yourapplicationwouldhaveto
95
CHAPTER4SKELETONTRACKING
runcontinuouslyat30framespersecondforjustovertwoandaquarteryearsbeforereachingthe FrameNumbermaximum,andthiswouldbewaybeforetheTimestampwasclosetoitsceiling.Additionally, theFrameNumberandTimestampstartoveratzeroeachtimetheKinectSensorisinitialized.Youcanrely ontheFrameNumberandTimestampvaluestobeunique. AtthisstageinthelifecycleoftheSDKandoverallKinectdevelopment,thesefieldsareimportantas theyareusedtoprocessoranalyzeframes,forinstancewhensmoothingjointvalues.Gesture processingisanotherexample,andthemostcommon,ofusingthisdatatosequenceframedata.The currentversionoftheSDKdoesnotincludeagestureengine.UntilafutureversionoftheSDKincludes gesturetracking,developershavetocodetheirowngesturerecognitionalgorithms,whichmaydepend onknowingthesequenceofskeletonframes.
FrameDescriptors
TheFloorClipPlanefieldisa4-tuple (Tuple) with eachelementisacoefficients ofthefloorplane.ThegeneralequationforthefloorplaneisAx+By+Cz+D=0,whichmeansthefirst tupleelementcorrespondstoA,thesecondtoBandsoon.TheDvariableinthefloorplaneequationis alwaysthenegativeoftheheightofKinectinmetersfromthefloor.Whenpossible,theSDKusesimageprocessingtechniquestodeterminetheexactcoefficientvalues;however,thisisnotalwayspossible, andthevalueshavetobeestimated.TheFloorClipPlaneisazeroplane(allelementshaveavalueof zero)whenthefloorisundeterminable.
Skeleton TheSkeletonclassdefinesasetoffieldstoidentifytheskeleton,describethepositionoftheskeleton andpossiblythepositionsoftheskeleton’sjoints.Skeletonobjectareavailablebypassinganarrayto theCopySkeletonDataTomethodonaSkeletonFrame.TheCopySkeletonDataTomethodhasan unexpectedbehavior,whichmayaffectmemoryusageandreferencestoSkeletonobjects.TheSkeleton objectsreturnedareuniquetothearrayandnottotheapplication.Takethefollowingcodesnippet: Skeleton[] skeletonsA = new Skeleton[frame.SkeletonArrayLength]; Skeleton[] skeletonsB = new Skeleton[frame.SkeletonArrayLenght]; frame.CopySkeletonDataTo(skeletonsA); frame.CopySkeletonDataTo(skeletonsB); bool resultA = skeletonsA[0] == skeleton[0]; //This is false bool resultB = skeletonsA[0].TrackingId == skeleton[0].TrackingId;
//This is true
TheSkeletonsobjectsinthearraysarenotthesame.Thedataisthesame,buttherearetwounique instancesoftheobjects.TheCopySkeletonDataTomethodcreatesanewSkeletonobjectforeachnullslot inthearray.However,ifthearrayslotisnotnull,itupdatesthedataontheexistingSkeletonobject.
TrackingID Theskeletontrackingengineassignseachskeletonauniqueidentifier.Thisidentifierisaninteger, whichincrementallygrowswitheachnewskeleton.Donotexpectthevalueassignedtothenextnew skeletontogrowsequentially,butratherthenextvaluewillbegreaterthantheprevious.Additionally, thenextassignedvalueisnotpredictable.Iftheskeletonenginelosestheabilitytotrackauser—for example,theuserwalksoutofview—thetrackingidentifierforthatskeletonisretired.WhenKinect detectsanewskeleton,itassignsanewtrackingidentifier.Atrackingidentifierofzeromeansthatthe
96
CHAPTER4SKELETONTRACKING
Skeletonobjectisnotrepresentingauser,butisjustaplaceholderinthecollection.Thinkofitasanull skeleton.ApplicationsusetheTrackingIDtospecifywhichskeletonstheskeletonengineshouldactively track.CalltheChooseSkeletonsmethodontheSkeletonStreamobjecttoinitiatethetrackingofaspecific skeleton.
TrackingState Thisfieldprovidesinsightintowhatskeletondataisavailableifanyatall.Table4-2listsallvaluesofthe SkeletonTrackingStateenumeration. Table4-2.SkeletonTrackingStateValues
SkeletonTrackingState
What is Means
NotTrackedThe
Skeletonobjectdoesnotrepresentatrackeduser.ThePositionfieldof theSkeletonandeveryJointinthejointscollectionisazeropoint (SkeletonPointwheretheX,YandZvaluesallequalzero).
PositionOnly
Theskeletonisdetected,butisnotactivelybeingtracked.ThePositionfield hasanon-zeropoint,butthepositionofeachJoint inthejointscollectionis azeropoint.
Tracked
Theskeletonisactivelybeingtracked.ThePositionfieldandallJoint objectsinthejointscollectionhavenon-zeropoints.
Position ThePositionfieldisoftypeSkeletonPointandisthecenterofmassoftheskeleton.Thecenterofmass isroughlythesamepositionasthespinejoint.Thisfieldprovidesafastandsimplemeansof determiningauser’spositionwhetherornottheuserisactivelybeingtracked.Insomeapplications,this valueissufficientandthepositionsofspecificjointsareunnecessary.Thisvaluecanalsoserveas criteriaformanuallyselectingskeletons(SkeletonStream.ChooseSkeletons)totrack.Forexample,an applicationmaywanttoactivelytrackthetwoskeletonsclosesttoKinect.
ClippedEdges TheClippedEdgesfielddescribeswhichpartsoftheskeletonisoutoftheKinect’sview.Thisprovidesa macroinsightintotheskeleton’sposition.Useittoadjusttheelevationangleprogrammaticallyorto messageuserstorepositionthemselvesinthecenteroftheviewarea.ThepropertyisoftypeFrameEdges, whichisanenumerationdecoratedwiththeFlagsAttributeattribute.ThismeanstheClippedEdgesfield couldhaveoneormoreFrameEdgesvalues.RefertoAppendixAformoreinformationaboutworking withbitfieldssuchasthisone.Table4-3liststhepossibleFrameEdgesvalues.
97
CHAPTER4SKELETONTRACKING
Table4-3.FrameEdgesValues
FrameEdges
What is Means
Bottom
TheuserhasoneormorebodypartsbelowKinect’sfieldofview.
Left
TheuserhasoneormorebodypartsoffKinect’sleft.
Right
TheuserhasoneormorebodypartsoffKinect’sright.
Top
TheuserhasoneormorebodypartsaboveKinect’sfieldofview.
None
TheuseriscompletelyinviewoftheKinect.
Itispossibletoimprovethequalityofskeletondataifanypartoftheuser’sbodyisoutoftheview area.Theeasiestsolutionistopresenttheuserwithamessageaskingthemtoadjusttheirpositionuntil theclippingiseitherresolvedorinanacceptablestate.Forexample,anapplicationmaynotbe concernedthattheuserisbottomclipped,butmessagestheuseriftheybecomeclippedontheleftor right.TheothersolutionistophysicallyadjustthetitleoftheKinect.Kinecthasabuilt-inmotorthat titlesthecameraheadupanddown.Theangleofthetileisadjustablechangingthevalueofthe ElevationAnglepropertyontheKinectSensorobject.Ifanapplicationismoreconcernedwiththefeet jointsoftheskeleton,itneedstoensuretheuserisnotbottomclipped.Adjustingthetitleangleofthe sensorhelpskeeptheuser’sbottomjointsinview. TheElevationAngleismeasuredindegrees.TheKinectSensorobjectpropertiesMinElevationAngle andMaxElevationAngledefinethevaluerange.Anyattempttosettheanglevalueoutsideofthesevalues resultsinanArgumentOutOfRangeException exception.Microsoftwarnsnottochangethetitleangle repeatedlyasitmaywearoutthetiltmotor.Tohelpsavedevelopersfrommistakesandtohelppreserve themotor,theSDKlimitsthenumberofvaluechangestoonepersecond.Further,itenforcesa20secondbreakafter15consecutivechanges,inthattheSDKwillnothonoranyvaluechangesfor20 secondsfollowingthe15thchange.
Joints EachSkeletonobjecthasapropertynamedJoints.ThispropertyisoftypeJointsCollectionand containsasetofJointstructuresthatdescribethetrackablejoints(head,hands,elbowandothers)ofa skeleton.AnapplicationreferencesspecificjointsbyusingtheindexerontheJointsCollectionwhere theidentifierisavaluefromtheJointTypeenumeration.TheJointsCollectionisalwaysfullypopulated andreturnsaJointstructureforanyJointTypeevenwhentherearenouser’sinview.
Joint Theskeletontrackingenginefollowsandreportsontwentypointsorjointsoneachuser.TheJoint structurerepresentsthetrackingdatawiththreeproperties.TheJointTypepropertyoftheJointisa valuefromtheJointTypeenumeration.Figure4-3illustratesalltrackablejoints.
98
CHAPTER4SKELETONTRACKING
Figure4-3.Illustratedskeletonjoints EachjointhasaPosition,whichisoftypeSkeletonPointthatreportstheX,Y,andZofthejoint.The XandYvaluesarerelativetotheskeletonspace,whichisnotthesameasthedepthorvideospace.The KinectSensorhasasetofmethods(describedlaterintheSpaceandTransformssection)thatconvert skeletonpointstodepthpoints.Finally,thereistheJointTrackingStateproperty,whichdescribesifthe jointisbeingtrackedandhow.Table4-4liststhedifferenttrackingstates.
99
CHAPTER4SKELETONTRACKING
Table4-4.JointTrackingStateValues
JointTrackingState
What it means
Inferred
Theskeletonenginecannotseethejointinthedepthframepixels,buthas madeacalculateddeterminationofthepositionofthejoint.
NotTracked
Thepositionofthejointisindeterminable.ThePositionvalueisazeropoint.
Tracked
Thejointisdetectedandactivelyfollowed.
KinecttheDots Goingthroughbasicexercisesthatillustratepartsofalargerconceptisonething.Buildingfully functional,usableapplicationsisanother.Wediveintousingtheskeletonenginebybuildingagame calledKinecttheDots.Everychildgrowsupwithcoloringbooksandconnect-the-dotsdrawingbooks.A childtakesacrayonanddrawsalinefromonedottoanotherinaspecificsequence.Anumbernextto eachdotdefinesthesequence.Wewillbuildthisgame,butinsteadofusingacrayon,thechildren(inall ofus)usetheirhands. Thisobviouslyisnotanaction-packed,first-personshooterorMMOthatwillconsumethetech hipstersorhardcorereadersofSlashdotorTechCrunch,butitisperfectforourpurposes.Wewantto createarealapplicationwiththeskeletonenginethatusesjointdataforsomeotherusebesides renderingtotheUI.ThisgamepresentsopportunitiestointroduceNaturalUserInterface(NUI)design concepts,andameansofdevelopingacommonKinectuserinterface:handtracking.KinecttheDotsis anapplicationwecanbuildwithnoproductionassets(images,animations,hi-fidelitydesigns),but insteadusingonlycoreWPFdrawingtools.However,withalittleproductioneffortonyourpart,itcan beafullypolishedapplication.And,yes,whileyouandImaynotderivegreatentertainmentfromit, grabason,daughter,orlittlebrother,sister,niece,nephew,andwatchhowmuchfuntheyhave. Beforecodinganyproject,weneedtodefineourfeatureset.KinecttheDotsisapuzzlegamewhere auserdrawsanimagebyfollowingasequenceofdots.Immediately,wecanidentifyanentityobject puzzle.Eachpuzzleconsistsofaseriesofdotsorpoints.Theorderofthepointsdefinesthesequence. WewillcreateaclassnamedDotPuzzlethathasacollectionofPointobjects.Initially,itmightseem unnecessarytohavetheclass(whycan’twejusthaveanarraymembervariableofpoint?),butlaterit willmakeaddingnewfeatureseasy.Theapplicationusesthepuzzlepointsintwoways,thefirstbeingto drawthedotsonthescreen.Thesecondistodetectwhenausermakescontactwithadot. Whenausermakescontactwithadot,theapplicationbeginsdrawingaline,withthestartingpoint ofthelineanchoredatthedot.Theend-pointofthelinefollowstheuser’shanduntilthehandmakes contactwiththenextdotinthesequence.Thenextsequentialdotbecomestheanchorfortheendpoint,andanewlinestarts.Thiscontinuesuntiltheuserdrawsalinefromthelastpointbacktothefirst point.Thepuzzleisthencompleteandthegameisover. Withtherulesofthegamedefined,wearereadytocode.Asweprogressthroughtheproject,wewill findnewideasforfeaturesandaddthemasneeded.StartbycreatinganewWPFprojectandreferencing theSDKlibraryMicrosoft.Kinect.dll.Addcodetodetectandinitializeasinglesensor.Thisshould includesubscribingtotheSkeletonFrameReadyevent.
100
CHAPTER4SKELETONTRACKING
TheUserInterface ThecodeinListing4-4istheXAMLfortheprojectandthereareacoupleofimportantobservationsto note.ThePolylineelementrendersthelinedrawnfromdottodot.Astheusermoveshishandfromdot todot,theapplicationaddspointstotheline.ThePuzzleBoardElementCanvas elementholdstheUI elementsforthedots.TheorderoftheUIelementsintheLayoutRootGridisintentional.Weusea layeredapproachsothatourhandcursor,representedbytheImageelement,isalwaysinfrontofthe dotsandlines.TheotherreasonforputtingtheseUIelementsintheirowncontaineristhatresettingthe currentpuzzleorstartinganewpuzzleiseasy.Allwehavetodoisclearthechildelementsofthe PuzzleBoardElement andthe CrayonElementandtheotherUIelementsareunaffected. TheViewboxandGridelementsarecriticaltotheUIlookingastheuserexpectsitto.Weknowthat thevaluesofeachskeletaljointarebasedinskeletonspace.Thismeansthatwehavetotranslatethe jointvectorstobeinourUIspace.Forthisproject,wewillhard-codetheUIspaceandnotallowitto floatbasedonthesizeoftheUIwindow.TheGridelementdefinestheUIspaceas1920x1200.Weare usingtheexactdimensionsof1920x1200,becausethatisacommonfullscreensize;also,itis proportionaltothedepthimagesizesofKinect.Thismakescoordinatesystemtransformsclearerand providesforsmoothercursormovements. Listing4-4.XAMLForKinectTheDots
Havingahard-codedUIspacealsomakesiteasieronus,thedeveloper.Wewanttomakethe processoftranslatingfromskeletonspacetoourUIspacequickandeasywithasfewlinesofcodeas possible.Further,reactingtowindowsizechangesaddsmoreworkforusthatisnotrelevanttoourmain task.WecanbelazyandletWPFdothescalingworkforusbywrappingtheGridinaViewboxcontrol. TheViewboxscalesitschildrenbasedontheirsizeinrelationtotheavailablesizeofthewindow. ThefinalUIelementtopointoutistheImage.Thiselementisthehandcursor.Inthisproject,we useasimpleimageofahand,butyoucanfindyourownimage(itdoesn’thavetobeshapedlikeahand) oryoucancreatesomeotherUIelementforthecursorsuchasanEllipse.Theimageinthisexampleis arighthand.Inthecodethatfollows,wegivetheusertheoptionofusinghisleftorrighthand.Ifthe
101
CHAPTER4SKELETONTRACKING
usermotionswithhislefthand,wefliptheimagesothatitlookslikealefthandusingthe ScaleTransform.TheScaleTransformhelpstomakethegraphiclookandfeelright.
HandTracking Peopleinteractwiththeirhands,soknowingwhereandwhatthehandsaredoingisparamounttoa successfulandengagingKinectapplication.Thelocationandmovementsofthehandsisthebasisfor virtuallyallgestures.Trackingthemovementsofthehandsisthemostcommonuseofthedatareturned byKinect.Thisiscertainlythecasewithourapplicationasweignoreallotherjoints. Whendrawinginaconnect-the-dotsbook,apersonnormallydrawswithapencil,orcrayon,using asinglehand.Onehandcontrolsthecrayontodrawlinesfromonedottoanother.Ourapplication replicatesasinglehanddrawingwithcrayononapaperinterface.Thisuserinterfaceisnaturaland alreadyknowntousers.Further,itrequireslittleinstructiontoplayandtheuserquicklybecomes immersedintheexperience.Asaresult,theapplicationinherentlybecomesmoreenjoyable.Itiscrucial tothesuccessofanyKinectapplicationthattheapplicationbeasintuitiveandnon-invasivetothe user’snaturalformofinteractionaspossible.Bestofall,itrequiresminimalcodingeffortonourpart. UsersnaturallyextendorreachtheirarmstowardsKinect.Withinthisapplication,whicheverhand isclosesttoKinect,farthestfromtheuser,becomesthedrawingorprimaryhand.Theuserhasthe optionofswitchinghandsatanytimeinthegame.Thisallowsbothleftiesandrightiestoplaythegame comfortably.Codingtheapplicationtothesefeaturescreatesthecrayon-on-paperanalogyandsatisfies ourgoalofcreatinganaturaluserinterface.ThecodestartswithListing4-5. Listing4-5.SkeletonFrameReadyEventHandler private void KinectDevice_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e) { using(SkeletonFrame frame = e.OpenSkeletonFrame()) { if(frame != null) { frame.CopySkeletonDataTo(this._FrameSkeletons); Skeleton skeleton = GetPrimarySkeleton(this._FrameSkeletons);
}
}
if(skeleton == null) { HandCursorElement.Visibility = Visibility.Collapsed; } else { Joint primaryHand = GetPrimaryHand(skeleton); TrackHand(primaryHand); }
} private static Skeleton GetPrimarySkeleton(Skeleton[] skeletons) { Skeleton skeleton = null;
102
CHAPTER4SKELETONTRACKING
if(skeletons != null) { //Find the closest skeleton for(int i = 0; i < skeletons.Length; i++) { if(skeletons[i].TrackingState == SkeletonTrackingState.Tracked) { if(skeleton == null) { skeleton = skeletons[i]; } else { if(skeleton.Position.Z > skeletons[i].Position.Z) { skeleton = skeletons[i]; } } } } } }
return skeleton;
Eachtimetheeventhandlerexecutes,wefindthefirstvalidskeleton.Theapplicationdoesnotlock inonasingleskeleton,becauseitdoesnotneedtotrackorfollowasingleuser.Iftherearetwovisible users,theuserclosesttoKinectbecomestheprimaryuser.Thisisthefunctionofthe GetPrimarySkeletonmethod.Iftherearenodetectableusers,thentheapplicationhidesthehand cursors;otherwise,wefindtheprimaryhandandupdatethehandcursor.Thecodeforfindingthe primaryhandisinListing4-6. TheprimaryhandisalwaysthehandclosesttoKinect.However,thecodeisnotassimpleas checkingtheZvalueoftheleftandrighthandsandtakingthelowervalue.RememberthataZvalueof zeromeansthedepthvalueisindeterminable.Becauseofthis,wehavetodomorevalidationonthe joints.CheckingtheTrackingStateofeachjointtellsustheconditionsunderwhichthepositiondata wascalculated.Thelefthandisthedefaultprimaryhand,fornootherreasonthanthattheauthorislefthanded.Therighthandthenhastobetrackedexplicitly(JointTrackingState.Tracked)orimplicitly (JointTrackingState.Inferred)forustoconsideritasareplacementofthelefthand.
TipWhenworkingwithJointdata,alwayschecktheTrackingState.Notdoingsooftenleadstounexpected positionvalues,amisbehavingUI,orexceptions.
103
CHAPTER4SKELETONTRACKING
Listing4-6.GettingthePrimaryHandandUpdatingtheCursorPosition private static Joint GetPrimaryHand(Skeleton skeleton) { Joint primaryHand = new Joint(); if(skeleton != null) { primaryHand = skeleton.Joints[JointType.HandLeft]; Joint righHand = skeleton.Joints[JointType.HandRight];
} }
if(righHand.TrackingState != JointTrackingState.NotTracked) { if(primaryHand.TrackingState == JointTrackingState.NotTracked) { primaryHand = righHand; } else { if(primaryHand.Position.Z > righHand.Position.Z) { primaryHand = righHand; } } }
return primaryHand;
Withtheprimaryhandknown,thenextactionistoupdatethepositionofthehandcursor(see Listing4-7).Ifthehandisnottracked,thenthecursorishidden.Inmoreprofessionallyfinished applications,hidingthecursormightbedonewithaniceanimation,suchasafadeorzoomout,that hidesthecursor.Forthisproject,itsufficestosimplysettheVisibilitypropertyto Visibility.Collapsed.Whentrackingahand,weensurethecursorisvisible,calculatetheX,Yposition ofthehandinourUIspace,updateitsscreenposition,andsettheScaleTransform(HandCursorScale) basedonthehandbeingleftorright.Thecalculationtodeterminethepositionofthecursoris interesting,andrequiresfurtherexamination.Thiscodeissimilartocodewewroteinthestickfigure example(Listing4-3).Wecovertransformationslater,butfornowjustknowthattheskeletondataisin anothercoordinatespacethantheUIelementsandweneedtoconvertthepositionvaluesfromone coordinatespacetoanother. Listing4-7.UpdatingthePositionoftheHandCursor private void TrackHand(Joint hand) { if(hand.TrackingState == JointTrackingState.NotTracked) { HandCursorElement.Visibility = System.Windows.Visibility.Collapsed; }
104
CHAPTER4SKELETONTRACKING
else { HandCursorElement.Visibility = System.Windows.Visibility.Visible; float x; float y; DepthImagePoint point = this.KinectDevice.MapSkeletonPointToDepth(hand.Position, DepthImageFormat.Resolution640x480Fps30); point.X = (int) ((point.X * LayoutRoot.ActualWidth / this.KinectDevice.DepthStream.FrameWidth) (HandCursorElement.ActualWidth / 2.0)); point.Y = (int) ((point.Y * LayoutRoot.ActualHeight) / this.KinectDevice.DepthStream.FrameHeight) (HandCursorElement.ActualHeight / 2.0)); Canvas.SetLeft(HandCursorElement, x); Canvas.SetTop(HandCursorElement, y);
}
if(hand.ID == JointType.HandRight) { HandCursorScale.ScaleX = 1; } else { HandCursorScale.ScaleX = -1; }
} Atthispoint,compileandruntheapplication.Theapplicationproducesoutputsimilartothat showninFigure4-4,whichshowsaskeletonstickfiguretobetterillustratethehandmovements.With handtrackinginplaceandfunctional,wemovetothenextphaseoftheprojecttobegingameplay implementation.
Figure4-4.Handstrackingwithstickfigurewavinglefthand
105
CHAPTER4SKELETONTRACKING
DrawingthePuzzle Listing4-8showstheDotPuzzleclass.Itissimpletothepointthatyoumayquestionitsneed,butit servesasabasisforlaterexpansion.Theprimaryfunctionofthisclassistoholdacollectionofpoints thatcomposethepuzzle.ThepositionofeachpointintheDotscollectiondeterminestheirsequencein thepuzzle.Thecompositionoftheclasslendsitselftoserialization.Becauseitiseasilyserializable,we couldexpandourapplicationtoreadpuzzlesfromXMLfiles. Listing4-8.DotPuzzleClass public class DotPuzzle { public DotPuzzle() { this.Dots = new List(); } }
public List Dots { get; set; }
WiththeUIlaidoutandourprimaryentityobjectdefined,wemoveontocreatingourfirstpuzzle. IntheconstructoroftheMainWindowclass,createanewinstanceoftheDotPuzzleclassanddefinesome points.ThecodeinboldinListing4-9showshowthisisdone.Thecodelistingalsoshowsthemember variableswewilluseforthisapplication.ThevariablePuzzleDotIndexisusedtotracktheuser’sprogress insolvingthepuzzle.Weinitiallysetthe_PuzzleDotIndexvariableto-1toindicatethattheuserhasnot startedthepuzzle. Listing4-9.MainWindowMemberVariablesandConstructor private DotPuzzle _Puzzle; private int _PuzzleDotIndex; public MainWindow() { InitializeComponent(); //Sample puzzle this._Puzzle = new DotPuzzle(); this._Puzzle.Dots.Add(new Point(200, 300)); this._Puzzle.Dots.Add(new Point(1600, 300)); this._Puzzle.Dots.Add(new Point(1650, 400)); this._Puzzle.Dots.Add(new Point(1600, 500)); this._Puzzle.Dots.Add(new Point(1000, 500)); this._Puzzle.Dots.Add(new Point(1000, 600)); this._Puzzle.Dots.Add(new Point(1200, 700)); this._Puzzle.Dots.Add(new Point(1150, 800)); this._Puzzle.Dots.Add(new Point(750, 800)); this._Puzzle.Dots.Add(new Point(700, 700)); this._Puzzle.Dots.Add(new Point(900, 600)); this._Puzzle.Dots.Add(new Point(900, 500));
106
CHAPTER4SKELETONTRACKING
this._Puzzle.Dots.Add(new Point(200, 500)); this._Puzzle.Dots.Add(new Point(150, 400)); this._PuzzleDotIndex = -1; this.Loaded
+= MainWindow_Loaded;
} ThelaststepincompletingtheUIistodrawthepuzzlepoints.ThiscodeisinListing4-10.We createamethodnamedDrawPuzzle,whichwecallwhentheMainWindowloads(MainWindow_Loadedevent handler).TheDrawPuzzleiteratesovereachdotinthepuzzle,creatingUIelementstorepresentthedot. ItthenplacesthatdotinthePuzzleBoardElement.AnalternativetobuildingtheUIincodeistobuilditin theXAML.WecouldhaveattachedtheDotPuzzleobjecttotheItemsSourcepropertyofanItemsControl object.TheItemsControl’sItemTemplatepropertywouldthendefinethelookandplacementofeachdot. Thisdesignismoreelegant,becauseitallowsforthemingoftheuserinterface.Thedesign demonstratedinthesepageswaschosentokeepthefocusontheKinectcodeandnotthegeneralWPF code,aswellasreducethenumberoflinesofcodeinprint.Youarehighlyencouragedtorefactorthe codetoleverageWPF’spowerfuldatabindingandstylingsystemsandtheItemsControl.
107
CHAPTER4SKELETONTRACKING
Listing4-10.DrawingthePuzzle private void MainWindow_Loaded(object sender, RoutedEventArgs e) { KinectSensor.KinectSensors.StatusChanged += KinectSensors_StatusChanged; this.KinectDevice = KinectSensor.KinectSensors.FirstOrDefault(x => x.Status == KinectStatus.Connected); }
DrawPuzzle(this._Puzzle);
private void DrawPuzzle(DotPuzzle puzzle) { PuzzleBoardElement.Children.Clear(); if(puzzle != null) { for(int i = 0; i < puzzle.Dots.Count; i++) { Grid dotContainer = new Grid(); dotContainer.Width = 50; dotContainer.Height = 50; dotContainer.Children.Add(new Ellipse() { Fill = Brushes.Gray }); TextBlock dotLabel = new TextBlock(); dotLabel.Text = (i + 1).ToString(); dotLabel.Foreground = Brushes.White; dotLabel.FontSize = 24; dotLabel.HorizontalAlignment = System.Windows.HorizontalAlignment.Center; dotLabel.VerticalAlignment = System.Windows.VerticalAlignment.Center; dotContainer.Children.Add(dotLabel);
} }
//Position the UI element centered on the dot point Canvas.SetTop(dotContainer, puzzle.Dots[i].Y - (dotContainer.Height / 2) ); Canvas.SetLeft(dotContainer, puzzle.Dots[i].X - (dotContainer.Width / 2)); PuzzleBoardElement.Children.Add(dotContainer);
}
SolvingthePuzzle Uptothispoint,wehavebuiltauserinterfaceandcreatedabaseinfrastructureforpuzzledata;weare visuallytrackingauser’shandmovementsanddrawingpuzzlesbasedonthatdata.Thefinalcodeto adddrawslinesfromonedottoanother.Whentheusermovesherhandoveradot,weestablishthatdot asananchorfortheline.Theline’send-pointiswherevertheuser’shandis.Astheusermovesherhand aroundthescreen,thelinefollows.ThecodeforthisfunctionalityisintheTrackPuzzlemethodshown inListing4-11.
108
CHAPTER4SKELETONTRACKING
ThemajorityofthecodeinthisblockisdedicatedtodrawinglinesontheUI.Theotherparts enforcetherulesofthegame,suchasfollowingthecorrectsequenceofthedots.However,onesection ofcodedoesneitherofthesethings.Itsfunctionistomaketheapplicationmoreuser-friendly.Thecode calculatesthelengthdifferencebetweenthenextdotinthesequenceandthehandposition,andchecks toseeifthatdistanceislessthan25pixels.Thenumber25isarbitrary,butitisagoodsolidnumberfor ourUI.Kinectneverreportscompletelysmoothjointpositionsevenwithsmoothingparameters applied.Additionally,usersrarelyhaveasteadyhand.Therefore,itisimportantforapplicationstohave ahitzonelargerthantheactualUIelementtarget.Thisisadesignprinciplecommonintouchinterfaces andappliestoKinectaswell.Iftheusercomesclosetothehitzone,wegivehercreditforhittingthe target.
NoteThecalculationstogettheresultsstoredinthepointdotDiffandlengthareexamplesofvectormath. ThistypeofmathcanbequitecommoninKinectapplications.Grabanoldgradeschoolmathbookorusethebuilt invectorroutinesof.NET.
Finally,addonelineofcodetotheSkeletonFrameReadyeventhandlertocalltheTrackPuzzle method.ThecallfitsperfectlyrightafterthecalltoTrackHandinListing4-5.Addingthiscodecompletes theapplication.Compile.Run.Kinectthedots!
109
CHAPTER4SKELETONTRACKING
Listing4-11.DrawingtheLinestoKinecttheDots using Nui=Microsoft.Kinect; private void TrackPuzzle(SkeletonPoint position) { if(this._PuzzleDotIndex == this._Puzzle.Dots.Count) { //Do nothing - Game is over } else { Point dot; if(this._PuzzleDotIndex + 1 < this._Puzzle.Dots.Count) { dot = this._Puzzle.Dots[this._PuzzleDotIndex + 1]; } else { dot = this._Puzzle.Dots[0]; } float x; float y; DepthImagePoint point = this.KinectDevice.MapSkeletonPointToDepth(position, DepthImageFormat.Resolution640x480Fps30); point.X = (int) (point.X * LayoutRoot.ActualWidth / this.KinectDevice.DepthStream.FrameWidth); point.Y = (int) (point.Y * LayoutRoot.ActualHeight / this.KinectDevice.DepthStream.FrameHeight); Point handPoint = new Point(point.X, point.Y); //Calculate the length between the two points. This can be done manually //as shown here or by using the System.Windows.Vector object to get the length. //System.Windows.Media.Media3D.Vector3D is available for 3D vector math. Point dotDiff = new Point(dot.X - handPoint.X, dot.Y - handPoint.Y); double length = Math.Sqrt(dotDiff.X * dotDiff.X + dotDiff.Y * dotDiff.Y); int lastPoint = this.CrayonElement.Points.Count – 1; if(length < 25) { //Cursor is within the hit zone if(lastPoint > 0) { //Remove the working end point this.CrayonElement.Points.RemoveAt(lastPoint);
110
CHAPTER4SKELETONTRACKING
} //Set line end point this.CrayonElement.Points.Add(new Point(dot.X, dot.Y)); //Set new line start point this.CrayonElement.Points.Add(new Point(dot.X, dot.Y)); //Move to the next dot this._PuzzleDotIndex++; if(this._PuzzleDotIndex == this._Puzzle.Dots.Count) { //Notify the user that the game is over }
} else {
if(lastPoint > 0) { //To refresh the Polyline visual you must remove the last point, //update and add it back. Point lineEndpoint = this.CrayonElement.Points[lastPoint]; this.CrayonElement.Points.RemoveAt(lastPoint); lineEndpoint.X = handPoint.X; lineEndpoint.Y = handPoint.Y; this.CrayonElement.Points.Add(lineEndpoint); }
}
}
}
SuccessfullybuildingandrunningtheapplicationyieldsresultslikethatofFigure4-5.Thehand cursortrackstheuser’smovementsandbeginsdrawingconnectinglinesaftermakingcontactwiththe nextdotinthesequence.Theapplicationvalidatestoensurethedotconnectionsaremadeinsequence. Forexample,iftheuserinFigure4-5movestheirhandtodot11theapplicationdoesnotcreatea connection.Thisconcludestheprojectandsuccessfullydemonstratesbasicskeletonprocessinginafun way.ReadthenextsectionforideasonexpandingKinecttheDotstotakeitbeyondasimplewalkthrough levelproject.
111
CHAPTER4SKELETONTRACKING
Figure4-5.KinecttheDotsinaction
ExpandingtheGame KinecttheDotsisfunctionallycomplete.Ausercanstarttheapplicationandmovehishandsaroundto solvethepuzzle.However,itisfarfromapolishedapplication.Itneedssomefit-and-finish.Themost obvioustweakistoaddsmoothing.Youshouldhavenoticedthatthehandcursorisjumpy.Thesecond mostobviousfeaturetoaddisawaytoresetthepuzzle.Asitstands,oncetheusersolvesthepuzzle thereisnothingmoretodobutkilltheapplicationandthat’snofun!Yourusersarechanting“More! More!More!” Oneoptionistocreateahotspotintheupperleftcornerandlabelit“Reset.”Whentheuser’shand entersthisarea,theapplicationresetsthepuzzlebysetting_PuzzleDotIndexto-1andclearingthe pointsfromtheCrayonElement.ItwouldbesmarttocreateaprivatemethodnamedResetPuzzlethat doesthiswork.Thismakestheresetcodemorereusable. Herearemorefeaturesyouarehighlyencouragedtoaddtothegametomakeitacomplete experience: •
112
Createmorepuzzles!Maketheapplicationsmartersothatwhenitinitiallyloadsit readsacollectionofpuzzlesfromanXMLfile.Thenrandomlypresenttheuser withapuzzle.Or…
CHAPTER4SKELETONTRACKING
•
Givetheusertheoptiontoselectwhichpuzzleshewantstosolve.Theuserselects apuzzleanditdraws.Atthispoint,thisisanadvancedfeaturetoaddtothe application.Itrequirestheusertoselectanoptionfromalist.Ifyouare ambitious,goforit!Afterreadingthechapterongestures,thiswillbeeasy.Aquick solutionistobuildthemenusothatitworkswithtouchoramouse.Kinectand touchworkverywelltogether.
•
Advancetheusertoanewpuzzleonceshehascompletedthecurrentpuzzle.
•
Addextradata,suchasatitleandbackgroundimage,toeachpuzzle.Displaythe titleatthetopofthescreen.Forexample,ifthepuzzleisafish,thebackground canbeanunderwaterscenewithaseafloor,astarfish,mermaids,andotherfish. ThebackgroundimagewouldbedefinedasapropertyontheDotPuzzleobject.
•
Addautomateduserassistancetohelpusersstrugglingtofindthenextdot.Inthe code,startatimerwhentheuserconnectswithadot.Eachtimetheuserconnects withadot,resetthetimer.Ifthetimergoesoff,itmeanstheuserishavingtrouble findingthenextdot.Atthispoint,theapplicationdisplaysapop-upmessage pointingtothenextdot.
•
Resetthepuzzlewhenauserleavesthegame.Supposeauserhastoleavethe gameforsomereasontoanswerthephone,getadrinkofwater,orgofora bathroombreak.WhenKinectnolongerdetectsanyusers,startatimer.Whenthe timerexpires,resetthepuzzle.Youdidremembertoputtheresetcodeinitsown methodsothatitcanbecalledfrommultipleplaces,right?
•
Rewardtheuserforcompletingapuzzlebyplayinganexcitinganimation.You couldevenhaveananimationeachtimetheusersuccessfullyconnectsadot.The dot-for-dotanimationsshouldbesubtleyetrewarding.Youdonotwantthemto annoytheuser.
•
Aftersolvingapuzzlelettheusercolorthescreen.Givehimtheoptionofselecting acolorfromametaphoriccolorbox.Afterselectingacolor,whereverhishand goesonthescreentheapplicationdrawscolor.
SpaceandTransforms Ineachoftheexampleprojectsinthischapter,weprocessedandmanipulatedthePositionpointof Joints.Inalmostallcircumstances,thisdataisunusablewhenraw.Skeletonpointsaremeasured differentlyfromdepthorvideodata.Eachindividualsetofdata(skeleton,depth,andvideo)isdefined withinaspecificgeometriccoordinateplaneorspace.Depthandvideospacearemeasuredinpixels, withthezeroXandYpositionsbeingattheupperleftcorner.TheZdimensioninthedepthspaceis measuredinmillimeters.However,theskeletonspaceismeasuredinmeterswiththezeroXandY positionsatthecenterofthedepthsensor.Skeletonspaceusesaright-handedcoordinatespacewhere thepositivevaluesoftheX-axisextendtotherightandthepositiveY-axisextendsupward.TheX-axis rangesfrom-2.2to2.2(7.22”)foratotalspanof4.2metersor14.44feet;theY-axisrangesfrom-1.6to1.6 (5.25”);andtheZ-axisfrom0to4(13.1233”).Figure4-6illustratesthecoordinatespaceoftheskeleton stream.
113
CHAPTER4SKELETONTRACKING
Figure4-6.Skeletonspace
SpaceTransformations Kinectexperiencesareallabouttheuserinteractingwithavirtualspace.Themoreinteractionsan applicationmakespossible,themoreengagingandentertainingtheexperience.Wewantuserstoswing virtualrackets,throwvirtualbowlingballs,pushbuttons,andswipethroughmenus.Inthecaseof KinecttheDots,wewanttheusertomoveherhandwithinrangeofadot.Forustoknowthatauseris connectingonedottoanother,wehavetoknowthattheuser’shandisoveradot.Thisdeterminationis onlypossiblebytransformingtheskeletondatatoourvisualUIspace.SincetheSDKdoesnotgive skeletondatatousinaformdirectlyusablewithouruserinterfaceandvisualelements,wehavetodo somework. Convertingortransformingskeletonspacevaluestodepthspaceiseasy.TheSDKprovidesacouple ofhelpermethodstotransformbetweenthetwospaces.TheKinectSensorobjecthasamethodnamed MapSkeletonPointToDepthtotransformskeletonpointsintopointsusableforUItransformations.There isalsoamethodnamedMapDepthToSkeletonPoint,whichperformstheconversioninreverse.The MapSkeletonPointToDepthmethodtakesaSkeletonPointandaDepthImageFormat.Theskeletonpointcan comefromthePositionpropertyontheSkeletonorthePositionpropertyofaJointontheskeleton. Whilethenameofthemethodhastheword“Depth”init,thisisnottobetakenliterally.The destinationspacedoesnothavetobeaKinectdepthimage.Infact,theDepthStreamdoesnothavetobe enabled.ThetransformistoanyspacialplaneofthedimensionssupportedbytheDepthImageFormat. Oncetheskeletonpointismappedtothedepthspaceitcanbescaledtoanydesireddimension. Inthestickfigureexercise,theGetJointPointmethod(Listing4-3)transformseachskeletonpoint toapixelinthespaceoftheLayoutRootelement,becausethisisthespaceinwhichwewanttodraw skeletonjoints.IntheKinecttheDotsproject,weperformthistransformtwice.Thefirstisinthe TrackHandmethod(Listing4-7).Inthisinstance,wecalculatedthetransformandthenadjustedthe positionsothatthehandcursoriscenteredatthatpoint.TheotherinstanceisintheTrackPuzzle method(Listing4-11),whichdrawsthelinesbasedontheuser’shandmovements.Herethecalculation issimpleandjusttransformstotheLayoutRootelement’sspace.Bothcalculationsarethesameinthat theytransformtothesameUIelement’sspace.
114
CHAPTER4SKELETONTRACKING
TipCreatehelpermethodstoperformspacetransforms.Therearealwaysfivevariablestoconsider:source vector,destinationwidth,destinationheight,destinationwidthoffset,anddestinationheightoffset.
LookingintheMirror Asyoumayhavenoticedfromtheprojects,theskeletondataismirrored.Undermostcircumstances, thisisacceptable.InKinecttheDots,itworksbecauseusersexpectthecursortomimictheirhand movementsexactly.Thisalsoworksforaugmentedrealityexperienceswheretheuserinterfaceisbased onthevideoimageandtheuserexpectsamirroredeffect.Manygamesrepresenttheuserwithanavatar wheretheavatar’sbackfacestheuser.Thisisathird-personperspectiveview.However,thereare instanceswherethemirroreddataisnotconducivetotheUIpresentation.Applicationsorgamesthat haveafront-facingavatar,wheretheuserseesthefaceoftheavatar,donotwantthemirroredaffect. Whentheuserwaveshisleftarm,theleftoftheavatarshouldwave.Withoutmakingasmall manipulationtotheskeletondata,theavatar’srightarmwaves,whichisclearlyincorrect. Unfortunately,theSDKdoesnothaveanoptionorpropertytoset,whichcausestheskeleton enginetoproducenon-mirroreddata.Thisworkistheresponsibilityofthedeveloper,butluckily,itisa trivialoperationduetothenatureofskeletondata.Thenon-mirroredeffectworksbyinvertingtheX valueofeachskeletonpositionvector.Tocalculatetheinverseofanyvectorcomponent,multiplyitby -1.ExperimentwiththestickfigureprojectbyupdatingtheGetJointPointmethod(originallyshownin Listing4-3)asshowninListing4-12.Withthischangeinplace,whentheuserraiseshisleftarm,thearm ontherightsideofthestickfigurewillraise. Listing4-12.ReversingtheMirror private Point GetJointPoint(Joint joint) { DepthImagePoint point = this.KinectDevice.MapSkeletonPointToDepth(joint.Position, DepthImageFormat.Resolution640x480Fps30); depthX *= -1 * (int) this.LayoutRoot.ActualWidth / KinectDevice.DepthStream.FrameWidth; depthY *= (int) this.LayoutRoot.ActualHeight / KinectDevice.DepthStream.FrameHeight; }
return new Point((double) depthX, (double) depthY);
SkeletonViewerUserControl AsyouworkwithKinectandtheKinectforWindowsSDKtobuildinteractiveexperiences,youfindthat duringdevelopmentitisveryhelpfultoactuallyseeavisualrepresentationofskeletonandjointdata. Whendebugginganapplication,itishelpfultoseeandunderstandtherawinputdata,butinthe productionversionoftheapplication,youdonotwanttoseethisinformation.Oneoptionistotakethe codewewroteintheskeletonviewerexerciseandcopyandpasteitintoeachapplication.Afterawhile, thisbecomestediousandcluttersyourcodebaseunnecessarily.Forthesereasonsandothers,itis helpfultorefactorthiscodesothatitisreusable.
115
CHAPTER4SKELETONTRACKING
Ourgoalistotaketheskeletonviewercodeandaddtoitsothatitprovidesusmorehelpful debugginginformation.Weaccomplishthisbycreatingausercontrol,whichwenameSkeletonViewer. Inthisusercontrol,theskeletonandjointUIelementsaredrawntotheUIrootoftheusercontrol.The SkeletonViewercontrolcanthenbeachildofanypanelwewant.Startbycreatingausercontroland replacetherootGridelementwiththecodeinListing4-13. Listing4-13.SkeletonViewerXAML
TheSkeletonsPaneliswherewewilldrawthestickfiguresjustaswedidearlierinthischapter.The JointInfoPaneliswheretheadditionaldebugginginformationwillgo.We’llgointomoredetailonthis later.ThenextstepistolinktheusercontrolwithaKinectSensorobject.Forthis,wecreatea DependencyProperty,whichallowsustousedatabindingifwedesire.Listing4-14hasthecodeforthis property.TheKinectDeviceChanged staticmethodiscriticaltothefunctionandperformanceofany applicationusingthiscontrol.Thefirststepunsubscribestheeventhandlerfromthe SkeletonFrameReadyforanypreviouslyassociatedKinectSensorobject.Notremovingtheeventhandler causesmemoryleaks.Anevenbetterapproachistousetheweakeventhandlerpattern,thedetailsof whicharebeyondourscope.TheotherhalfofthismethodsubscribestotheSkeletonFrameReadyevent whentheKinectDevicepropertyissettoanon-nullvalue. Listing4-14.RuntimeDependencyProperty #region KinectDevice protected const string KinectDevicePropertyName = "KinectDevice"; public static readonly DependencyProperty KinectDeviceProperty = DependencyProperty.Register(KinectDevicePropertyName, typeof(KinectSensor), typeof(SkeletonViewer), new PropertyMetadata(null, KinectDeviceChanged)); private static void KinectDeviceChanged(DependencyObject owner, DependencyPropertyChangedEventArgs e) { SkeletonViewer viewer = (SkeletonViewer) owner; if(e.OldValue != null) { KinectSensor sensor; sensor = (KincetSensor) e.OldValue; sensor.SkeletonFrameReady -= viewer.KinectDevice_SkeletonFrameReady; }
}
116
if(e.NewValue != null) { viewer.KinectDevice = (KinectSensor) e.NewValue; viewer.KinectDevice.SkeletonFrameReady += viewer.KinectDevice_SkeletonFrameReady; }
CHAPTER4SKELETONTRACKING
public KinectSensor KinectDevice { get { return (KinectSensor)GetValue(KinectDeviceProperty); } set { SetValue(KinectDeviceProperty, value); } } #endregion KinectDevice NowthattheusercontrolisreceivingnewskeletondatafromtheKinectSensor,wecanstart drawingskeletons.Listing4-15showstheSkeletonFrameReadyeventhandler.Muchofthiscodeisthe sameasthatusedinthepreviousexercises.Theskeletonprocessinglogiciswrappedinanifstatement sothatitonlyexecuteswhenthecontrol’sIsEnabledpropertyissettotrue.Thisallowstheapplication toturnthefunctionofthiscontrolonandoffeasily.Therearetwooperationsperformedoneach skeleton.ThemethoddrawstheskeletonbycallingtheDrawSkeletonmethod.DrawSkeletonandits helpermethods(CreateFigureandGetJointPoint)arethesamemethodsthatweusedinthestickfigure example.Youcancopyandpastethecodetothissourcefile. ThenewlinesofcodetocalltheTrackJointmethodareshowninbold.Thismethoddisplaysthe extrajointinformation.ThesourceforthismethodisalsopartofListing4-15.TrackJointdrawsacircle atthelocationofthejointanddisplaystheX,Y,andZvaluesnexttothejoint.TheXandYvaluesarein pixelsandarerelativetothewidthandheightoftheusercontrol.TheZvalueisthedepthconvertedto feet.IfyouarenotalazyAmericanandknowthemetricsystem,youcanleavethevalueinmillimeters.
117
CHAPTER4SKELETONTRACKING
Listing4-15.DrawingSkeletonJointsandInformation private void KinectDevice_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e) { SkeletonsPanel.Children.Clear(); JointInfoPanel.Children.Clear(); if(this.IsEnabled) { using(SkeletonFrame frame = e.OpenSkeletonFrame()) { if(frame != null) { if(this.IsEnabled) { Brush brush; Skeleton skeleton; frame.CopySkeletonDataTo(this._FrameSkeletons); for(int i = 0; i < this._FrameSkeletons.Length; i++) { skeleton = skeletons[i]; brush = this._SkeletonBrushes[i]; DrawSkeleton(skeleton, brush); TrackJoint(skeleton.Joints[JointType.HandLeft], brush); TrackJoint(skeleton.Joints[JointType.HandRight], brush); //You can track all the joints if you want } }
}
}
}
} private void TrackJoint(Joint joint, Brush brush) { if(joint.TrackingState != JointTrackingState.NotTracked) { Canvas container = new Canvas(); Point jointPoint = GetJointPoint(joint); //FeetPerMeters is a class constant of 3.2808399f; double z = joint.Position.Z * FeetPerMeters; Ellipse element = new Ellipse(); element.Height = 10; element.Width = 10; element.Fill = brush; Canvas.SetLeft(element, 0 - (element.Width / 2));
118
CHAPTER4SKELETONTRACKING
Canvas.SetTop(element, 0 - (element.Height / 2)); container.Children.Add(element); TextBlock positionText = new TextBlock(); positionText.Text = string.Format("", jointPoint.X, jointPoint.Y, z); positionText.Foreground = brush; positionText.FontSize = 24; Canvas.SetLeft(positionText, 0 - (positionText.Width / 2)); Canvas.SetTop(positionText, 25); container.Children.Add(positionText); Canvas.SetLeft(container, jointPoint.X); Canvas.SetTop(container, jointPoint.Y); JointInfoPanel.Children.Add(container); }
}
AddingtheSkeletonViewertoanapplicationisquickandeasy.SinceitisaUserControl,simplyadd ittotheXAML,andtheninthemainapplicationsettheKinectDevicepropertyoftheSkeletonViewerto thedesiredsensorobject.Listing4-16demonstratesthisbyshowingthecodefromtheKinecttheDots projectthatinitializestheKinectSensorobject.Figure4-7,whichfollowsListing4-16,isascreenshotof KinecttheDotswiththeSkeletonViewerenabled. Listing4-16.InitializingtheSkeletonViewer if(this._KinectDevice != value) { //Uninitialize if(this._KinectDevice != null) { this._KinectDevice.Stop(); this._KinectDevice.SkeletonFrameReady -= KinectDevice_SkeletonFrameReady; this._KinectDevice.SkeletonStream.Disable(); SkeletonViewerElement.KinectDevice = null; } this._KinectDevice = value;
}
//Initialize if(this._KinectDevice != null) { if(this._KinectDevice.Status == KinectStatus.Connected) { this._KinectDevice.SkeletonStream.Enable(); this._KinectDevice.Start(); SkeletonViewerElement.KinectDevice = this.KinectDevice; this.KinectDevice.SkeletonFrameReady += KinectDevice_SkeletonFrameReady; } }
119
CHAPTER4SKELETONTRACKING
Figure4-7.KinecttheDotsusingtheSkeletonViewerusercontrol
Summary AsyoumayhavenoticedwiththeKinecttheDotsapplication,theKinectcodewasonlyasmallpartof theapplication.Thisiscommon.Kinectisjustanotherinputdevice.Whenbuildingapplicationsdriven bytouchormouse,thecodefocusedontheinputdeviceismuchlessthantheotherapplicationcode. Thedifferenceisthattheextractionoftheinputdataandtheworktoprocessthatdataishandledlargely foryoubythe.NETframework.AsKinectmatures,itispossiblethatittoowillbeintegratedintothe .NETframework.ImaginehavinghandcursorsbuiltintotheframeworkorOSjustlikethemousecursor. Untilthatdayhappens,wehavetowritethiscodeourselves.Theimportantpointoffocusisondoing stuffwiththeKinectdata,andnotonhowtoextractdatafromtheinputdevice. Inthischapter,weexaminedeveryclass,property,andmethodfocusedonskeletontracking.Inour firstexample,theapplicationdemonstratedhowtodrawastickfigurefromtheskeletonpoints,whereas thesecondprojectwasamorefunctionalreal-worldapplicationexperience.Init,wefoundpractical usesforthejointdata,inwhichtheuser,forthefirsttime,actuallyinteractedwiththeapplication.The user’snaturalmovementsprovidedinputtotheapplication.Thischapterconcludesourexplorationof thefundamentalsoftheSDK.Fromhereonout,weexperimentandbuildfunctionalapplicationstofind theboundariesofwhatispossiblewithKinect.
120
CHAPTER 5
Advanced Skeleton Tracking Thischaptermarksthebeginningofthesecondhalfofthebook.Thefirstsetofchaptersfocusedonthe fundamentalcameracentricfeaturesoftheKinectSDK.Weexploredandexperimentedwithevery methodandpropertyofeveryobjectfocusedonthesefeatures.ThesearethenutsandboltsofKinect development.YounowhavethetechnicalknowledgenecessarytowriteapplicationsusingKinectand theSDK.However,knowingtheSDKandunderstandinghowtouseitasatooltobuildgreat applicationsandexperiencesaresubstantiallydifferentmatters.Theremainingchaptersofthebook changetoneandcourseintheircoverageoftheSDK.MovingforwardwediscusshowtousetheSDKin conjunctionwithWPFandotherthird-partytoolsandlibrariestobuildKinect-drivenexperiences.We willusealltheinformationyoulearnedinthepreviouschapterstoprogresstomoreadvancedand complextopics. Atitscore,Kinectonlyemitsanddetectsthereflectionofinfraredlight.Fromthereflectionofthe light,itcalculatesdepthvaluesforeachpixelofviewarea.Thefirstderivativeofthedepthdataisthe abilitytodetectblobsandshapes.Playerindexbitsofeachdepthpixeldataareaformofafirst derivative.Thesecondderivativedetermineswhichoftheseshapesmatchesthehumanform,andthen calculatesthelocationofeachsignificantaxispointonthehumanbody.Thisisskeletontracking,which wecoveredinthepreviouschapter. WhiletheinfraredimageandthedepthdataarecriticalandcoretoKinect,theyarelessprominent thanskeletontracking.Infact,theyareameanstoanend.AstheKinectandotherdepthcameras becomemoreprevalentineverydaycomputeruse,therawdepthdatawillreceivelessdirectattention fromdevelopers,andbecomemerelytriviaorpartofpassingconversation.Wearealmosttherenow. TheMicrosoftKinectSDKdoesnotgivethedeveloperaccesstotheinfraredimagestreamofKinect,but otherKinectSDKsmakeitavailable.Itislikelythatmostdeveloperswillneverusetherawdepthdata, butwillonlyeverworkwiththeskeletondata.However,onceposeandgesturerecognitionbecome standardizedandintegratedintotheKinectSDK,developerslikelywillnotevenaccesstheskeleton data. Wehopetoadvancethismovement,becauseitsignifiesthematurationofKinectasatechnology. Thischapterkeepsthefocusonskeletontracking,buttheapproachtotheskeletondataisdifferent.We focusonKinectasaninputdevicewiththesameclassificationasamouse,stylus,ortouch,butuniquely differentbecauseofitsabilitytoseedepth.MicrosoftpitchedKinectforXboxwith,“Youarethe controller,”ormoretechnically,youaretheinputdevice.Withskeletondata,applicationscandothe samethingsamouseortouchdevicecan.Thedifferenceisthedepthcomponentallowstheuserandthe applicationtointeractasneverbefore.LetusexplorethemechanicsthroughwhichtheKinectcan controlandinteractwithuserinterfaces.
121
CHAPTER5ADVANCEDSKELETONTRACKING
UserInteraction Computersandtheapplicationsthatrunonthemrequireinput.Traditionally,userinputcomesfroma keyboardandmouse.Theuserinteractsdirectlywiththesehardwaredevices,whichinturn,transmit datatothecomputer.Thecomputertakesthedatafromtheinputdeviceandcreatessometypeofvisual effect.Itiscommonknowledgethateverycomputerwithagraphicaluserinterfacehasacursor,which isoftenreferredtoasthemousecursor,becausethemousewastheoriginalvehicleforthecursor. However,callingitamousecursorisnolongerasaccurateasitoncewas.Touchorstylusdevicesalso controlthesamecursorasthemouse.Whenausermovesthemouseordragshisorherfingeracrossa touchscreen,thecursorreactstothesemovements.Ifausermovesthecursoroverabutton,moreoften thannotthebuttonchangesvisuallytoindicatethatthecursorishoveringoverthebutton.Thebutton givesanothertypeofvisualindicatorwhentheuserpressesthemousebuttonwhilehoveringovera button.Stillanothervisualindicatoremergeswhentheuserreleasesthemousebuttonwhileremaining overabutton.Thisprocessmayseemtrivialtothinkthroughstepbystep,buthowmuchofthisprocess doyoureallyunderstand?Ifyouhadto,couldyouwritethecodenecessarytotrackchangesinthe mouse’sposition,hoverstates,andbuttonclicks? Theseareuserinterfaceorinteractionsdevelopersoftentakeforgranted,becausewithuser interfaceplatformslikeWPF,interactingwithinputdevicesisextremelyeasy.Whendevelopingweb pages,thebrowserhandlesuserinteractionsandthedevelopersimplydefinesthevisualtreatmentslike mousehoverstatesusingstylesheets.However,Kinectisdifferent.Itisaninputdevicethatisnot integratedintoWPF.Thereforeyou,asthedeveloper,areresponsiblefordoingalloftheworkthatthe OSandWPFotherwisewoulddoforyou. Atalowlevel,amouse,stylus,ortouchessentiallyproducesX,andYcoordinates,whichtheOS translatesintothecoordinatespaceofthecomputerscreen.Thisprocessissimilartothatdiscussedin thepreviouschapter(SpaceTransformations).Itistheoperatingsystem’sresponsiblytoextractdata fromtheinputdeviceandmakeitavailablethegraphicaluserinterfaceandtoapplications.The graphicaluserinterfaceoftheOSdisplaysamousecursorandmovesthecursoraroundthescreenin reactiontouserinput.Insomeinstances,thisworkisnottrivialandrequiresathoroughunderstanding oftheGUIplatform,whichinourinstanceisWPF.WPFdoesnotprovidenativesupportfortheKinect asitdoesforthemouseandotherinputdevices.Theburdenfallsonthedevelopertopullthedatafrom theKinectSDKanddotheworknecessarytointeractwiththeButtons,ListBoxesandotherinterface controls.Dependingonthecomplexityofyourapplicationoruserinterface,thiscanbeasizabletask andpotentiallyonethatisnon-trivialandrequiresintimateknowledgeofWPF.
ABriefUnderstandingoftheWPFInputSystem WhenbuildinganapplicationinWPF,developersdonothavetoconcernthemselveswiththe mechanicsofuserinput.Itishandledforusallowingustofocusmoreonreactingtouserinput.Afterall, asdevelopers,wearemoreconcernedwithdoingthingswiththeuser’sinputratherthanreinventing thewheeleachtimejusttocollectuserinput.Ifanapplicationneedsabutton,thedeveloperaddsa Buttoncontroltothescreen,wiresaneventhandlertothecontrol’sClickeventandisdone.Inmost circumstances,thedeveloperwillstylethebuttontohaveauniquelookandfeelandtoreactvisuallyto differentmouseinteractionssuchashoverandmousedown.WPFhandlesallofthelow-levelworkto determinewhenthemouseishoveringoverthebutton,orwhenthebuttonisclicked. WPFhasarobustinputsystemthatconstantlygathersinputfromattacheddevicesanddistributes thatinputtotheaffectedcontrols.ThissystemstartswiththeAPIdefinedintheSystem.Windows.Input namespace(Presentation.Core.dll).Theentitiesdefinedwithinworkdirectlywiththeoperatingsystem togetdatafromtheinputdevices.Forexample,thereareclassesnamedKeyboard,Mouse,Stylus,Touch, andCursor.Theoneclassthatisresponsibleformanagingtheinputfromthedifferentinputdevicesand marshallingthatinputtotherestofthepresentationframeworkistheInputManager.
122
CHAPTER5ADVANCEDSKELETONTRACKING
TheothercomponenttotheWPFinputsystemisasetoffourclassesintheSystem.Windows namespace(PresentationCore.dll).TheseclassesareUIElement,ContentElement,FrameworkElement,and FrameworkContentElement.FrameworkElementinheritsfromUIElementandFrameworkContentElement inheritsfromContentElement.TheseclassesarethebaseclassesforallvisualelementsinWPFsuchas Button,TextBlock,andListBox.
NoteFormoredetailedinformationaboutWPF’sinputsystem,refertotheMSDNdocumentationat http://msdn.microsoft.com/en-us/library/ms754010.aspx.
TheInputManagertracksalldeviceinputandusesasetofmethodsandeventstonotifyUIElement andContentElementobjectsthattheinputdeviceisperformingsomeactionrelatedtothevisual element.Forexample,WPFraisestheMouseEnterEventeventwhenthemousecursorentersthevisual spaceofavisualelement.ThereisalsoavirtualOnMouseEntermethodintheUIElementand ContentElementclasses,whichWPFalsocallswhenthemouseentersthevisualspaceoftheobject.This allowsotherobjects,whichinheritfromtheUIElementorContentElementclasses,todirectlyreceivedata frominputdevices.WPFcallsthesemethodsonthevisualelementsbeforeitraisesanyinputevents. ThereareseveralothersimilartypesofeventsandmethodsontheUIElementandContentElementclasses tohandlethevarioustypesofinteractionsincludingMouseEnter,MouseLeave,MouseLeftButtonDown, MouseLeftButtonUp,TouchEnter,TouchLeave,TouchUp,andTouchDown,tonameafew. Developershavedirectaccesstothemouseandotherinputdevicesneeded.TheInputManager objecthasapropertynamedPrimaryMouseDevice,whichreturnsaMouseDeviceobject.Usingthe MouseDeviceobject,youcangetthepositionofthemouseatanytimethroughamethodnamed GetScreenPosition.Additionally,theMouseDevicehasamethodnamedGetPosition,whichtakesina userinterfaceelementandreturnsthemousepositionwithinthecoordinatespaceofthatelement.This informationiscrucialwhendeterminingmouseinteractionssuchasthemousehoverevent.Witheach newSkeletonFramegeneratedbytheKinectSDK,wearegiventhepositionofeachskeletonjointin relationtoskeletonspace;wethenhavetoperformcoordinatespacetransformstotranslatethejoint positionstobeusablewithvisualelements.TheGetScreenPositionandGetPositionmethodsonthe MouseDeviceobjectdothisworkforthedeveloperformouseinput. Insomeways,Kinectiscomparablewiththemouse,butthecomparisonsabruptlybreakdown. Skeletonjointsenterandleavevisualelementssimilartoamouse.Inotherwords,jointshoverlikea mousecursor.However,theclickandmousebuttonupanddowninteractionsdonotexist.Aswewill seeinthenextchapter,therearegesturesthatsimulateaclickthroughapushgesture.Thebuttonpush metaphorisweakwhenappliedtoKinectandsothecomparisonwiththemouseendswiththehover. Kinectdoesnothavemuchincommonwithtouchinputeither.Touchinputisavailablefromthe TouchandTouchDeviceclasses.Singletouchinputissimilartomouseinput,whereasmultipletouch inputisakintoKinect.Themousehasonlyasingleinteractionpoint(thepointofthemousecursor), buttouchinputcanhavemultipleinputpoints,justasKinectcanhavemultipleskeletons,andeach skeletonhastwentyinputpoints.Kinectismoreinformative,becauseweknowwhichinputpoints belongtowhichuser.Withtouchinput,theapplicationhasnowayofknowinghowmanyusersare actuallytouchingthescreen.Iftheapplicationreceivestentouchinputs,isitonepersonpressingallten fingers,orisittenpeoplepressingonefingereach?Whiletouchinputhasmultipleinputpoints,itisstill atwo-dimensionalinputlikethemouseorstylus.Tobefair,touchinputdoeshavebreadth,meaningit includesalocation(X,Y)ofthepointandtheboundingareaofthecontactpoint.Afterall,auser pressingafingeronatouchscreenisneveraspreciseasamousepointerorstylus;italwayscoversmore thanonepixel.
123
CHAPTER5ADVANCEDSKELETONTRACKING
Whiletherearesimilarities,Kinectinputclearlydoesnotneatlyconformtofittheformofanyinput devicesupportedbyWPF.Ithasauniquesetofinteractionsanduserinterfacemetaphors.Ithasyetto bedeterminedifKinectshouldfunctioninthesamewayasotherinputdevices.Atthecore,themouse, touch,orstylusreportasinglepixelpointlocation.Theinputsystemthendeterminesthelocationofthe pixelpointinthecontextofavisualelement,andthatvisualelementreactsaccordingly.CurrentKinect userinterfacesattempttousethehandjointsasalternativestomouseortouchinput,butitisnotclear yetifthisishowKinectshouldbeusedorifthedesigneranddevelopercommunityissimplytryingto makeKinectconformtoknownformsofuserinput. TheexpectationisthatatsomepointKinectwillbefullyintegratedintoWPF.UntilWPF4.0,touch inputwasaseparatecomponent.TouchwasfirstintroducedwithMicrosoft’sSurface.TheSurfaceSDK includedaspecialsetofWPFcontrolslikeSurfaceButton,SurfaceCheckBoxandSurfaceListBox.Ifyou wantedabuttonthatrespondedtotouchevents,youhadtousetheSurfaceButtoncontrol. OnecanspeculatethatifKinectinputweretobeassimilatedintoWPF,theremightbeaclass namedSkeletonDevice,whichwouldlooksimilartotheSkeletonFrameobjectoftheKinectSDK.Each SkeletonobjectwouldhaveamethodnamedGetJointPoint,whichwouldfunctionliketheGetPosition methodonMouseDeviceortheGetTouchPointonTouchDevice.Additionally,thecorevisualelements (UIElement,ContentElement,FrameworkElement,andFrameworkContentElement)wouldhaveeventsand methodstonotifyandhandleskeletonjointinteractions.Forexample,theremightbeJointEnter, JointLeave,andJointHoverevents.Further,justastouchinputhastheManipulationStartedand ManipulationEndedevents,theremightbeGestureStartedandGestureEndedeventsassociatedwith Kinectinput. Fornow,theKinectSDKisaseparateentityfromWPF,andassuch,itdoesnotnativelyintegrate withtheinputsystem.Itistheresponsibilityofthedevelopertotrackskeletonjointpositionsand determinewhenjointpositionsintersectwithuserinterfaceelements.Whenaskeletonjointiswithin thecoordinatespaceofavisualelement,wemustthenmanuallyaltertheappearanceoftheelementto reacttotheinteraction.Woeisthelifeofadeveloperwhenworkingwithanewtechnology.
DetectingUserInteraction Beforewecandetermineifauserhasinteractedwithvisualelementsonthescreen,wemustdefine whatitmeansfortheusertointeractwithavisualelement.Lookingatamouse-orcursor-driven application,therearetwowell-knowninteractions.Amousehoversoveravisualelementandclicks. Theseinteractionsbreakdownevenfurtherintoothermoregranularinteractions.Foracursortohover, itmustenterthecoordinatespaceofthevisualelement.Thehoverinteractionendswhenthecursor leavesthecoordinatespaceofthevisualelement.InWPF,theMouseEnterandMouseLeaveeventsfire whentheuserperformstheseinteractions.Aclickistheactofthemousebuttonbeingpresseddown (MouseDown)andreleased(MouseUp). Thereisanothercommonmouseinteractionbeyondaclickandhover.Ifauserhoversoveravisual element,pressesdowntheleftmousebutton,andthenmovesthecursoraroundthescreen,wecallthis adrag.Thedropinteractionhappenswhentheuserreleasesthemousebutton.Draganddropisa complexinteraction,muchlikeagesture. Forthepurposeofthischapter,wefocusonthefirstsetofsimpleinteractionswherethecursor hovers,enters,andleavesthespaceofthevisualelement.IntheKinecttheDotsprojectfromthe previouschapter,wehadtodeterminewhentheuser’shandwasinthevicinityofadotbeforedrawinga connectingline.Inthatproject,theapplicationdidnotinteractwiththeuserinterfaceasmuchasthe userinterfacereactedtotheuser.Thisdistinctionisimportant.Theapplicationgeneratedthelocations ofthedotswithinacoordinatespacethatwasthesameasthescreensize,butthesepointswerenot derivedfromthescreenspace.Theywerejustdatastoredinvariables.Wefixedthescreensizetomakeit easy.Uponreceiptofeachnewskeletonframe,thepositionoftheskeletonhandwastranslatedintothe coordinatespaceofthedots,afterwhichwedeterminedifthepositionofthehandwasthesameasthe
124
CHAPTER5ADVANCEDSKELETONTRACKING
currentdotinthesequence.Technically,thisapplicationcouldfunctionwithoutauserinterface.The userinterfacewascreateddynamicallyfromdata.Inthatapplication,theuserisinteractingwiththe dataandnottheuserinterface.
HitTesting Determiningwhenauser’shandisnearadotisnotassimpleascheckingifthecoordinatesofthehand matchexactlythepositionofthedot.Eachdotisjustasinglepixel,anditwouldbeimpossibleforauser toplacetheirhandeasilyandroutinelyinthesamepixelposition.Tomaketheapplicationusable,we donotrequirethepositionofthehandtobethesameasthedot,butratherwithinacertainrange.We createdacirclewithasetradiusaroundthedot,withthedotbeingthecenterofthecircle.Theuserjust hastobreaktheplaneoftheproximitycircleforthehandtobeconsideredhoveringoverthedot.Figure 5-1illustratesthis.Thewhitedotwithinthevisualelementcircleistheactualdotpointandthedotted circleistheproximitycircle.Thehandimageiscenteredtothehandpoint(whitedotwithinthehand icon).Itisthereforepossibleforthehandimagetocrosstheproximitycircle,butthehandpointtobe outsidethedot.Theprocessofcheckingtoseeifthehandpointbreakstheplaneofthedotiscalledhit testing.
Figure5-1.Dotproximitytesting Again,intheKinecttheDotsproject,theuserinterfacereactstothedata.Thedotsaredrawnonthe screenaccordingtothegeneratedcoordinates.Theapplicationperformshittestingusingthedotdata andnotthesizeandlayoutofthevisualelement.Mostapplicationsandgamesdonotfunctionthisway. Theuserinterfacesaremorecomplexandoftendynamic.Take,forexample,theShapeGameapplication (Figure5-2)thatcomeswiththeKinectforWindowsSDK.Itgeneratesshapesthatdropfromthesky. Theshapespopanddisappearwhentheuser“touches”them.
125
CHAPTER5ADVANCEDSKELETONTRACKING
Figure5-2.MicrosoftSDKsampleShapeGame AnapplicationlikeShapeGamerequiresamorecomplexhittestingalgorithmthanthatofKinectthe Dots.WPFprovidessometoolstohelphittestvisualobjects.TheVisualTreeHelperclass (System.Windows.Medianamespace)hasamethodnamedHitTest.Therearemultipleoverloadsforthis method,buttheprimarymethodsignaturetakesinaVisualobjectandapoint.Itreturnsthetop-most visualobjectwithinthespecifiedvisualobject’svisualtreeatthatpoint.Ifthatseemscomplicatedandit isnotinherentlyobviouswhatthismeans,donotworry.AsimpleexplanationisthatWPFhasalayered visualoutput.Morethanonevisualelementcanoccupythesamerelativespace.Ifmorethanonevisual elementisatthespecifiedpoint,theHitTestmethodreturnstheelementatthetoplayer.DuetoWPF’s stylingandtemplatingsystem,whichallowscontrolstobecompositesofoneormorevisualelements andothercontrols,moreoftenthannottherearemultiplevisualelementsatanygivencoordinatepoint. Figure5-3helpstoillustratethelayeringofvisualelements.Therearethreeelements:aRectangle,a Button,andanEllipse.AllthreeareinaCanvaspanel.Theellipseandthebuttonsitontopofthe rectangle.Inthefirstframe,themouseisovertheellipseandahittestatthispointreturnstheellipse.A hittestinthesecondframereturnstherectangleeventhoughitisthebottomlayer.Whiletherectangle isatthebottom,itistheonlyvisualelementoccupyingthepixelatthemouse’scursorposition.Inthe thirdframe,thecursorisoverthebutton.HittestingatthispointreturnsaTextBlockelement.Ifthe cursorwerenotonthetextinthebutton,ahittestwouldreturnaButtonChromeelement.Thebutton’s visualrepresentationiscomposedofoneormorevisualcontrols,andiscustomizable.Infact,the buttonhasnoinherentvisualstyle.AButtonisavisualelementthatinherentlyhasnovisual
126
CHAPTER5ADVANCEDSKELETONTRACKING
representation.ThebuttonshowninFigure5-3usesthedefaultstyle,whichisinpartmadeupofa TextBlockandaButtonChrome.Itisimportanttounderstandhittestingonacontroldoesnotnecessarily meanthehittestreturnsthedesiredorexpectedvisualelementorcontrol,asisthecasewiththeButton. Inthisexample,wealwaysgetoneoftheelementsthatcomposethebuttonvisual,butnevertheactual buttoncontrol.
Figure5-3.LayeredUIelements Tomakehittestingmoreconvienant,WPFprovidesothermethodstoassistwithhittesting.The UIElementclassdefinesanInputHitTestmethod,whichtakesinaPointandreturnsanIInputElement thatisatthespecifiedpoint.TheUIElementandContentElementclassesbothimplementthe IInputElementinterface.ThismeansthatvirtuallyalluserinterfaceelementswithinWPFarecovered. TheVisualTreeHelperclassalsohasasetofHitTestmethods,whichcanbeusedmoregenerically.
NoteTheMSDNdocumentationfortheUIElement.InputHitTestmethodstates,“Thismethodtypicallyisnot calledfromyourapplicationcode.Callingthismethodisonlyappropriateifyouintendtore-implementa substantialamountofthelowlevelinputfeaturesthatarealreadypresent,suchasrecreatingmousedevice logic.”KinectisnotintegratedintoWPF’s“low-levelinputfeatures”;therefore,itisnecessarytorecreatemouse devicelogic.
InWPF,hittestingdependsontwovariables,avisualelementandapoint.Thetestdeterminesif thespecifiedpointlieswithinthecoordinatespaceofthevisualelement.Let’suseFigure5-4tobetter understandcoordinatespacesofvisualelements.EachvisualelementinWPF,regardlessofshapeand size,haswhatiscalledaboundingbox:arectangularshapearoundthevisualelementthatdefinesthe widthandheightofthevisualelement.Thisboundingboxisusedbythelayoutsystemtodeterminethe overalldimensionsofthevisualelementandhowtoarrangeitonthescreen.WhiletheCanvasarranges itschildrenbasedonvaluesspecifiedbythedeveloper,anelement’sboundingboxisfundamentalto thelayoutalgorithmofotherpanelssuchtheGridandStackPanel.Theboundingboxisnotvisually showntotheuser,butisrepresentedinFigure5-4bythedottedboxsurroundingeachvisualelement. Additionally,eachelementhasanXandYpositionthatdefinestheelement’slocationwithinitsparent container.ToobtaintheboundingboxandpositionofanelementcalltheGetLayoutSlotmethodofthe LayoutInformation(static)class(System.Windows.Controls.Primitives). Take,forexample,thetriangle.Thetop-leftcorneroftheboundingboxispoint(0,0)ofthevisual element.Thewidthandheightofthetriangleareeach200pixels.Thethreepointsofthetrianglewithin
127
CHAPTER5ADVANCEDSKELETONTRACKING
theboundingboxareat(100,0),(200,200),(0,200).Ahittestisonlysuccessfulforpointswithinthe triangleandnotforallpointswithintheboundingbox.Ahittestforpoint(0,0)isunsuccessful,whereas atestatthecenterofthetriangle,point(100,100),issuccessful.
Figure5-4.Layoutspaceandboundingboxes Hittestingresultsdependonthelayoutofthevisualelements.Inallofourprojects,weusedthe Canvaspaneltoholdourvisualelements.TheCanvaspanelistheonevisualelementcontainerthatgives thedevelopercompletecontrolovertheplacementofthevisualelements,whichcanbeespecially usefulwhenworkingwithKinect.BasicfunctionslikehandstrackingarepossiblewithotherWPF panels,butrequiremoreworkanddonotperformaswellastheCanvaspanel.WiththeCanvaspanel,the developerexplicitlysetstheXandYposition(CanvasLeftandCanvasToprespectively)ofthechildvisual element.Coordinatespacetranslation,aswehaveseen,isstraightforwardwiththeCanvaspanel,which meanslesscodetowriteandbetterperformancebecausethereislessprocessingneeded. ThedisadvantagetousingaCanvasisthesamereasonforusingtheCanvaspanel.Thedeveloperhas completecontrolovertheplacementofvisualelementsandthereforeisalsoresponsibleforthingslike updatingelementpositionswhenthewindowresizesorarrangingcomplexlayouts.Panelssuchasthe GridandStackPanelmakeUIlayoutupdatesandresizingspainlesstothedeveloper.However,these panelsincreasethecomplexityofhittestingbyincreasingthesizeofthevisualtreeandbyadding additionalcoordinatespaces.Themorecoordinatespaces,themorepointtranslationsneeded.These panelsalsohonorthealignment(horizontalandvertical)andmarginpropertiesofthe FrameworkElement,whichfurthercomplicatesthecalculationsnecessaryforhittesting.Ifthereisany possibilitythatavisualelementwillhaveRenderTransforms,youwillbesmarttousetheWPFhittesting andnotattemptdothistestingyourself.
128
CHAPTER5ADVANCEDSKELETONTRACKING
Ahybridapproachistoplacevisualelementsthatchangefrequentlybasedonskeletonjoint positions,suchashandcursorsinaCanvas,andplaceallofUIelementsinotherpanels.Suchalayout schemerequiresmorecoordinatespacetransforms,whichcanaffectperformanceandpossibly introducebugsrelatedtoimpropertransformcalculations.Thehybridmethodisattimesthemore appropriatechoicebecauseittakesfulladvantageoftheWPFlayoutsystem.RefertotheMSDN documentationonWPF’slayoutsystem,panelsandhittestingforathoroughunderstandingofthese concepts.
RespondingtoInput Hittestingonlytellsusthattheuserinputpointiswithinthecoordinatespaceofavisualelement.One oftheimportantfunctionsofauserinterfaceistogiveusersfeedbackontheiractions.Whenwemove ourmouseoverabutton,weexpectthebuttontochangevisuallyinsomeway(changecolor,growin size,animate,revealabackgroundglow),tellingtheuserthebuttonisclickable.Withoutthisfeedback, theuserexperienceisnotonlyflatanduninteresting,butalsopossiblyconfusingandfrustrating.A failedapplicationexperiencemeanstheapplicationasawholeisafailure,evenifittechnicallyfunctions flawlessly. WPFhasafantasticsystemfornotifyingandrespondingtouserinput.Thestylingandtemplate systemmakesdevelopinguserinterfacesthatproperlyrespondtouserinputeasytobuildandhighly customizable,butonlyifyouruserinputcomesfromamouse,stylus,oratouchdevice.Kinect developershavetwooptions:donotuseWPF’ssystemanddoeverythingmanually,orcreatespecial controlsthatrespondtoKinectinput.Thelatter,whilenotoverlydifficult,isnotabeginner’stask. Withthisinmind,wemovetothenextsectionwherewebuildagamethatapplieshittestingand manuallyrespondstouserinput.Before,movingon,consideraquestion,whichwehavepurposefully notaddresseduntilnow.WhatdoesitmeanforaKinectskeletontointeractwiththeuserinterface?The coremouseinteractionsare:enter,leave,click.Touchinputhasenter,leave,down,andupinteractions. Amousehasasinglepositionpoint.Touchcanhavemultiplepositionpoints,butthereisalwaysa primarypoint.AKinectskeletonhastwentypossiblepositionpoints.Whichoftheseistheprimary point?Shouldtherebeaprimarypoint?Shouldavisualelement,suchasabutton,reactwhenanyone skeletonpointenterstheelement’scoordinatespace,orshoulditreacttoonlycertainjointpoints,for instancethehands? Thereisnooneanswertoallofthesequestions.Itlargelydependsonthefunctionanddesignof youruserinterface.ThesetypesofquestionsarepartofabroadersubjectcalledNaturalUserInterface design,whichisasignificanttopicinthenextchapter.FormostKinectapplications,includingthe projectsinthischapter,theonlyjointsthatinteractwiththeuserinterfacearethehands.Thestarting interactionsareenterandleave.Interactionsbeyondthesebecomecomplicatedquickly.Wecovermore complicatedinteractionslaterinthechapterandallofthenextchapter,butnowthefocusisonthe basics.
SimonSays TodemonstrateworkingwithKinectasaninputdevice,westartournextproject,whichusesthehand jointsasiftheywereacrossbetweenamouseandtouchinput.Theproject’sgoalistogiveapractical, butintroductoryexampleofhowtoperformhittestingandcreateuserinteractionswithWPFvisual elements.TheprojectisagamenamedSimonSays. Growingupduringmyearlygradeschoolyears,weplayedagamenamedSimonSays.Inthisgame, onepersonplaystheroleofSimonandgivesinstructionstotheotherplayers.Atypicalinstructionis, “Putyourlefthandontopofyourhead.”Playersperformtheinstructiononlyifitisprecededbythe words,“SimonSays.”Forexample,“Simonsays,‘stompyourfeet”incontrastto“stompyourfeet.”Any
129
CHAPTER5ADVANCEDSKELETONTRACKING
playercaughtfollowinganinstructionnotprecededby“Simonsays”isoutofthegame.Thesearethe game’srules.DidyouplaySimonSaysasachild?Dokidsstillplaythisgame?Lookitupifyoudonot knowthegame.
TipThetraditionalversionofSimonSaysmakesafundrinkinggame—butonlyifyouareoldenoughtodrink. Pleasedrinkresponsibly.
Inthelate70sandearly80sthegamecompanyMiltonBradleycreatedahand-heldelectronic versionofSimonSaysnamedSimon.Thisgameconsistedoffourcolored(red,blue,green,andyellow) buttons.Theelectronicversionofthegameworkswiththecomputer,givingtheplayerasequenceof buttonstopress.Whengivingtheinstructions,thecomputerlightseachbuttoninthecorrectsequence. Theplayermustthenrepeatthebuttonsequence.Aftertheplayersuccessfullyrepeatsthebutton sequence,thecomputerpresentsanother.Thesequencesbecomeprogressivelymorechallenging.The gameendswhentheplayercannotrepeatthesequence. WeattempttorecreatetheelectronicversionofSimonSaysusingKinect.Itisaperfectintroductory exampleofusingskeletontrackingtointeractwithuserinterfaceelements.Thegamealsohasasimple setofgamerules,whichwecanquicklyimplement.Figure5-5illustratesourdesireduserinterface.It consistsoffourrectangles,whichserveasgamebuttonsortargets.Wehaveagametitleatthetopofthe screen,andanareainthemiddleofthescreenforgameinstructions.
Figure5-5.SimonSaysuserinterface
130
CHAPTER5ADVANCEDSKELETONTRACKING
OurversionofSimonSaysworksbytrackingtheplayer’shands;whenahandmakescontactwith oneofthecoloredsquares,weconsiderthisabuttonpress.ItiscommoninKinectapplicationstouse hoverorpressgesturestointeractwithbuttons.Fornow,ourapproachtoplayerinteractionsremains simple.Thegamestartswiththeplayerplacingherhandsoverthehandmarkersintheredboxes. Immediatelyafterbothhandsareonthemarkers,thegamebeginsissuinginstructions.Thegameisover andreturnstothisstatewhentheplayerfailstorepeatthesequence.Atthispoint,wehaveabasic understandingofthegame’sconcept,rules,andlook.Nowwewritecode.
SimonSays,“DesignaUserInterface” Startbybuildingtheuserinterface.Listing5-1showstheXAMLfortheMainWindow.Aswithourprevious examples,wewrapourmainUIelementsinaViewboxcontroltohandlescalingtodifferentmonitor resolutions.OurUIdimensionsaresetto1920x1080.TherearefoursectionsofourUI:titleand instructions,gameinterface,gamestartinterface,andcursorsforhandtracking.ThefirstTextBlock holdsthetitleandtheinstructionUIelementsareintheStackPanelthatfollows.TheseUIcomponents serveonlytohelptheplayerknowthecurrentstateofthegame.Theyhavenootherfunctionandare notrelatedtoKinectorskeletontracking.However,theotherUIelementsare. TheGameCanvas,ControlCanvas,andHandCanvasallholdUIelements,whichtheapplication interactswith,basedonthepositionoftheplayer’shands.Thehandpositionsobviouslycomefrom skeletontracking.Takingtheseitemsinreverseorder,theHandCanvasshouldbefamiliar.The applicationhastwocursorsthatfollowthemovementsoftheplayer’shands,aswesawintheprojects fromthepreviouschapter.TheControlCanvasholdstheUIelementsthattriggerthestartofthegame, andtheGameCanvasholdstheblocks,whichtheplayerpressesduringthegame.Thedifferentinteractive componentsarebrokenintomultiplecontainers,makingtheuserinterfaceeasiertomanipulatein code.Forexample,whentheuserstartsthegame,wewanttohidetheControlCanvas.Itismucheasier tohideonecontainerthantowritecodetoshowandhideallofthechildrenindividually. AfterupdatingtheMainWindow.xamlfilewiththecodeinListingin5-1,runtheapplication.The screenshouldlooklikeFigure5-1.
131
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-1.SimonSaysUserInterface
132
CHAPTER5ADVANCEDSKELETONTRACKING
SimonSays,“BuildtheInfrastructure” WiththeUIinplace,weturnourfocusonthegame’sinfrastructure.UpdatetheMainWindow.xaml.csfile toincludethenecessarycodetoreceiveSkeletonFrameReadyevents.IntheSkeletonFrameReadyevent handler,addthecodetotrackplayerhandmovements.ThebaseofthiscodeisinListing5-2.TrackHand isarefactoredversionofListing4-7,wherethemethodtakesintheUIelementforthecursorandthe parentelementthatdefinesthelayoutspace.
133
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-2.InitialSkeletonFrameReadyEventHandler private void KinectDevice_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e) { using(SkeletonFrame frame = e.OpenSkeletonFrame()) { if(frame != null) { frame.CopySkeletonDataTo(this._FrameSkeletons); Skeleton skeleton = GetPrimarySkeleton(this._FrameSkeletons); if(skeleton == null) { LeftHandElement.Visibility = Visibility.Collapsed; RightHandElement.Visibility = Visibility.Collapsed; } else { TrackHand(skeleton.Joints[JointType.HandLeft], LeftHandElement, LayoutRoot); TrackHand(skeleton.Joints[JointType.HandRight], RightHandElement, LayoutRoot); } }
} }
private static Skeleton GetPrimarySkeleton(Skeleton[] skeletons) { Skeleton skeleton = null; if(skeletons != null) { //Find the closest skeleton for(int i = 0; i < skeletons.Length; i++) { if(skeletons[i].TrackingState == SkeletonTrackingState.Tracked) { if(skeleton == null) { skeleton = skeletons[i]; } else { if(skeleton.Position.Z > skeletons[i].Position.Z) { skeleton = skeletons[i]; } } } } }
134
CHAPTER5ADVANCEDSKELETONTRACKING
return skeleton; } Formostgamesusingapollingarchitectureisthebetterandmorecommonapproach.Normally,a gamehaswhatiscalledagamingloop,whichwouldmanuallygetthenextskeletonframefromthe skeletonstream.However,thisprojectusestheeventmodeltoreducethecodebaseandcomplexity.For thepurposesofthisbook,lesscodemeansthatitiseasiertopresenttoyou,thereader,andeasierto understandwithoutgettingboggeddowninthecomplexitiesofgamingloopsandpossiblythreading. Theeventsystemalsoprovidesusacheapgamingloop,which,again,meanswehavetowritelesscode. However,becarefulwhenusingtheeventsysteminplaceofatruegamingloop.Besidesperformance concerns,eventsareoftennotreliableenoughtooperateasatruegamingloop,whichmayresultin yourapplicationbeingbuggyornotperformingasexpected.
SimonSays,“AddGamePlayInfrastructure” ThegameSimonSaysbreaksdownintothreephases.Theinitialphase,whichwewillcallGameOver, meansnogameisactivelybeingplayed.Thisisthedefaultstateofthegame.Itisalsothestatetowhich thegamerevertswhenKinectstopsdetectingplayers.ThegameloopsfromSimongivinginstructionsto theplayerrepeatingorperformingtheinstructions.Thiscontinuesuntiltheplayercannotcorrectly performtheinstructions.Theapplicationdefinesanenumerationtodescribethegamephasesanda membervariabletotrackthegamestate.Additionally,weneedamembervariabletotrackthecurrent roundorlevelofthegame.Thevalueoftheleveltrackingvariableincrementseachtimetheplayer successfullyrepeatsSimon’sinstructions.Listing5-3detailsthegamephaseenumerationandmember variables.Themembervariablesinitializationisintheclassconstructor.
135
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-3.GamePlayInfrastructure public enum GamePhase { GameOver SimonInstructing PlayerPerforming }
= 0, = 1, = 2
public partial class MainWindow : Window { #region Member Variables private KinectSensor _KinectDevice; private Skeleton[] _FrameSkeletons; private GamePhase _CurrentPhase; private int _CurrentLevel; #endregion Member Variables #region Constructor public MainWindow() { InitializeComponent(); //Any other constructor code such as sensor initialization goes here. this._CurrentPhase = GamePhase.GameOver this._CurrentLevel = 0;
} #endregion Constructor
#region Methods //Code from listing 5-2 and any additional supporting methods #endregion Methods } WenowrevisittheSkeletonFrameReadyeventhandler,whichneedstodeterminewhatactiontotake basedonthestateoftheapplication.ThecodeinListing5-4detailsthecodechanges.Updatethe SkeletonFrameReadyeventhandlerwiththiscodeandstubouttheChangePhase,ProcessGameOver,and ProcessPlayerPerformingmethods.Wecoverthefunctionalcodeofthesemethodslater.Thefirst methodtakesonlyaGamePhaseenumerationvalue,whilethelattertwohaveasingleparameterof Skeletontype. Whentheapplicationcannotfindaprimaryskeleton,thegameendsandentersthegameover phase.ThishappenswhentheuserleavestheviewareaofKinect.WhenSimonisgivinginstructionsto theuser,thegamehideshandcursors;otherwise,itupdatesthepositionofthehandcursors.Whenthe gameisineitheroftheothertwophases,thenthegamecallsspecialprocessingmethodsbasedonthe particulargamephase.
136
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-4.SkeletonFrameReadyEventHandler private void KinectDevice_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e) { using(SkeletonFrame frame = e.OpenSkeletonFrame()) { if(frame != null) { frame.CopySkeletonDataTo(this._FrameSkeletons); Skeleton skeleton = GetPrimarySkeleton(this._FrameSkeletons); if(skeleton == null) { ChangePhase(GamePhase.GameOver); } else { if(this._CurrentPhase == GamePhase.SimonInstructing) { LeftHandElement.Visibility = Visibility.Collapsed; RightHandElement.Visibility = Visibility.Collapsed; } else { TrackHand(skeleton.Joints[JointType.HandLeft], LeftHandElement, LayoutRoot); TrackHand(skeleton.Joints[JointType.HandRight], RightHandElement, LayoutRoot); switch(this._CurrentPhase) { case GamePhase.GameOver: ProcessGameOver(skeleton); break; case GamePhase.PlayerPerforming: ProcessPlayerPerforming(skeleton); break; } }
}
}
}
}
137
CHAPTER5ADVANCEDSKELETONTRACKING
StartingaNewGame TheapplicationhasasinglefunctionwhenintheGameOverphase:detectwhentheuserwantstoplaythe game.Thegamestartswhentheplayerplacesherhandsintherespectivehandmarkers.Thelefthand needstobewithinthespaceoftheLeftHandStartElementandtherighthandneedstobewithinthe spaceoftheRightHandStartElement.Forthisproject,weuseWPF’sbuilt-inhittestingfunctionality.Our UIissmallandsimple.ThenumberofUIelementsavailableforprocessinginanInputHitTestmethod callisextremelylimited;therefore,therearenoperformanceconcerns.Listing5-5containsthecodefor theProcessGameOvermethodandtheGetHitTargethelpermethod.TheGetHitTargetmethodisusedin otherplacesintheapplication. Listing5-5.DetectingWhentheUserIsReadytoStarttheGame private void ProcessGameOver(SkeletonData skeleton) { //Determine if the user triggers to start of a new game if(GetHitTarget (skeleton.Joints[JointType.HandLeft], LeftHandStartElement) != null && GetHitTarget (skeleton.Joints[JointType.HandRight], RightHandStartElement) != null) { ChangePhase(GamePhase.SimonInstructing); } } private IInputElement GetHitTarget(Joint joint, UIElement target) { Point targePoint = GetJointPoint(this.KinectDevice, joint, LayoutRoot.RenderSize, new Point()); targetPoint = LayoutRoot.TranslatePoint(targetPoint, target); }
return target.InputHitTest(targetPoint);
ThelogicoftheProcessGameOvermethodissimpleandstraightforward:ifeachoftheplayer’shands isinthespaceoftheirrespectivetargets,changethestateofthegame.TheGetHitTargetmethodis responsiblefortestingifthejointisinthetargetspace.Ittakesinthesourcejointandthedesiredtarget, andreturnsthespecificIInputElementoccupyingthecoordinatepointofthejoint.Whilethemethod onlyhasthreelinesofcode,itisimportanttounderstandthelogicbehindthecode. Ourhittestingalgorithmconsistsofthreebasicsteps.Thefirststepgetsthecoordinatesofthejoint withinthecoordinatespaceoftheLayoutRoot.TheGetJointPointmethoddoesthisforus.Thisisthe samemethodfromthepreviouschapter.CopythecodefromListing4-3andpasteitintothisproject. Next,thejointpointintheLayoutRootcoordinatespaceistranslatedtothecoordinatespaceofthe targetusingtheTranslatePointmethod.ThismethodisdefinedintheUIElementclass,ofwhichGrid (LayoutRoot)isadescendent.Finally,withthepointtranslatedintothecoordinatespaceofthetarget, wecalltheInputHitTestmethod,alsodefinedintheUIElementclass.Ifthepointiswithinthecoordinate spaceofthetarget,theInputHitTestmethodreturnstheexactUIelementinthetarget’svisualtree.Any non-nullvaluemeansthehittestwassuccessful. ItisimportanttonotethatthesimplicityofthislogiconlyworksduetothesimplicityofourUI layout.Ourapplicationconsumestheentirescreenandisnotmeanttoberesizable.Havingastaticand fixedUIsizedramaticallysimplifiesthenumberofcalculations.Additionally,byusingCanvaselements tocontainallinteractiveUIelements,weeffectivelyhaveasinglecoordinatespace.Byusingotherpanel
138
CHAPTER5ADVANCEDSKELETONTRACKING
typestocontaintheinteractiveUIelementsorusingautomatedlayoutfeaturessuchasthe HorizontalAlignment,VerticalAlignment,orMarginproperties,youpossiblyincreasethecomplexityof hittestinglogic.Inshort,themorecomplicatedtheUI,themorecomplicatedthehittestinglogic,which alsoaddsmoreperformanceconcerns.
ChangingGameState Compileandruntheapplication.Ifallgoeswell,yourapplicationshouldlooklikeFigure5-6.The applicationshouldtracktheplayer’shandmovementsandchangethegamephasefromGameOverto SimonInstructingwhentheplayermoveshishandsintothestartposition.Thenexttaskistoimplement theChangePhasemethod,asshowninListing5-6.ThiscodeisnotrelatedtoKinect.Infact,wecouldjust aseasilyimplementedthissamegameusingtouchormouseinputandthiscodewouldstillberequired.
Figure5-6.StartinganewgameofSimonSays ThefunctionofChangePhaseistomanipulatetheUItodenoteachangeinthegame’sstateand maintainanydatanecessarytotracktheprogressofthegame.Specifically,theGameOverphasefadesout theblocks,changesthegameinstructions,andpresentsthebuttonstostartanewgame.Thecodefor theSimonInstructingphasegoesbeyondupdatingtheUI.Itcallstwomethods,whichgeneratethe instructionsequence(GenerateInstructions),anddisplaystheseinstructionstotheplayer (DisplayInstructions).FollowingListing5-6isthesourcecodeandfurtherexplanationforthese methodsaswellasthedefinitionofthe_InstructionPositionmembervariables.
139
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-6.ControllingtheGameState private void ChangePhase(GamePhase newPhase) { if(newPhase != this._CurrentPhase) { this._CurrentPhase = newPhase; switch(this._CurrentPhase) { case GamePhase.GameOver: this._CurrentLevel RedBlock.Opacity BlueBlock.Opacity GreenBlock.Opacity YellowBlock.Opacity
new game.";
GameStateElement.Text ControlCanvas.Visibility GameInstructionsElement.Text
= = = = =
0; 0.2; 0.2; 0.2; 0.2; = "GAME OVER!"; = System.Windows.Visibility.Visible; = "Place hands over the targets to start a
break; case GamePhase.SimonInstructing: this._CurrentLevel++; GameStateElement.Text = string.Format("Level {0}", this._CurrentLevel); ControlCanvas.Visibility = System.Windows.Visibility.Collapsed; GameInstructionsElement.Text = "Watch for Simon's instructions"; GenerateInstructions(); DisplayInstructions(); break;
}
}
case GamePhase.PlayerPerforming: this._InstructionPosition GameInstructionsElement.Text break;
= 0; = "Repeat Simon's instructions";
}
PresentingSimon’sCommands Listing5-7detailsanewsetofmembervariablesandtheGenerateInstructionsmethod.Themember variable_InstructionSequenceholdsasetofUIElements,whichcompriseSimon’sinstructions.The playermustmovehishandovereachUIElementinthesequenceorderdefinedbythearraypositions. Theinstructionsetisrandomlychosen—eachlevelwiththenumberofinstructionsbasedonthecurrent levelorround.Forexample,roundfivehasfiveinstructions.Alsoincludedinthiscodelistingisthe DisplayInstructionsmethod,whichcreatesandthenbeginsastoryboardanimationtochangethe opacityofeachblockinthecorrectsequence.
140 4
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-7.GeneratingandDisplayingInstructions private int _InstructionPosition; private UIElement[] _InstructionSequence; private Random rnd = new Random(); private void GenerateInstructions() { this._InstructionSequence = new UIElement[this._CurrentLevel]; for(int i = 0; i < this._CurrentLevel; i++) { switch(rnd.Next(1, 4)) { case 1: this._InstructionSequence[i] = RedBlock; break; case 2: this._InstructionSequence[i] = BlueBlock; break; case 3: this._InstructionSequence[i] = GreenBlock; break;
} }
case 4: this._InstructionSequence[i] = YellowBlock; break;
}
private void DisplayInstructions() { Storyboard instructionsSequence = new Storyboard(); DoubleAnimationUsingKeyFrames animation; for(int i = 0; i < this._InstructionSequence.Length; i++) { animation = new DoubleAnimationUsingKeyFrames(); animation.FillBehavior = FillBehavior.Stop; animation.BeginTime = TimeSpan.FromMilliseconds(i * 1500); Storyboard.SetTarget(animation, this._InstructionSequence[i]); Storyboard.SetTargetProperty(animation, new PropertyPath("Opacity")); instructionsSequence.Children.Add(animation); animation.KeyFrames.Add(new EasingDoubleKeyFrame(0.3, KeyTime.FromTimeSpan(TimeSpan.Zero))); animation.KeyFrames.Add(new EasingDoubleKeyFrame(1,
141 6
CHAPTER5ADVANCEDSKELETONTRACKING
KeyTime.FromTimeSpan(TimeSpan.FromMilliseconds(500)))); animation.KeyFrames.Add(new EasingDoubleKeyFrame(1, KeyTime.FromTimeSpan(TimeSpan.FromMilliseconds(1000)))); animation.KeyFrames.Add(new EasingDoubleKeyFrame(0.3, KeyTime.FromTimeSpan(TimeSpan.FromMilliseconds(1300)))); } instructionsSequence.Completed += (s, e) => { ChangePhase(GamePhase.PlayerPerforming); }; instructionsSequence.Begin(LayoutRoot); } Runningtheapplicationnow,wecanseetheapplicationstartingtocometogether.Theplayercan startthegame,whichthencausesSimontobeginissuinginstructions.
DoingasSimonSays Thefinalaspectofthegameistoimplementthefunctionalitytocapturetheplayeractingoutthe instructions.NoticethatwhenthestoryboardcompletesanimatingSimon’sinstructions,the applicationcallstheChangePhasemethodtotransitiontheapplicationintothePlayerPerformingphase. ReferbacktoListing5-4,whichhasthecodefortheSkeletonFrameReadyeventhandler.Wheninthe PlayerPerformingphase,theapplicationexecutestheProcessPlayerPerformingmethod.Onthesurface, implementingthismethodshouldbeeasy.Thelogicissuchthataplayersuccessfullyrepeatsan instructionwhenoneofhishandsentersthespaceofthetargetuserinterfaceelement.Essentially,this isthesamehittestinglogicwealreadyimplementedtotriggerthestartofthegame(Listing5-5). However,insteadoftestingagainsttwostaticUIelements,wetestforthenextUIelementinthe instructionarray.AddthecodeinListing5-8totheapplication.Compileandrunit.Youwillquickly noticethattheapplicationworks,butisveryunfriendlytotheuser.Infact,thegameisunplayable.Our userinterfaceisbroken.
142
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-8.ProcessingPlayerMovementsWhenRepeatingInstructions private void ProcessPlayerPerforming(Skeleton skeleton) { IInputElement leftTarget; IInputElement rightTarget; FrameworkElement correctTarget; correctTarget = this._InstructionSequence[this._InstructionPosition]; leftTarget = GetHitTarget(skeleton.Joints[JointType.HandLeft], GameCanvas); rightTarget = GetHitTarget(skeleton.Joints[JointType.HandRight], GameCanvas); if(leftTarget != null && rightTarget != null) { ChangePhase(GamePhase.GameOver); } else if(leftTarget == null && rightTarget == null) { //Do nothing - target found } else if((leftHandTarget == correctTarget && rightHandTarget == null) || (rightHandTarget == correctTarget && leftHandTarget == null) { this._InstructionPosition++; if(this._InstructionPosition >= this._InstructionSequence.Length) { ChangePhase(GamePhase.SimonInstructing); } } else { }
ChangePhase(GamePhase.GameOver);
} Beforebreakingdowntheflawsinthelogic,let’sunderstandessentiallywhatthiscodeattemptsto accomplish.Thefirstlinesofcodegetthetargetelement,whichisthecurrentinstructioninthe sequence.Thenthroughhittesting,itgetstheUIelementsatthepointsoftheleftandrighthand.The restofthecodeevaluatesthesethreevariables.IfbothhandsareoverUIelements,thenthegameis over.Ourgameissimpleandonlyallowsasingleblockatatime.WhenneitherhandisoveraUI element,thenthereisnothingforustodo.Ifoneofthehandsmatchestheexpectedtarget,thenwe incrementourinstructionpositioninthesequence.Theprocesscontinueswhiletherearemore instructionsoruntiltheplayerreachestheendofthesequence.Whenthishappens,thegamephase changestoSimonInstruction,andmovestheplayertothenextround.Foranyothercondition,the applicationtransitionstotheGameOverphase. Thisworksfine,aslongastheuserisheroicallyfast,becausetheinstructionpositionincrementsas soonastheuserenterstheUIelement.TheuserisgivennotimetocleartheirhandfromtheUI element’sspacebeforetheirhand’spositionisevaluatedagainstthenextinstructioninthesequence.It isimpossibleforanyplayertogetpastleveltwo.Assoonastheplayersuccessfullyrepeatsthefirst
143
CHAPTER5ADVANCEDSKELETONTRACKING
instructionofroundtwo,thegameabruptlyends.Thisobviouslyruinsthefunandchallengeofthe game. Wesolvethisproblembywaitingtoadvancetothenextinstructioninthesequenceuntilafterthe user’shandhasclearedtheUIelement.Thisgivestheuseranopportunitytogetherhandsintoaneutral positionbeforetheapplicationbeginsevaluatingthenextinstruction.Weneedtotrackwhentheuser’s handentersandleavesaUIelement. InWPF,eachUIElementobjecthaseventsthatfirewhenamouseentersandleavesthespaceofaUI element,MouseEnterandMouseLeave,respectively.Unfortunately,asnoted,WPFdoesnotnatively supportUIinteractionswithskeletonjointsproducedbyKinect.Thisprojectwouldbeawholeloteasier ifeachUIElementhadeventsnameJointEnterandJointLeavethatfiredeachtimeaskeletonjoint interactswithaUIElement.Sincewearenotaffordedthisluxury,wehavetowritethecodeourselves. Implementingthesamereusable,elegant,andlow-leveltrackingofjointmovementsthatexistsforthe mouseisnon-trivialandimpossibletomatchgiventheaccessibilityofcertainclassmembers.Thistype ofdevelopmentisalsoquitebeyondthelevelandscopeofthisbook.Instead,wecodespecificallyfor ourproblem. Thefixforthegameplayproblemiseasytomake.Weaddacoupleofnewmembervariablesto tracktheUIelementoverwhichtheplayer’shandlasthovered.Whentheplayer’shandentersthespace ofaUIelement,weupdatethetrackingtargetvariable.Witheachnewskeletonframe,wecheckthe positionoftheplayer’shand;ifitisdeterminedtohaveleftthespaceoftheUIelement,thenweprocess theUIelement.Listing5-9showstheupdatedcodefortheProcessPlayerPerformingmethod.Thekey changestothemethodareinbold. Listing5-9.DetectingUsers’MovementsDuringGamePlay private FrameworkElement _LeftHandTarget; private FrameworkElement _RightHandTarget; private void ProcessPlayerPerforming(Skeleton skeleton) { UIElement correctTarget = this._InstructionSequence[this._InstructionPosition]; IInputElement leftTarget = GetHitTarget(skeleton.Joints[JointType.HandLeft], GameCanvas); IInputElement rightTarget = GetHitTarget(skeleton.Joints[JointType.HandRight], GameCanvas); if((leftTarget != this._LeftHandTarget) || (rightTarget != this._RightHandTarget)) { if(leftTarget != null && rightTarget != null) { ChangePhase(GamePhase.GameOver); } else if((_LeftHandTarget == correctTarget && _RightHandTarget == null) || (_RightHandTarget == correctTarget && _LeftHandTarget == null)) { this._InstructionPosition++; if(this._InstructionPosition >= this._InstructionSequence.Length) { ChangePhase(GamePhase.SimonInstructing); } } else if(leftTarget != null || rightTarget != null) {
144
CHAPTER5ADVANCEDSKELETONTRACKING
//Do nothing - target found
} else {
ChangePhase(GamePhase.GameOver);
} if(leftTarget != this._LeftHandTarget) { AnimateHandLeave(this._LeftHandTarget); AnimateHandEnter(leftTarget) this._LeftHandTarget = leftTarget; } if(rightTarget != this._RightHandTarget) { AnimateHandLeave(this._RightHandTarget); AnimateHandEnter(rightTarget) this._RightHandTarget = rightTarget; } } } Withthesecodechangesinplace,theapplicationisfullyfunctional.Theretwonewmethodcalls, whichexecutewhenupdatingthetrackingtargetvariable:AnimateHandLeaveandAnimateHandEnter. Thesefunctionsonlyexisttoinitiatesomevisualeffectsignalingtotheuserthatshehasenteredorlefta userinterfaceelement.Thesetypesofvisualcluesorindicatorsareimportanttohavingasuccessfuluser experienceinyourapplication,andareyourstoimplement.Useyourcreativitytoconstructany animationyouwant.Forexample,youcouldmimicthebehaviorofastandardWPFbuttonorchange thesizeoropacityoftherectangle.
EnhancingSimonSays ThisprojectisagoodfirststartinbuildinginteractiveKinectexperiences,butitcouldusesome improvements.Therearethreeareasofimprovement:theuserexperience,gameplay,andpresentation. Wediscusspossibleenhancements,butthedevelopmentisuptoyou.Grabfriendsandfamilyandhave themplaythegame.Noticehowusersmovetheirarmsandreachforthegamesquares.Comeupwith yourownenhancementsbasedontheseobservations.Makesuretoaskthemquestions,becausethis feedbackisalwaysbeneficialtomakingabetterexperience.
UserExperience Kinect-basedapplicationsandgamesareextremelynew,anduntiltheymature,buildinggooduser experiencesconsistofmanytrialsandanextremenumberoferrors.Theuserinterfaceinthisproject hasmuchroomforimprovement.SimonSaysuserscanaccidentlyinteractwithagamesquare,andthis ismostobviousatthestartofthegamewhentheuserextendshishandstothegamestarttargets.Once bothhandsarewithinthetarget,thegamebeginsissuinginstructions.Iftheuserdoesnotquicklydrop hishands,hecouldaccidentlyhitoneofthegametargets.Onechangeistogivetheusertimetoresethis handbyhisside,beforeissuinginstructions.Becausepeoplenaturallydroptheirhandstotheirside,an easychangeissimplytodelayinstructionpresentationforanumberofseconds.Thissamedelayis
145
CHAPTER5ADVANCEDSKELETONTRACKING
necessaryinbetweenrounds.Anewroundofinstructionsbeginsimmediatelyaftercompletingthe instructionset.Theusershouldbegiventimetoclearhishandsfromthegametargets.
GamePlay Thelogictogeneratetheinstructionsequenceissimple.Theroundnumberdeterminesthenumberof instructions,andthetargetsarechosenatrandom.Intheoriginalgame,thenewroundaddedanew instructiontotheinstructionsetofthepreviousround.Forexample,roundonemightbered.Round twowouldberedthenblue.Roundthreewouldaddgreen,sotheinstructionsetwouldbered,blue,and green.Anotherchangecouldbetonotincreasethenumberofinstructionsincrementallybyoneeach round.Roundsonethroughthreecouldhaveinstructionssetsthatequaltheroundnumberinthe instructioncount,butafterthat,theinstructionsetistwicetheroundnumber.Afunaspectofsoftware developmentisthattheapplicationcodecanberefactoredsothatwecanhavemultiplealgorithmsto generateinstructionsequences.Thegamecouldallowtheusertopickaninstructiongeneration algorithm.Forsimplicity,thealgorithmscouldbenamedeasy,medium,orhard.Whilethebaselogicfor generatinginstructionsequencesgetslongerwitheachround,theinstructionsdisplayataconstantrate. Toincreasethedifficultyofthegameevenmore,decreasetheamountoftimetheinstructionisvisible totheuserwhenpresentingtheinstructionset.
Presentation Thepresentationofeachprojectinthisbookisstraightforwardandeasy.Creatingvisuallyattractiveand amazingapplicationsrequiresmoreattentiontothepresentationthanisaffordedinthesepages.We wanttofocusmoreonthemechanicsofKinectdevelopmentandlessonapplicationaesthetics.Itis yourdutytomakegorgeousapplications.Withalittleeffort,youcanpolishtheUIoftheseprojectsto makethemdazzleandengageusers.Forinstance,createniceanimationtransitionswhendelivering instructions,andwhenauserentersandleavesatargetarea.Whenusersgetinstructionsetscorrect, displayananimationtorewardthem.Likewise,haveananimationwhenthegameisover.Atthevery least,createmoreattractivegametargets.Eventhesimplestgamesandapplicationscanbeengagingto users.Anapplication’sallureandcharismacomesfromitspresentationandnotfromthegameplay.
ReflectingonSimonSays Thisprojectillustratesbasicsofuserinteraction.Ittracksthemovementsoftheuser’shandsonthe screenwithtwocursors,andperformshittestswitheachskeletonframetodetermineifahandhas enteredorleftauserinterfaceelement.Hittestingiscriticaltouserinteractionregardlessoftheinput device.SinceKinectisnotintegratedintoWPFasthemouse,stylus,ortouchare,Kinectdevelopershave todomoreworktofullyimplementuserinteractionintotheirapplications.TheSimonSaysproject servesasanexample,demonstratingtheconceptsnecessarytobuildmorerobustuserinterfaces.The demonstrationisadmittedlyshallowandmoreispossibletocreatereusablecomponents.
Depth-BasedUserInteraction Ourprojectsworkingwithskeletondatasofar(Chapter4and5)utilizeonlytheXandYvaluesofeach skeletonjoint.However,theaspectofKinectthatdifferentiatesitfromallotherinputdevicesisnot utilized.Eachjointcomeswithadepthvalue,andeveryKinectapplicationshouldmakeuseofthedepth data.DonotforgettheZ.Thenextprojectexploresusesforskeletondata,andexaminesabasic approachtointegratingdepthdataintoaKinectapplication.
146
CHAPTER5ADVANCEDSKELETONTRACKING
Withoutusingthe3DcapabilitiesofWPF,thereareafewwaystolayervisualelements,givingthem depth.Thelayoutsystemultimatelydeterminesthelayoutorderofthevisualelements.Usingelements ofdifferentsizesalongwithlayoutsystemlayeringgivestheillusionofdepth.Ournewprojectusesa CanvasandtheCanvas.ZIndexpropertytosetthelayeringofvisualelements.Alternatively,itusesboth manualsizingandaScaleTransformtocontroldynamicscalingforchangesindepth.Theuserinterface ofthisprojectconsistsofanumberofcircles,eachrepresentingacertaindepth.Theapplicationtracks theuser’shandswithcursors(handimages),whichchangeinscaledependingonthedepthoftheuser. Theclosertheuseristothescreen,thelargerthecursors,andthefartherfromKinect,thesmallerthe scale. InVisualStudio,createanewprojectandaddthenecessaryKinectcodetohandleskeleton tracking.UpdatetheXAMLinMainWindow.xamltomatchthatshowninListing5-10.MuchoftheXAMLis commontoourpreviousprojects,orobviousadditionsbasedontheprojectrequirementsjust described.ThemainlayoutpanelistheCanvaselement.ItcontainsfiveEllipsesalongwithan accompanyingTextBlock.TheTextBlocksarelabelsforthecircles.Eachcircleisrandomlyplaced aroundthescreen,butgivenspecificCanvas.ZIndexvalues.Adetailedexplanationbehindthevalues comeslater.TheCanvasalsocontainstwoimagesthatrepresentthehandcursors.Eachdefinesa ScaleTransform.Theimageusedforthescreenshotsisthatofarighthand.The-1ScaleXvalueflipsthe imagetomakeitlooklikealefthand.
147
CHAPTER5ADVANCEDSKELETONTRACKING
Listing5-10.DeepUITargetsXAML