Table of contents : Cover Table of Contents Title Page Introduction GenAI APPLICATIONS AND LARGE LANGUAGE MODELS IMPORTANCE OF COST OPTIMIZATION MICRO CASE STUDIES WHO IS THIS BOOK FOR? SUMMARY 1 Introduction OVERVIEW OF GenAI APPLICATIONS AND LARGE LANGUAGE MODELS PATHS TO PRODUCTIONIZING GenAI APPLICATIONS THE IMPORTANCE OF COST OPTIMIZATION SUMMARY 2 Tuning Techniques for Cost Optimization FINE‐TUNING AND CUSTOMIZABILITY PARAMETER‐EFFICIENT FINE‐TUNING METHODS COST AND PERFORMANCE IMPLICATIONS OF PEFT METHODS SUMMARY 3 Inference Techniques for Cost Optimization INTRODUCTION TO INFERENCE TECHNIQUES PROMPT ENGINEERING CACHING WITH VECTOR STORES CHAINS FOR LONG DOCUMENTS SUMMARIZATION BATCH PROMPTING FOR EFFICIENT INFERENCE MODEL OPTIMIZATION METHODS PARAMETER‐EFFICIENT FINE‐TUNING METHODS COST AND PERFORMANCE IMPLICATIONS SUMMARY REFERENCES 4 Model Selection and Alternatives INTRODUCTION TO MODEL SELECTION MOTIVATING EXAMPLE: THE TALE OF TWO MODELS THE ROLE OF COMPACT AND NIMBLE MODELS EXAMPLES OF SUCCESSFUL SMALLER MODELS DOMAIN‐SPECIFIC MODELS THE POWER OF PROMPTING WITH GENERAL‐PURPOSE MODELS SUMMARY 5 Infrastructure and Deployment Tuning Strategies INTRODUCTION TO TUNING STRATEGIES HARDWARE UTILIZATION AND BATCH TUNING INFERENCE ACCELERATION TOOLS MONITORING AND OBSERVABILITY SUMMARY CONCLUSION BALANCING PERFORMANCE AND COST FUTURE TRENDS IN GenAI APPLICATIONS SUMMARY INDEX Copyright Dedication ABOUT THE AUTHOR ABOUT THE TECHNICAL EDITOR End User License Agreement