Table of contents : Preface Who Should Read This Book Why We Wrote This Book Navigating This Book Conventions Used in This Book Using Code Examples O’Reilly Online Learning How to Contact Us Acknowledgments Hubert Stephen 1. Data Mesh Introduction Data Divide Data Mesh Pillars Data Ownership Data as a Product Federated Computational Data Governance Self-Service Data Platform Data Mesh Diagram Other Similar Architectural Patterns Data Fabric Data Gateways and Data Services Data Democratization Data Virtualization Focusing on Implementation Apache Kafka AsyncAPI 2. Streaming Data Mesh Introduction The Streaming Advantage Streaming Enables Real-Time Use Cases Streaming Enables Data Optimization Advantages Reverse ETL The Kappa Architecture Lambda Architecture Introduction Kappa Architecture Introduction Summary 3. Domain Ownership Identifying Domains Discernible Domains Geographic Regions Hybrid Architecture Multicloud Avoiding Ambiguous Domains Domain-Driven Design Domain Model Domain Logic Bounded Context The Ubiquitous Language Data Mesh Domain Roles Data Product Engineer Data Product Owner or Data Steward Streaming Data Mesh Tools and Platforms to Consider Domain Charge-Backs Summary 4. Streaming Data Products Defining Data Product Requirements Identifying Data Product Derivatives Derivatives from Other Domains Ingesting Data Product Derivatives with Kafka Connect Consumability Synchronous Data Sources Asynchronous Data Sources and Change Data Capture Debezium Connectors Transforming Data Derivatives to Data Products Data Standardization Protecting Sensitive Information SQL Extract, Transform, and Load Publishing Data Products with AsyncAPI Registering the Streaming Data Product Building an AsyncAPI YAML Document Assigning Data Tags Versioning Monitoring Summary 5. Federated Computational Data Governance Data Governance in a Streaming Data Mesh Data Lineage Graph Streaming Data Catalog to Organize Data Products Metadata Schemas Lineage Security Scalability Generating the Data Product Page from AsyncAPI Apicurio Registry Access Workflow Centralized Versus Decentralized Centralized Engineers Decentralized (Domain) Engineers Summary 6. Self-Service Data Infrastructure Streaming Data Mesh CLI Resource-Related Commands Cluster-Related Commands Topic-Related Commands The domain Commands The connect Commands The streaming Commands Publishing a Streaming Data Product Data Governance-Related Services Security Services Standards Services Lineage Services SaaS Services and APIs Summary 7. Architecting a Streaming Data Mesh Infrastructure Two Architecture Solutions Dedicated Infrastructure Multitenant Infrastructure Streaming Data Mesh Central Architecture The Domain Agent (aka Sidecar) Data Plane Control Plane Summary 8. Building a Decentralized Data Team The Traditional Data Warehouse Structure Introducing the Decentralized Team Structure Empowering People Working Processes Fostering Collaboration Data-Driven Automation New Roles in Data Domains New Roles in the Data Plane New Roles in Data Science and Business Intelligence 9. Feature Stores Separating Data Engineering from Data Science Online and Offline Data Stores Apache Feast Introduction Summary 10. Streaming Data Mesh in Practice Streaming Data Mesh Example Deploying an On-Premises Streaming Data Mesh Installing a Connector Deploying Clickstream Connector and Auto-Creating Tables Deploying the Debezium Postgres CDC Connector Enrichment of Streaming Data Publishing the Data Product Consuming Streaming Data Products Fully Managed SaaS Services Summary and Considerations Index About the Authors