This book is for architects and senior managers who are responsible for building a strategy around their current data architecture, helping them identify the need for a Data Lake implementation in an enterprise context. The reader will need a good knowledge of master data management and information lifecycle management, and experience of Big Data technologies.
这是一本为架构师和高级经理提供关于如何在企业环境中实施Data Lake战略的指南。本书首先介绍了Data Lake的优势,包括灵活性、原始数据存储、快速执行数据丰富、全面视图和数据科学家访问原始数据的能力。接着,作者详细讨论了数据摄入的过程,包括数据收集、数据清洗、数据验证和数据加载。
本书还深入探讨了Data Lake的未来趋势,特别是深度学习和Data Lake如何结合,改变企业的未来。同时,作者也指出了传统数据架构的局限性,并解释了Data Lake如何克服这些局限性。