Data Warehousing:
This term encapsulates the concept of centrally transforming structured data from diverse platforms to generate valuable insights.
Data Warehouse:
This represents a relatively costly repository for storing structured data, designed for easy access and analysis using tools like SQL and visualization tools.
Data Lake:
In contrast, a Data Lake serves as a more economical repository for structured, semi-structured, and unstructured data. It facilitates access for analysis through various tools such as Python, R, Scala, Java, and SQL. The advantage lies in the diverse storage formats available, but the challenge is the potential for disorganization and difficulty in managing and deriving insights.
Delta Lake:
A Data Lake enriched with transaction logs, Delta Lake provides transactional guarantees through ACID-based logs. This ensures a level of data integrity and reliability within the repository.
Data Lakehouse(ing):
This concept involves centrally transforming data from multiple platforms simultaneously to generate insights. The layers—Bronze, Silver, and Gold—may optionally incorporate transaction logs, transforming it into a Delta Lakehouse. Notably, Data Lakehouse using Azure and Databricks is now inherently Delta