Must read for data enthusiasts – https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dw-bi-lifecycle-method/
Why Cosmos DB may not be apt for building Data Warehouse?
Well, the question is slightly wrong until the context is specified because it is possible to build Modern Data Warehouse by including Cosmos DB in the architecture. This is too much relevant today because the data is no more straight forward content with human readable entities and relations (structured), but unstructured and/or streaming too. Also the pace of the data flow, or business requirement is becoming near real-time.
See a reference architecture below:
Here, in this blog, the context is about Traditional Data Warehouse possibility, where you will be modelling the data, specifying relationships, etc. Let us look at the definition of Data Warehouse mentioned in Oracle Docs:
“A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing.”
Now let us ask the right question – Why Cosmos DB may not be apt for using as a data store in a Data Warehouse? – It is not apt, because, Cosmos DB is a NoSQL database where it is literally not easy to draw relationships between entities/tables/data. Check what MSDN blog said about this:
“Cosmos DB is not a relational database. You cannot just take your relational database and expect it to run in Cosmos DB. You could move tables of data into Cosmos, but not the relational aspects of your existing data structures.”
As of today, this is the conclusion. But we cannot say tomorrow what will happen to these concepts because Cosmos DB is becoming powerful and I am already in love with it.
You can read common scenarios (use cases) where you can use, or the companies use Cosmos DB here.
Do you have different thoughts on this? Please comment.