OLTPs / Traditional database designs are perhaps one of the most common database types used. Data warehouse as a concept was introduced in 1980, but widespread adoption came a good few years later.
Since OLTP designs are very common and well known, It helps to understand the difference in purpose between traditional OLTP DBs and data warehouses.
Traditional DBs are primarily a persistence layer of an application – where data is changed very frequently requiring a quick operational response time. It’s all about the speed at which CRUD operations can be performed, and ensuring high availability of data for other processes.
To make that happen, these DBs are supposed to be:
- Very quick in inserting / updating new records, those being the most frequent operations.
- The DB can be accessed by different operations and therefore locking needs to be minimized.
- Typical read operations usually fetch a small amount of data – so querying typically should be optimized and not heavy on processing
- It’s crucial the data is consistent, entries match and are linked properly, which leads to:
- Normalization – which helps with reducing redundancy, maintaining consistency making updates easier.
Data warehouses have a completely different purpose. OLTPs are used to maintain data for ‘system of records’, Data warehouses are meant to serve as a single source of enterprise wide data for reporting. That means – OLTPs have small, quick and very frequent updates. Data warehouses (DW) are used for querying very large data sets and typically ‘batch’ updates which don’t have to be that performance critical.
To elaborate further this is how data warehouses are different in contrast with OLTPs:
- The most important purpose of a DW is retrieval of large data sets through queries instead of very limited data sets
- Since the primary job is retrieval of data – performance of ‘read’ operations is essential
- Typically you might not have many operations trying to access the DW, therefore locking is not an issue
- DW are a copy of OLTP data – that means we typically will not have frequent inserts or updates like an OLTP system since those are the customer facing ones
- Since insert / update will not be as frequent, therefore performance on write operations is not a concern