Home > News > Inside of Parallel Data Warehouse

Inside of Parallel Data Warehouse

From the first look at PDW it seems like usual Federated Distributed Database which we basically use to work with. But, PDW isolates us from sophisticated and complex stuff such as Distributed Partitioned Views, data replication, etc., which is very convenient. All functionality of PDW was developed using usual SQL Server 2008 R2 engine where the latter was wrapped into the list of special new modules such as Backup module, Restore module, Bulk Load, distributed query collector, etc. As a result, overall performance which PDW solution can provide with basic 10 boxes is really impressive. Moreover, as was claimed by Microsoft PDW had been developed without dependency from hardware. The only one point where PDW depends on hardware is File-Group management because the PDW engine does not allow you to manage File-Groups which are closely coupled with storage structure. Taking into account the complexity of PDW infrastructure it becomes more understandable why Microsoft sell PDW just through vendors. But, from my perspective Microsoft tend to sell PDW through vendors not because some specific hardware while just to be sure that everything is working smoothly and optimally.

From the list of new commands it is evidently seen that the principals to divide information in PDW are the same like in usual Federated Distributed Database. It was done by using horizontal tables splitting. But, all management of distributed queries was shifted to the primary PDW node. Moreover, the old-fashion idea of manual data distribution was embedded into T-SQL syntax. However, it is still crucially important to design a database structure properly. Otherwise, all you investments will be spent for nothing. According to this, it could be very interesting to compare performance of Federated  Distributed Database and Parallel Data Warehouse because theoretically they are quite similar.

Nevertheless, thus it is not so easy to get an access to really operated Parallel Data Warehouse solution, I am steal curious how faster is it in compare with the predecessor technology operated on base of Federated Distributed Database architecture with the same amount of hardware. But, for the beginning we need to go through the article Getting Started with Parallel Data Warehouse written by Rich Johnson with precise description of Parallel Data Warehouse inside which could be very interesting for DBAs who want to work with PDW.


Source: Aleksey Fomchenko (https://sqlconsulting.wordpress.com)

Reference: Aleksey Fomchenko (https://sqlconsulting.wordpress.com)


Categories: News Tags:
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: