Implementing a Microsoft SQL Server Parallel Data Warehouse Using the Kimball Approach
Writer: Warren Thornthwaite
Technical Reviewers: Jesse Fountain, Barbara Kess, Stuart Ozer
Published: June 2011
Applies to: SQL Server 2008 R2
This white paper explores how the Kimball approach to architecting and building a data warehouse/businessintelligence (DW/BI) system works with Microsoft’s Parallel Data Warehouse, and how you would incorporate this new product as the cornerstone of your DW/BI system. For readers who are not familiar with the Kimball approach, we begin with a brief overview of the approach and its key principles. We then explore the Parallel Data Warehouse (PDW) system architecture and discuss its alignment with the Kimballapproach. In the last section, we identify key best practices and pitfalls to avoid when building or migrating a large data warehouse to a Microsoft SQL Server PDW system. This includes a look at how Parallel Data Warehouse can work with other Microsoft offerings such as SQL Server 2008 R2, the Microsoft Fast Track Reference Architecture for Data Warehouse, and the new Business Data Warehouse toprovide a complete solution for enterprise-class data warehousing.
This document is provided “as-is”. Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it.
Some examples depicted herein are provided for illustration only and are fictitious. No real association orconnection is intended or should be inferred.
This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes.
© 2011 Microsoft. All rights reserved.
About the Author
Warren Thornthwaite started his data warehousing/business intelligence career in 1984 at Metaphor ComputerSystems where he worked for eight years implementing major DW/BI systems and managing the consulting organization. After Metaphor, Warren became the Program Director of Stanford University’s enterprise data warehouse (EDW) development project. He left Stanford to co-found InfoDynamics LLC, a data warehouse consulting firm in 1994. In 1997, Warren joined WebTV to help build a world class,multi-terabyte customer focused data warehouse. In 2003 he returned to consulting as a member of the Kimball Group with Ralph Kimball.
Warren holds a BA in Communication Studies from the University of Michigan and an MBA in Decision Sciences from the University of Pennsylvania’s Wharton School. He is co-author for The Data Warehouse Lifecycle Toolkit, 2nd Edition (Wiley, 2008) and The Microsoft DataWarehouse Toolkit, 2nd Edition (Wiley, 2011). He teaches Kimball University courses around the world, and is a regular presenter at industry conferences.
The Kimball Group is the source for dimensional DW/BI consulting and education, consistent with our best selling Toolkit book series, Design Tips, and articles. Please visit www.kimballgroup.com for more information.
Introduction 5Section 1: The Kimball Approach 6
Follow a Proven Methodology: Lifecycle Steps and Tracks 6
The Data Warehouse Bus Matrix 8
Data Model Design 9
What Is a Dimensional Model? 9
Performance and Maintenance 11
Dimensions and Facts 11
Tracking Attribute Variations over Time 14
Performance Depends on the Platform 14
Section 2: Massively Parallel Processingand the Parallel Data Warehouse 15
The Strengths and Limits of Single-Server Systems 15
The Massively Parallel Processing Alternative 16
The SQL Server PDW Architecture 16
A Scalable Appliance 18
SQL Server PDW Data Management 18
The Kimball Approach on SQL Server PDW 18
Section 3: Enterprise Data Warehouse Architecture Options 20
Requirements for an Enterprise Data Warehouse...