Wednesday, 15 August 2012

Power Pivot with Excel 2010 - Features and Getting Started

PowerPivot for Excel is an add-on for Excel 2010. PowerPivot allows users to conduct powerful business intellegence (BI) in an environment that is familier. It's a free download from Microsoft and allows users to work with extremely large data sets. Before PowerPivot, this kind of analysis was limited to enterprise BI tools such as SAS and Business Objects.

PowerPivot uses an in-memory engine called VertiPaq. This new SSAS engine takes advantage of the increased RAM available in most personal computers today. Most IT shops are challenged with the resources needed to build out an enterprise BI environment. PowerPivot moves some of this work closer to the business user. While there are many features in PowerPivot for Excel.

Work With Very Large Data Sets

If you open Microsoft Excel and move to the very bottom of a worksheet, you will see that the maximum number of rows is 1,048,576. This represents about a million rows of data. With PowerPivot for Excel, there is no limit on the number of rows of data. While this is a true statement, the actual limitation is based on the version of Microsoft Excel you are running and whether you are going to publish your spreadsheet to SharePoint 2010.

If you are running the 64-bit version of Excel, PowerPivot can reportedly handle about 2 GB of data. You must have enough RAM to make this work. If you plan to publish your PowerPivot based Excel spreadsheet to SharePoint 2010, the maximize file size is also 2 GB. The bottom line is PowerPivot for Excel can handle millions of records. If you hit the maximum, you will receive a memory error.

Combine Data From Different Sources

This has to be one of the most important features in PowerPivot for Excel. Excel has always been able to handle different data sources such as SQL Server, XML, Microsoft Access and even web based data. The problem comes when you need to created relationships between different data sources. There are 3rd party products available to help with this, and you can use Excel functions like VLOOKUP to "join" data. But these methods are impractical for large data sets. PowerPivot for Excel is built to accomplish this task.
Within PowerPivot, you can import data from virtually any data source. I have found that one of the most useful data sources is a SharePoint List. I have used PowerPivot for Excel to combine data from SQL Server and a list from SharePoint. You will need SharePoint 2010 to make this work along with the ADO.Net runtime installed on the SharePoint environment.
When you connect PowerPivot to a SharePoint list, you are actually connecting to a Data Feed. To create a Data Feed from a SharePoint list, open the list and click on the List ribbon. Then click on Export as Data Feed and save it. The feed is available as a URL in PowerPivot for Excel.

There is one challenge that will be resolved by Microsoft soon. PowerPivot for Excel only supports inner joins. If you need to do an outer join, you will need to wait until an update is released from Microsoft or tweak your data so that an inner join can be used.

Create Visually Apealing Analytical Models

PowerPivot for Excel allows you to output a variety of visual data to your Excel worksheet. You can return data in a PivotTable, PivotChart, Chart and Table (horizontal and vertical), Two Charts (horizontal and vertical), Four Charts and a Flattened PivotTable. The power comes when you create a worksheet that includes multiple outputs. This provides a dashboard view of the data that makes analysis really easy. Even your executives should be able to interact with your worksheet if you build it correctly. Slicers, which shipped with Excel 2010, makes it simple to visually filter data.

Use DAX to Create Calculated Fields for Slicing and Dicing Data

DAX stands for Data Analysis Expressions and is the formula language used in PowerPivot tables. There are similarities between Excel formulas and DAX. The DAX formula language is used primarily in creating calculated columns. Check out the DAX Reference in TechNet for a complete reference.

In a regular Pivot Table in Excel that included a properly formated date field, you can use grouping to include the ability to filter or group by year, quarter, month and day. In PowerPivot, you need to create these as calculated columns to accomplish the same thing. Add a column for each way you need to filter or group data in your Pivot Table. Many of the date functions in DAX are the same as Excel formulas which makes this a snap.
For example, use =YEAR([date column]) in a new calculated column to add the year to your data set in PowerPivot. You can then use this new YEAR field as a slicer or group by in your Pivot Table.

Publish Dashboards to SharePoint 2010

If your company is like mine, dashboard are still the work of your IT team. PowerPivot when combined with SharePoint 2010 puts the power of dashboards into the hands of your users. One of the prerequisites of publishing PowerPivot driven charts and tables to SharePoint 2010 is the implementation of PowerPivot for SharePoint on your SharePoint 2010 farm

Installing PowerPivot

The PowerPivot plugin for Excel can be downloaded from the official PowerPivot site:

PowerPivot can be installed on both x86 and x64 systems including Windows XP sp3, Windows Vista sp1, and Windows 7. If you aren’t installing it on Windows 7, you will need to install .Net Framework 3.5 SP1.
A minimum of 1G of memory is required to run PowerPivot, but 2G is recommended. Depending on the solution, more memory for PowerPivot can increase the performance of the transformations due to its in-memory processing capabilities.

Also on the PowerPivot site is a link to a trial for Microsoft Office 2010 Professional Plus if you don’t already have it installed. If you customize the installation of the Office Suite, you need to choose at least Excel and the Office Shared Tools.
Once Microsoft Office is installed, you can download the PowerPivot install from the download page on the PowerPivot site. Locate and click on the PowerPivot install file. The add-in will actually install the next time you run Excel, at which time you will be asked to approve the installation of the PowerPivot add-in.

Creating Your First PowerPivot

Now that the PowerPivot add-in is installed, you can see that you have a new PowerPivot tab across the top of Excel. Clicking on that tab displays the PowerPivot ribbon. To select the data to use for your PowerPivot, you will need to click on the PowerPivot Window button (highlighted on the left below).

On the PowerPivot window, select Get External Data from Database -> From SQL Server. Notice that you can also get data from many other sources including data feeds, text files, and the web. In this case, however, we are going to use the AdventureWorks2008DW sample database.

In the Table Import Wizard, specify localhost for the server and AdventureWorks2008DW as the Database name. If you need to install AdventureWorks2008DW, you can find it here.

Hit Next and make sure to select “Select from a list of tables and views to choose the data to import” from the next screen.

From the table list, select the FactResellerSales table and rename it to ResellerSales in the Friendly Name column.


While you are there, click the Select Related Tables button. This selects the 7 tables in the database that have a relationship with the FactResellerSales table. Click Preview & Filter just to have a look; we won’t change anything here, though you could use it to filter out rows and columns from the model.

Click Cancel to get out of Preview & Filter and then hit Finish on the Table Import Wizard screen. This brings up the following window which displays the progress on each table import.

A Details link displays in the message column on each import line if there is information or errors regarding the import. Notice there is a detail link on final step, data preparation. Clicking that displays a popup window with information about the data preparation. Most of it is successful, but it does inform you of a few errors. For instance, self-joins are not supported in the case of employee. Fortunately, none of these affect the pivot we are creating.

Closing the Table Import Wizard reveals the model from which we will work. First, let’s tweak a few things. You see each imported table is represented on its own tab in the model. Let’s select the ResellerSales tab.

Scroll all the way to the right and click on “Add Column” in the header. For the expression, subtract TotalProductCost from SalesAmount. You can either type in the expression as it appears below, or build it as you click the columns.

Now we need to give our new computed column a better name than ComputedColumn1. Right-click on the column header and select “Rename Column” from the drop down menu. Change the name to “Profit”.

To get started on the Pivot Table, we need to switch back to the workbook itself. You can either just click on the spreadsheet in the background or you can use the toolbar button that looks like the Excel icon (next to Formatting) to switch back.
Once back in the workbook, select the PivotTable button and choose Chart and Table (Horizontal) from the menu. This allows you to create both a chart and table for your data contained on one table. The chart will be on the left and data will be on the right.

When asked for a decision on placing it in a new or existing workbook, choose the existing workbook.

The Field List panel on your right controls the contents of the pivot. Let’s begin by dragging the fields we are interested in from the list into their respective areas. First drag the amount fields for TotalProcuctCost, SalesAmount, and Profit from the ResellerSales table down to the Values list. Drag the SalesTerritoryRegion from DimSalesTerritory to the Row Labels list. Similarly, drag the ResellerName column from the DimReseller table down to the Row Labels list and drop it in below the SalesTerritoryRegion column.
Now that the report is filled in, we’ll clean it up a little. Change the header “Row Labels” to “Region\Reseller”. Change the headers over the three amount fields to remove “Sum of” from the name. Your pivot report and Field List panel should resemble the following screen shot.

It’s starting to look good, but the chart is way too busy. Maybe we should just look at one region at a time. So let’s add a horizontal slicer. From the Field List, choose SalesTerritoryRegion again and drag it to the Slicers Horizontal list. Now you see a button for each region displayed at the top of the report. Click on France. You’ll see the report and chart filter down to display just the region of France and its resellers.

Let’s further narrow down the chart by doing a manual filter using the filter button on the chart for ResellerName. From the list, first click Select All to unselect all the resellers, then pick just Accessories Network, Ace Bicycle Supply, and Atypical Bike Company. This reduces the Resellers displayed on the chart to just these three. Note that if we would have selected resellers not from France (our existing Region filter), the chart would be empty.

The resulting chart is much easier to read having just the three resellers of interest graphed.

We are not quite done. It’s bothering me that our amounts have more than 2 decimal places, so let’s take care of that real quick. Select all the data for the three amount columns and when the formatting menu appears, select the $ icon as shown below.

And now, the final product. Congratulations on completing your first PowerPivot report.

No comments:

Post a Comment