Recently I was at a client talking about Power BI in Excel. They had some questions about where to load data from their source system. After all you can just load data using the Data tab, and create a Pivot table based on that. Why use Power Pivot? The best reason that I could give is because if you load data up into Power Pivot, you are then going to use the XVelocity engine. The xVelocity engine loads data into memory and provides data compression which will increase the amount of data that you can store within Excel.
Demonstrating How the xVelocity Engine Works in Power Pivot
Since about 1890 Missouri has been known as the “Show Me State”. Apparently, no one really knows why, but I’ve heard the expression “I am from Missouri so show me”. I’m assuming that everyone reading this is from Missouri, so to speak, which
means I need to be able to show you how the xVelocity engine works. To do that, I am using a sample data set I got from UCI’s Machine Learning Archive, which is a great place for getting machine learning samples. The data set I selected for this test is the Online News Popularity set, which can be found here. If you don’t feel like downloading anything, do the same thing with any other large text file. In the Online News Popularity file there are 65 columns and
39,644 rows. The csv file size is 16,518 KB. If I open up that file and save it in the Excel file format and do nothing else, the file size is 18,484 KB. Saving the csv file in Excel adds about 2 MB to the file size. This isn’t surprising, as Excel adds to the text when it is saved, which can be seen if you ever open up an Excel file in Notepad.
The xVelocity File Size Test
So far we have not tested the xVelocity Engine yet. To test the xVelocity engine, open up Excel, click on the Power Pivot tab, select Manage, which will open up Power Pivot. Within Power Pivot, click on the From Other Sources icon, and select Text File, then click on the Next button. Browse to the Online News Popularity.csv location, wait until the data loads, then click on the Finish button. The data will then be loaded into Power Pivot. Save everything and exit Excel. The file size for the Excel file with the data loaded into Power Pivot is 11,386 KB, which is a 39% reduction in space. In playing around with various files, I have noticed that you get the biggest reduction if you have few columns and many rows. Now I know that the xVelocity Engine does more than just shrink files, but showing someone the decreased amount of memory usage just doesn’t have the same impact as file size. This is a simple experiment which shows the improvements made to Power Pivot for Excel, which I hope you find useful as well.
Data aficionado et SQL Raconteur