Returning The Contents Of Any Table In Any Excel Workbook In Sharepoint With Power Pivot And Power Query
Reposted from Chris Webb's blog with the author's permission.
Yet another Power Query post this week (with more to come) - today I'm going to show you how you can use Power Pivot and Power Query to build a dynamic solution that allows a user to browse through all the Excel workbooks stored in a Sharepoint document library and see the contents of any table in any workbook they choose. Why would you want to do this? Well, I'm not really sure, but I visited a customer a few months ago where an analyst (a data steward?!) had created a repository of Excel workbooks containing various datasets used for BI in Sharepoint. Maybe it could be useful to use Power Query to browse these datasets. and even if it isn't, it's still a learning experience for me
The first step is to build a Power Query query that returns a list of all of the Excel workbooks in a Sharepoint document library. There are a couple of ways of doing this; you could use the SharePoint.Contents() function, but I opted to use the OData REST api that Sharepoint exposes because it was faster and easier to work with. The following Power Query query is all that's needed to get a list of files in a library called MetadataTest:
let
Source = OData.Feed("https://zzzz.sharepoint.com/_vti_bin/listdata.svc/MetadataTest"),
RemovedOtherColumns = Table.SelectColumns(Source,{"Id", "ContentType", "Path", "Name"})
in
RemovedOtherColumns
These all happen to be Excel workbooks, which makes things easier for me.
The next thing to do is to get a list of all the tables in any of these Excel workbooks. Again, you can use OData for this. Here's an example query that requests the $metadata of the OData endpoint that an Excel workbook called FruitVegPeople.xlsx exposes, interprets the response as an XML document, and finds the list of tables in the workbook from that XML document (this last step accounts for 90% of the code below):
let
Source = Xml.Document(
OData.Feed(
"https://zzzz.sharepoint.com/
_vti_bin/ExcelRest.aspx/MetadataTest/FruitVegPeople.xlsx/OData/$metadata"
)),
Value = Source{0}[Value],
Value1 = Value{0}[Value],
Value2 = Value1{0}[Value],
Value3 = Value2{3}[Value],
#"Expand Value" = Table.ExpandTableColumn(Value3, "Value"
, {"Name", "Namespace", "Value", "Attributes"}
, {"Value.Name", "Value.Namespace", "Value.Value", "Value.Attributes"}),
#"Expand Attributes" = Table.ExpandTableColumn(#"Expand Value", "Attributes"
, {"Name", "Namespace", "Value"}
, {"Attributes.Name", "Attributes.Namespace", "Attributes.Value"}),
FilteredRows = Table.SelectRows(#"Expand Attributes", each ([Attributes.Name] = "Name")),
RemovedOtherColumns = Table.SelectColumns(FilteredRows,{"Attributes.Value"}),
RenamedColumns = Table.RenameColumns(RemovedOtherColumns,{{"Attributes.Value", "TableName"}})
in
RenamedColumns
In this case, there are three tables called Fruit, Vegetable and Person returned:
At this point you can find a list of files in a library and find a list of tables in an Excel workbook; what you really need is a single list of all the tables in all of the files. This can be accomplished by turning the second query above into a function, and calling that function for each file returned by the first query. Here's the function:
let
Source = (SharepointRootURL, ExcelSourceDoc) =>
let
Source =
Xml.Document(
OData.Feed(SharepointRootURL & "/_vti_bin/ExcelRest.aspx" & ExcelSourceDoc & "/OData/$metadata"
)
),
Value = Source{0}[Value],
Value1 = Value{0}[Value],
Value2 = Value1{0}[Value],
Value3 = Value2{3}[Value],
#"Expand Value" = Table.ExpandTableColumn(Value3, "Value", {"Name", "Namespace"
, "Value", "Attributes"}, {"Value.Name", "Value.Namespace", "Value.Value", "Value.Attributes"}),
#"Expand Attributes" = Table.ExpandTableColumn(#"Expand Value", "Attributes"
, {"Name", "Namespace", "Value"}, {"Attributes.Name", "Attributes.Namespace", "Attributes.Value"}),
FilteredRows = Table.SelectRows(#"Expand Attributes", each ([Attributes.Name] = "Name")),
RemovedOtherColumns = Table.SelectColumns(FilteredRows,{"Attributes.Value"}),
RenamedColumns = Table.RenameColumns(RemovedOtherColumns,{{"Attributes.Value", "TableName"}})
in
RenamedColumns
in
Source
And here it is used:
let
SharepointRootURL = "https://zzzz.sharepoint.com",
Source = OData.Feed(SharepointRootURL & "/_vti_bin/listdata.svc/MetadataTest"),
RemovedOtherColumns = Table.SelectColumns(Source,{"Id", "ContentType", "Path", "Name"}),
InsertedCustom = Table.AddColumn(RemovedOtherColumns, "Custom"
, each GetTablesFunction(SharepointRootURL, [Path] & "/" & [Name] )),
#"Expand Custom" = Table.ExpandTableColumn(InsertedCustom, "Custom", {"TableName"}, {"TableName"}),
InsertedCustom1 = Table.AddColumn(#"Expand Custom", "TableODataURL", each SharepointRootURL
& "/_vti_bin/ExcelRest.aspx" & [Path] & "/" & [Name] & "/OData/" & [TableName]),
RemovedColumns = Table.RemoveColumns(InsertedCustom1,{"ContentType", "Id"})
in
RemovedColumns
This gives you the list of workbooks and tables you need, and this table itself can now be loaded into the Excel Data Model. You can now build a report using this data something like this:
The slicers allow the user to select a folder, a workbook and a table in a workbook. In the top right hand corner is a PivotTable displaying a measure with the following definition:
TableODataURLMeasure:=
IF(HASONEVALUE(GetDocumentsTables[TableODataURL]), VALUES(GetDocumentsTables[TableODataURL]))
This returns the URL containing the OData feed for the contents of the selected table. The PivotTable is a bit of a red herring, in fact - I only created it so the slicer highlighting would work properly, a trick I use quite frequently when I'm working with Excel cube functions (the PivotTable itself can be hidden from the end user). In the bottom right hand corner, in the cell selected in the screenshot above, is a CubeValue() function that returns the same measure value and I've put that cell into an Excel table - which means that we can now use this measure value as an input to yet another Power Query query.
Two last bits of code: first, a Power Query function that will return the contents of any OData feed passed to it.
let
Source = (ODataTableURL) =>
let
Source = OData.Feed(ODataTableURL),
RemovedColumns = Table.RemoveColumns(Source,{"excelUpdated", "excelRowID"})
in
RemovedColumns
in
Source
Finally, a query that takes this function and calls it for the URL selected by the user above:
let
Source = GetTableContentsFunction(Excel.CurrentWorkbook(){[Name="TableODataURL"]}[Content]{0}[URL])
in
Source
And there you have it - a way of selecting any table in any Excel workbook in a Sharepoint library and seeing its contents. You can download the sample workbook here, although of course you'll have to modify all the URLs to make it work on.
PS Is it just me, or does "Power Query query" seem like a bit of a mouthful? Would it be ok just to talk about queries in future? That seems a bit misleading. maybe I should talk about "a Power Query"? That doesn't sound right either. Hmmm.
Chris has been working with Microsoft BI tools since he started using beta 3 of OLAP Services back in the late 90s. Since then he has worked with Analysis Services in a number of roles (including three years spent with Microsoft Consulting Services) and he is now an independent consultant specialising in complex MDX, Analysis Services cube design and Analysis Services query performance problems. His company website can be found at http://www.crossjoin.co.uk and his blog can be found at http://cwebbbi.wordpress.com/ . |
Tags: excel, power query