My tweets from SQL PASS Summit 2009 - SSAS, PowerPivot, DAX and more
I just came back from SQL PASS Summit 2009. About a week before that I started to use Twitter and I tried to “tweet” during the conference as much as I can. My Twitter ID is: http://twitter.com/VidasM. I decided that some of my posts could be interesting to people who do not use Twitter, so I cleaned them up a bit and posted them here.
- Avoid “All in one” cubes
- Keep cube data size small – 300-400GB cube, that is 5-6TB source relational data.
- SQL CAT call cube large when its size is 2-3TB!
- After bunch of process updates, if you can do sometimes process full. This might reduce “fragmentation” of data
- Big dimension-anything over 10 – 20mln members, but depends on number of attributes, relationships
- On Windows 2003 consider <PreaAllocate> setting in ini file, to reduce memory fragmentation. Could be big performance increase.Win2008-no need to use it.
- During SSAS processing there are a lots of small random I/Os , RAID-5 is OK.
- on source DB for SSAS use compression, eliminate locking.
- money type is about 10-20% faster than decimal.
- if optimize SSAS, keep in mind SQL will optimize too and create many parallel queries. Maybe need MAXDOP=1 for processing, if all optimization is done for SSAS (many partitions, data source connections, etc)
- consider 20% aggregations, should be up to 100 ideally
Here are my tweets from the session ”Top customer support issues in Analysis Services 2005/2008 and how to resolve them”, by John Sirmon (Microsoft CSS)
- use native provided for data source, don’t use managed provider
- when SQL&SSAS same box, use shared memory data source: lpc:MachineName.!!!!!! Also change packet size to 32757.
- #SSAS or instead of lpc: could also use tcp:<machine name>
- #SSAS test packet size, depends on hardware. Sometimes smaller value is better.
- there is Performance Analyzer Logs, not available for SSAS yet, but good.
- when SSAS uses multiple data sources, then other data is accessed through OpenRowset – that is not efficient.
- John Sirmon said: best book every done on SSAS – Expert cube development with SSAS
Here are my tweets from the session ”SQLCAT: A Preview of PowerPivot Best Practices”. Presenters are Denny Lee and Dave Wickert:
- #PowerPivot spreadsheet is stored in the SQL database when deployed to SharePoint
- #powerPivot uses AS engine – in memory column based store w/VertiPaq. All calculations are local after import.
- Excel has it’s own local in process SSAS engine. New#powerPivot functions – Data Access eXpressions (DAX)
- #powerPivot compression 10:1 (approx) and depends…
- More info about #PowerPivot:VertiPaq does compression about 10:1, then SSAS does on top compression 1.5:1 or 2:1.
- MOLAP stores multiple records in the grain of fact table, so some records could be rolled up
- Vertipaq has a row for each source data, so different from MOLAP.
- There is on disk structure for #PowerPivot that looks similar to SSAS. c:\local user\… Additional info: Denny Lee just blogged about this with more details.
- Sub-folder customdata has file with backup of SSAS database for #PowerPivot.
- Showing demo of silverlight gallery of #PowerPivot reports in SharePoint
- #PowerPivot - when open workbook in Excel services, no interaction with SSAS first
- when click on slicer, excel services /#PowerPivot service talks to AS service engine. So on demand loading of SSAS DB.
- first initialization could take time – depends on traffic, etc. Same published #PowerPivot can serve many people.
- Need to optimize SharePoint for #PowerPivot setup.
- Capacity planning-need memory for any in use database and buffer 10-20% for auto detection(?).
- #PowerPivot keeps cache of detached databases. in the SSAS backup folder
- SSAS servers will be 64-128GB and more for #PowerPivot. Will work with 8GB, but most likely you will want more.
- #PowerPivot capacity planning depends - #users total/concurrent, # of files, max file size, data refresh options
- in SharePoint content db workbooks are stored as blobs . Could use Remote Blob store in FileStream. Overhead 2-3%
- but better handle higher concurrency scenarios (cont..)
- different topology options – one sharepoint server, multiple servers with different services on them, etc…
- Currently SharePoint requires Kerberos security all the way
- in SharePoint 2010 – new “claims tokens”, so no need for Kerberos for #PowerPivot
- Kerberos still might be required, but there is reduction in cases when you need it for #PowerPivot and SharePoint 2010.
- #PowerPivot - before installing YOU MUST READ MANUAL! That is important.
- upgrade SharePOInt 2007 to 2010 is very complex for#PowerPivot. Cannot upgrade CTP2->CTP3. Not clear CTP3->RTM.
- SharePoint is optimized for download, so there are concerns for upload #PowerPivot (large files)
- power pivot max size 2GB (SharePoint limit). But you can create >2GB #PowerPivot files, just not upload to Sharepoint. Additional info: Actual max publish size could be 1.8GB. Although it is limitation, it will affect very few users, as majority of them will not have such large files, so there is no need to worry about this
- in sharepoint LargeChunkFileSize parameter is useless – don’t touch it for #PowerPivot.
- to upload #PowerPivot to SharePoint you need very good network connection! Maybe first copy to SharePoint server – faster.
- troubleshooting – error messages very en-cryptic for#PowerPivot.
- use ULS logs (Bing or google for more info) from SharePoint to troubleshoot#PowerPivot.
- from ULS logs use correlationID to track down log data for one event. Log files very large, filter by time too. #powerpivot
there is no separate #PowerPivot log. Error could be in#PowerPivot, Excel services, etc., so ULS one option. - You can use SQL Profiler to troubleshoot#PowerPivot SSAS instance! Good news!
- so to troubleshoot #PowerPivot you need to know SharePoint and SSAS and use tools! Consultant will be still busy…
- session almost done, time for lunch. See you latter tonight for #PowerPivot DAX session!
- create a view with fewer records and create#PowerPivot, deploy to Sharepoint, then update view to include all records. Faster.
- #PowerPivot tip: Rename *.xlsx file to *.zip and inside you will see data file that you can rename to *.abf and restore to SSAS server that runs in PowerPivot integrated mode
- this was the best session so far at - Best Practice on#PowerPivot by @dennylee and Dave Wickert
My tweets from the session “DAX in #PowerPivot for Excel 2010? by Howie Dickerman.
- Data Analysis Expressions = DAX.
- DAX lets user do multidimensional analysis without user knowing that this is multidimensional analysis.
- sample DAX: =[Qty]*[Price] – syntax just like Excel.
- DAX uses Excel functions, but no notion of addressing indv cells or ranges – instead columns in the data
- DAX is not replacement for MDX
- DAX provides functions that implement relations database concepts: Filter tables, aggregates, follow relationships
- DAX one to many function sample: =SUMX(RELATEDTABLE([Sales], Sales[Amount])
- DAX has functions to assist with dynamic aggregations of measures:
- DAX dynamic sample: =VALUES(Time[year]) & “.” & VALUES(Product[ProductID])
- DAX sample: =IF(VALUES(Time[Year])=2008,”Baseline”,”normal”)
- DAX calc expression in 2 places – calc columns (full materialize) and measures (eval for each cell dynamically)
- More than 80 Excel functions in DAX
- CTP3 has new FORMAT function to allow to convert any number to string #DAX
- DAX sample: [salesAmt]/[SalesAmt](All(Product)) – use measure as function.
- DAX = Sales[SalesAmt]/CALCULATE(Sales[SalesAmt], ALL(Product)) more complex syntax
- CTP3 DAX has 35 Time Intelligence Functions – require date column in the data. Column type Date
- DAX this version does not have custom time periods and weeks. Works just with Yr, Qtr,Mth,Day
- DAX function samples: FirstDate, LastDate, FirstNonBlank, StartOfMonth, StartOfQtr, EndOfYear
- DAX: DateAdd, DatesBetween, DatesInperiod – 26 function that return a table of dates
- DAX: PreviousDay, Nextmonth, DatesMTD, DatesYTD, TotalMTD, OpeningBalanceQuarter, ClosingBalanceYear, etc <-more functions
- Year over year growth: =Sales[SalesAmtt]-Sales[SalesAmth)(DateAdd(Time[Date],-1,Year].All(Time))
- AllTime – need to add now in CTP3, not in RTM(will be under cover). So that first year has data.
- DAX sample: QTD Sales =TotalQTD(Sales[SalesAmt],Time[Date], All(Time))
- DAX yearAgo:=Sales[SalesAmt)(ParallelPeriod(Time[Date],-12, Month).All(Time))
- I missed single quotes in same functions prev. If table name contains spaces, need to put single quote around names.
- DAX autocomplete in CTP3 add single quotes, but they are optional.
Additional tweets from the conference:
- Microsoft just announced BI resource center on Technet: http://technet.microsoft.com/en-ca/bi/default.aspx
- Microsoft SQL Server 2008R2 & PowerPivot CTP3 will be released in November
- Microsoft will release SQL Server 2008R2 in the first half of 2010
- Just noticed-Mosha released new version of MDX Studio few weeks ago: http://www.ssas-info.com/forum/viewtopic.php?f=3&t=624
Tags: performance, dax, sharepoint, excel