The End Game is Not Just Canned Reports

Canned reports/dashboards aka custom reports or out of box reports are inevitable part of any BI ecosystem. It’s a good point of reference for conversations with business users especially at the start of a project when requirements are often a bit hazy. In many instances, business users see report layouts as an effective and convenient medium of communication with BI project teams. This however should not be considered as THE business requirement. Instead of dwelling on canned reports, the focus should be on building a well-designed DW/BI system that empowers users to perform analysis on a self-service basis. The quality and the range of analysis that a self-service BI platform provides far outweighs the benefits of just rolling out a set of canned reports. It empowers business users to do their own analysis while the BI team can work towards expanding and refining the analytical platform instead of operating as a “reporting team”. Continue Reading

SSIS Package to Process SSAS Cube

The execution plan of SSAS processing engine triggers a sequence of processing tasks based on the object’s current processed state and its dependencies with other objects. This is what happens when the SSAS cube is processed from SQL Server Management Studio (SSMS). It is possible to replicate this execution plan close enough in an SSIS package so that the cube is always processed and refreshed with latest data.

Continue Reading

SSRS: Why does SSAS Member UniqueName Return Nothing?

If your SSRS report is using SSAS MDX datasets, the member uniquename is particularly useful to pass parameter values to drill-through MDX reports or for sorting based on key values. If Fields!Attribute.UniqueName is returning nothing or blank or null value in your report, then most likely you are missing the DIMENSION PROPERTIES clause in your MDX dataset query
Continue Reading

Benefits of Creating SSAS Cube from Database Views

While SSAS allows you to build a cube straight of the fact and dimension tables, however there are benefits in creating the cube from database views that sits on top of the underlying fact and dimension tables. Here’s why. Continue reading

Dimensional Modeling: Banding

It’s quite common to analyse a measure by categorizing it into different buckets or bands. For example to list the customers who are due for payment in the next week or fortnight, to decide which income group to target for a particular product campaign, which age group of students are most likely to buy a smartphone and so on. These bands are usually not supplied by source systems but derived from the underlying measure and they exist only for analytic purposes. While this banding could be achieved in reports or dashboards, there is significant value to model them as dimensions especially to promote self-service BI.

In this example, fact_admission represents the fact data collected from hospital admissions. AGE_YRS is the client’s age at the time of admission. DIM_AGE is banding dimension for age group with the granularity being AGE in yrs.

Screen Capture 1- Age Group Bands

  • DIM_AGE is type 1 SCD that has all ages say from 0 to 130 yrs.
  • AGE_SK is the surrogate key. It’s not an identity rather it takes same value of AGE except for Unknown SKs, in which case AGE is NULL. This means the ETL does not have to look-up this dimension table.
  • DIM_AGE has an out of bound surrogate key with value 999 to cater for age greater than 130 yrs.

The three age group bands are defined as follows

AGE_GROUP_1

Infant AGE Below 1 yr
Toddler AGE between 1-3 yrs
Pre-schooler AGE between 3-5 yrs
Young Kids AGE between 5-7 yrs
Big Kids AGE between 7-12 yrs
Teens AGE between 12-18 yrs
Young Adult AGE between 18-39 yrs
Middle Age AGE between 40-59 yrs
Seniors AGE 60 and over

AGE_GROUP_2

Unlicensed Drivers AGE Below 17 yrs
Young Drivers AGE between 17-25 yrs
Silver License AGE between 25-30 yrs
Gold License AGE between 30-75 yrs
Older Drivers AGE over 75 yrs

AGE_GROUP_3

Pre-schooler AGE below 5 yrs
School AGE between 5-18 yrs
Uni AGE between 18-25 yrs
Salaried AGE between 25-60 yrs
Pensioner AGE over 60yrs
  • When building the dimension, it is recommended to build dimension using a view on DIM_AGE. This allows to include columns that could act as key columns to the age bands and more importantly to sort the different captions in the required order.
  • The materialized AGE_SK column on fact table is optional. The fact table can be linked to DIM_AGE using a view on fact table because of the fact the surrogate key is directly derived from age measure.
  • A new age group can be easily created just by adding a new column to the dimension table without the need to re-key fact tables.

Incorporating banding as dimension simplifies reporting, self-service BI platform and also provides a design that is highly scalable. Banding can be extended to any measure like income, duration, count , miles and percentile and so on.

Benny Austin

Dimensional Modelling: Excluding Unused Dimension Attribute Members

In a self-service BI platform, every member of a dimension attribute is available for browsing irrespective of whether the member has an associated fact or not. Though the unused members disappear when browsed alongside facts, they are still undesirable when used as filters. This is especially true with role playing dimensions which cover a broad range of data to cover every scenario. No one wants a date filter starting at 1900-01-01 and going all the way to 9999-12-31.

There are different options to exclude the unused dimension attribute members

Option 1: Use a View on Dimension Table

It’s always a good practise to build dimension using view on dimension table because of the agility it offers. A simple view like this would filter out any member that’s never been used in any fact table.

CREATE VIEW vdimDate
AS
SELECT dateAttributes
FROM dimDate
WHERE EXISTS (SELECT 1 FROM factA WHERE dateSK=dimDate.dateSK)
   OR EXISTS (SELECT 1 FROM factB WHERE dateSK=dimDate.dateSK)
   OR EXISTS (SELECT 1 FROM factC WHERE dateSK=dimDate.dateSK)
GO

Option 2: Use a flag on Dimension Table

Option 1 works like a charm on small dimension tables. With large data volumes, the SSAS processing time on these dimensions tend to increase because the query gets executed for each attribute. This can be overcome by modelling the dimension to include a materialized flag to indicate whether the dimension member is ever referenced by any fact. This flag is set to true by the ETL before the SSAS cube is processed using a simple update statement

UPDATE dimDate
SET DATE_IN_USE_FLAG=1
WHERE EXISTS (SELECT 1 FROM factA WHERE factA.dateSK = dateSK)
   OR EXISTS (SELECT 1 FROM factB WHERE factB.dateSK = dateSK)
   OR EXISTS (SELECT 1 FROM factC WHERE factC.dateSK = dateSK)

The update statement would execute much faster when compared to SSAS processing time especially if a view is created on the dimension table to include only the dimension members in use.

CREATE VIEW vdimDate
AS
SELECT dateAttributes
FROM dimDate
WHERE DATE_IN_USE_FLAG=1
GO

Option 3: Customize frequently used Role Playing Dimensions

Role playing dimensions need special attention. Sometimes in addition to option 1 or 2 you will also have to convert a frequently used role playing dimension to non-role playing dimension. This will exclude any unused member from a specific fact table (instead of any fact table). The key point to remember while taking this hybrid approach is regardless of whether the dimension is role playing or not, as long as it is derived from same source it must be a Conformed Dimension. This means the name and number of the attributes, key column, name column and value column, attribute relationships, sort order and hierarchies must be the same between the role playing and non-playing dimensions derived from same data source. The only difference is the number of members in a non-role playing dimension is fewer than the role playing dimension

Option 4: Work some MDX Magic in Presentation Layer

Self-service BI platforms are inevitably augmented by few out-of-box SSRS reports and dashboards. By using MDX expressions and data sets you could customizes the filter to display only the data relevant in that context. Here are some MDX examples for SSRS and PerformancePoint

By combining the above options you could ensure that your self-service BI platform is agile, practical and user friendly.

Benny Austin

SSRS: Parameter Expression for MDX Datasets Having Role Playing Dimension

In SSRS report, when a report parameter is role playing dimension then the parameter value passed to each dataset has to be customized depending on the context in which the dimension is being used. In an earlier post, I explained how this is done in PerformancePoint dashboard. In this post an equivalent solution for reporting services is provided.

Continue Reading

Big Data: Create and Load Hive Tables on Windows Azure HDInsight

Update 18-June-2014

Please refer to Working with Hive in HDInsight which the updated version of this post. The orginal post was written when HDInsight was a separate entity and in preview mode. Since then HDInsight has been completely integrated into Microsoft Azure cloud services. The concepts explained in this post still holds, however some of the instructions and screen captures have changed significantly. So I would recommend you to refer to the updated version of this post
Continue Reading

PerformancePoint: Using Connection Formula to Pass Filter Values to Report Having Role Playing Dimension

Date dimension is inevitably used in most dashboards. Due to its role playing nature, a specific date dimension used as slicer in one report is different from the date dimension used for another report on the same dashboard page. For example sales reports are based on order date whereas delivery reports are based on delivery dates. On a dashboard that displays both sales and delivery reports, it would not be good for navigation purposes to have both order date and deliver date filters on the same page. The choice of which date dimension to use as filter among the available dates is a tactical decision. Having chosen a particular date dimension, the challenge is to provide filter value to those reports that use a different date dimension. This is where PerformancePoint filter Connection Formula comes into play as explained in the example below.

Continue Reading

SharePoint: MDX Expressions for PerformancePoint Filters

Dashboard filters is one of the key components of BI Platform that drive reports and KPI’s in various dashboards. MDX expressions to a great extent enables to customize PerformancePoint filters and enhance the end user experience.

First – a quick look at member selection option. Member selection is the easiest way to build dashboard filters which does not require MDX expressions (Screen capture 1). Member selection option allows to select specific members, although this is not usually recommended as the filter can become quite static. A better method is to select Children, Leaves or Descendants with additional option to include or exclude default member. For hierarchies, member selection option also allows selection members from a particular level of hierarchy.

Screen Capture 1 – Member Selection

While member selection option may be sufficient for some business scenarios, using MDX Expressions (Screen Capture 2) enables to customize dashboards without hardcoding the member selection.

Screen Capture 2 – MDX Expression Option for PerformancePoint Filters

Some common and often repeated dashboarding scenarios where MDX expressions for PerformancePoint filters are useful is discussed below

Display only members that have data

It’s not particularly intuitive for business users to select an item from drop down only to find that there is no data for the dashboard to display.

The following example uses FILTER function to select only Product Categories that have a sale.

FILTER([Product].[Category].Children,[Measures].[Internet Order Count] > 0)

Screen Capture 3 – MDX Expression for PerformancePoint Filter

Display only last N time periods

Often business is only keen on trend analysis for the last few years. All time periods from beginning of time need not be displayed in dashboard.

The following MDX expression displays the last 4 fiscal years from current fiscal year. When the fiscal year changes so does the filter data. Note that for this MDX expression to work, the date dimension must have current attributes defined.

(EXISTS([Date].[Fiscal Year].Children,[Date].[Current Fiscal Year].&[Y]).Item(0).LAG(-4)
: EXISTS([Date].[Fiscal Year].Children,[Date].[Current Fiscal Year].&[Y]).Item(0))

Display time periods in reverse chronology – latest dates first

Another common requirement for business users is to see the latest dates first.

Using the ORDER function in the previous MDX expression, the latest fiscal years can be displayed as the first item in the drop down.

ORDER((EXISTS([Date].[Fiscal Year].Children,[Date].[Current Fiscal Year].&[Y]).Item(0).LAG(-4)
: EXISTS([Date].[Fiscal Year].Children,[Date].[Current Fiscal Year].&[Y]).Item(0))
, [Date].[Fiscal Year].CurrentMember.Properties('key'),DESC)

Hide unknown members

It a good to handle unknown dimension members in datamarts which in turn helps to address data quality issues. However bad data need not be displayed in dashboard filters.

The following MDX expression gets all the Sales Reason members except the unknown members.

([Sales Reason].[Sales Reason].Children
-[Sales Reason].[Sales Reason].&[UNAVAILABLE]
-[Sales Reason].[Sales Reason].&[NOT APPLICABLE]
-[Sales Reason].[Sales Reason].&[UNKNOWN])

The above examples demonstrate the ease with which seriously rocking PerformancePoint filters can be created using simple MDX expressions that create business friendly dashboards.

Benny Austin