SSIS Package to Process SSAS Cube

The execution plan of SSAS processing engine triggers a sequence of processing tasks based on the object’s current processed state and its dependencies with other objects. This is what happens when the SSAS cube is processed from SQL Server Management Studio (SSMS). It is possible to replicate this execution plan close enough in an SSIS package so that the cube is always processed and refreshed with latest data.

The package shown in screen capture 1 will do just that. Using SSIS Analysis Services Processing Task, the dimensions are processed first followed by the partitions, measure groups and cube. Finally the indexes of all objects are rebuilt.

Screen Capture 1 – SSIS Package to Process SSAS Cube

Process Dimensions

All dimensions related to the cube is first processed using Process Default option. If the cube has been deployed for first time or re-deployed, depending on the structural changes the cube might be in an unprocessed state. Process Default will bring the dimensions to a bare minimum processed state. On the other hand, if the dimension is already in a processed state and only requires a data refresh this task will have no impact in the processing of the cube.

The next step would be refresh dimension data using Process Update option. It is important to choose Process Update instead of Process Full for dimensions. Process update will refresh the dimension data without taking other cubes that depend on the shared dimensions to an unprocessed state. Process Update option works only if the dimension is in processed state, that’s why Process Default should precede Process Update. Dimensions that have rigid attribute relationship should use Process Full option and that’s a design consideration.

Process Facts

Similar to dimension the partitions, measure groups and cube is first processed using Process Default option to bring the facts to a processed state. As with dimensions, if the cube is already in a processed state and only requires a data refresh this task will have no impact in the processing of the cube.

The fact data is refreshed in cube using Process Data option. Process Data option works only if the object is in processed state, that’s why Process Default should precede Process Data. Unlike Process Update for dimensions, Process Data drops and rebuilds entire partitions from scratch. If the partitions are date based and grow over time, only the affected partitions needs to be processed. In that case replace the Analysis Services Processing Task with AMO Code in Script Task to process partitions. You will also need some logic to determine which partitions to process as described in the last section of this post from Vidas Matelis. For cubes with static partitions like those based on geography or lines of business, this method will work fine.

Rebuild Indexes

The last step would be to rebuild indexes of all dimensions, partitions, measure groups and cube using Process Index option. This will improve query performance and as a pre-requisite requires the facts and dimensions to be in processed state.

Set the Processing Order to Parallel in the Analysis Services Processing Tasks to speed up processing time (Screen Capture 2).

Screen Capture 2 – Processing Order – Parallel

The above sequence of processing is a robust technique to bring any cube from any state to processed state and refreshed with latest data without bringing down any other cubes that might depend on the shared dimensions. It is important to note that Process Update causes the size of physical file of the dimension attributes to grow. Older versions of SQL Server have a 4GB limit on the file size beyond which processing throws exceptions. It’s a good practice to occasionally process SSAS database using Full Process option to shrink the physical file size.

It’s extremely useful to handle onWarning event in the event handler. SSAS logs useful information about processing errors as warnings rather than exceptions. For example processing errors due to missing attribute key records (The attribute key cannot be found) are logged as warnings

Benny Austin

Related Posts

SSIS: Dynamic Processing of SSAS Partitions



23 thoughts on “SSIS Package to Process SSAS Cube

  1. Thank you for this post! I helped me understand the SSAS cube processing sequence better, and looks to be a good option for an issue I’ve run into on my project. However it’s now 2020, and in Visual Studio 2019’s SSIS/SSAS components I don’t see a Process Index option available for any object in the cube. Does this mean indexes are automatically being processed now?

  2. you mention occasionally doing the process full to shrink the physical file size, how often do you suggest this? Weekly, monthly?

  3. Hi Benny,

    thanks so much for this useful post.
    I am trying to understand what really happens and I also read the book “Expert Cube Development with SSAS…..” from Chris Webb, Alberto Ferrari and Marco Russo, and they proposed as an strategy for the dimensions processing:
    “We can run a Process Update on any dimensions that have changed
    their contents, and then run a Process Default to rebuild any invalidated
    aggregations.” But you said that “Process Default should precede Process Update”, which is actually the opposite order, and that is my first question:
    1.- Are both options correct?
    The second question is regarding the facts:
    2.- I a process all partitions, why should I process next all measure groups and at the end the cube, it seems to me like process 3 times the same.

    As a final comment, I tested the package in my environment and it took about an hour to complete whereas a full database process takes usually only 30 minutes.

    I think one of the reasons could be that I only have one cube and I have to adjust your setup to meet my current database structure.

    Sorry if my reply is too long.

    Thanks so much again for sharing your knowledge with us and for your priceless time.

    Paul

    1. hi Paul, thanks for your comments. This is my rationale
      1) Process Update will work by itself only if the dimension is already in processed state. If for example you introduce a new attribute relationship to an existing dim, this will take the dim to unprocessed state. In this instance if you just do a process update, you will get an exception. That’s where Process Default come in. For the first time after the dim is deployed/re-deployed, the dim may be proceseed twice in this approach. However once the dim is processed Process Default will not add an overhead. It just ensured any subsequent cube deployments does not fail the package.
      2) Process measure group and Process cube is required if your cube has distinct count measure groups.

      Most of the processing time is around process update of dim and process data of measures. If you are having only one cube or a cube that has no dimensions shared with other cubes, Process Full is a better option. If you have many partitions and only few of them get updated in an ETL run you could selectively procesess only those partitions https://bennyaustin.wordpress.com/2014/02/20/dynamic-processing/

      Hope that helps

  4. Great post, but I’m interested in bypassing SSAS completely to process my cube. Instead I would like to use SSIS DIm and Partition tasks. Curious, if your approach above would apply. I’m guessing Process Index would, but could I replace ssas dim and fact processing tasks w/ SSIS ones? Any thoughts? Do you know if I would need to process some part of the cube in SSAS for my approach to work?

    thanks

    1. Rupal, the technique presented in this post uses pull method to process cube. Dimension Processing Destination and Partition Processing Destination uses push method to process SSAS objects and has to handled differently.

    1. SSIS logging and OnProgress event in particular is probably the closest to track processing similar to SSDT

    1. Matthias, If the processing mode is set to single transaction (default), the old data in the cube is available for browsing while the cube is still processing. When the cube processing transaction is complete, the commit requires exclusive lock on the cube files. During this window, MDX queries will wait till the commit is complete and may appear to hang but will eventually return data.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s