Rebuild Sitecore Analytics Index without re-building reporting database

Since the Sitecore 8 came out, there is no option on the indexing manager to re-build the analytics index any longer. And it was done for a reason. At the moment, Sitecore says that you can re-build the analytics index by doing the rebuild of reporting database as described here. Also, there are discussions around it in the community and there is a blog post of Adam Conn which suggests a way for re-building the analytics index by putting into aggregation queue all existing visits from the collection database.
However, this way is still not something you want to do. As described here, this code will result in an attempt to re-aggregate all interaction data to the Reporting database. That will lead to incorrect reporting numbers, aggregation exceptions, will still put approximately the same load on your Sitecore installation as a full Reporting DB rebuild.

The problem

On one of my projects, I have quite huge amount of contacts and their interactions. There is an integration from third-party CRM system that syncs customer data into xDB as contacts. The Sitecore analytics index updates when the contact visit ends so for those contacts that are synced but never visited the site there is no data in the index. In my case this leads to List Manager giving not all contacts in segmented lists which our customer heavily uses for email-marketing. I have tried to rebuild the reporting database and left it for a night to process. It was performing for around 9 hours and did not process even half of it. So apparently this is not the option for us. So I really needed a tool that can re-index contacts after each data sync from third-party CRM.
And here is what I implemented – Helpfulcore.AnalyticsIndexBuilder – a module that installs new Sitecore admin page which can be found by path /sitecore/admin/analyticsindexbuilder.aspx.
analytics_index_builder_page

The module provides functionality for re-building sitecore_analytics_index using data from the collection database without re-building the reporting database. It also includes methods to clean the index if necessary

Consider using this module if:

  • You want to re-build the Sitecore analytics index without re-building reporting database;
  • You have extended/changed any of [indexable_type].loadFields pipeline (see list below) and you want your index to be updated with new changes for all existing contacts or interactions;
  • You are importing contacts from third-party source in order to make sure the index being rebuilt so you can use the List Manager segmentation. There is a way to use module’s API to do that programmatically (described below);
  • You have corrupted the analytics index or by some reason your indexed data have been lost.

Installation

To install Helpfulcore.AnalyticsIndexBuilder run next command in the Nuget Package Manager console on your Sitecore website project:

Install-Package Helpfulcore.AnalyticsIndexBuilder

Or you can find a Sitecore module on the Sitecore Market Place with name Helpfulcore.AnalyticsIndexBuilder (coming soon).

Compatibility

  • Built and tested on Sitecore CMS 8.2 rev 160729 (initial release) and SOLR content search provider.
  • Supposed to work on any Sitecore CMS 8.x and later as well as on both SOLR and Lucene content search providers.

Functionality

The module uses native Sitecore content search API for updating the index so both SOLR and Lucene content search providers should be supported. Also it uses native pipelines for building indexed records such as:

  • <contacttagindexable.loadfields>
  • <contactaddressindexable.loadfields>
  • <contactindexable.loadfields>
  • <visitindexable.loadfields>
  • <visitpageindexable.loadfields>
  • <visitpageeventindexable.loadfields>

The new admin page /sitecore/admin/analyticsindexbuilder.aspx shows the analytics index content overview with count of each indexable type currently present in the index.

analytics_index_builder_overview

And provides actions to

  • Delete indexables of specific indexable type
  • Rebuild index for specific indexable type
  • Reset whole analytics index
  • Rebuild whole analytics index

There are options for re-building indexables only for required contacts. This was named as “filterable” function and available on page with green buttons as:

  • Rebuild all filtered indexables
  • Rebuild filtered indexables of this type

By default, there is a filter for this action set to use only known contacts (contacts that have not empty identifier). Filtering can be extended or replaced using include configuration file (look at config file).

The new admin page /sitecore/admin/analyticsindexbuilder.aspx display real time log in the Executoin log field. As well as logs all actions in separate log file

$(dataFolder)/logs/Helpfulcore.AnalyticsIndexBuilder.log.${date:format=yyyyMMdd}.txt

There is a brief functionality legend at the bottom of new admin page /sitecore/admin/analyticsindexbuilder.aspx.

analytics_index_builder_legend

The module is fully configuration driven and it installs the config file to /App_Config/Include/Helpfulcore/Helpfulcore.AnalyticsIndexBuilder.config with next content:

<?xml version="1.0" encoding="utf-8" ?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <helpfulcore>
      <analytics.index.builder>
        <analyticsIndexBuilder type="Helpfulcore.AnalyticsIndexBuilder.AnalyticsIndexBuilder, Helpfulcore.AnalyticsIndexBuilder">
          <param desc="analyticsSearchService" ref="helpfulcore/analytics.index.builder/analyticsSearchService" />
          <param desc="collectionDataProvider" ref="helpfulcore/analytics.index.builder/collectionDataProvider" />
          <param desc="logger" ref="helpfulcore/analytics.index.builder/logging/loggingService" />
          <param desc="batchSize">1000</param>
          <param desc="concurrentThreads">4</param>
        </analyticsIndexBuilder>
        <analyticsSearchService type="Helpfulcore.AnalyticsIndexBuilder.ContentSearch.AnalyticsSearchService, Helpfulcore.AnalyticsIndexBuilder">
          <param desc="logger" ref="helpfulcore/analytics.index.builder/logging/loggingService" />
        </analyticsSearchService>
        <collectionDataProvider type="Helpfulcore.AnalyticsIndexBuilder.MongoDb.MongoCollectionDataProvider, Helpfulcore.AnalyticsIndexBuilder">
          <param desc="analyticsConnectionString">analytics</param>
          <param desc="logger" ref="helpfulcore/analytics.index.builder/logging/loggingService" />
          <param desc="contactFactory" ref="model/entities/contact/factory" />
          <Filters hint="list">
            <filter type="Helpfulcore.AnalyticsIndexBuilder.Data.KnownContactsFilter, Helpfulcore.AnalyticsIndexBuilder" />
          </Filters>
        </collectionDataProvider>
        <logging>
          <loggingService type="Helpfulcore.Logging.LoggingService, Helpfulcore.Logging" singleInstance="true">
            <param desc="provider1" ref="helpfulcore/analytics.index.builder/logging/providers/nlogDebugFileProvider"/>
          </loggingService>
          <providers>
            <nlogDebugFileProvider type="Helpfulcore.Logging.NLog.NLogLoggingProvider, Helpfulcore.Logging.NLog" filePath="$(dataFolder)/logs/Helpfulcore.AnalyticsIndexBuilder.log.${date:format=yyyyMMdd}.txt" singleInstance="true">
              <param desc="filePath">$(filePath)</param>
              <LogLevel>Debug</LogLevel>
            </nlogDebugFileProvider>
          </providers>
        </logging>
      </analytics.index.builder>
    </helpfulcore>
  </sitecore>
</configuration>

All methods are performance-optimized and use multiple threads for re-building indexables. All public methods are xml-documented so you can see information on what each particular method does. The analytics index is updated in batches and there is a batch size parameter in the configuration file, by default it is set to 1000.

Module’s API

There is an option to use the API provided by the module in your code. The primary object is AnalyticsIndexBuilder class.
Here how you can get an instance of it:

using Sitecore.Configuration;
using Helpfulcore.AnalyticsIndexBuilder;
...
var analyticsIndexBuilder = (IAnalyticsIndexBuilder)Factory.CreateObject("helpfulcore/analytics.index.builder/analyticsIndexBuilder", true)

analytics_index_builder_api

Here is the interface which it provides:

public interface IAnalyticsIndexBuilder
{
	bool IsBusy { get; }
	void RebuildAllIndexables(bool applyFilters);
	void RebuildContactIndexableTypes(bool applyFilters);
	void RebuildContactIndexableTypes(IEnumerable<Guid> contactIds);
	void RebuildVisitIndexableTypes(bool applyFilters);
	void RebuildVisitIndexableTypes(IEnumerable<Guid> contactIds);
	void RebuildContactIndexables(bool applyFilters);
	void RebuildContactIndexables(IEnumerable<Guid> contactIds);
	void RebuildAddressIndexables(bool applyFilters);
	void RebuildAddressIndexables(IEnumerable<Guid> contactIds);
	void RebuildContactTagIndexables(bool applyFilters);
	void RebuildContactTagIndexables(IEnumerable<Guid> contactIds);
	void RebuildVisitIndexables(bool applyFilters);
	void RebuildVisitIndexables(IEnumerable<Guid> contactIds);
	void RebuildVisitPageIndexables(bool applyFilters);
	void RebuildVisitPageIndexables(IEnumerable<Guid> contactIds);
	void RebuildVisitPageEventIndexables(bool applyFilters);
	void RebuildVisitPageEventIndexables(IEnumerable<Guid> contactIds);
}

For example, if you have a list of contact ID’s that you need to re-index, use next code

using System.Collections.Generic;
using Sitecore.Configuration;
using Helpfulcore.AnalyticsIndexBuilder;

public class UpdateAnalyticsIndexForContacts
{
	private IAnalyticsIndexBuilder analyticsIndexBuilder;

	public UpdateAnalyticsIndexForContacts()
	{
		this.analyticsIndexBuilder = (IAnalyticsIndexBuilder)Factory.CreateObject(
			"helpfulcore/analytics.index.builder/analyticsIndexBuilder",
			true);
	}

    public void UpdateIndexForContacts(IEnumerable<Guid> contactIds)
	{
		// this will re-index 'contact', 'contactTag' and 'address' indexable types for specified contacts.
		this.analyticsIndexBuilder.RebuildContactIndexableTypes(contactIds);
	}
}

As always, source code is available on my GitHub

Give it a try 😉

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s