Are you sure you want to quit the chat?
Data migration projects can often seem straightforward at first glance. However, as many professionals know, migrating data from legacy or custom-built systems to modern enterprise platforms like Microsoft Dynamics CRM involves a host of challenges. These include dealing with differing data structures, transforming records to meet new schema requirements, and overcoming key mapping issues.
Recently, I undertook a data migration project that involved moving customer data from a custom-built application into Microsoft Dynamics CRM. This migration wasn’t limited to just one data source—we also had to integrate customer data stored in various Excel sheets into CRM. To accomplish this, we relied heavily on Azure Data Factory (ADF) pipelines and SQL stored procedures for data transformation and deduplication. Along the way, we faced and overcame a number of challenges, particularly around key mismatches and ensuring data consistency.
In this post, I will walk through the key steps we followed during the migration process, the architectural decisions we made, and how we tackled the challenges presented by mismatched foreign keys and duplicate customer records.
The foundation of any successful migration lies in having a clear and well-defined architecture. Our project involved moving customer data from two key sources:
To ensure a smooth migration into CRM, we adopted a multi-tiered architecture that involved:
Data extraction was the first step in the ETL (Extract, Transform, Load) process. This involved pulling data from both the custom application and the Excel files into the SQL staging database
The custom application’s database held a wealth of customer data, but it had a schema that didn’t align with Microsoft Dynamics CRM’s structure. We began by creating linked services in Azure Data Factory to establish secure connections to the custom application’s database. From there, we designed ADF pipelines to extract this data and load it into our SQL staging database.
In addition to the custom application, we had to deal with customer data scattered across various Excel files. These files were stored in Azure Blob Storage to ensure secure, high-availability access. We utilized ADF’s Copy Data Activity to read the Excel files and move them into the SQL staging database. The steps involved:
Each Excel file had its schema, so mapping columns correctly to the SQL staging tables was a critical part of this process.
With the data from both sources in the SQL staging database, the next step was to transform it into a format that Microsoft Dynamics CRM could accept. CRM has a well-defined schema, which meant we had to carefully reshape the data to ensure compatibility.
We created a dedicated schema called trn (for “transformation”) within the SQL staging database. This schema served as a workspace where data from both the custom application and Excel files could be transformed according to CRM’s structure.
Data transformation was handled by a series of SQL stored procedures. These stored procedures were designed to:
By using stored procedures, we automated much of the transformation process, ensuring that every record was properly formatted for insertion into CRM.
One of the most significant challenges we faced during this migration was handling the foreign key relationships between the custom application and CRM. The custom application uses integer primary keys (PKs) to identify records, while CRM uses GUIDs (Globally Unique Identifiers).
The primary key in the custom application was a simple integer. However, CRM requires GUIDs for its primary key structure, meaning that the relationships between entities (e.g., between a customer and their orders) would be lost if we didn’t properly map the integer keys to GUIDs
To solve this, we introduced a reference table in the staging database. This table acted as a bridge between the custom application’s integer keys and CRM’s GUIDs. The reference table stored a mapping between the old primary keys (integers) from the custom application and the new GUIDs generated by CRM.
The reference table was populated during the data transformation process. Specifically:
In the final migration step, we used XML code within ADF pipelines to perform dynamic lookups against the reference table. This allowed us to retrieve the correct GUIDs from CRM and insert them into the transformed records. The XML code was crucial for matching the reference key to the primary key and lifting the GUID from the original CRM table.
This approach ensured that the foreign key relationships between entities were preserved during migration, despite the mismatch in key formats.
Another critical component of the migration process was deduplication. Given the multiple data sources (the custom application and Excel files), there were bound to be duplicate customer records. Duplicate records could lead to confusion and errors in CRM, so we had to ensure that only unique customer records were retained.
This approach ensured that the foreign key relationships between entities were preserved during migration, despite the mismatch in key formats.
We built deduplication logic directly into the SQL-stored procedures used for data transformation. This logic involved:
Once duplicates were identified, we applied survivorship rules to consolidate them. In cases where the duplicate records contained conflicting information (e.g., one record had a phone number while the other did not), we used rules to retain the most complete or recent data. This ensured that the resulting customer profile in CRM was accurate and up to date.
As part of the deduplication process, we also updated our reference table to reflect old-to-new key mappings. This allowed us to ensure that the correct GUIDs were assigned to the deduplicated records, maintaining the integrity of the data relationships in CRM.
Step 5: Lifting and Shifting the Data to CRM
With the data transformed, cleaned, and deduplicated, we were ready to load it into Microsoft Dynamics CRM. This final step involved using ADF pipelines to lift the transformed data from the SQL staging database and shift it into CRM’s production environment.
We designed a series of ADF pipelines to map the transformed data to CRM’s entities. The pipelines were responsible for:
One of the key concerns during this stage was minimizing downtime and ensuring that the data migration did not disrupt CRM operations. To achieve this, we scheduled the data migration to occur during non-peak hours, to avoid overwhelming the CRM system with a large volume of data at once.
Conclusion:
Migrating customer data from a custom-built application and external Excel files into Microsoft Dynamics CRM was a challenging but rewarding project. By leveraging Azure Data Factory, SQL stored procedures, and Azure Blob Storage, we were able to successfully extract, transform, and load the data into CRM while overcoming significant challenges around foreign key mismatches and duplicate records.
The use of a reference table for key mapping and deduplication logic in the transformation process ensured that the migrated data was accurate, complete, and fully aligned with CRM’s structure.
As part of the deduplication process, we also updated our reference table to reflect old-to-new key mappings. This allowed us to ensure that the correct GUIDs were assigned to the deduplicated records, maintaining the integrity of the data relationships in CRM.
At ITKnocks, we are more than an IT consulting company; we’re your strategic partner in business evolution. With a global footprint and a passion for technology, we craft innovative solutions, ensuring your success. Join us on a journey of excellence, where collaboration meets cutting-edge IT expertise.