What Makes Data Migration so Complicated
Imagine you just bought a new home. You searched for the perfect house, persevered as you applied for a mortgage and boxed up your old abode. Moving day finally arrives, but as you move your furniture from one home to the other, you discover that the process is not as smooth as you thought. Pieces don’t work for the new space, things get lost and you have to break down clunky items you thought would fit.
The same can be said for data migration. Data migration is the process of transferring your data and content from your old site, to your new one. Data migration is a part of most every project, but unfortunately, there is no magic button that seamlessly transfers your content into its new home, and just like that oversized sectional sofa, your text copy, files, links and images will have to be broken down and restructured to fit in the new space. It is easy to overlook the need for data migration at the start of a website redesign project, or, the complexities are not always accounted for, so we’re covering the bumps, bruises and hiccups that can occur during data migration, to better arm you with the tools you need for a smooth process.
Your content is specific to your current site, not the new one
Content is not just body text; content includes links, images and embedded files. Your new site will have an updated CMS, new taxonomies and varying naming structures, so if linked data on your current site directs users to a url that will no longer exist, or is no longer classified by the same name after your new site goes live, that piece of data will fail when migrated, and users will be greeted by the dreaded “404” error page.
Garbage data is a big issue
Every site is guilty of having garbage data. Multiple users, messy work, inconsistent naming conventions, duplicate data, data saved on private drives and copy/paste issues are all considered “garbage data”. In these instances, it is fundamentally impossible to migrate all current data accurately.
Approach is not a science
Dave, WDG’s data migration guru says, “Migration is 80% approach and 20% pragmatic.” Completing data migration from your old site to a new one does not always follow a successive format. There is a great amount of testing, trial and error, and experience is key, so there is no fool-proof step-by-step guide that will offer non-experts seamless migration.
At WDG, our implementation team is made up of experts on data migration, and we have carried out complex and massive migration for many of our clients. Read our Case Study on how we migrated four different sites into one, for the American Enterprise Institute (AEI). For our client, Space News, we migrated over ten years of content onto their new site. And for Washington DC’s top news site, WTOP, WDG migrated over a hundred thousand articles.
As we said, there is no fool-proof step-by-step guide. However, here’s a break down of the typical data migration process.
[loose format] of the steps WDG experts use for data migration:
1. Analyze the content structure of the new website. Is it WordPress, Drupal? Are there custom layouts?
2. Assess the scope and detail of data being migrated: This is looking at the KIND of data being migrated and the complexity of this content, and the amounts:
Complexity to be considered:
• do articles have multiple kinds of categorization?
• does content contain links to other internal content
• does content contain have embedded videos, images, sound files.
The Amount: (what does that mean for migration?)
• When conceptualizing data migration, one would typically think that migrating 10 files takes less time than migrating 100 files, and 100 takes less time than 100,000, and so forth.However, the number of files is not the only factor that affects complexity. Also, when data migration is automated, scripts take away the pain of handling thousands of records by hand.
3. Attain a copy of the data. This could be a .csv export or a spreadsheet
4. Run scripts to gauge the rate of fallout. Programmatically extract local and foreign paths. Skim for link verity (aka finding radically formatted content). This step is practically necessary for larger migrations. This is the process which all the data is programmatically analyzed generating a report that shows some of the following things:
- list of domains used in links
- list of extensions used in links
- malformed tags
5. Examine results of scan and script run to observe outliers and determine if resolution is possible.
This can tell us things like:
- how often content references/links to itself (if any)
- how often images(externally referenced and internally referenced) are used (if any)
- if content contains links to content stored locally (ex. file://c/documents/picnic.jpg). This is usually the result of mistaken content entry. though rare — it happens often enough to keep an eye out for.
6. For those items where a resolution is determined possible, a script is written to replace the records in question from the old database, with the new content, edited specifically for an updated CMS.
Worried about loosing your content during a website redesign? Let us help! Contact us and let us tell you more about the data migration process.