Data Migration Between CMS Repositories? 16
StyleChief asks: "My employer has decided to begin migrating all of the company's documentation oriented objects and files to a new content management system. The new system seems to have good functionality, robustness, and better usability than our current systems. However, the
task of migrating all of the data from 2 or 3 other repositories to the new system seems to be a daunting chore. Automating the
process as much as possible is of course my first goal. There are APIs that one can use to do this, but the details quickly
become eye opening. Questions of objects versus files, handling their attributes, authorizations, file type identification, shadowing, build integration, versioning, etc., are several of the plethora of issues at hand. Moving perhaps hundreds of thousands of objects from one proprietary repository to
another while preserving everything related to that object is the name of the game. I would like to know how others from Slashdot have dealt with similar scenarios. I am particularly interested in the 'lessons learned,' and the problems that you didn't see coming beforehand."
easy (Score:1, Redundant)
Re:easy (Score:2)
The swiftest way. (Score:1, Informative)
The swiftest way to migrate data between repositories.
Drop the firewall.
1st hand experience (Score:3, Interesting)
As far as CMS software goes, Documentum is one of the better that I have had to deal with (compared to Interwoven, Vignette, Stellant), and the API really is simple to use once you dig through the layer of Java / OO Design cruft that some developer types like to throw in the way of getting things done.
A few questions need to be asked and answered before you can ever migrate content from other systems. First is a survey of what kind of content you are trying to pull, is it structured and tagged or categorized well, like XML in Docbook format? Or, is it old 1996 HTML crap that years of users and Frontpage or Dreamweaver have thrown out to the corporate intranet? If it is the latter I suggest looking at a nice emerging company that handles this well called Nahava. (I am not affiliated, but I think their tech is well done after working with them a bit)
After this is done, you have to decide what you want the new CMS to store. Are you going to fit all the old stuff to some fancy new taxonomy that a big brain strategerian has come up with, or is it a straight over migration, with 3 root folders, one for each of the old systems. Is it possible to do both by putting some of the Documentum features to use?
Anyway, there are a million things to answer in this process, good luck!
A few thoughts (Score:5, Informative)
Interestingly, none of these "migration articles" on web sites that are explicitly devoted to CMS matters (e.g., CMSwatch.com, cmsReview.com) seem to characterize this problem as relating to Extraction, Transformation, and Loading (ETL) [eweek.com], raising the possibility that their authors are ignorant of the many ETL tools that are available. In the open source world, these tools include Octopus [enhydra.org] and Jetstream [sourceforge.net]. Of course, Perl programmers do not call this process "ETL," but, rather, simply "data munging. [barnesandnoble.com]"
A prior Slashdot story on "Transferring data 'tween databases" (posted 14 April 2003) might interest you. I cannot post a link to it, however, because Slashdot's search engine is currently down.
Finally, EMC just bought Documentum, the CMS that you are considering. EMC is primarily a storage company, and I cannot help but wonder how CMS fits into their storage strategy.
Re:A few thoughts (Score:3, Informative)
Re:A few thoughts (Score:1)
Re:A few thoughts (Score:2)
The quick answer is that content has to be kept somewhere, i.e. storage, and EMC is always interested in things that help sell storage.
The long answer is that so-called fixed content is an growing slice of the storage pie. EMC has a nifty way to store fixed content (see this article [com.com]), but that's only the bottom layer of the stack
Data Junction (Score:1, Informative)
Not going to help you but... (Score:2)
It takes someone as big as a government to demand "no lock in please, we're british" t
well, don't use Microsoft's CMS product (Score:2, Interesting)
With that said, the 2002 product integrates nicely with .NET and is actually pretty slick.