Gmail and Google Apps mail migration

I’ve been a long time Google Apps user, I think it’s a perfect solution for a smaller company like mine. In fact, it’s a perfect solution for bigger companies as well. Right now, I’m working with a client that wants to consolidate 10 individual Google Apps domains into one single account that handles all the domains. About half of the 15 users have accounts in all the other domains, the other half have accounts in 2-3 of them. Even if this is not the most typical kind of work I’m doing, this assignment brings some welcome new challenges into my current work.

Migration tool requirements

Anyway. One of the client requirements is naturally that all existing email is migrated into the new Google Apps account. Easy peasy right? Well, it turns out that migrating email to and from Gmail (and therefore Google Apps) is quite a challenge due to a number of reasons. Of course, there are plenty of advice on the Internet, but none of the proposed solutions would cover all my needs which are:

  1. The tool must be able to handle XOAUTH on the source.
  2. The tool must be able to handle XOAUTH on the target.
  3. The tool must be able to handle Gmail’s rather special model of treating folders like labels.
  4. The tool must be scriptable
  5. The tool must be able to handle delta changes, running the script a second time should not create duplicates on the target email account.

I’ve looked at plenty of alternatives but the only tools I found that could potentially do the work for me came at a too high cost. The cloud based tools that exist typically charge per user account, that would have been fine if there was a 1-1 mapping between users and accounts. But in my scenario, each user have on average 7 accounts and I had given my client a fixed price for the entire job. So, even if I had loved to try, I can’t afford to lose money on a job.

The two most problematic requirements was to handle xoauth on both ends and to handle the gmail folder/label magic. The two official tools from Google, the migration API and the Google migration tool for Exchange failed. The API only gives you write access, so it’s not possible to get email OUT of a Gmail account using it. The Google Exchange Migration tool assumes that the source server is something other than Gmail and requires you to know the username / password for all source accounts.

A solution… almost

Enter imapsync. Imapsync used to be a free open source tool that is now under a commercial license. But for a mere EUR 50, I bought access to the source code (in perl). Imapsync is able to handle XOAUTH on both source and destination, it’s scriptable and it’s able to use MessageId to keep a kind of state. Running imapsync twice with the same parameters will not duplicate the amount of emails on the target server, more on that later.

The one problem I had with imapsync was the folder vs label management. The problem that most people know of is that Gmail doesn’t really use folder, it uses labels. Even if it’s similar in a lot of cases, there are differences. What I learned is that there’s another issue regarding the concept of subfolders or nested labels. An example

See the difference? In the last example, you’d perhaps thought that Gmail would create a root level label “fuu” and then a nested label “bar” under it. But nope, Gmail will happily create a label containing the actual IMAP label separator character. Bummer. So the end result is that if you transfer email with imapsync out of the box, you will get a flat structure of really long label names. And that flat list that can grow to be quite long if your’re actively using nested labels. And you don’t want that.

I was pondering a whole lot of various solutions to this problem. I actually got to the point where I tried to migrate the source account to a local IMAP account on my own machine, manipulate the Maildir directly on disk to insert dummy email in strategic places and then migrate to the target account. It worked, but it also introduced a whole new set of moving parts.

The final solution (thanks Dennis)

dennis_filter

It took a long sunday walk with the dog before I realized that the proper solution would be to work with the imapsync source to fix folder creation. As I described above, the cause of the folder / label problem is that Gmail treats things differently depending on the order of folder creation. So, after the initial shock of seeing 5000 lines of Perl code (I don’t consider Perl to be part of my standard toolbox) I got to work and built me a patch. With the patch in place, the folder creation now works as I’d expected it in the first place. The one downside to this solution is that it won’t be able to see the difference between a label on the source Gmail account that actually contains a / (forward slash).

The other thing I with this patch is that it don’t have a switch to tell imapsync if you want the different folder creation behavior or not. I guess that’s needed before I submit it back to the maintainer.

Anyway, this patch assumes that you have imapsync 1.542 even if it’s likely to work well with other versions as well. If you have another version of imapsync and want to work with Gmail migrations, consider upgrading anyway since only 1.542 supports xoauth. On line 2312, replace the existing create_folders function with this modified version:

sub create_folder {
	my( $imap2, $h2_fold, $h1_fold ) = @_ ;
        my(@parts, $parent);

	print "Creating folder [$h2_fold] on host2\n";
        if ( ( 'INBOX' eq uc( $h2_fold) )
         and ( $imap2->exists( $h2_fold ) ) ) {
                print "Folder [$h2_fold] already exists\n" ;
                return( 1 ) ;
        }

        @parts = split($h2_sep, $h2_fold );
        pop( @parts );
        $parent = join($h2_sep, @parts );
        $parent =~ s/^\s+|\s+$//g ;
        if(($parent ne "") and !$imap2->exists( $parent )) {
        	create_folder( $imap2 , $parent , $h1_fold);
        }

	if ( ! $dry ){
		if ( ! $imap2->create( $h2_fold ) ) {
			print( "Couldn't create folder [$h2_fold] from [$h1_fold]: ",
			$imap2->LastError(  ), "\n" );
			$nb_errors++;
                        # success if folder exists ("already exists" error)
                        return( 1 ) if $imap2->exists( $h2_fold ) ;
                        # failure since create failed
			return( 0 );
		}else{
			#create succeeded
			return( 1 );
		}
	}else{
		# dry mode, no folder so many imap will fail, assuming failure
		return( 0 );
	}
}

 

 

2 comments

  1. It looks as though Mr. Gilles has back-ported the oauth code, as my copy of 1.286 functions properly with GMail (aside from the functionality that your patch offers)

Leave a Reply to John Cope Cancel reply

Your email address will not be published. Required fields are marked *