Account mapping is conceptually very simple: you are transferring files and sharing for a set of users in one system to a corresponding set of users on a new system. Due to the scope, system-to-system differences and some conflicts that result from a migration of this scale, it can result in some unexpected results if not managed closely. These can be minimized by following the principles below.
Convert all invited users over to full members. When you, as the admin for a new cloud based corporate account, create new accounts for your employees, they typically get invitations in their emails, and need to confirm those invitations before they are full members with full privileges. As difficult as it is to convert over the holdouts, having most or all of your users converted over will make your migration much easier, less confusing and less stressful.
Understand the limitations imposed by source and target in your migration, and plan accordingly. Your target platform may have limits for file size, number of files per drive or folder, number and structure for ACLs, and so on. The source may not be able to transfer some items, such as proprietary file links, to the target. CFP has extensive resources to help you address many of the issues that can arise.
Keep users informed of the migration. Before the migration:
- If there is any data that users do not want to send to the target, move or delete it. You can also set it up in your user mapping spreadsheet to skip it from the migration.
- If differences in the target permissions (such as waterfall permissions) will affect data access, make users aware of the potential ramifications to confidential data.
- Remove any permissions from files and folders that will prevent data from being transferred over.
- Train users on how to use the target platform. Paths to access the data may look different, or files may be in a completely different location or format. Advising users on how to use the target platform can reduce help desk workload and confusion.
During the migration:
- Users may continue to add folders and files to the source. They may also update existing files on the target.
- Do not move or rename files and folders.
- Do not access the target at all. If this is not possible, users should absolutely not access data in the folder(s) marked for migration, or the shared folders that are being transferred over.
- Migrations frequently take several days, and users should be aware of this.
After the migration:
- Inform users that you will be running a final sync, and they will not be able to access data on source or target for a brief interval. Do not apply lockdown procedures that will prevent CFP access to the data itself. Best done over the weekend or otherwise after business hours.
- Run the job again to do a sync. If you have selected full accounts (e.g., Dropbox or Box), all newly created folders will be transferred over.
- Stop using the source and switch all users over to the target.
Fundamentally, understand what you can and cannot replicate from the source to target in your migration, and then plan accordingly. CFP provides both knowledgebase content and software features to mitigate some of these limitations. If you are migrating off of Windows, for example, embedded links in Excel sheets to other files on the network may break, and reduced sharing of files may not be supported on the target ecosystem. CFP provides a report of the embedded links so you can plan accordingly, and will identify reduced/file sharing where it is not supported, along with automated ways to quickly resolve these conflicts. CFP simulations will identify some unsupported features, such as paths that are too long or files that are too big.
None of these measures are exhaustive, however. Talk to your cloud service provider representative and review their documentation so you have a plan to resolve any resulting conflicts before you begin migrating any data.
While many migrations may be run with co-admin credentials, it simplifies the migration if you have admin credentials. Co-admins may not be able to access all accounts, or view all data.
Ensure that all of the users you will need for your target account are created and invitations confirmed before you generate the account mapping spreadsheet in CFP. Talk to CFP support so you understand any permissions issues with users who are still in invited status.
Ensure that all groups are created on the target. You do not need to add members to the groups – you only need to create the groups – with one exception, which is for Team Folders on Dropbox. If you are migrating data to Team Folders on Dropbox, you will need to assign a group at the top level to each Team folder – and the Dropbox admin must be part of that group to have write access to the Team folders.
When you set up your job, make sure your data goes to a Target Destination Folder for the target node that is clearly labelled as data from the source, such as “Migrated Data.” This identifies and isolates the migrated data so that if there is any question about it later, it’s easier to manage.
Run the actual account mapping run as soon as possible after the spreadsheet generation, and run the final sync run as soon as possible after the bulk migration run. This minimizes the impact of changes to your data, users and permissions. For example, you may have folders set up in your permissions tab and tagged for transfer; if you wait a month to do the actual transfer, one or more of those folders may be deleted, and you have to remove it from both the job and the spreadsheet or it will generate a validation error.
Managing a migration of any size is a complex process. Here are some pointers that will help you to be successful:
It’s extremely useful to have a document that lists out each job in the migration, and then track the progress of that job through its different phases: map generation, simulation, bulk data transfer, sync, cutover/shares application. Posting it in a public domain will enable other team members to understand where the various parts of the migration are at.
Outlining Migration Strategy
Do you want to run your migration – and cutover – in waves? One single cutover? Large migrations are best done in waves, but that may not be practical. What works best for most customers is this approach:
- Generate an analysis of the full data set. This will tell you how much data you have, identify users who hold large quantities of data (usually this is a handful of users in the company), and display the sharing that is done on the source.
- Set up jobs, ideally no more than 1M files each. You can use the original map job as a parent to these jobs, group your jobs and create one parent map for each group, or generate separate maps for each job. Tagging jobs by group is helpful for organization if you have a lot of jobs.
Bear in mind that if you have to split a single account into multiple pieces, you should run two jobs at most that are accessing the same source account. Running more than two jobs for the same source account can cause unacceptable levels of rate limiting and retries that ultimately degrade performance. If split jobs are going to cause multiple jobs to access a single account, consider sharing out the data to dummy accounts, and running the splits from those dummy accounts. The same holds true for target accounts.
- Run simulations and then bulk transfer of data to the target. Once these complete, run nightly resyncs to keep the target accounts up to date. No sharing of data needs to be applied at this time.
- About a month before cutover, run new mappings and assess how long they will take. Any jobs that take longer than a day to generate the map should be split up. The goal is to have a configuration where you can start the map generations on a Friday night, have them complete by Saturday, and then apply the shares (and transfer any remaining data) on Saturday – Sunday.
Analyze the reports and the shares that are there. Are there shares you want to eliminate? Are there incompatibilities to address between how sharing is done on source and target?
- On cutover weekend, place source in lockdown on Friday afternoon. Run the map generations with permissions. Edit the spreadsheets and upload. Run the transfer jobs to apply those shares.
- Monday is a good catch-up day for anything that did not go as expected over the weekend. For that reason, Tuesday is a better cutover day than Monday.
Rate limiting is a phenomenon that most service providers invoke if a single user on the source or target of a job exceeds a predetermined number of requests per day or per hour. It prevents a single user from monopolizing system resources and is necessary to comply with some security strategies as well. But the result to you is that if multiple jobs are run concurrently, they may get throttled and slow down drastically.
Fortunately, most account mapping jobs can work around rate limiting because even though data is transferred via a single admin on source and a second single admin on target, it is going between multiple mapped users, and the admin on each side of the job is acting ‘on behalf of’ each particular user. So the requests tallied are per user rather than collectively against the admin. The result is that as long as the user sets for each concurrent job are completely separate, with no overlap, rate limiting is unlikely. For these migrations, multiple jobs may be configured with the same source and target oauth users. Service providers in some instances do institute max requests per day for a single admin user, but these limits are typically generous enough that most migrations will not encounter them.
If you are not running an account mapping job and need to run concurrent jobs from a single account, one thing you can do is to share out the contents of the source account with another user, and then create a system with that second user’s credentials. The second user can then transfer the shared folders that belong to the first user. You still have the issue where you will probably need to transfer the data to two separate accounts on the target to avoid rate limiting on the target. You have the option of either moving the data within your new service provider after the migration completes so it is all back in a single account, or you can manually share the data from the second account back to the first one. Consult your service provider about plans to move data within the target account, especially if it is a large quantity of data.
Rate limiting is separate from the max files per second egress and ingress. Rate limiting is when a single user exceeds the limit for requests to that provider. Max files per second for a given service provider is just what it sounds like: it is the maximum number of files that can transfer from, or to, a given provider. Maximum files per second metrics are just that: maximums – and they may decrease for large files, syncs where a lot of data is on specified target destination directory, and any number of other reasons.