Data, Transformation & AI

Feedback on a Large-Scale Big Data Migration

Feedback on a Large-Scale Big Data Migration

A few years ago, at Société Générale, I had the opportunity to lead a strategic migration program: moving over 100 applications and datalabs from a legacy Big Data cluster to a new, more modern, more automated, and more secure platform… without ever interrupting production. My scope covered the central departments: compliance, operational and credit risks, liquidity, finance, group repositories, and HR.

Today, I am sharing with you the key lessons from this adventure, hoping they will inspire you for your own migration projects, especially towards the Cloud.

Challenge #1: Cracking the Data Subject

The first major challenge was to have a clear vision of how to handle data availability, both for internal teams and for client projects. We initially considered targeted approaches, such as:

But these two approaches immediately seemed far too complicated and therefore risky given our operational constraints. Thus, we preferred the following approach:

To do this, we quickly launched a product in advance to make our technical synchronization base more flexible, more scalable, and above all, easily usable on a large scale by the migration teams, without requiring low-level interventions on the infrastructure side.

Challenge #2: An Agile and Tooled Migration

From the beginning of the program, we adopted some principles of modern software development:

More concretely, here is how these two dimensions were implemented.

For the first, the migration team was organized in Scrum with a one-week sprint, a formalized backlog of course, under control during Sprint planning, and with rapid feedback loops thanks to weekly demos and retrospectives. In addition to this agile setup, we had weekly operational committees with the projects and steering committees with management: these instances allowed us to maintain a constant and close link with the many stakeholders.

For the second, we were developing the following products (the BUILD in parallel with the RUN of the migrations):

Migration Pattern

5-Phase Migration Roadmap

On the program management side, we organized our migration roadmap into 5 phases, defined according to the typology (and constraints) of the projects, as described below:

Feedback from Each Phase

Each phase of our Roadmap allowed us to draw the following feedback:

Data Governance & Cybersecurity

Let’s conclude this article with two last important aspects: data governance and cybersecurity.

During the program, management mandated that we take advantage of this migration to:

Finally, as one of the challenges of the new cluster was to provide a much higher level of security, we worked hand in hand with the cybersecurity teams. Even though the European DORA directive was not yet an issue at the time, cybersecurity requirements were beginning to become central, and we had to ensure that all our deliverables (processes and tools) were perfectly compliant on the Cyber side.

Checklist: The Keys to a Successful Migration

To finish, I offer you this checklist of points to verify if you have to manage a migration program:

Conclusion: To Migrate is to Transform

I hope this journey into the heart of a Big Data cluster migration has enlightened you on some issues that might be underestimated. And if you are working on cloud migrations, some of the aspects covered may undoubtedly help you.

I cannot end this article without warmly thanking all my colleagues who worked with me on this migration, both in the migration team itself, and in the infrastructure teams, project teams, and management teams. They are so numerous that I cannot name them all. But the success of this migration, which kept us intensely busy for almost 2 years, was possible thanks to them 🙏

Feel free to share your Big Data migration experiences in the comments.

And long live the Lucid Cluster!

Salvatore Russo

Let's discuss your data challenges

Whether you're just starting out or facing a complex problem, an initial 15-minute chat can clarify everything.