Migration

Running a large Hadoop or Cloudera CDP migration?

For the business overview (drivers, cost models, case studies, ROI), see the Ilum Hadoop & Cloudera Migration Platform page. For the implementation reference once a migration is in flight, see the migration toolkit documentation, which covers discovery, phased execution, data validation, and rollback.

The rest of this page describes a high-level, self-directed migration path and covers product-version migration notes for Ilum itself, including the MongoDB to PostgreSQL metadata-store transition and the Konvert library pilot for data-conversion workloads.

Soutien à la migration

La transition d’Apache Hadoop vers un nouvel environnement géré par Ilum peut sembler difficile, mais vous n’êtes pas seul dans ce processus. Nous comprenons que la migration des données et des applications, la configuration d’un nouvel environnement et la garantie que tout fonctionne comme prévu peuvent être une tâche complexe.

Pour vous aider dans ce processus, notre équipe chez Ilum est prête à vous fournir un soutien complet. Si vous avez besoin d’aide pour configurer Ilum, migrer vos clusters Spark ou tout autre aspect du processus de transition, n’hésitez pas à nous contacter. Nous pouvons vous fournir un graphique Helm pour faciliter le déploiement d’Ilum et vous guider à travers les étapes nécessaires à la migration de votre cluster Hadoop existant vers le nouvel environnement.

Nous nous engageons à rendre le processus de migration aussi fluide que possible pour vous. Que vous ayez des questions techniques, que vous ayez besoin de conseils sur les bonnes pratiques ou que vous rencontriez des problèmes lors de la migration, nous sommes là pour vous aider.

Veuillez nous contacter à l’adresse suivante : [email protected] à tout moment pour obtenir de l’aide lors de votre migration vers Ilum. Notre équipe d’assistance dédiée est prête et impatiente de vous aider dans votre cheminement vers une gestion efficace et gérable des clusters Apache Spark avec Ilum.

Migration Notes

Migration de la version 5.. vers la version 6.0.0

Avec la sortie de la version 6.0.0, nous avons introduit une nouvelle implémentation de sécurité qui nécessite une attention particulière lors du processus de migration. Les comptes d’utilisateur existants doivent être recréés si des modifications ont été apportées au compte d’administrateur par défaut.

Suivez les étapes ci-dessous pour réussir la migration vers la version 6.0.0. Un exemple de commande crée deux comptes : l’un pour un administrateur et l’autre pour un utilisateur régulier.

Mise à niveau Helm \
    --poser ilum-core.security.internal.users[0].username=Admin \
	--poser ilum-core.security.internal.users[0].password=adminPassword \
	--poser ilum-core.security.internal.users[0].roles[0]=ADMIN \
	--poser ilum-core.security.internal.users[1].username=utilisateur \
	--poser ilum-core.security.internal.users[1].password=userPassword \
	--poser ilum-core.security.internal.users[1].roles[0]=UTILISATEUR \
	--reuse-values ilum ilum/ilum

Pour vérifier toutes les méthodes d’authentification prises en charge et leurs paramètres, consultez README.md fichiers dans les graphiques ilum-core.

Migration de la version 6.0.* vers la version 6.1.0

Avec la sortie de la version 6.1.0, nous avons introduit une nouvelle implémentation de stockage ilum spark qui nécessite une attention particulière pendant le processus de migration. La configuration de compartiment existante doit être formatée pour correspondre au nouveau schéma.

Auparavant, le compartiment s3 utilisé par ilum pour stocker les ressources Spark était configuré à l’aide de la commande ilum-core.kubernetes.s3.bucket helm. Depuis la version 6.1.0, il a été remplacé par deux nouveaux paramètres :

ilum-core.kubernetes.s3.sparkBucket - joue le même rôle que le paramètre précédent
ilum-core.kubernetes.s3.dataBucket - Utilisé pour configurer bucket pour stocker les tables d’icônes

Metadata store: MongoDB to PostgreSQL

Recent Ilum releases promote PostgreSQL to the primary metadata store pour ilum-core. Access is reactive (R2DBC) with jOOQ-generated SQL DSL. MongoDB remains supported for legacy deployments and continues to receive bug fixes, but new deployments should default to PostgreSQL.

Why the change

Consistency with the rest of the stack: Marquez, Hive Metastore, Airflow, Superset, MLflow, Hydra, Gitea, n8n, and Kestra already share PostgreSQL. Consolidating ilum-core removes one stateful system from the deployment surface.
Schema-first metadata: jOOQ codegen produces type-safe SQL, replacing the schemaless reads against MongoDB collections that grew brittle as the metadata model expanded.
Operational tooling: Standard Postgres backup, replication, and observability tooling applies to Ilum metadata without bespoke MongoDB pipelines.

Default configuration

PostgreSQL is enabled out of the box in the umbrella Helm chart (postgresql.enabled: true). MongoDB is also enabled by default for backwards compatibility. Operators can disable MongoDB once they have migrated:

mongodb:
  Activé: faux

Migrating an existing MongoDB-backed deployment

A migration tooling chain ships with Ilum (script set M001 through M009) that reads metadata from MongoDB and writes it to PostgreSQL in the new schema. The migration is run once during the upgrade window:

Stop incoming traffic to ilum-core (drain or scale to zero).
Verify a PostgreSQL deployment is reachable from the ilum-core namespace and the ilum database exists.
Run the migration job through the umbrella chart's migration runner. Each step (M001 through M009) runs sequentially; failures roll back to the prior checkpoint.
Update the ilum-core configuration to point its primary store at PostgreSQL.
Restart ilum-core. Verify clusters, jobs, schedules, and saved queries are present in the UI.
After a soak period, scale MongoDB down or disable it via mongodb.enabled: false.

For deployment-specific migration assistance, contact [email protected].

Both backends in parallel

ilum-core supports running with either backend during the transition. The mongo.uri and PostgreSQL connection settings remain configurable independently, allowing operators to validate the PostgreSQL backend on a non-production cluster before promoting it.

Konvert library (pilot)

Ilum includes an integration with Konvert, a data-conversion library currently in pilot integration. Konvert is intended to streamline conversion of data and code between source formats and Ilum-native targets during migration projects (for example, transforming legacy ETL definitions into Ilum job specifications).

The pilot is opt-in and not yet covered by a stable public API. Teams interested in evaluating Konvert for a migration project should contact [email protected] for the current scope and an enablement walkthrough.

For large estate migrations from Hadoop or Cloudera CDP, the recommended starting point remains the Bifrost migration toolkit, which covers discovery, phased execution, data validation, and rollback.

Soutien à la migration​

Migration Notes​

Migration de la version 5.*.* vers la version 6.0.0​

Migration de la version 6.0.* vers la version 6.1.0​

Metadata store: MongoDB to PostgreSQL​

Why the change​

Default configuration​

Migrating an existing MongoDB-backed deployment​

Both backends in parallel​

Konvert library (pilot)​