Aller au contenu principal

Notebooks in Ilum

Aperçu

Ilum supports two powerful, enterprise-ready notebook environments: Jupyter (JupyterLab/JupyterHub)et Zeppelin .

Both environments enable users to create interactive, executable documents that combine code, results, rich text, and dynamic visualizations—making them essential tools for data science, analytics, and engineering workflows. Ilum ensures these environments are tightly integrated with cluster resources, Spark, data storage, and version control.


Supported Notebook Environments

JupyterLab

  • JupyterLab is a modern, flexible web-based IDE for notebooks and data applications.
  • It runs in single-user mode: perfect for experimentation, prototyping, and personal data projects.
  • In Ilum, JupyterLab is provided as the core user interface within each JupyterHub user's workspace.

JupyterHub

  • JupyterHub is the enterprise, multi-user orchestrator for JupyterLab environments.
  • It manages authentication (LDAP/SSO), user isolation, spawning, and centralized resource management on Kubernetes.
  • Each authenticated user receives a private, persistent JupyterLab workspace with built-in Spark and Git integration.
  • JupyterHub is optional in Ilum and can be enabled via Helm.

Zeppelin

  • Zeppelin is a multi-language notebook environment that emphasizes Spark analytics, visualizations, and dashboards.
  • It supports a wide array of interpreters and provides flexible visualization out of the box.
  • Zeppelin is optional in Ilum and can be enabled via Helm.

Key Differences and Typical Use Cases

Feature / AspectJupyterLab (Standalone)JupyterHub (Multi-user)Zeppelin
User ModelSingle userMulti-user (centralized)Single user
Authentification None / local onlyLDAP / SSO via IlumNone / local only
Resource ManagementLocal serverCentralized via KubernetesLocal server
WorkspaceLocal user environmentPer-user isolated workspaceLocal user environment
Spark IntegrationSparkmagic pluginSparkmagic pluginLivy Interpreter
Version ControlOptionalBuilt-in with Gitea (per user repo)Optional / not integrated
CollaborationGit (share via repo), exportGit (share via repo), exportShare notebooks, export
Language SupportPython, R, Bash, Scala, SQLPython, R, Bash, Scala, SQLPython, Scala, SQL, Bash, others (interpreters)
VisualizationJupyter widgets, matplotlib, etc.Jupyter widgets, matplotlib, etc.Built-in visualizations, dashboards
Recommended forPrototyping, local analysisTeam workflows, reproducible research, secure enterprise analyticsPrototyping, local analysis

Environment Selection Guide

Use CaseJupyterLab JupyterHub Zeppelin
Personal prototyping/experiments
Multi-user, secure, enterprise deployments
Centralized resource & user management
Integrated Git version control✓ (per user)
Ad-hoc exploration and dashboards
Advanced Python/R data science workflows
Spark jobs from notebooks (via Livy)
Collaboration via Git(manual sharing)

How Notebook Environments Work in Ilum

  • JupyterHub provides a central portal and login for users. After LDAP/SSO authentication, each user gets a personal JupyterLab environment on the cluster, with isolated storage and a pre-configured Spark integration. All code, notebooks, and output are private by default, but can be shared via Git (Gitea).
  • JupyterLab is the UI each user interacts with—write code, run cells, visualize data, and manage files, all from the browser.
  • Zeppelin can be enabled as an alternative, supporting multi-language analytics and fast, interactive dashboards. Zeppelin leverages Livy interpreters for Spark, and supports SQL and many other interpreters.

For a technical breakdown of the architecture and flow, see:


Intégration d’Ilum avec les notebooks via le proxy Ilum Livy

Pour communiquer avec Spark, les notebooks nécessitent des plugins spécifiques.

Dans Jupyter , ce qui est réalisé grâce à Commandes magiques — special syntax expressions such as %%magique ou %magique qui modifient le comportement d’un bloc de code. Par exemple %%étincelle Permet Étincelle de magie , ce qui permet au bloc d’exécuter du code Spark à l’aide de la commande Ilum Code Service .

Zeppelin , en revanche, a une architecture différente. Il utilise Interprètes pour traiter le code dans chaque bloc, chaque interpréteur étant conçu pour un langage ou un service spécifique. Pour Spark, Zeppelin utilise un Interpréteur Spark .

Ilum

But how does Ilum connect Jupyter’s Spark magic and Zeppelin’s interpreters to manage jobs and organize them into meaningful groups? It does it by utilizing Livy Server with Proxy over it.

De nombreux services, dont Jupyter avec son Spark Magic et Zeppelin avec son moteur Livy, exploitent Livy pour la communication avec Spark. Tite-Live est un serveur qui fournit une API REST pour interagir avec Spark.

Ilum fournit sa propre implémentation de l’API Livy appelée ilum-livy-proxy qui limite les sessions Spark avec Ilum Services. Par exemple, si vous créez une session Livy dans Jupyter, vous verrez un service de code correspondant dans votre charge de travail Ilum.

Ilum

For detailed Spark workflows and notebook-specific Spark usage, see Guide for Jupyter Notebooks.


Deployment Overview

  • JupyterLab :
    • JupyterLab is enabled and preconfigured by default.
    • Access from Modules > JupyterLab in the Ilum UI.
    • Only one instance.
    • Version control (Gitea) and Spark integration are ready-to-use.
  • JupyterHub

    • JupyterHub (multi-user) is not enabled and preconfigured by default (use this guide to deploy).
    • Access from Modules > JupyterHub in the Ilum UI.
    • Each user receives a private JupyterLab instance.
    • Version control (Gitea) and Spark integration are ready-to-use.
  • Zeppelin :

    • Zeppelin is not enabled and preconfigured by default (use this guide to deploy).
    • Access from Modules > Zeppelin in the Ilum UI.
    • Pre-integrated with Livy Proxy and Spark.

Notebook Features in Ilum

All supported notebook environments provide:

  • Executable, incremental cells (Python, Scala, SQL, Bash, etc.)
  • Data visualization (charts, tables, widgets, dashboards)
  • Integration with Spark clusters
  • Access to Ilum storages and services
  • Documentation via Markdown/HTML
  • Data lineage and session management
  • Collaboration and sharing options

Next Steps