0
点赞
收藏
分享

微信扫一扫

Jupyterhub 部署

官方文档: ​​https://jupyterhub.readthedocs.io/en/latest/​​

什么是 Jupyterhub?

​​JupyterHub​​ is the best way to serve ​​Jupyter notebook​​ for multiple users. Because JupyterHub manages a separate Jupyter environment for each user, it can be used in a class of students, a corporate data science group, or a scientific research group. It is a multi-user Hub that spawns, manages, and proxies multiple instances of the single-user ​​​​Jupyter notebook​​​​ server.

JupyterHub is made up of four subsystems:

  • Hub (tornado process) that is the heart of JupyterHub
  • configurable http proxy (node-http-proxy) that receives the requests from the client’s browser
  • multiple single-user Jupyter notebook servers (Python/IPython/tornado) that are monitored by Spawners
  • an authentication class that manages how users can access the system

JupyterHub performs the following functions:

  • The Hub launches a proxy
  • The proxy forwards all requests to the Hub by default
  • The Hub handles user login and spawns single-user servers on demand
  • The Hub configures the proxy to forward URL prefixes to the single-user notebook servers

安装 Jupyterhub

jupyterhub 的 docker 镜像仓库:​​https://hub.docker.com/r/jupyterhub/jupyterhub​​

部署前需要考虑的问题:

  • deployment system (bare metal, Docker)
  • Authentication (PAM, OAuth, etc.)
  • Spawner of singleuser notebook servers (Docker, Batch, etc.)
  • Services (nbgrader, etc.)
  • JupyterHub database (default SQLite; traditional RDBMS such as PostgreSQL,) MySQL, or other databases supported by ​​SQLAlchemy​​)


目录与本地文件

It is recommended to put all of the files used by JupyterHub into standard UNIX filesystem locations.

  • ​/srv/jupyterhub​​ for all security and runtime files
  • ​/etc/jupyterhub​​ for all configuration files
  • ​/var/log​​ for log files


部署

​/etc/jupyterhub​​ 严重怀疑这东西是否准确,因为 jupyterhub_config.py 在该目录下时不生效,在 ​​/srv/jupyterhub​​ 能正常生效

# 创建 jupyterhub 的网络
docker network create --driver bridge jupyterhub_network
# 创建 volume
mkdir -pv /data/jupyterhub
chown -R root /data/jupyterhub
chmod -R 777 /data/jupyterhub
# 默认配置启动 jupyterhub
docker run -d --name jupyterhub -p8000:8000 --network jupyterhub_network -v /var/run/docker.sock:/var/run/docker.sock -v /data/jupyterhub:/srv/jupyterhub jupyterhub/jupyterhub:latest
docker exec -it jupyterhub bash
npm install --no-cache oauthenticator

配置 Authentication —— GitLabOAuthenticator

jupyterhub OAuthenticator 文档

​​https://oauthenticator.readthedocs.io/en/latest/tutorials/install.html​​

vi /data/jupyterhub/jupyterhub_config.py 添加以下信息

import os
from oauthenticator.gitlab import GitLabOAuthenticator

c.JupyterHub.authenticator_class = GitLabOAuthenticator
os.environ['OAUTH_CALLBACK_URL'] = 'http://ip:8000/hub/oauth_callback'
os.environ['GITLAB_CLIENT_ID'] = '***'
os.environ['GITLAB_CLIENT_SECRET'] = '***'
os.environ['GITLAB_URL']='https://xxx.github.cn'
os.environ['GITLAB_HOST']='https://xxx.github.cn'

c.GitlabOAuthenticator.client_id = os.environ['GITLAB_CLIENT_ID']
c.GitlabOAuthenticator.client_secret = os.environ['GITLAB_CLIENT_SECRET']
c.GitLabOAuthenticator.gitlab_url = os.environ['GITLAB_URL']
c.GitLabOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL']

Jupyterhub 部署_DockerSpawner

#查看 jupyterhub 的容器日志
docker logs jupyterhub

[I 2023-02-03 08:03:01.030 JupyterHub roles:477] Adding role server to token: <APIToken('a6e1...', user='xxx', client_id='jupyterhub')>
[I 2023-02-03 08:03:01.035 JupyterHub provider:607] Creating oauth client jupyterhub-user-xxx
[E 2023-02-03 08:03:01.046 JupyterHub user:762] Unhandled error starting chenyuzhe's server: "getpwnam(): name not found: 'xxx'"
[E 2023-02-03 08:03:01.059 JupyterHub pages:311] Error starting server xxx: "getpwnam(): name not found: 'xxx'"
Traceback (most recent call last):
None: None

[W 2023-02-03 08:03:01.059 JupyterHub web:1787] 500 GET /hub/spawn/xxx (): Unhandled error starting server xxx
...
[E 2023-02-03 08:03:01.061 JupyterHub log:189] 500 GET /hub/spawn/xxx (xxx@) 36.46m

嗯,虽然报错了,但这已经授权成功了。下一步配置 Spawner。

配置 Spawner —— DockerSpawner

看一下官网的 DockerSpawner 的配置:​​https://jupyterhub-dockerspawner.readthedocs.io/en/latest/spawner-types.html#dockerspawner​​

vi /data/jupyterhub/jupyterhub_config.py 添加以下信息

c.JupyterHub.spawner_class = 'dockerspawner.SwarmSpawner'

进入 jupyterhub 容器执行 npm install dockerspawner 后,重启容器

Jupyterhub 部署_OAuth_02

docker logs jupyterhub 查看错误日志

[I 2023-02-03 08:27:26.224 JupyterHub roles:477] Adding role server to token: <APIToken('a69c...', user='xxx', client_id='jupyterhub')>
[I 2023-02-03 08:27:26.232 JupyterHub provider:607] Creating oauth client jupyterhub-user-xxx
[I 2023-02-03 08:27:26.256 JupyterHub dockerspawner:1218] pulling image jupyterhub/singleuser:2.0
[I 2023-02-03 08:27:27.218 JupyterHub log:189] 302 GET /hub/spawn/xxx -> /hub/spawn-pending/xxx (chenyuzhe@::ffff:10.100.228.17) 1019.64ms
[I 2023-02-03 08:27:27.241 JupyterHub pages:400] xxx is pending spawn
[I 2023-02-03 08:27:27.264 JupyterHub log:189] 200 GET /hub/spawn-pending/xxx (xxx@::ffff:10.100.228.17) 25.94ms
[W 2023-02-03 08:27:36.218 JupyterHub base:1044] User xxx is slow to start (timeout=10)
[W 2023-02-03 08:28:26.246 JupyterHub user:754] xxx's server failed to start in 60 seconds, giving up.

Common causes of this timeout, and debugging tips:

1. Everything is working, but it took too long.
To fix: increase `Spawner.start_timeout` configuration
to a number of seconds that is enough for spawners to finish starting.
2. The server didn't finish starting,
or it crashed due to a configuration issue.
Check the single-user server's logs for hints at what needs fixing.

[I 2023-02-03 08:29:21.222 JupyterHub dockerspawner:988] Container 'jupyter-xxx' is gone
[W 2023-02-03 08:29:21.222 JupyterHub dockerspawner:963] Container not found: jupyter-xxx
[E 2023-02-03 08:29:21.249 JupyterHub gen:623] Exception in Future <Task finished name='Task-10' coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /usr/local/lib/python3.8/dist-packages/jupyterhub/handlers/base.py:935> exception=TimeoutError('Timeout')> after timeout
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 618, in error_callback
future.result()
File "/usr/local/lib/python3.8/dist-packages/jupyterhub/handlers/base.py", line 942, in finish_user_spawn
await spawn_future
File "/usr/local/lib/python3.8/dist-packages/jupyterhub/user.py", line 780, in spawn
raise e
File "/usr/local/lib/python3.8/dist-packages/jupyterhub/user.py", line 679, in spawn
url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
tornado.util.TimeoutError: Timeout

[I 2023-02-03 08:29:21.250 JupyterHub dockerspawner:988] Container 'jupyter-xxx' is gone
[I 2023-02-03 08:29:21.254 JupyterHub log:189] 200 GET /hub/api/users/xxx/server/progress (xxx@::ffff:10.100.228.17) 113851.95ms
[I 2023-02-03 08:29:22.759 JupyterHub dockerspawner:1272] Created container jupyter-xxx (id: 69653f3) from image jupyterhub/singleuser:2.0
[I 2023-02-03 08:29:22.759 JupyterHub dockerspawner:1296] Starting container jupyter-xxx (id: 69653f3)
[root@chenyuzhe jupyterhub]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
69653f3de27d jupyterhub/singleuser:2.0 "tini -g -- start-no…" About a minute ago Up About a minute 127.0.0.1:49153->8888/tcp jupyter-xxx

发现了一下关键信息

  • dockerspawner 自觉的帮我拉了 jupyterhub/singleuser:2.0 镜像
  • 启动了用 jupyterhub/singleuser:2.0 启动了 jupyter-xxx 容器,但是挂了没起来
  • 提示让我们去看 jupyter-xxx 的日志,那就看看🤔

[root@xxx jupyterhub]# docker logs jupyter-xxx
WARNING: Jupyter Notebook deprecation notice https://github.com/jupyter/docker-stacks#jupyter-notebook-deprecation-notice.
Entered start.sh with args: jupyter notebook
Executing the command: jupyter notebook
[I 08:29:23.629 NotebookApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/notebook_cookie_secret
[W 2023-02-03 08:29:24.404 LabApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-02-03 08:29:24.404 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-02-03 08:29:24.404 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-02-03 08:29:24.405 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[I 2023-02-03 08:29:24.414 LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.9/site-packages/jupyterlab
[I 2023-02-03 08:29:24.414 LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 08:29:24.420 NotebookApp] Serving notebooks from local directory: /home/jovyan
[I 08:29:24.420 NotebookApp] Jupyter Notebook 6.4.6 is running at:
[I 08:29:24.420 NotebookApp] http://69653f3de27d:8888/?token=f37ef323da43282d3de862f85c480af68cc213ef496aec48
[I 08:29:24.420 NotebookApp] or http://127.0.0.1:8888/?token=f37ef323da43282d3de862f85c480af68cc213ef496aec48
[I 08:29:24.420 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 08:29:24.425 NotebookApp]

To access the notebook, open this file in a browser:
file:///home/jovyan/.local/share/jupyter/runtime/nbserver-7-open.html
Or copy and paste one of these URLs:
http://69653f3de27d:8888/?token=f37ef323da43282d3de862f85c480af68cc213ef496aec48
or http://127.0.0.1:8888/?token=f37ef323da43282d3de862f85c480af68cc213ef496aec48

咦,看起来没有毛病,没错误信息哦。尝试手动开起来:docker start jupyter-xxx 挖槽能起来。回到页面刷新一下,点击蓝色按钮,结果还是超时

Jupyterhub 部署_jupyterhub_03

Jupyterhub 部署_jupyterhub_04

嗯,大概知道,应该是 jupyterhub 与 jupyter-xxx 网络不互通,才会超时的。尝试建立同一网络。查了资料,上连接 ​​https://github.com/jupyterhub/dockerspawner/blob/main/examples/simple/jupyterhub_config.py​​

vi /data/jupyterhub/jupyterhub_config.py 添加以下信息

# we need the hub to listen on all ips when it is in a container
c.JupyterHub.hub_ip = '0.0.0.0'
# the hostname/ip that should be used to connect to the hub
# this is usually the hub container's name
c.JupyterHub.hub_connect_ip = 'jupyterhub'

c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'

network_name = 'jupyterhub_network'
c.DockerSpawner.network_name = network_name

重启 jupyterhub,回到页面刷新后,又出现 500 。。。 难道猜错了?还是再看看 jupyterhub 的日志吧。日志如下:

# 找到关键日志
[I 2023-02-03 08:53:09.679 JupyterHub dockerspawner:1296] Starting container jupyter-xxx (id: 69653f3)
[E 2023-02-03 08:53:09.895 JupyterHub user:762] Unhandled error starting xxx's server: Unknown docker network 'jupyterhub_network'. Did you create it with `docker network create <name>`?
[I 2023-02-03 08:53:09.897 JupyterHub dockerspawner:1390] Stopping container jupyter-xxx (id: 69653f3)
[E 2023-02-03 08:53:10.000 JupyterHub pages:311] Error starting server xxx: Unknown docker network 'jupyterhub_network'. Did you create it with `docker network create <name>`?
Traceback (most recent call last):
None: None

[W 2023-02-03 08:53:10.001 JupyterHub web:1787] 500 GET /hub/spawn/xxx (::ffff:10.100.228.17): Unhandled error starting server xxx

说我没有创建 docker network jupyterhub_network。明明已经创建了,不信可以 docker network ls 看看。

嗯。。。再次猜测一下,一定是在我新建 jupyter-xxx 时用的是旧的配置,没有用到 jupyterhub_network 来新建 jupyter-xxx 容器。删除原有的 jupyter-xxx,回到页面,刷新,start,成了。撒花 ✿✿ヽ(°▽°)ノ✿

Jupyterhub 部署_DockerSpawner_05

心急的伙伴看这里——期中终结
  1. docker network create --driver bridge jupyterhub_network 建立网络,给 DockerSpawner 使用
  2. 建立 volume

mkdir -pv /data/jupyterhub
chown -R root /data/jupyterhub
chmod -R 777 /data/jupyterhub

  1. 在 /data/jupyterhub 创建 jupyterhub_config.py

# we need the hub to listen on all ips when it is in a container
c.JupyterHub.hub_ip = '0.0.0.0'
# the hostname/ip that should be used to connect to the hub
# this is usually the hub container's name
c.JupyterHub.hub_connect_ip = 'jupyterhub'

c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'

network_name = 'jupyterhub_network'
c.DockerSpawner.network_name = network_name

import os
from oauthenticator.gitlab import GitLabOAuthenticator

c.JupyterHub.authenticator_class = GitLabOAuthenticator
os.environ['OAUTH_CALLBACK_URL'] = 'http://IP:8000/hub/oauth_callback'
os.environ['GITLAB_CLIENT_ID'] = '***'
os.environ['GITLAB_CLIENT_SECRET'] = '***'
os.environ['GITLAB_URL']='https://xxx.github.cn'
os.environ['GITLAB_HOST']='https://xxx.github.cn'

c.GitlabOAuthenticator.client_id = os.environ['GITLAB_CLIENT_ID']
c.GitlabOAuthenticator.client_secret = os.environ['GITLAB_CLIENT_SECRET']
c.GitLabOAuthenticator.gitlab_url = os.environ['GITLAB_URL']
c.GitLabOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL']

  1. 构建自己的 jupyterhub 镜像

建个 dockerfile 文件,/opt/jupyterhub/dockerfile

FROM jupyterhub/jupyterhub:latest

RUN pip install oauthenticator
RUN pip install dockerspawner

完事后,构建一下生成镜像

docker build -t custom/jupyterhub-oauth .

  1. 启动 jupyterhub

docker run -d --name jupyterhub -p8000:8000 --network jupyterhub_network -v /var/run/docker.sock:/var/run/docker.sock -v /data/jupyterhub:/srv/jupyterhub custom/jupyterhub-oauth:latest

完成 jupyterhub + GitLabOAuthenticator + DockerSpawmer 

举报

相关推荐

0 条评论