编程技术分享平台

网站首页 > 技术教程 正文

containerd硬核解析:从Hash ID看镜像下载过程

xnh888 2024-10-03 04:44:30 技术教程 27 ℃ 0 评论


通过ctr命令行下载镜像的过程便可以很清楚地看到镜像分层下载整个过程,下面通过拉取nginx镜像的例子,可以看到先获取镜像index文件然后分别获取manifest,最后下载各个分层。关于镜像格式请参见上一篇:OCI 规范解析。

# ctr i pull docker.io/library/nginx:latest
docker.io/library/nginx:latest:                                                   resolved       |++++++++++++++++++++++++++++++++++++++|
index-sha256:8f335768880da6baf72b70c701002b45f4932acae8d574dedfddaf967fc3ac90:    done           |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:3f13b4376446cf92b0cb9a5c46ba75d57c41f627c4edb8b635fa47386ea29e20: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:1f41b2f2bf94740d411c54b48be7f5e9dfbe14f29d1a5cf64f39150d75f39740:    done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:08b152afcfae220e9709f00767054b824361c742ea03a9fe936271ba520a0a4b:   done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:33847f680f63fb1b343a9fc782e267b5abdbdb50d65d4b9bd2a136291d67cf75:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:dbb907d5159dcb993c532a46d2edaff7a72670d093d518e38e6aaf8115103f73:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:8a268f30c42a7c778c9c9497d043dfac6143281918cb9337f20335d4f11e1937:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:b10cf527a02df3ba9f85346ee04f59f920d9ec341a7ca688339c8ae1f8ea978c:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:c90b090c213b9e42d7982715b827803437fcbf4337f4672fead618c60cd36b84:    done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 14.2s                                                                    total:  51.3 M (3.6 MiB/s)
unpacking linux/amd64 sha256:8f335768880da6baf72b70c701002b45f4932acae8d574dedfddaf967fc3ac90...
done

分层下载到本地后,可以执行content查看每个分层信息。

# ctr  content ls
DIGEST									SIZE	AGE		LABELS
sha256:08b152afcfae220e9709f00767054b824361c742ea03a9fe936271ba520a0a4b	7.73kB	20 seconds	containerd.io/gc.ref.snapshot.overlayfs=sha256:97386f823dd75e356afac10af0def601f2cd86908e3f163fb59780a057198e1b,containerd.io/distribution.source.docker.io=library/nginx
sha256:1f41b2f2bf94740d411c54b48be7f5e9dfbe14f29d1a5cf64f39150d75f39740	1.394kB	20 seconds	containerd.io/uncompressed=sha256:e3135447ca3e69c6975aee1621c406e3865e0e143c807bbdcf05abefa56054a2,containerd.io/distribution.source.docker.io=library/nginx
sha256:33847f680f63fb1b343a9fc782e267b5abdbdb50d65d4b9bd2a136291d67cf75	27.15MB	15 seconds	containerd.io/uncompressed=sha256:814bff7343242acfd20a2c841e041dd57c50f0cf844d4abd2329f78b992197f4,containerd.io/distribution.source.docker.io=library/nginx
sha256:3f13b4376446cf92b0cb9a5c46ba75d57c41f627c4edb8b635fa47386ea29e20	1.57kB	22 seconds	containerd.io/gc.ref.content.config=sha256:08b152afcfae220e9709f00767054b824361c742ea03a9fe936271ba520a0a4b,containerd.io/distribution.source.docker.io=library/nginx,containerd.io/gc.ref.content.l.5=sha256:1f41b2f2bf94740d411c54b48be7f5e9dfbe14f29d1a5cf64f39150d75f39740,containerd.io/gc.ref.content.l.4=sha256:c90b090c213b9e42d7982715b827803437fcbf4337f4672fead618c60cd36b84,containerd.io/gc.ref.content.l.3=sha256:b10cf527a02df3ba9f85346ee04f59f920d9ec341a7ca688339c8ae1f8ea978c,containerd.io/gc.ref.content.l.2=sha256:8a268f30c42a7c778c9c9497d043dfac6143281918cb9337f20335d4f11e1937,containerd.io/gc.ref.content.l.1=sha256:dbb907d5159dcb993c532a46d2edaff7a72670d093d518e38e6aaf8115103f73,containerd.io/gc.ref.content.l.0=sha256:33847f680f63fb1b343a9fc782e267b5abdbdb50d65d4b9bd2a136291d67cf75
sha256:8a268f30c42a7c778c9c9497d043dfac6143281918cb9337f20335d4f11e1937	601B	20 seconds	containerd.io/uncompressed=sha256:59b01b87c9e7f668b740d23eb872c5964636c33aef795f1186f08b172197bc35,containerd.io/distribution.source.docker.io=library/nginx
sha256:8f335768880da6baf72b70c701002b45f4932acae8d574dedfddaf967fc3ac90	1.862kB	22 seconds	containerd.io/gc.ref.content.m.6=sha256:145b55314cbde37bcc6f097452a898d81f19d5813e1e97d0b0de2de1d0b65b77,containerd.io/distribution.source.docker.io=library/nginx,containerd.io/gc.ref.content.m.0=sha256:3f13b4376446cf92b0cb9a5c46ba75d57c41f627c4edb8b635fa47386ea29e20,containerd.io/gc.ref.content.m.5=sha256:00d01a24e8ad476bf2adf5a067a51c09e580aa343fffc4f950b5671c198954a3,containerd.io/gc.ref.content.m.3=sha256:7ef3ca6ca846a10787f98fd2722d6e4054a17b37981a3ca273207a792731aebe,containerd.io/gc.ref.content.m.2=sha256:2c2c20f9dfe8d183e8f19f74f58bad3531d8d04b1bfe944c732a4c4136cf3939,containerd.io/gc.ref.content.m.7=sha256:24ee0098aff2d5eba66dbbb56e9e94c2d4745325ba2135facf0e44a4c4c14622,containerd.io/gc.ref.content.m.4=sha256:01b94f94c143b4cb1cbc2e359ac8c0111ac17360c556c34dfa8424c518b6148a,containerd.io/gc.ref.content.m.1=sha256:5e785e4dfee8e95cafa5d9f1aa164bcdb2c55d3530c8c8798ab20869ea056ea6
sha256:b10cf527a02df3ba9f85346ee04f59f920d9ec341a7ca688339c8ae1f8ea978c	893B	20 seconds	containerd.io/distribution.source.docker.io=library/nginx,containerd.io/uncompressed=sha256:988d9a3509bbb7ea8037d4eba3a5e0ada5dc165144c8ff0df89c0048d1ac6132
sha256:c90b090c213b9e42d7982715b827803437fcbf4337f4672fead618c60cd36b84	665B	20 seconds	containerd.io/distribution.source.docker.io=library/nginx,containerd.io/uncompressed=sha256:b857347059916922b353147882544f17bb96e64c639081c0677bf386c446be4f
sha256:dbb907d5159dcb993c532a46d2edaff7a72670d093d518e38e6aaf8115103f73	26.6MB	13 seconds	containerd.io/distribution.source.docker.io=library/nginx,containerd.io/uncompressed=sha256:7c0b223167b96d7deaacf1e1d2d35892166645b09b17bcc8675a4d882ef84893

也可以直接到目录"/var/lib/containerd/io.containerd.content.v1.content/blobs/sha256"下查每个分层本地数据。可以看到这里的三个输出的显示全部对应上了。

# ll
-r--r--r-- 1 root root     7730 7月  24 08:11 08b152afcfae220e9709f00767054b824361c742ea03a9fe936271ba520a0a4b
-r--r--r-- 1 root root     1394 7月  24 08:11 1f41b2f2bf94740d411c54b48be7f5e9dfbe14f29d1a5cf64f39150d75f39740
-r--r--r-- 1 root root 27145795 7月  24 08:11 33847f680f63fb1b343a9fc782e267b5abdbdb50d65d4b9bd2a136291d67cf75
-r--r--r-- 1 root root     1570 7月  24 08:11 3f13b4376446cf92b0cb9a5c46ba75d57c41f627c4edb8b635fa47386ea29e20
-r--r--r-- 1 root root      601 7月  24 08:11 8a268f30c42a7c778c9c9497d043dfac6143281918cb9337f20335d4f11e1937
-r--r--r-- 1 root root     1862 7月  24 08:11 8f335768880da6baf72b70c701002b45f4932acae8d574dedfddaf967fc3ac90
-r--r--r-- 1 root root      893 7月  24 08:11 b10cf527a02df3ba9f85346ee04f59f920d9ec341a7ca688339c8ae1f8ea978c
-r--r--r-- 1 root root      665 7月  24 08:11 c90b090c213b9e42d7982715b827803437fcbf4337f4672fead618c60cd36b84
-r--r--r-- 1 root root 26596573 7月  24 08:11 dbb907d5159dcb993c532a46d2edaff7a72670d093d518e38e6aaf8115103f73

9个分层的ID完全对应,这里ID是文件内容的sha256sum的值,譬如

 cat dbb907d5159dcb993c532a46d2edaff7a72670d093d518e38e6aaf8115103f73|sha256sum
dbb907d5159dcb993c532a46d2edaff7a72670d093d518e38e6aaf8115103f73

这个主要是为了防止数据被篡改。我们可以查看一下manifest信息的内容。

# cat 3f13b4376446cf92b0cb9a5c46ba75d57c41f627c4edb8b635fa47386ea29e20
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
   "config": {
      "mediaType": "application/vnd.docker.container.image.v1+json",
      "size": 7730,
      "digest": "sha256:08b152afcfae220e9709f00767054b824361c742ea03a9fe936271ba520a0a4b"
   },
   "layers": [
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 27145795,
         "digest": "sha256:33847f680f63fb1b343a9fc782e267b5abdbdb50d65d4b9bd2a136291d67cf75"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 26596573,
         "digest": "sha256:dbb907d5159dcb993c532a46d2edaff7a72670d093d518e38e6aaf8115103f73"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 601,
         "digest": "sha256:8a268f30c42a7c778c9c9497d043dfac6143281918cb9337f20335d4f11e1937"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 893,
         "digest": "sha256:b10cf527a02df3ba9f85346ee04f59f920d9ec341a7ca688339c8ae1f8ea978c"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 665,
         "digest": "sha256:c90b090c213b9e42d7982715b827803437fcbf4337f4672fead618c60cd36b84"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 1394,
         "digest": "sha256:1f41b2f2bf94740d411c54b48be7f5e9dfbe14f29d1a5cf64f39150d75f39740"
      }
   ]
}

这里manifest的输出layer的digest 也是和上面ID对应上的。因为containerd正是通过解析这个manifest下载镜像分层的。还需要看一下rootfs中diff_id,这个diff_id就镜像层解压后的hash ID。

"rootfs": {
    "type": "layers",
    "diff_ids": [
      "sha256:814bff7343242acfd20a2c841e041dd57c50f0cf844d4abd2329f78b992197f4",
      "sha256:7c0b223167b96d7deaacf1e1d2d35892166645b09b17bcc8675a4d882ef84893",
      "sha256:59b01b87c9e7f668b740d23eb872c5964636c33aef795f1186f08b172197bc35",
      "sha256:988d9a3509bbb7ea8037d4eba3a5e0ada5dc165144c8ff0df89c0048d1ac6132",
      "sha256:b857347059916922b353147882544f17bb96e64c639081c0677bf386c446be4f",
      "sha256:e3135447ca3e69c6975aee1621c406e3865e0e143c807bbdcf05abefa56054a2"
    ]
  }

我们可以尝试解压其任意两个分层后,求sha256的值,可以看到和上面diff_id的输出完全对应。

# cat dbb907d5159dcb993c532a46d2edaff7a72670d093d518e38e6aaf8115103f73|gunzip |sha256sum
7c0b223167b96d7deaacf1e1d2d35892166645b09b17bcc8675a4d882ef84893
# cat c90b090c213b9e42d7982715b827803437fcbf4337f4672fead618c60cd36b84|gunzip |sha256sum
b857347059916922b353147882544f17bb96e64c639081c0677bf386c446be4f

镜像下载完成后需要组织snapshot分层,每个数据层layer都会解压到snapshot层中,snapshot里面目录的ID用的是chainid,这个chainid 是通过下面公式计算得出的。这里需要区分diffid和chainid,diffid是确保某一层信息在解压后数据没有篡改,而chainid是代表多层信息没有发生修改,类似哈希树 (Merkle tree)。

ChainID(A|B|C) = Digest(ChainID(A|B) + " " + DiffID(C))

我们可以查看每个snapshot层,其中第二列ID 是第一列ID 的parent,至于最底的一层parenet ID为空。

sha256:1e294000374b3a304c2bfcfe51460aa599237149ed42e3423ac2c3f155f9b4a5 sha256:c0d318592b21711dc370e180acd66ad5d42f173d5b58ed315d08b9b09babb84a Committed
sha256:316cd969204ae854302bc55c610698829c9f23fa6fcd4e0f69afa6f29fedfd68 sha256:dcec23d16cb7cdbd725dc0024f38b39fd326066fc59784df92b40fc05ba3728f Committed
sha256:814bff7343242acfd20a2c841e041dd57c50f0cf844d4abd2329f78b992197f4                                                                         Committed
sha256:97386f823dd75e356afac10af0def601f2cd86908e3f163fb59780a057198e1b sha256:316cd969204ae854302bc55c610698829c9f23fa6fcd4e0f69afa6f29fedfd68 Committed
sha256:c0d318592b21711dc370e180acd66ad5d42f173d5b58ed315d08b9b09babb84a sha256:814bff7343242acfd20a2c841e041dd57c50f0cf844d4abd2329f78b992197f4 Committed
sha256:dcec23d16cb7cdbd725dc0024f38b39fd326066fc59784df92b40fc05ba3728f sha256:1e294000374b3a304c2bfcfe51460aa599237149ed42e3423ac2c3f155f9b4a5 Committed

最底下一层的hash ID为 :814bff7343242acfd20a2c841e041dd57c50f0cf844d4abd2329f78b992197f4 和上面diffid是相同的。但后续的ID 需要通过上面的公式计算了,譬如第二次的hash ID为:c0d318592b21711dc370e180acd66ad5d42f173d5b58ed315d08b9b09babb84a。这个值是通过:

 echo -n "sha256:814bff7343242acfd20a2c841e041dd57c50f0cf844d4abd2329f78b992197f4 sha256:7c0b223167b96d7deaacf1e1d2d35892166645b09b17bcc8675a4d882ef84893"|sha256sum
c0d318592b21711dc370e180acd66ad5d42f173d5b58ed315d08b9b09babb84a

只需要保障这snapshot ID一直,便可以保障每层中数据的不被修改。

最后,还有一个ID是整个镜像ID(imageid),镜像ID 其实config.json的hash ID 也就是上面的”config-sha256:08b152afcfae2“。

unpack过后,并没有真的执行挂载操作,只有在启动容器的时候才会执行联合挂载。如下所示,会将6个分层作为lowerdir

overlay on /run/containerd/io.containerd.runtime.v2.task/default/mynginx/rootfs type overlay (rw,relatime,lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/6/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/5/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/4/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/3/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/1/fs,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/7/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/7/work)

然后再创建一个upperdir作为读写层。

Tags:

本文暂时没有评论,来添加一个吧(●'◡'●)

欢迎 发表评论:

最近发表
标签列表