[TOC]


0x00 Docker 排查项

CentOS7:

# Systemctl 启动项参数
/etc/systemd/system/docker.service
/usr/lib/systemd/docker.service

# Docker 元数据目录
/var/lib/docker

# Docker Deamon启动项
/etc/sysconfig/docker


0x01 Docker 异常解决

报错问题1:
错误信息:

docker:Error running DeviceCreate (createSnapDevice) dm_task_run failed

解决办法:https://stackoverflow.com/questions/30719896/docker-dm-task-run-failed-error

#不同安装路径可能不同
service docker stop
thin_check /var/lib/docker/devicemapper/devicemapper/metadata
thin_check --clear-needs-check-flag /var/lib/docker/devicemapper/devicemapper/metadata
service docker start


报错问题2:
错误信息:

docker start e7e
Error response from daemon: devmapper: Error mounting '/dev/mapper/docker-253:4-11534337-ee772425c4996ca581e5c234806adf41aede9424a83ce1402596105a9f66434d' on '/export/docker/devicemapper/mnt/ee772425c4996ca581e5c234806adf41aede9424a83ce1402596105a9f66434d': invalid argument

错误原因:因为selinux enable的时候,创建了该容器。而后修改了/etc/selinux/config 修改成selinux为disabled。
物理机重启后selinux处于关闭状态,则原先在selinux enable时候创建的容器就会无法启动报出这种错误。
修复方法:

主要有两种:
1.可以将selinux重新置为enable然后重启物理机即可修复。
2.修改容器的配置,比如我的容器的配置是/var/lib/docker/containers/e7ef71494940ba293be4b3f74198bf34835c35537810053b051d9a6c33adbd32/config.v2.json文件。将其中的"MountLabel": "system_u:object_r:svirt_sandbox_file_t:s0:c12,c257", "ProcessLabel": "system_u:system_r:svirt_lxc_net_t:s0:c12,c257"重修修改为"MountLabel": "", "ProcessLabel": "",然后重新启动docker daemon,容器即可修复。


报错问题3:
错误信息:

/usr/bin/docker-current: Error response from daemon: devmapper: Thin Pool has 155398 free data blocks which is less than minimum required 163840 free data blocks. Create more free space in thin pool or use dm.min_free_space option to change behavior

解决办法:

sudo docker rm $(sudo docker ps -q -f status=exited)
sudo docker volume rm $(sudo docker volume ls -qf dangling=true)
sudo docker rmi $(sudo docker images --filter "dangling=true" -q --no-trunc)


报错问题4:
运行环境: CentOS 7.3.1611 , Docker Version 1.12.6-16.el7.centis.x86_64 , API 1.24;
报错信息:

#Docker 启动报错
docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/docker.service.d
└─flannel.conf
Process: 5226 ExecStart=/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_NETWORK_OPTIONS $ADD_REGISTRY $BLOCK_REGISTRY $INSECURE_REGISTRY (code=exited, status=1/FAILURE)
Main PID: 5226 (code=exited, status=1/FAILURE)
#错误关键点
dockerd-current[5226]: time="..." level=info msg="libcontainerd: new containerd process, pid: 5238"
dockerd-current[5226]: time="..." level=warning msg="devmapper: Usage of loopback devices is strongly discouraged for production use. Please use `--storage-opt dm.thinpooldev` or use `man docker` to refer to dm.thinpooldev section."
node-198 dockerd-current[5226]: time="2020-01-18T17:00:27.872191345+08:00" level=error msg="[graphdriver] prior storage driver \"devicemapper\" failed: devmapper: Base Device UUID and Filesystem verification failed: devmapper: Current Base Device UUID:59df6192-df22-4d88-9e90-02755e7e3242 does not match with stored UUID:24907e3f-5114-4948-91ea-c1a4e92854ef. Possibly using a different thin pool than last invocation"
node-198 dockerd-current[5226]: time="2020-01-18T17:00:27.872410561+08:00" level=fatal msg="Error starting daemon: error initializing graphdriver: devmapper: Base Device UUID and Filesystem verification failed: devmapper: Current Base Device UUID:59df6192-df22-4d88-9e90-02755e7e3242 does not match with stored UUID:24907e3f-5114-4948-91ea-c1a4e92854ef. Possibly using a different thin pool than last invocation"

错误原因:由于存放Docker的Metadata磁盘是挂载上来的,在某次关机的时候存储异常关闭在解决后机器挂载上远程的NFS磁盘,在挂载后磁盘的UUID发生变化,导致通过loopback的方式不能连接到Docker的DeviceMapper的存储池;
解决方法:查看实际的loop0的uuid并且修改deviceset-metadata中的UUID

#查看系统磁盘UUID
$ls /dev/disk/by-uuid
$blkid
#59df6192-df22-4d88-9e90-02755e7e3242

#常规路径
/var/lib/docker/devicemapper/metadata/deviceset-metadata
#自定义的路径
/disk/docker/devicemapper/metadata/deviceset-metadata
#内容设置
{"next_device_id":1,"BaseDeviceUUID":"59df6192-df22-4d88-9e90-02755e7e3242","BaseDeviceFilesystem":"xfs"}

注意事项:

  • 目前docker支持的存储驱动类型有aufs/Device mapper/btrfs/overlayfs和zfs并且都采用写时复制(CoW)的技术,但是在CentOS上默认不支持aufs;
  • 注意:Docker使用的Devicemapper存储驱动的默认模式是loopback的方式,但是它的性能和稳定性都不太好;
  • 注意:默认它是loop-lvm模式采用空闲文件来构建存储池,建议在生产环境中使用direct-lvm模式;


报错信息5:

#docker info 或者在启动时候可以看见
WARNING: Usage of loopback devices is strongly discouraged for production use

报错原因:用loopback的方式运行docker是强烈不建议的;
解决方法:

#方式1:在Docker启动项里添加DOCKER_STORAGE_OPTIONS(不推荐,仅仅是忽略警告)
DOCKER_STORAGE_OPTIONS="--storage-opt dm.no_warn_on_loop_devices=true"

#方式2:在docker daemon启动时,加入device mapper的元数据存储和docker的镜像数据存储选择独立的块设备即可,lvm或者独立磁盘分区都可以
--storage-opt dm.datadev=/dev/xxxx --storage-opt dm.metadatadev=/dev/xxx

WeiyiGeek.解决方法