mlflow詳細安裝部署

1、安裝docker# 安裝工具sudo yum install -y yum-utils# 添加yum倉庫配置sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.reposudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.rep# 更新yum緩存sudo yum makecache fast# 安裝dockeryum install -y docker-ce docker-ce-cli containerd.io# 查看安裝狀態docker info# 執行以下命令新建配置國內源加速cat <<EOF > /etc/docker/daemon.json{"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn","http://hub-mirror.c.163.com"],"max-concurrent-downloads": 10,"log-driver": "json-file","log-level": "warn","log-opts": {"max-size": "10m","max-file": "3"},"data-root": "/var/lib/docker"}EOF# 啟動服務systemctl start docker# 設置開機自啟systemctl enable docker# 查看狀態systemctl status docker2、Docker安裝minio# 拉取鏡像docker pull minio/minio# 運行容器,如果9000端口被占用請修改docker run -d -p 9000:9000 --name minio \-e "MINIO_ACCESS_KEY=minio" \-e "MINIO_SECRET_KEY=minio123" \-v /opt/minio/data:/data \-v /opt/minio/config:/root/.minio \minio/minio server /data \--console-address ":9000" --address ":9090"3、訪問minio界面

  • 地址:<安裝節點ip>:9000
  • 用戶名:minio
  • 密碼:minio123
  • 創建Bucket:點擊Create Bucket 輸入名稱 mlflow 并創建
4、安裝Anaconda3【mlflow詳細安裝部署】# 拉取包wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-2021.11-Linux-x86_64.sh# 安裝命令,一路回車+yesbash Anaconda3-2021.11-Linux-x86_64.sh# 將conda添加至環境變量vim /etc/profole# 在文件底部添加,注意根據實際修改的anaconda安裝路徑export PATH=/root/anaconda3/bin:$PATH# 使環境變量生效source /etc/profile# 修改為清華源,否則創建環境會因網絡情況緩慢或者失敗conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forgeconda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/msys2/conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/conda config --set show_channel_urls yes5、創建激活conda環境# 創建conda環境并安裝python3.8,時間比較長請耐心等待conda create -n mlflow-1.11.0 python==3.8# 如果出現以下提示請耐心等待系統自動嘗試下一個鏡像源:Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.# 注意開啟新終端,執行以下命令激活conda環境conda activate mlflow-1.11.06、安裝所需依賴包# 依次執行安裝mlfow tracking server python需要的依賴包pip install mlflow==1.11.0pip install mysqlclient==1.4.6pip install boto37、啟動mlflow tracking server# 暴露出minio url以及需要的ID和KEY,因為mlflow tracking server在上傳模型文件時需要export AWS_ACCESS_KEY_ID=minioexport AWS_SECRET_ACCESS_KEY=minio123export MLFLOW_S3_ENDPOINT_URL=http://localhost:9000# 在MySQL中創建庫mlflowcreate database if not exists `mlflow`;# 啟動mlflow server , 注意根據實際情況修改mysql信息mlflow server \--backend-store-uri mysql://<mysql用戶名>:'<mysql密碼>'@localhost/mlflow \--host 0.0.0.0 -p 5002 \--default-artifact-root s3://mlflow8、啟動可能出現的問題# 問題一:TypeError: Descriptors cannot not be created directly.If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.If you cannot immediately regenerate your protos, some other possible workarounds are: 1. Downgrade the protobuf package to 3.20.x or lower. 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).# 原因是protobuf版本問題 , 解決方案是在當前canda環境先卸載再指定版本安裝pip uninstall protobufpip install protobuf==3.19.0# 問題二:ImportError: libmysqlclient.so.20: cannot open shared object file: No such file or directory# 原因是在/usr/lib64/中沒有libmysqlclient.so.20 , 解決方案是找到當前系統中libmysqlclient.so.20的路徑,然后創建一個軟連接到/usr/lib64/libmysqlclient.so.20[root@node1 ~]# find / -name "libmysqlclient.so.20"/usr/local/mysql/lib/libmysqlclient.so.20[root@node1 ~]# ln -s /usr/local/mysql/lib/libmysqlclient.so.20 /usr/lib64/libmysqlclient.so.20# 問題三:sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (2002, "Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)")# 原因是找不到tmp下的mysql.sock文件,解決方案是需要找到mysql.sock所在的目錄,然后建立/tmp/mysql.sock軟連接到該文件上[root@node1 ~]# find / -name "mysql.sock"/var/lib/mysql/mysql.sock[root@node1 ~]# ln -s /var/lib/mysql/mysql.sock /tmp/mysql.sock

推薦閱讀