article/dolphinscheduler/dolphinscheduler-集群部署.md

---
gitea: none
include_toc: true
---
# dolphinscheduler  集群部署


### 环境准备

#### 服务器部署

准备三台机器，用于安装 dolphinscheduler ，最好部署 在cdh 节点上（非必须）可以公用

1. hostname 与ip 映射
2. 安装好zookeeper（dolphinscheduler 3.0以上版本需要zookeeper3.8） 环境
3. spark 环境
4. 安装数据库 mysql 或者 postgresql


#### 节点免密登录

1. 创建用户 dolphinscheduler   ，切换用户至 dolphinscheduler
2. 创建密钥 （三台机器都要dolphinscheduler执行）`ssh-keygen ` 一路回车
3. 分发密钥（三台机器都要dolphinscheduler执行）`ssh-copy-id dolphinscheduler@hostname1` `ssh-copy-id dolphinscheduler@hostname2`
4.  测试免密登录 `ssh hostname1`  `ssh hostname2`


### 开始安装

在三台机器 的 `/opt/` 创建  切换 `dolphinscheduler`目录，并将目录所有者设置为 `dolphinscheduler` 用户

1. 登录 节点1 dolphinscheduler用户 并切换到 `~` 目录下，下载源码  `https://dlcdn.apache.org/dolphinscheduler/3.2.1/apache-dolphinscheduler-3.2.1-bin.tar.gz`

2. 解压到文件

   `tar -zxvf https://dlcdn.apache.org/dolphinscheduler/3.2.1/apache-dolphinscheduler-3.2.1-bin.tar.gz`

   `apache-dolphinscheduler-3.2.1-bin`    目录为源目录 不能删除

3. 配置安装信息

   进入`apache-dolphinscheduler-3.2.1-bin/bin/env`

   编辑 `olphinscheduler_env.sh` 文件，配置数据库信息，时区信息，zookeeper以及各组件路径

   编辑 ` install_env.sh` 文件，分配节点 安装信息, 安装位置，zookeeper节点

4. 配置全局文件系统

   编辑 每个模块的  `common.properties` 文件 找到 `resource.storage.type=HDFS`选择合适存储类型，然后在下面的对应配置项中填写相关配置，

   如果HDFS配置了 kerberos  还需要在 `hadoop.security.authentication.startup.state=false`配置项中配置好，kerberos  用户名证书

   完整配置项

   ```bash
   # if resource.storage.type=HDFS, the user must have the permission to create directories under the HDFS root path
   resource.hdfs.root.user=hdfs
   # if resource.storage.type=S3, the value like: s3a://dolphinscheduler; if resource.storage.type=HDFS and namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir
   resource.hdfs.fs.defaultFS=hdfs://cdh-node-2:8020

   # whether to startup kerberos
   hadoop.security.authentication.startup.state=false

   # java.security.krb5.conf path
   java.security.krb5.conf.path=/opt/krb5.conf

   # login user from keytab username
   login.user.keytab.username=hdfs-mycluster@ESZ.COM

   # login user from keytab path
   login.user.keytab.path=/opt/hdfs.headless.keytab

   # kerberos expire time, the unit is hour
   kerberos.expire.time=2
   ```

5. 数据源

   如需要添加数据源，首先将驱动添加至，api-server,worker-server,master-server 的lib 目录下

   - 添加 kerberos 认证的hive，需要在 配置全局文件系统 中配置 conf ，kerberos 用户与证书（默认）

   - 替换 api-server,worker-server,master-server 服务中 hive-* 相关的jar包为 cdh/cdp 中的依赖，

     cdh/cdp 路径

     ```
     /opt/cloudera/parcels/CDH/lib/hive/lib
     ```

   - 在页面上 的自定义参数中添加

     ```
     {"principal":"hive/bigdata57.cua.internal@CUA-KDCSERVER.COM"}
     ```

   - 在 kerberos.username 种 填入对应的用户名与keytab  `hive@CUA-KDCSERVER.COM`

6. 配置数据质量校验

   编辑 每个模块的 `common.properties` 确定 `data-quality.jar.name=`配置的名字与模块libs 目录下的jar 名字一样。然后将数据源中的数据类型对应的驱动，以及存储dolphinscheduler 元数据的数据库jdbc驱动 复制到 api-server,worker-server 的libs 目录下。

   1. 数据校验包默认不带驱动需要带入驱动，（可以用hdfs 分布式文件系统）
   2. 校验 kerberos 认证的hive 需要代入

7. 配置yarn

   将 yarn的hostName 与port 端口进行修改

   ```bash
   # resourcemanager port, the default value is 8088 if not specified
   resource.manager.httpaddress.port=8088
   # if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty
   yarn.resourcemanager.ha.rm.ids=cdh-node-2
   # if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname
   yarn.application.status.address=http://cdh-node-2:%s/ws/v1/cluster/apps/%s
   # job history status url when application number threshold is reached(default 10000, maybe it was set to 1000)
   yarn.job.history.status.address=http://cdh-node-2:19888/ws/v1/history/mapreduce/jobs/%s

   ```

8. 开始安装

   执行 bin/ install.sh 进行安装，安装完成后 访问api-server 的hostName:12345/dolphinscheduler/ui/login

   进行登录，使用  admin/dolphinscheduler123 进行登录。


### 修改配置

修改配置 还是在源目录 修改响应的配置 然后执行 bin/ install.sh 进行安装，

如果需要删除 依赖或者文件，需要在安装目录中删除后，在源目录删除。


### 常见问题

1. 高版本 dolphinscheduler 搭配低版本zookeeper

   将每个模块的zookeeper curator 等依赖，替换成低版本的

2. hive 数据库链接不上

   将libs 目录相关的 hive  依赖替换成 cdh 中的同名依赖

3. 多租户 执行数据质量校验任务报错，

   需要在 全局文件系统中的/user/目录下新建 租户主目录并将所有权设置为租户

4. kerberos 数据源不显示表 测试连接失败

   检查keytab 文件，principal 配置 用户名


### 附录

1. 3.19 版本的 olphinscheduler_env.sh配置

   ```bash
   #
   # Licensed to the Apache Software Foundation (ASF) under one or more
   # contributor license agreements.  See the NOTICE file distributed with
   # this work for additional information regarding copyright ownership.
   # The ASF licenses this file to You under the Apache License, Version 2.0
   # (the "License"); you may not use this file except in compliance with
   # the License.  You may obtain a copy of the License at
   #
   #     http://www.apache.org/licenses/LICENSE-2.0
   #
   # Unless required by applicable law or agreed to in writing, software
   # distributed under the License is distributed on an "AS IS" BASIS,
   # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   # See the License for the specific language governing permissions and
   # limitations under the License.
   #


   # Never put sensitive config such as database password here in your production environment,
   # this file will be sourced everytime a new task is executed.

   # applicationId auto collection related configuration, the following configurations are unnecessary if setting appId.collect=log
   #export HADOOP_CLASSPATH=`hadoop classpath`:${DOLPHINSCHEDULER_HOME}/tools/libs/*
   #export SPARK_DIST_CLASSPATH=$HADOOP_CLASSPATH:$SPARK_DIST_CLASS_PATH
   #export HADOOP_CLIENT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$HADOOP_CLIENT_OPTS
   #export SPARK_SUBMIT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$SPARK_SUBMIT_OPTS
   #export FLINK_ENV_JAVA_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$FLINK_ENV_JAVA_OPTS


   export JAVA_HOME=${JAVA_HOME:-/opt/java/jdk1.8.0_181/}

   export DATABASE=${DATABASE:-mysql}
   export SPRING_PROFILES_ACTIVE=${DATABASE}
   export SPRING_DATASOURCE_URL="jdbc:mysql://cdh-node-1/dolphinscheduler"
   export SPRING_DATASOURCE_USERNAME=dolphinscheduler
   export SPRING_DATASOURCE_PASSWORD=^Ws#nV4HvrXus*cpyv

   # DolphinScheduler server related configuration
   export SPRING_CACHE_TYPE=${SPRING_CACHE_TYPE:-none}
   export SPRING_JACKSON_TIME_ZONE=${SPRING_JACKSON_TIME_ZONE:-GMT+8}
   export MASTER_FETCH_COMMAND_NUM=${MASTER_FETCH_COMMAND_NUM:-10}

   # Registry center configuration, determines the type and link of the registry center
   export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper}
   export REGISTRY_ZOOKEEPER_CONNECT_STRING=${REGISTRY_ZOOKEEPER_CONNECT_STRING:-cdh-node-2:2181}

   # Tasks related configurations, need to change the configuration if you use the related tasks.
   export HADOOP_HOME=${HADOOP_HOME:-/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop}
   export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/etc/hadoop/conf}
   export SPARK_HOME=${SPARK_HOME:-/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/spark}
   export PYTHON_LAUNCHER=${PYTHON_LAUNCHER:-/opt/soft/python}
   export HIVE_HOME=${HIVE_HOME:-/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hive}
   export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
   export DATAX_LAUNCHER=${DATAX_LAUNCHER:-/opt/soft/datax/bin/python3}

   export PATH=$HADOOP_HOME/bin:$SPARK_HOME/bin:$PYTHON_LAUNCHER:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_LAUNCHER:$PATH


   ```

2. 3.19版本的 install_env.sh

   ```bash
   #
   # Licensed to the Apache Software Foundation (ASF) under one or more
   # contributor license agreements.  See the NOTICE file distributed with
   # this work for additional information regarding copyright ownership.
   # The ASF licenses this file to You under the Apache License, Version 2.0
   # (the "License"); you may not use this file except in compliance with
   # the License.  You may obtain a copy of the License at
   #
   #     http://www.apache.org/licenses/LICENSE-2.0
   #
   # Unless required by applicable law or agreed to in writing, software
   # distributed under the License is distributed on an "AS IS" BASIS,
   # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   # See the License for the specific language governing permissions and
   # limitations under the License.
   #

   # ---------------------------------------------------------
   # INSTALL MACHINE
   # ---------------------------------------------------------
   # A comma separated list of machine hostname or IP would be installed DolphinScheduler,
   # including master, worker, api, alert. If you want to deploy in pseudo-distributed
   # mode, just write a pseudo-distributed hostname
   # Example for hostnames: ips="ds1,ds2,ds3,ds4,ds5", Example for IPs: ips="192.168.8.1,192.168.8.2,192.168.8.3,192.168.8.4,192.168.8.5"
   ips="cdh-node-1,cdh-node-2,cdh-node-3"

   # Port of SSH protocol, default value is 22. For now we only support same port in all `ips` machine
   # modify it if you use different ssh port
   sshPort=${sshPort:-"22"}

   # A comma separated list of machine hostname or IP would be installed Master server, it
   # must be a subset of configuration `ips`.
   # Example for hostnames: masters="ds1,ds2", Example for IPs: masters="192.168.8.1,192.168.8.2"
   masters="cdh-node-3"

   # A comma separated list of machine <hostname>:<workerGroup> or <IP>:<workerGroup>.All hostname or IP must be a
   # subset of configuration `ips`, And workerGroup have default value as `default`, but we recommend you declare behind the hosts
   # Example for hostnames: workers="ds1:default,ds2:default,ds3:default", Example for IPs: workers="192.168.8.1:default,192.168.8.2:default,192.168.8.3:default"
   workers="cdh-node-1:default,cdh-node-2:default,cdh-node-3:default"

   # A comma separated list of machine hostname or IP would be installed Alert server, it
   # must be a subset of configuration `ips`.
   # Example for hostname: alertServer="ds3", Example for IP: alertServer="192.168.8.3"
   alertServer="cdh-node-1"

   # A comma separated list of machine hostname or IP would be installed API server, it
   # must be a subset of configuration `ips`.
   # Example for hostname: apiServers="ds1", Example for IP: apiServers="192.168.8.1"
   apiServers="cdh-node-1"

   # The directory to install DolphinScheduler for all machine we config above. It will automatically be created by `install.sh` script if not exists.
   # Do not set this configuration same as the current path (pwd). Do not add quotes to it if you using related path.
   installPath="/opt/dolphinscheduler"

   # The user to deploy DolphinScheduler for all machine we config above. For now user must create by yourself before running `install.sh`
   # script. The user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled than the root directory needs
   # to be created by this user
   deployUser="dolphinscheduler"

   # The root of zookeeper, for now DolphinScheduler default registry server is zookeeper.
   # It will delete ${zkRoot} in the zookeeper when you run install.sh, so please keep it same as registry.zookeeper.namespace in yml files.
   # Similarly, if you want to modify the value, please modify registry.zookeeper.namespace in yml files as well.
   zkRoot=${zkRoot:-"/dolphinscheduler"}
   ```

3. 3.19 版本的 worker-server 服务 common.properties配置

   ```bash
   #
   # Licensed to the Apache Software Foundation (ASF) under one or more
   # contributor license agreements.  See the NOTICE file distributed with
   # this work for additional information regarding copyright ownership.
   # The ASF licenses this file to You under the Apache License, Version 2.0
   # (the "License"); you may not use this file except in compliance with
   # the License.  You may obtain a copy of the License at
   #
   #     http://www.apache.org/licenses/LICENSE-2.0
   #
   # Unless required by applicable law or agreed to in writing, software
   # distributed under the License is distributed on an "AS IS" BASIS,
   # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   # See the License for the specific language governing permissions and
   # limitations under the License.
   #

   # user data local directory path, please make sure the directory exists and have read write permissions
   data.basedir.path=/tmp/dolphinscheduler

   # resource view suffixs
   #resource.view.suffixs=txt,log,sh,bat,conf,cfg,py,java,sql,xml,hql,properties,json,yml,yaml,ini,js

   # resource storage type: HDFS, S3, OSS, NONE
   resource.storage.type=HDFS
   # resource store on HDFS/S3 path, resource file will store to this base path, self configuration, please make sure the directory exists on hdfs and have read write permissions. "/dolphinscheduler" is recommended
   resource.storage.upload.base.path=/dolphinscheduler

   # The AWS access key. if resource.storage.type=S3 or use EMR-Task, This configuration is required
   resource.aws.access.key.id=minioadmin
   # The AWS secret access key. if resource.storage.type=S3 or use EMR-Task, This configuration is required
   resource.aws.secret.access.key=minioadmin
   # The AWS Region to use. if resource.storage.type=S3 or use EMR-Task, This configuration is required
   resource.aws.region=cn-north-1
   # The name of the bucket. You need to create them by yourself. Otherwise, the system cannot start. All buckets in Amazon S3 share a single namespace; ensure the bucket is given a unique name.
   resource.aws.s3.bucket.name=dolphinscheduler
   # You need to set this parameter when private cloud s3. If S3 uses public cloud, you only need to set resource.aws.region or set to the endpoint of a public cloud such as S3.cn-north-1.amazonaws.com.cn
   resource.aws.s3.endpoint=http://localhost:9000

   # alibaba cloud access key id, required if you set resource.storage.type=OSS
   resource.alibaba.cloud.access.key.id=<your-access-key-id>
   # alibaba cloud access key secret, required if you set resource.storage.type=OSS
   resource.alibaba.cloud.access.key.secret=<your-access-key-secret>
   # alibaba cloud region, required if you set resource.storage.type=OSS
   resource.alibaba.cloud.region=cn-hangzhou
   # oss bucket name, required if you set resource.storage.type=OSS
   resource.alibaba.cloud.oss.bucket.name=dolphinscheduler
   # oss bucket endpoint, required if you set resource.storage.type=OSS
   resource.alibaba.cloud.oss.endpoint=https://oss-cn-hangzhou.aliyuncs.com

   # if resource.storage.type=HDFS, the user must have the permission to create directories under the HDFS root path
   resource.hdfs.root.user=hdfs
   # if resource.storage.type=S3, the value like: s3a://dolphinscheduler; if resource.storage.type=HDFS and namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir
   resource.hdfs.fs.defaultFS=hdfs://cdh-node-2:8020

   # whether to startup kerberos
   hadoop.security.authentication.startup.state=false

   # java.security.krb5.conf path
   java.security.krb5.conf.path=/opt/krb5.conf

   # login user from keytab username
   login.user.keytab.username=hdfs-mycluster@ESZ.COM

   # login user from keytab path
   login.user.keytab.path=/opt/hdfs.headless.keytab

   # kerberos expire time, the unit is hour
   kerberos.expire.time=2


   # resourcemanager port, the default value is 8088 if not specified
   resource.manager.httpaddress.port=8088
   # if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty
   yarn.resourcemanager.ha.rm.ids=cdh-node-2
   # if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname
   yarn.application.status.address=http://cdh-node-2:%s/ws/v1/cluster/apps/%s
   # job history status url when application number threshold is reached(default 10000, maybe it was set to 1000)
   yarn.job.history.status.address=http://cdh-node-2:19888/ws/v1/history/mapreduce/jobs/%s

   # datasource encryption enable
   datasource.encryption.enable=false

   # datasource encryption salt
   datasource.encryption.salt=!@#$%^&*

   # data quality option
   data-quality.jar.name=dolphinscheduler-data-quality-3.1.9.jar

   #data-quality.error.output.path=/tmp/data-quality-error-data

   # Network IP gets priority, default inner outer

   # Whether hive SQL is executed in the same session
   support.hive.oneSession=false

   # use sudo or not, if set true, executing user is tenant user and deploy user needs sudo permissions; if set false, executing user is the deploy user and doesn't need sudo permissions
   sudo.enable=true
   setTaskDirToTenant.enable=false

   # network interface preferred like eth0, default: empty
   #dolphin.scheduler.network.interface.preferred=

   # network IP gets priority, default: inner outer
   #dolphin.scheduler.network.priority.strategy=default

   # system env path
   #dolphinscheduler.env.path=dolphinscheduler_env.sh

   # development state
   development.state=false

   # rpc port
   alert.rpc.port=50052

   # set path of conda.sh
   conda.path=/opt/anaconda3/etc/profile.d/conda.sh

   # Task resource limit state
   task.resource.limit.state=false

   # mlflow task plugin preset repository
   ml.mlflow.preset_repository=https://github.com/apache/dolphinscheduler-mlflow
   # mlflow task plugin preset repository version
   ml.mlflow.preset_repository_version="main"
   ```