# dolphinscheduler 集群部署 [TOC] ### 环境准备 #### 服务器部署 准备三台机器,用于安装 dolphinscheduler ,最好部署 在cdh 节点上(非必须)可以公用 1. hostname 与ip 映射 2. 安装好zookeeper(dolphinscheduler 3.0以上版本需要zookeeper3.8) 环境 3. spark 环境 4. 安装数据库 mysql 或者 postgresql #### 节点免密登录 1. 创建用户 dolphinscheduler ,切换用户至 dolphinscheduler 2. 创建密钥 (三台机器都要dolphinscheduler执行)`ssh-keygen ` 一路回车 3. 分发密钥(三台机器都要dolphinscheduler执行)`ssh-copy-id dolphinscheduler@hostname1` `ssh-copy-id dolphinscheduler@hostname2` 4. 测试免密登录 `ssh hostname1` `ssh hostname2` ### 开始安装 在三台机器 的 `/opt/` 创建 切换 `dolphinscheduler`目录,并将目录所有者设置为 `dolphinscheduler` 用户 1. 登录 节点1 dolphinscheduler用户 并切换到 `~` 目录下,下载源码 `https://dlcdn.apache.org/dolphinscheduler/3.2.1/apache-dolphinscheduler-3.2.1-bin.tar.gz` 2. 解压到文件 `tar -zxvf https://dlcdn.apache.org/dolphinscheduler/3.2.1/apache-dolphinscheduler-3.2.1-bin.tar.gz` `apache-dolphinscheduler-3.2.1-bin` 目录为源目录 不能删除 3. 配置安装信息 进入`apache-dolphinscheduler-3.2.1-bin/bin/env` 编辑 `olphinscheduler_env.sh` 文件,配置数据库信息,时区信息,zookeeper以及各组件路径 编辑 ` install_env.sh` 文件,分配节点 安装信息, 安装位置,zookeeper节点 4. 配置全局文件系统 编辑 每个模块的 `common.properties` 文件 找到 `resource.storage.type=HDFS`选择合适存储类型,然后在下面的对应配置项中填写相关配置, 如果HDFS配置了 kerberos 还需要在 `hadoop.security.authentication.startup.state=false`配置项中配置好,kerberos 用户名证书 完整配置项 ```bash # if resource.storage.type=HDFS, the user must have the permission to create directories under the HDFS root path resource.hdfs.root.user=hdfs # if resource.storage.type=S3, the value like: s3a://dolphinscheduler; if resource.storage.type=HDFS and namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir resource.hdfs.fs.defaultFS=hdfs://cdh-node-2:8020 # whether to startup kerberos hadoop.security.authentication.startup.state=false # java.security.krb5.conf path java.security.krb5.conf.path=/opt/krb5.conf # login user from keytab username login.user.keytab.username=hdfs-mycluster@ESZ.COM # login user from keytab path login.user.keytab.path=/opt/hdfs.headless.keytab # kerberos expire time, the unit is hour kerberos.expire.time=2 ``` 5. 配置数据质量校验 编辑 每个模块的 `common.properties` 确定 `data-quality.jar.name=`配置的名字与模块libs 目录下的jar 名字一样。然后将数据源中的数据类型对应的驱动,以及存储dolphinscheduler 元数据的数据库jdbc驱动 复制到 api-server,worker-server 的libs 目录下。 6. 配置yarn 将 yarn的hostName 与port 端口进行修改 ```bash # resourcemanager port, the default value is 8088 if not specified resource.manager.httpaddress.port=8088 # if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty yarn.resourcemanager.ha.rm.ids=cdh-node-2 # if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname yarn.application.status.address=http://cdh-node-2:%s/ws/v1/cluster/apps/%s # job history status url when application number threshold is reached(default 10000, maybe it was set to 1000) yarn.job.history.status.address=http://cdh-node-2:19888/ws/v1/history/mapreduce/jobs/%s ``` 7. 开始安装 执行 bin/ install.sh 进行安装,安装完成后 访问api-server 的hostName:12345/dolphinscheduler/ui/login 进行登录,使用 admin/dolphinscheduler123 进行登录。 ### 修改配置 修改配置 还是在源目录 修改响应的配置 然后执行 bin/ install.sh 进行安装, 如果需要删除 依赖或者文件,需要在安装目录中删除后,在源目录删除。 ### 常见问题 1. 高版本 dolphinscheduler 搭配低版本zookeeper 将每个模块的zookeeper curator 等依赖,替换成低版本的 2. hive 数据库链接不上 将libs 目录相关的 hive 依赖替换成 cdh 中的同名依赖 3. 多租户 执行数据质量校验任务报错, 需要在 全局文件系统中的/user/目录下新建 租户主目录并将所有权设置为租户 ### 附录 1. 3.19 版本的 olphinscheduler_env.sh配置 ```bash # # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # Never put sensitive config such as database password here in your production environment, # this file will be sourced everytime a new task is executed. # applicationId auto collection related configuration, the following configurations are unnecessary if setting appId.collect=log #export HADOOP_CLASSPATH=`hadoop classpath`:${DOLPHINSCHEDULER_HOME}/tools/libs/* #export SPARK_DIST_CLASSPATH=$HADOOP_CLASSPATH:$SPARK_DIST_CLASS_PATH #export HADOOP_CLIENT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$HADOOP_CLIENT_OPTS #export SPARK_SUBMIT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$SPARK_SUBMIT_OPTS #export FLINK_ENV_JAVA_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$FLINK_ENV_JAVA_OPTS export JAVA_HOME=${JAVA_HOME:-/opt/java/jdk1.8.0_181/} export DATABASE=${DATABASE:-mysql} export SPRING_PROFILES_ACTIVE=${DATABASE} export SPRING_DATASOURCE_URL="jdbc:mysql://cdh-node-1/dolphinscheduler" export SPRING_DATASOURCE_USERNAME=dolphinscheduler export SPRING_DATASOURCE_PASSWORD=^Ws#nV4HvrXus*cpyv # DolphinScheduler server related configuration export SPRING_CACHE_TYPE=${SPRING_CACHE_TYPE:-none} export SPRING_JACKSON_TIME_ZONE=${SPRING_JACKSON_TIME_ZONE:-GMT+8} export MASTER_FETCH_COMMAND_NUM=${MASTER_FETCH_COMMAND_NUM:-10} # Registry center configuration, determines the type and link of the registry center export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper} export REGISTRY_ZOOKEEPER_CONNECT_STRING=${REGISTRY_ZOOKEEPER_CONNECT_STRING:-cdh-node-2:2181} # Tasks related configurations, need to change the configuration if you use the related tasks. export HADOOP_HOME=${HADOOP_HOME:-/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop} export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/etc/hadoop/conf} export SPARK_HOME=${SPARK_HOME:-/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/spark} export PYTHON_LAUNCHER=${PYTHON_LAUNCHER:-/opt/soft/python} export HIVE_HOME=${HIVE_HOME:-/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hive} export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink} export DATAX_LAUNCHER=${DATAX_LAUNCHER:-/opt/soft/datax/bin/python3} export PATH=$HADOOP_HOME/bin:$SPARK_HOME/bin:$PYTHON_LAUNCHER:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_LAUNCHER:$PATH ``` 2. 3.19版本的 install_env.sh ```bash # # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # --------------------------------------------------------- # INSTALL MACHINE # --------------------------------------------------------- # A comma separated list of machine hostname or IP would be installed DolphinScheduler, # including master, worker, api, alert. If you want to deploy in pseudo-distributed # mode, just write a pseudo-distributed hostname # Example for hostnames: ips="ds1,ds2,ds3,ds4,ds5", Example for IPs: ips="192.168.8.1,192.168.8.2,192.168.8.3,192.168.8.4,192.168.8.5" ips="cdh-node-1,cdh-node-2,cdh-node-3" # Port of SSH protocol, default value is 22. For now we only support same port in all `ips` machine # modify it if you use different ssh port sshPort=${sshPort:-"22"} # A comma separated list of machine hostname or IP would be installed Master server, it # must be a subset of configuration `ips`. # Example for hostnames: masters="ds1,ds2", Example for IPs: masters="192.168.8.1,192.168.8.2" masters="cdh-node-3" # A comma separated list of machine : or :.All hostname or IP must be a # subset of configuration `ips`, And workerGroup have default value as `default`, but we recommend you declare behind the hosts # Example for hostnames: workers="ds1:default,ds2:default,ds3:default", Example for IPs: workers="192.168.8.1:default,192.168.8.2:default,192.168.8.3:default" workers="cdh-node-1:default,cdh-node-2:default,cdh-node-3:default" # A comma separated list of machine hostname or IP would be installed Alert server, it # must be a subset of configuration `ips`. # Example for hostname: alertServer="ds3", Example for IP: alertServer="192.168.8.3" alertServer="cdh-node-1" # A comma separated list of machine hostname or IP would be installed API server, it # must be a subset of configuration `ips`. # Example for hostname: apiServers="ds1", Example for IP: apiServers="192.168.8.1" apiServers="cdh-node-1" # The directory to install DolphinScheduler for all machine we config above. It will automatically be created by `install.sh` script if not exists. # Do not set this configuration same as the current path (pwd). Do not add quotes to it if you using related path. installPath="/opt/dolphinscheduler" # The user to deploy DolphinScheduler for all machine we config above. For now user must create by yourself before running `install.sh` # script. The user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled than the root directory needs # to be created by this user deployUser="dolphinscheduler" # The root of zookeeper, for now DolphinScheduler default registry server is zookeeper. # It will delete ${zkRoot} in the zookeeper when you run install.sh, so please keep it same as registry.zookeeper.namespace in yml files. # Similarly, if you want to modify the value, please modify registry.zookeeper.namespace in yml files as well. zkRoot=${zkRoot:-"/dolphinscheduler"} ``` 3. 3.19 版本的 worker-server 服务 common.properties配置 ```bash # # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # user data local directory path, please make sure the directory exists and have read write permissions data.basedir.path=/tmp/dolphinscheduler # resource view suffixs #resource.view.suffixs=txt,log,sh,bat,conf,cfg,py,java,sql,xml,hql,properties,json,yml,yaml,ini,js # resource storage type: HDFS, S3, OSS, NONE resource.storage.type=HDFS # resource store on HDFS/S3 path, resource file will store to this base path, self configuration, please make sure the directory exists on hdfs and have read write permissions. "/dolphinscheduler" is recommended resource.storage.upload.base.path=/dolphinscheduler # The AWS access key. if resource.storage.type=S3 or use EMR-Task, This configuration is required resource.aws.access.key.id=minioadmin # The AWS secret access key. if resource.storage.type=S3 or use EMR-Task, This configuration is required resource.aws.secret.access.key=minioadmin # The AWS Region to use. if resource.storage.type=S3 or use EMR-Task, This configuration is required resource.aws.region=cn-north-1 # The name of the bucket. You need to create them by yourself. Otherwise, the system cannot start. All buckets in Amazon S3 share a single namespace; ensure the bucket is given a unique name. resource.aws.s3.bucket.name=dolphinscheduler # You need to set this parameter when private cloud s3. If S3 uses public cloud, you only need to set resource.aws.region or set to the endpoint of a public cloud such as S3.cn-north-1.amazonaws.com.cn resource.aws.s3.endpoint=http://localhost:9000 # alibaba cloud access key id, required if you set resource.storage.type=OSS resource.alibaba.cloud.access.key.id= # alibaba cloud access key secret, required if you set resource.storage.type=OSS resource.alibaba.cloud.access.key.secret= # alibaba cloud region, required if you set resource.storage.type=OSS resource.alibaba.cloud.region=cn-hangzhou # oss bucket name, required if you set resource.storage.type=OSS resource.alibaba.cloud.oss.bucket.name=dolphinscheduler # oss bucket endpoint, required if you set resource.storage.type=OSS resource.alibaba.cloud.oss.endpoint=https://oss-cn-hangzhou.aliyuncs.com # if resource.storage.type=HDFS, the user must have the permission to create directories under the HDFS root path resource.hdfs.root.user=hdfs # if resource.storage.type=S3, the value like: s3a://dolphinscheduler; if resource.storage.type=HDFS and namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir resource.hdfs.fs.defaultFS=hdfs://cdh-node-2:8020 # whether to startup kerberos hadoop.security.authentication.startup.state=false # java.security.krb5.conf path java.security.krb5.conf.path=/opt/krb5.conf # login user from keytab username login.user.keytab.username=hdfs-mycluster@ESZ.COM # login user from keytab path login.user.keytab.path=/opt/hdfs.headless.keytab # kerberos expire time, the unit is hour kerberos.expire.time=2 # resourcemanager port, the default value is 8088 if not specified resource.manager.httpaddress.port=8088 # if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty yarn.resourcemanager.ha.rm.ids=cdh-node-2 # if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname yarn.application.status.address=http://cdh-node-2:%s/ws/v1/cluster/apps/%s # job history status url when application number threshold is reached(default 10000, maybe it was set to 1000) yarn.job.history.status.address=http://cdh-node-2:19888/ws/v1/history/mapreduce/jobs/%s # datasource encryption enable datasource.encryption.enable=false # datasource encryption salt datasource.encryption.salt=!@#$%^&* # data quality option data-quality.jar.name=dolphinscheduler-data-quality-3.1.9.jar #data-quality.error.output.path=/tmp/data-quality-error-data # Network IP gets priority, default inner outer # Whether hive SQL is executed in the same session support.hive.oneSession=false # use sudo or not, if set true, executing user is tenant user and deploy user needs sudo permissions; if set false, executing user is the deploy user and doesn't need sudo permissions sudo.enable=true setTaskDirToTenant.enable=false # network interface preferred like eth0, default: empty #dolphin.scheduler.network.interface.preferred= # network IP gets priority, default: inner outer #dolphin.scheduler.network.priority.strategy=default # system env path #dolphinscheduler.env.path=dolphinscheduler_env.sh # development state development.state=false # rpc port alert.rpc.port=50052 # set path of conda.sh conda.path=/opt/anaconda3/etc/profile.d/conda.sh # Task resource limit state task.resource.limit.state=false # mlflow task plugin preset repository ml.mlflow.preset_repository=https://github.com/apache/dolphinscheduler-mlflow # mlflow task plugin preset repository version ml.mlflow.preset_repository_version="main" ```