格式转换
This commit is contained in:
parent
b77fd1ccd1
commit
38ae95ae97
@ -82,7 +82,7 @@
|
||||
**Spark SQL :**
|
||||
|
||||
1. [DateFrames 和 DataSets ](https://github.com/heibaiying/BigData-Notes/blob/master/notes/SparkSQL_Dataset和DataFrame简介.md)
|
||||
2. [Structured API的基本使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Structured_API的基本使用.md)
|
||||
2. [Structured API的基本使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Structured_API的基本使用.md)
|
||||
3. 外部数据源
|
||||
4. [Spark SQL常用聚合函数](https://github.com/heibaiying/BigData-Notes/blob/master/notes/SparkSQL常用聚合函数.md)
|
||||
5. 联结操作
|
||||
|
@ -1,5 +1,25 @@
|
||||
# 聚合函数Aggregations
|
||||
|
||||
<nav>
|
||||
<a href="#一简单聚合">一、简单聚合</a><br/>
|
||||
<a href="#11-数据准备">1.1 数据准备</a><br/>
|
||||
<a href="#12-count">1.2 count</a><br/>
|
||||
<a href="#13-countDistinct">1.3 countDistinct</a><br/>
|
||||
<a href="#14-approx_count_distinct">1.4 approx_count_distinct </a><br/>
|
||||
<a href="#15-first--last">1.5 first & last </a><br/>
|
||||
<a href="#16-min--max">1.6 min & max</a><br/>
|
||||
<a href="#17-sum--sumDistinct">1.7 sum & sumDistinct</a><br/>
|
||||
<a href="#18-avg">1.8 avg</a><br/>
|
||||
<a href="#19-数学函数">1.9 数学函数</a><br/>
|
||||
<a href="#110-聚合数据到集合">1.10 聚合数据到集合</a><br/>
|
||||
<a href="#二分组聚合">二、分组聚合</a><br/>
|
||||
<a href="#21-简单分组">2.1 简单分组</a><br/>
|
||||
<a href="#22-分组聚合">2.2 分组聚合</a><br/>
|
||||
<a href="#三自定义聚合函数">三、自定义聚合函数</a><br/>
|
||||
<a href="#31-有类型的自定义函数">3.1 有类型的自定义函数</a><br/>
|
||||
<a href="#32-无类型的自定义聚合函数">3.2 无类型的自定义聚合函数</a><br/>
|
||||
</nav>
|
||||
|
||||
## 一、简单聚合
|
||||
|
||||
### 1.1 数据准备
|
||||
@ -225,7 +245,7 @@ object SparkSqlApp {
|
||||
|
||||
自定义聚合函数需要实现的方法比较多,这里以绘图的方式来演示其执行流程,以及每个方法的作用:
|
||||
|
||||

|
||||
<div align="center"> <img src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/spark-sql-自定义函数.png"/> </div>
|
||||
|
||||
|
||||
|
||||
|
@ -1,5 +1,14 @@
|
||||
# Structured API基本使用
|
||||
|
||||
<nav>
|
||||
<a href="#一创建DataFrames">一、创建DataFrames</a><br/>
|
||||
<a href="#二DataFrames基本操作">二、DataFrames基本操作</a><br/>
|
||||
<a href="#三创建Datasets">三、创建Datasets</a><br/>
|
||||
<a href="#四DataFrames与Datasets互相转换">四、DataFrames与Datasets互相转换</a><br/>
|
||||
<a href="#五RDDs转换为DataFramesDatasets">五、RDDs转换为DataFrames\Datasets</a><br/>
|
||||
</nav>
|
||||
|
||||
|
||||
## 一、创建DataFrames
|
||||
|
||||
Spark中所有功能的入口点是`SparkSession`,可以使用`SparkSession.builder()`创建。创建后应用程序就可以从现有RDD,Hive表或Spark数据源创建DataFrame。如下所示:
|
||||
@ -15,7 +24,7 @@ import spark.implicits._
|
||||
|
||||
这里可以启动`spark-shell`进行测试,需要注意的是`spark-shell`启动后会自动创建一个名为`spark`的`SparkSession`,在命令行中可以直接引用即可:
|
||||
|
||||

|
||||
<div align="center"> <img src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/spark-sql-shell.png"/> </div>
|
||||
|
||||
## 二、DataFrames基本操作
|
||||
|
||||
@ -26,7 +35,7 @@ import spark.implicits._
|
||||
df.printSchema()
|
||||
```
|
||||
|
||||

|
||||
<div align="center"> <img src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/spark-scheme.png"/> </div>
|
||||
|
||||
### 2.2 使用DataFrame API进行基本查询
|
||||
|
||||
|
@ -1,226 +1,240 @@
|
||||
# 基于Zookeeper搭建Kafka高可用集群
|
||||
|
||||
## 一、Zookeeper集群搭建
|
||||
|
||||
为保证集群高可用,Zookeeper集群的节点数最好是奇数,最少有三个节点,所以这里搭建一个三个节点的集群。
|
||||
|
||||
### 1.1 下载 & 解压
|
||||
|
||||
下载对应版本Zookeeper,这里我下载的版本`3.4.14`。官方下载地址:https://archive.apache.org/dist/zookeeper/
|
||||
|
||||
```shell
|
||||
# 下载
|
||||
wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
|
||||
# 解压
|
||||
tar -zxvf zookeeper-3.4.14.tar.gz
|
||||
```
|
||||
|
||||
### 1.2 修改配置
|
||||
|
||||
拷贝三份zookeeper安装包。分别进入安装目录的`conf`目录,拷贝配置样本`zoo_sample.cfg `为`zoo.cfg`并进行修改,修改后三份配置文件内容分别如下:
|
||||
|
||||
zookeeper01配置:
|
||||
|
||||
```shell
|
||||
tickTime=2000
|
||||
initLimit=10
|
||||
syncLimit=5
|
||||
dataDir=/usr/local/zookeeper-cluster/data/01
|
||||
dataLogDir=/usr/local/zookeeper-cluster/log/01
|
||||
clientPort=2181
|
||||
|
||||
# server.1 这个1是服务器的标识,可以是任意有效数字,标识这是第几个服务器节点,这个标识要写到dataDir目录下面myid文件里
|
||||
# 指名集群间通讯端口和选举端口
|
||||
server.1=127.0.0.1:2287:3387
|
||||
server.2=127.0.0.1:2288:3388
|
||||
server.3=127.0.0.1:2289:3389
|
||||
```
|
||||
|
||||
> 如果是多台服务器,则集群中每个节点通讯端口和选举端口可相同,IP地址修改为每个节点所在主机IP即可。
|
||||
|
||||
zookeeper02配置,与zookeeper01相比,只有`dataLogDir`和`dataLogDir`不同:
|
||||
|
||||
```shell
|
||||
tickTime=2000
|
||||
initLimit=10
|
||||
syncLimit=5
|
||||
dataDir=/usr/local/zookeeper-cluster/data/02
|
||||
dataLogDir=/usr/local/zookeeper-cluster/log/02
|
||||
clientPort=2182
|
||||
|
||||
server.1=127.0.0.1:2287:3387
|
||||
server.2=127.0.0.1:2288:3388
|
||||
server.3=127.0.0.1:2289:3389
|
||||
```
|
||||
|
||||
zookeeper03配置,与zookeeper01,02相比,也只有`dataLogDir`和`dataLogDir`不同:
|
||||
|
||||
```shell
|
||||
tickTime=2000
|
||||
initLimit=10
|
||||
syncLimit=5
|
||||
dataDir=/usr/local/zookeeper-cluster/data/03
|
||||
dataLogDir=/usr/local/zookeeper-cluster/log/03
|
||||
clientPort=2183
|
||||
|
||||
server.1=127.0.0.1:2287:3387
|
||||
server.2=127.0.0.1:2288:3388
|
||||
server.3=127.0.0.1:2289:3389
|
||||
```
|
||||
|
||||
> 配置参数说明:
|
||||
>
|
||||
> - **tickTime**:用于计算的基础时间单元。比如session超时:N*tickTime;
|
||||
> - **initLimit**:用于集群,允许从节点连接并同步到 master节点的初始化连接时间,以tickTime的倍数来表示;
|
||||
> - **syncLimit**:用于集群, master主节点与从节点之间发送消息,请求和应答时间长度(心跳机制);
|
||||
> - **dataDir**:数据存储位置;
|
||||
> - **dataLogDir**:日志目录;
|
||||
> - **clientPort**:用于客户端连接的端口,默认2181
|
||||
|
||||
|
||||
|
||||
### 1.3 标识节点
|
||||
|
||||
分别在三个节点的数据存储目录下新建`myid`文件,并写入对应的节点标识。Zookeeper集群通过`myid`文件识别集群节点,并通过上文配置的节点通信端口和选举端口来进行节点通信,选举出leader节点。
|
||||
|
||||
创建存储目录:
|
||||
|
||||
```shell
|
||||
# dataDir
|
||||
mkdir -vp /usr/local/zookeeper-cluster/data/01
|
||||
# dataDir
|
||||
mkdir -vp /usr/local/zookeeper-cluster/data/02
|
||||
# dataDir
|
||||
mkdir -vp /usr/local/zookeeper-cluster/data/03
|
||||
```
|
||||
|
||||
创建并写入节点标识到`myid`文件:
|
||||
|
||||
```shell
|
||||
#server1
|
||||
echo "1" > /usr/local/zookeeper-cluster/data/01/myid
|
||||
#server2
|
||||
echo "2" > /usr/local/zookeeper-cluster/data/02/myid
|
||||
#server3
|
||||
echo "3" > /usr/local/zookeeper-cluster/data/03/myid
|
||||
```
|
||||
|
||||
### 1.4 启动集群
|
||||
|
||||
分别启动三个节点:
|
||||
|
||||
```shell
|
||||
# 启动节点1
|
||||
/usr/app/zookeeper-cluster/zookeeper01/bin/zkServer.sh start
|
||||
# 启动节点2
|
||||
/usr/app/zookeeper-cluster/zookeeper02/bin/zkServer.sh start
|
||||
# 启动节点3
|
||||
/usr/app/zookeeper-cluster/zookeeper03/bin/zkServer.sh start
|
||||
```
|
||||
|
||||
### 1.5 集群验证
|
||||
|
||||
使用jps查看进程,并且使用`zkServer.sh status`查看集群各个节点状态。如图三个节点进程均启动成功,并且两个节点为follower节点,一个节点为leader节点。
|
||||
|
||||
<div align="center"> <img src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/zookeeper-cluster.png"/> </div>
|
||||
|
||||
|
||||
|
||||
## 二、Kafka集群搭建
|
||||
|
||||
### 2.1 下载解压
|
||||
|
||||
Kafka安装包官方下载地址:http://kafka.apache.org/downloads ,本用例下载的版本为`2.2.0`,下载命令:
|
||||
|
||||
```shell
|
||||
# 下载
|
||||
wget https://www-eu.apache.org/dist/kafka/2.2.0/kafka_2.12-2.2.0.tgz
|
||||
# 解压
|
||||
tar -xzf kafka_2.12-2.2.0.tgz
|
||||
```
|
||||
|
||||
>这里j解释一下kafka安装包的命名规则:以`kafka_2.12-2.2.0.tgz`为例,前面的2.12代表Scala的版本号(Kafka采用Scala语言进行开发),后面的2.2.0则代表Kafka的版本号。
|
||||
|
||||
### 2.2 拷贝配置文件
|
||||
|
||||
进入解压目录的` config`目录下 ,拷贝三份配置文件
|
||||
|
||||
```shell
|
||||
# cp server.properties server-1.properties
|
||||
# cp server.properties server-2.properties
|
||||
# cp server.properties server-3.properties
|
||||
```
|
||||
|
||||
### 2.3 修改配置
|
||||
|
||||
分别修改三份配置文件中的部分配置,如下:
|
||||
|
||||
server-1.properties:
|
||||
|
||||
```properties
|
||||
# The id of the broker. 集群中每个节点的唯一标识
|
||||
broker.id=0
|
||||
# 监听地址
|
||||
listeners=PLAINTEXT://hadoop001:9092
|
||||
# 日志文件存放位置
|
||||
log.dirs=/usr/local/kafka-logs/00
|
||||
# Zookeeper连接地址
|
||||
zookeeper.connect=hadoop001:2181,hadoop001:2182,hadoop001:2183
|
||||
```
|
||||
|
||||
server-2.properties:
|
||||
|
||||
```properties
|
||||
broker.id=1
|
||||
listeners=PLAINTEXT://hadoop001:9093
|
||||
log.dirs=/usr/local/kafka-logs/01
|
||||
zookeeper.connect=hadoop001:2181,hadoop001:2182,hadoop001:2183
|
||||
```
|
||||
|
||||
server-3.properties:
|
||||
|
||||
```properties
|
||||
broker.id=2
|
||||
listeners=PLAINTEXT://hadoop001:9094
|
||||
log.dirs=/usr/local/kafka-logs/02
|
||||
zookeeper.connect=hadoop001:2181,hadoop001:2182,hadoop001:2183
|
||||
```
|
||||
|
||||
### 2.4 启动集群
|
||||
|
||||
分别指定不同配置文件,启动三个Kafka节点。启动后可以使用jps查看进程,此时应该有三个zookeeper进程和三个kafka进程。
|
||||
|
||||
```shell
|
||||
bin/kafka-server-start.sh config/server-1.properties
|
||||
bin/kafka-server-start.sh config/server-2.properties
|
||||
bin/kafka-server-start.sh config/server-3.properties
|
||||
```
|
||||
|
||||
### 2.5 创建测试主题
|
||||
|
||||
创建测试主题:
|
||||
|
||||
```shell
|
||||
bin/kafka-topics.sh --create --bootstrap-server hadoop001:9092 --replication-factor 3 --partitions 1 --topic my-replicated-topic
|
||||
```
|
||||
|
||||
创建后可以使用以下命令查看创建的主题信息:
|
||||
|
||||
```shell
|
||||
bin/kafka-topics.sh --describe --bootstrap-server hadoop001:9092 --topic my-replicated-topic
|
||||
```
|
||||
|
||||

|
||||
|
||||
你也可以创建一个消费者和生产者进行连通测试:
|
||||
|
||||
```shell
|
||||
# 创建生产者
|
||||
bin/kafka-console-producer.sh --broker-list hadoop001:9093 --topic my-replicated-topic
|
||||
```
|
||||
|
||||
```shell
|
||||
# 创建消费者
|
||||
bin/kafka-console-consumer.sh --bootstrap-server hadoop001:9094 --from-beginning --topic my-replicated-topic
|
||||
```
|
||||
# 基于Zookeeper搭建Kafka高可用集群
|
||||
<nav>
|
||||
<a href="#一Zookeeper集群搭建">一、Zookeeper集群搭建</a><br/>
|
||||
<a href="#11-下载--解压">1.1 下载 & 解压</a><br/>
|
||||
<a href="#12-修改配置">1.2 修改配置</a><br/>
|
||||
<a href="#13-标识节点">1.3 标识节点</a><br/>
|
||||
<a href="#14-启动集群">1.4 启动集群</a><br/>
|
||||
<a href="#15-集群验证">1.5 集群验证</a><br/>
|
||||
<a href="#二Kafka集群搭建">二、Kafka集群搭建</a><br/>
|
||||
<a href="#21-下载解压">2.1 下载解压</a><br/>
|
||||
<a href="#22-拷贝配置文件">2.2 拷贝配置文件</a><br/>
|
||||
<a href="#23-修改配置">2.3 修改配置</a><br/>
|
||||
<a href="#24-启动集群">2.4 启动集群</a><br/>
|
||||
<a href="#25-创建测试主题">2.5 创建测试主题</a><br/>
|
||||
</nav>
|
||||
|
||||
## 一、Zookeeper集群搭建
|
||||
|
||||
为保证集群高可用,Zookeeper集群的节点数最好是奇数,最少有三个节点,所以这里搭建一个三个节点的集群。
|
||||
|
||||
### 1.1 下载 & 解压
|
||||
|
||||
下载对应版本Zookeeper,这里我下载的版本`3.4.14`。官方下载地址:https://archive.apache.org/dist/zookeeper/
|
||||
|
||||
```shell
|
||||
# 下载
|
||||
wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
|
||||
# 解压
|
||||
tar -zxvf zookeeper-3.4.14.tar.gz
|
||||
```
|
||||
|
||||
### 1.2 修改配置
|
||||
|
||||
拷贝三份zookeeper安装包。分别进入安装目录的`conf`目录,拷贝配置样本`zoo_sample.cfg `为`zoo.cfg`并进行修改,修改后三份配置文件内容分别如下:
|
||||
|
||||
zookeeper01配置:
|
||||
|
||||
```shell
|
||||
tickTime=2000
|
||||
initLimit=10
|
||||
syncLimit=5
|
||||
dataDir=/usr/local/zookeeper-cluster/data/01
|
||||
dataLogDir=/usr/local/zookeeper-cluster/log/01
|
||||
clientPort=2181
|
||||
|
||||
# server.1 这个1是服务器的标识,可以是任意有效数字,标识这是第几个服务器节点,这个标识要写到dataDir目录下面myid文件里
|
||||
# 指名集群间通讯端口和选举端口
|
||||
server.1=127.0.0.1:2287:3387
|
||||
server.2=127.0.0.1:2288:3388
|
||||
server.3=127.0.0.1:2289:3389
|
||||
```
|
||||
|
||||
> 如果是多台服务器,则集群中每个节点通讯端口和选举端口可相同,IP地址修改为每个节点所在主机IP即可。
|
||||
|
||||
zookeeper02配置,与zookeeper01相比,只有`dataLogDir`和`dataLogDir`不同:
|
||||
|
||||
```shell
|
||||
tickTime=2000
|
||||
initLimit=10
|
||||
syncLimit=5
|
||||
dataDir=/usr/local/zookeeper-cluster/data/02
|
||||
dataLogDir=/usr/local/zookeeper-cluster/log/02
|
||||
clientPort=2182
|
||||
|
||||
server.1=127.0.0.1:2287:3387
|
||||
server.2=127.0.0.1:2288:3388
|
||||
server.3=127.0.0.1:2289:3389
|
||||
```
|
||||
|
||||
zookeeper03配置,与zookeeper01,02相比,也只有`dataLogDir`和`dataLogDir`不同:
|
||||
|
||||
```shell
|
||||
tickTime=2000
|
||||
initLimit=10
|
||||
syncLimit=5
|
||||
dataDir=/usr/local/zookeeper-cluster/data/03
|
||||
dataLogDir=/usr/local/zookeeper-cluster/log/03
|
||||
clientPort=2183
|
||||
|
||||
server.1=127.0.0.1:2287:3387
|
||||
server.2=127.0.0.1:2288:3388
|
||||
server.3=127.0.0.1:2289:3389
|
||||
```
|
||||
|
||||
> 配置参数说明:
|
||||
>
|
||||
> - **tickTime**:用于计算的基础时间单元。比如session超时:N*tickTime;
|
||||
> - **initLimit**:用于集群,允许从节点连接并同步到 master节点的初始化连接时间,以tickTime的倍数来表示;
|
||||
> - **syncLimit**:用于集群, master主节点与从节点之间发送消息,请求和应答时间长度(心跳机制);
|
||||
> - **dataDir**:数据存储位置;
|
||||
> - **dataLogDir**:日志目录;
|
||||
> - **clientPort**:用于客户端连接的端口,默认2181
|
||||
|
||||
|
||||
|
||||
### 1.3 标识节点
|
||||
|
||||
分别在三个节点的数据存储目录下新建`myid`文件,并写入对应的节点标识。Zookeeper集群通过`myid`文件识别集群节点,并通过上文配置的节点通信端口和选举端口来进行节点通信,选举出leader节点。
|
||||
|
||||
创建存储目录:
|
||||
|
||||
```shell
|
||||
# dataDir
|
||||
mkdir -vp /usr/local/zookeeper-cluster/data/01
|
||||
# dataDir
|
||||
mkdir -vp /usr/local/zookeeper-cluster/data/02
|
||||
# dataDir
|
||||
mkdir -vp /usr/local/zookeeper-cluster/data/03
|
||||
```
|
||||
|
||||
创建并写入节点标识到`myid`文件:
|
||||
|
||||
```shell
|
||||
#server1
|
||||
echo "1" > /usr/local/zookeeper-cluster/data/01/myid
|
||||
#server2
|
||||
echo "2" > /usr/local/zookeeper-cluster/data/02/myid
|
||||
#server3
|
||||
echo "3" > /usr/local/zookeeper-cluster/data/03/myid
|
||||
```
|
||||
|
||||
### 1.4 启动集群
|
||||
|
||||
分别启动三个节点:
|
||||
|
||||
```shell
|
||||
# 启动节点1
|
||||
/usr/app/zookeeper-cluster/zookeeper01/bin/zkServer.sh start
|
||||
# 启动节点2
|
||||
/usr/app/zookeeper-cluster/zookeeper02/bin/zkServer.sh start
|
||||
# 启动节点3
|
||||
/usr/app/zookeeper-cluster/zookeeper03/bin/zkServer.sh start
|
||||
```
|
||||
|
||||
### 1.5 集群验证
|
||||
|
||||
使用jps查看进程,并且使用`zkServer.sh status`查看集群各个节点状态。如图三个节点进程均启动成功,并且两个节点为follower节点,一个节点为leader节点。
|
||||
|
||||
<div align="center"> <img src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/zookeeper-cluster.png"/> </div>
|
||||
|
||||
|
||||
|
||||
## 二、Kafka集群搭建
|
||||
|
||||
### 2.1 下载解压
|
||||
|
||||
Kafka安装包官方下载地址:http://kafka.apache.org/downloads ,本用例下载的版本为`2.2.0`,下载命令:
|
||||
|
||||
```shell
|
||||
# 下载
|
||||
wget https://www-eu.apache.org/dist/kafka/2.2.0/kafka_2.12-2.2.0.tgz
|
||||
# 解压
|
||||
tar -xzf kafka_2.12-2.2.0.tgz
|
||||
```
|
||||
|
||||
>这里j解释一下kafka安装包的命名规则:以`kafka_2.12-2.2.0.tgz`为例,前面的2.12代表Scala的版本号(Kafka采用Scala语言进行开发),后面的2.2.0则代表Kafka的版本号。
|
||||
|
||||
### 2.2 拷贝配置文件
|
||||
|
||||
进入解压目录的` config`目录下 ,拷贝三份配置文件
|
||||
|
||||
```shell
|
||||
# cp server.properties server-1.properties
|
||||
# cp server.properties server-2.properties
|
||||
# cp server.properties server-3.properties
|
||||
```
|
||||
|
||||
### 2.3 修改配置
|
||||
|
||||
分别修改三份配置文件中的部分配置,如下:
|
||||
|
||||
server-1.properties:
|
||||
|
||||
```properties
|
||||
# The id of the broker. 集群中每个节点的唯一标识
|
||||
broker.id=0
|
||||
# 监听地址
|
||||
listeners=PLAINTEXT://hadoop001:9092
|
||||
# 日志文件存放位置
|
||||
log.dirs=/usr/local/kafka-logs/00
|
||||
# Zookeeper连接地址
|
||||
zookeeper.connect=hadoop001:2181,hadoop001:2182,hadoop001:2183
|
||||
```
|
||||
|
||||
server-2.properties:
|
||||
|
||||
```properties
|
||||
broker.id=1
|
||||
listeners=PLAINTEXT://hadoop001:9093
|
||||
log.dirs=/usr/local/kafka-logs/01
|
||||
zookeeper.connect=hadoop001:2181,hadoop001:2182,hadoop001:2183
|
||||
```
|
||||
|
||||
server-3.properties:
|
||||
|
||||
```properties
|
||||
broker.id=2
|
||||
listeners=PLAINTEXT://hadoop001:9094
|
||||
log.dirs=/usr/local/kafka-logs/02
|
||||
zookeeper.connect=hadoop001:2181,hadoop001:2182,hadoop001:2183
|
||||
```
|
||||
|
||||
### 2.4 启动集群
|
||||
|
||||
分别指定不同配置文件,启动三个Kafka节点。启动后可以使用jps查看进程,此时应该有三个zookeeper进程和三个kafka进程。
|
||||
|
||||
```shell
|
||||
bin/kafka-server-start.sh config/server-1.properties
|
||||
bin/kafka-server-start.sh config/server-2.properties
|
||||
bin/kafka-server-start.sh config/server-3.properties
|
||||
```
|
||||
|
||||
### 2.5 创建测试主题
|
||||
|
||||
创建测试主题:
|
||||
|
||||
```shell
|
||||
bin/kafka-topics.sh --create --bootstrap-server hadoop001:9092 --replication-factor 3 --partitions 1 --topic my-replicated-topic
|
||||
```
|
||||
|
||||
创建后可以使用以下命令查看创建的主题信息:
|
||||
|
||||
```shell
|
||||
bin/kafka-topics.sh --describe --bootstrap-server hadoop001:9092 --topic my-replicated-topic
|
||||
```
|
||||
|
||||
<div align="center"> <img src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/kafka-cluster-shell.png"/> </div>
|
||||
|
||||
你也可以创建一个消费者和生产者进行连通测试:
|
||||
|
||||
```shell
|
||||
# 创建生产者
|
||||
bin/kafka-console-producer.sh --broker-list hadoop001:9093 --topic my-replicated-topic
|
||||
```
|
||||
|
||||
```shell
|
||||
# 创建消费者
|
||||
bin/kafka-console-consumer.sh --bootstrap-server hadoop001:9094 --from-beginning --topic my-replicated-topic
|
||||
```
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user