spark 单机环境搭建
This commit is contained in:
parent
a5b8c61ce3
commit
b05b437421
@ -1,20 +1,20 @@
|
||||
# Linux下JDK的安装
|
||||
|
||||
**系统环境**:centos 7.6
|
||||
|
||||
**JDK版本**:jdk 1.8.0_20
|
||||
|
||||
|
||||
>**系统环境**:centos 7.6
|
||||
>
|
||||
>**JDK版本**:jdk 1.8.0_20
|
||||
|
||||
|
||||
## 安装步骤:
|
||||
|
||||
### 1. 下载jdk安装包
|
||||
|
||||
在[官网](https://www.oracle.com/technetwork/java/javase/downloads/index.html)下载所需版本的jdk,上传至服务器对应位置。(这里我们下载的版本为[jdk1.8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) ,上传至服务器的/usr/java/目录下)
|
||||
在[官网](https://www.oracle.com/technetwork/java/javase/downloads/index.html)下载所需版本的jdk,上传至服务器对应位置(这里我们下载的版本为[jdk1.8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) ,上传至服务器的/usr/java/目录下)
|
||||
|
||||
|
||||
|
||||
### 2. 解压jdk-8u201-linux-x64.tar.gz安装包
|
||||
### 2. 解压安装包
|
||||
|
||||
```shell
|
||||
[root@ java]# tar -zxvf jdk-8u201-linux-x64.tar.gz
|
||||
|
121
notes/installation/Spark单机版本环境搭建.md
Normal file
121
notes/installation/Spark单机版本环境搭建.md
Normal file
@ -0,0 +1,121 @@
|
||||
# Spark单机版本环境搭建
|
||||
|
||||
|
||||
|
||||
>**系统环境**:centos 7.6
|
||||
>
|
||||
>**Spark版本**:spark-2.2.3-bin-hadoop2.6
|
||||
|
||||
|
||||
|
||||
### 1. Spark安装包下载
|
||||
|
||||
官网下载地址:http://spark.apache.org/downloads.html
|
||||
|
||||
因为Spark常常和Hadoop联合使用,所以下载时候需要选择Spark版本和对应的Hadoop版本后再下载
|
||||
|
||||
<div align="center"> <img width="600px" src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/spark-download.png"/> </div>
|
||||
|
||||
|
||||
|
||||
### 2. 解压安装包
|
||||
|
||||
```shell
|
||||
# tar -zxvf spark-2.2.3-bin-hadoop2.6.tgz
|
||||
```
|
||||
|
||||
|
||||
|
||||
### 3. 配置环境变量
|
||||
|
||||
```shell
|
||||
# vim /etc/profile
|
||||
```
|
||||
|
||||
添加环境变量:
|
||||
|
||||
```shell
|
||||
export SPARK_HOME=/usr/app/spark-2.2.3-bin-hadoop2.6
|
||||
export PATH=${SPARK_HOME}/bin:$PATH
|
||||
```
|
||||
|
||||
使得配置的环境变量生效:
|
||||
|
||||
```shell
|
||||
# source /etc/profile
|
||||
```
|
||||
|
||||
|
||||
|
||||
### 4. Standalone模式启动Spark
|
||||
|
||||
进入`${SPARK_HOME}/conf/`目录下,拷贝配置样本并进行相关配置:
|
||||
|
||||
```shell
|
||||
# cp spark-env.sh.template spark-env.sh
|
||||
```
|
||||
|
||||
在`spark-env.sh`中增加如下配置:
|
||||
|
||||
```shell
|
||||
# 主机节点地址
|
||||
SPARK_MASTER_HOST=hadoop001
|
||||
# Worker节点的最大并发task数
|
||||
SPARK_WORKER_CORES=2
|
||||
# Worker节点使用的最大内存数
|
||||
SPARK_WORKER_MEMORY=1g
|
||||
# 每台机器启动Worker实例的数量
|
||||
SPARK_WORKER_INSTANCES=1
|
||||
# JDK安装位置
|
||||
JAVA_HOME=/usr/java/jdk1.8.0_201
|
||||
```
|
||||
|
||||
进入`${SPARK_HOME}/sbin/`目录下,启动服务:
|
||||
|
||||
```shell
|
||||
# ./start-all.sh
|
||||
```
|
||||
|
||||
|
||||
|
||||
### 5. 验证启动是否成功
|
||||
|
||||
访问8080端口,查看Spark的Web-UI界面
|
||||
|
||||
<div align="center"> <img width="600px" src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/spark-web-ui.png"/> </div>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## 附:一个简单的词频统计例子,感受spark的魅力
|
||||
|
||||
#### 1. 准备一个词频统计的文件样本wc.txt,内容如下:
|
||||
|
||||
```txt
|
||||
hadoop,spark,hadoop
|
||||
spark,flink,flink,spark
|
||||
hadoop,hadoop
|
||||
```
|
||||
|
||||
#### 2. 指定spark master 节点地址,启动spark-shell
|
||||
|
||||
```shell
|
||||
# spark-shell --master spark://hadoop001:7077
|
||||
```
|
||||
|
||||
#### 3. 在scala交互式命令行中执行如下命名
|
||||
|
||||
```scala
|
||||
val file = spark.sparkContext.textFile("file:///usr/app//wc.txt")
|
||||
val wordCounts = file.flatMap(line => line.split(",")).map((word => (word, 1))).reduceByKey(_ + _)
|
||||
wordCounts.collect
|
||||
```
|
||||
|
||||
执行过程如下:
|
||||
|
||||
<div align="center"> <img src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/spark-shell.png"/> </div>
|
||||
|
||||
通过spark shell web-ui可以查看作业的执行情况,访问端口为4040
|
||||
|
||||
<div align="center"> <img width="600px" src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/spark-shell-web-ui.png"/> </div>
|
@ -2,29 +2,23 @@
|
||||
|
||||
|
||||
|
||||
>**系统环境**:centos 7.6
|
||||
>
|
||||
>**JDK版本**:jdk 1.8.0_20
|
||||
>
|
||||
>**Hadoop版本**:hadoop-2.6.0-cdh5.15.2
|
||||
|
||||
|
||||
|
||||
<nav>
|
||||
<a href="#一安装JDK">一、安装JDK</a><br/>
|
||||
<a href="#二配置-SSH-免密登录">二、配置 SSH 免密登录</a><br/>
|
||||
<a href="#21-配置ip地址和主机名映射在配置文件末尾添加ip地址和主机名映射">2.1 配置ip地址和主机名映射,在配置文件末尾添加ip地址和主机名映射</a><br/>
|
||||
<a href="#22--执行下面命令行一路回车生成公匙和私匙"> 2.2 执行下面命令行,一路回车,生成公匙和私匙</a><br/>
|
||||
<a href="#33-进入`~ssh`目录下查看生成的公匙和私匙并将公匙写入到授权文件">3.3 进入`~/.ssh`目录下,查看生成的公匙和私匙,并将公匙写入到授权文件</a><br/>
|
||||
<a href="#三HadoopHDFS环境搭建">三、Hadoop(HDFS)环境搭建</a><br/>
|
||||
<a href="#31-下载CDH-版本的Hadoop">3.1 下载CDH 版本的Hadoop</a><br/>
|
||||
<a href="#32-解压软件压缩包">3.2 解压软件压缩包</a><br/>
|
||||
<a href="#33-修改Hadoop相关配置文件">3.3 修改Hadoop相关配置文件</a><br/>
|
||||
<a href="#34-关闭防火墙">3.4 关闭防火墙</a><br/>
|
||||
<a href="#35-启动HDFS">3.5 启动HDFS</a><br/>
|
||||
<a href="#36-验证是否启动成功">3.6 验证是否启动成功</a><br/>
|
||||
<a href="#四HadoopYARN环境搭建">四、Hadoop(YARN)环境搭建</a><br/>
|
||||
<a href="#41-修改Hadoop配置文件指明mapreduce运行在YARN上">4.1 修改Hadoop配置文件,指明mapreduce运行在YARN上</a><br/>
|
||||
<a href="#42-在sbin目录下启动YARN">4.2 在sbin目录下启动YARN</a><br/>
|
||||
<a href="#43-验证是否启动成功">4.3 验证是否启动成功</a><br/>
|
||||
</nav>
|
||||
|
||||
|
||||
|
||||
|
||||
## 一、安装JDK
|
||||
|
||||
Hadoop 需要在java环境下运行,所以需要先安装Jdk,安装步骤见[Linux下JDK的安装](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/JDK%E5%AE%89%E8%A3%85.md)
|
||||
|
38
notes/installation/虚拟机静态IP配置.md
Normal file
38
notes/installation/虚拟机静态IP配置.md
Normal file
@ -0,0 +1,38 @@
|
||||
# 虚拟机静态IP配置
|
||||
|
||||
> 虚拟机环境:centos 7.6
|
||||
|
||||
|
||||
|
||||
### 1. 查看当前网卡名称
|
||||
|
||||
本机网卡名称为`enp0s3`
|
||||
|
||||
<div align="center"> <img src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/en0s3.png"/> </div>
|
||||
|
||||
### 2. 编辑网络配置文件
|
||||
|
||||
```shell
|
||||
# vim /etc/sysconfig/network-scripts/ifcfg-enp0s3
|
||||
```
|
||||
|
||||
添加如下网络配置,指明静态IP和DNS:
|
||||
|
||||
```shell
|
||||
BOOTPROTO=static
|
||||
IPADDR=192.168.200.226
|
||||
NETMASK=255.255.255.0
|
||||
GATEWAY=192.168.200.254
|
||||
DNS1=114.114.114.114
|
||||
```
|
||||
|
||||
修改后完整配置如下:
|
||||
|
||||
<div align="center"> <img src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/ifconfig.png"/> </div>
|
||||
|
||||
### 3. 重启网络服务
|
||||
|
||||
```shell
|
||||
# systemctl restart network
|
||||
```
|
||||
|
@ -1,9 +1,26 @@
|
||||
## 大数据环境搭建指南
|
||||
|
||||
### 一、JDK
|
||||
|
||||
1. [linux环境下JDK的安装](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/JDK安装.md)
|
||||
|
||||
### 二、Hadoop
|
||||
## 一、JDK
|
||||
|
||||
1. [Linux环境下JDK的安装](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/JDK安装.md)
|
||||
|
||||
|
||||
|
||||
## 二、Hadoop
|
||||
|
||||
1. [Hadoop单机版本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Hadoop单机版本环境搭建.md)
|
||||
|
||||
|
||||
|
||||
## 三、Spark
|
||||
|
||||
1. [Spark单机版本环境搭建]((https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Spark单机版本环境搭建.md))
|
||||
|
||||
|
||||
|
||||
## 网络配置
|
||||
|
||||
+ [虚拟机静态IP配置](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/虚拟机静态IP配置.md)
|
||||
|
||||
1. [hadoop单机版本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/hadoop单机版本环境搭建.md)
|
BIN
pictures/en0s3.png
Normal file
BIN
pictures/en0s3.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 26 KiB |
BIN
pictures/ifconfig.png
Normal file
BIN
pictures/ifconfig.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 21 KiB |
BIN
pictures/spark-download.png
Normal file
BIN
pictures/spark-download.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 20 KiB |
BIN
pictures/spark-shell-web-ui.png
Normal file
BIN
pictures/spark-shell-web-ui.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 49 KiB |
BIN
pictures/spark-shell.png
Normal file
BIN
pictures/spark-shell.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 22 KiB |
BIN
pictures/spark-web-ui.png
Normal file
BIN
pictures/spark-web-ui.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 73 KiB |
Loading…
x
Reference in New Issue
Block a user