spark 单机环境搭建
This commit is contained in:
		| @@ -1,20 +1,20 @@ | |||||||
| # Linux下JDK的安装 | # Linux下JDK的安装 | ||||||
|  |  | ||||||
| **系统环境**:centos 7.6 |  | ||||||
|  |  | ||||||
| **JDK版本**:jdk 1.8.0_20 |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | >**系统环境**:centos 7.6 | ||||||
|  | > | ||||||
|  | >**JDK版本**:jdk 1.8.0_20 | ||||||
|  |  | ||||||
|  |  | ||||||
| ## 安装步骤: |  | ||||||
|  |  | ||||||
| ### 1. 下载jdk安装包 | ### 1. 下载jdk安装包 | ||||||
|  |  | ||||||
| 在[官网](https://www.oracle.com/technetwork/java/javase/downloads/index.html)下载所需版本的jdk,上传至服务器对应位置。(这里我们下载的版本为[jdk1.8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) ,上传至服务器的/usr/java/目录下) | 在[官网](https://www.oracle.com/technetwork/java/javase/downloads/index.html)下载所需版本的jdk,上传至服务器对应位置(这里我们下载的版本为[jdk1.8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) ,上传至服务器的/usr/java/目录下) | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
| ### 2. 解压jdk-8u201-linux-x64.tar.gz安装包 | ### 2. 解压安装包 | ||||||
|  |  | ||||||
| ```shell | ```shell | ||||||
| [root@ java]# tar -zxvf jdk-8u201-linux-x64.tar.gz | [root@ java]# tar -zxvf jdk-8u201-linux-x64.tar.gz | ||||||
|   | |||||||
							
								
								
									
										121
									
								
								notes/installation/Spark单机版本环境搭建.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										121
									
								
								notes/installation/Spark单机版本环境搭建.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,121 @@ | |||||||
|  | # Spark单机版本环境搭建 | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | >**系统环境**:centos 7.6 | ||||||
|  | > | ||||||
|  | >**Spark版本**:spark-2.2.3-bin-hadoop2.6 | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ### 1. Spark安装包下载 | ||||||
|  |  | ||||||
|  | 官网下载地址:http://spark.apache.org/downloads.html | ||||||
|  |  | ||||||
|  | 因为Spark常常和Hadoop联合使用,所以下载时候需要选择Spark版本和对应的Hadoop版本后再下载 | ||||||
|  |  | ||||||
|  | <div align="center"> <img width="600px" src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/spark-download.png"/> </div> | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ### 2.  解压安装包 | ||||||
|  |  | ||||||
|  | ```shell | ||||||
|  | # tar -zxvf  spark-2.2.3-bin-hadoop2.6.tgz | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ### 3.  配置环境变量 | ||||||
|  |  | ||||||
|  | ```shell | ||||||
|  | # vim /etc/profile | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | 添加环境变量: | ||||||
|  |  | ||||||
|  | ```shell | ||||||
|  | export SPARK_HOME=/usr/app/spark-2.2.3-bin-hadoop2.6 | ||||||
|  | export  PATH=${SPARK_HOME}/bin:$PATH | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | 使得配置的环境变量生效: | ||||||
|  |  | ||||||
|  | ```shell | ||||||
|  | # source /etc/profile | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ### 4. Standalone模式启动Spark | ||||||
|  |  | ||||||
|  | 进入`${SPARK_HOME}/conf/`目录下,拷贝配置样本并进行相关配置: | ||||||
|  |  | ||||||
|  | ```shell | ||||||
|  | # cp spark-env.sh.template spark-env.sh | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | 在`spark-env.sh`中增加如下配置: | ||||||
|  |  | ||||||
|  | ```shell | ||||||
|  | # 主机节点地址 | ||||||
|  | SPARK_MASTER_HOST=hadoop001 | ||||||
|  | # Worker节点的最大并发task数 | ||||||
|  | SPARK_WORKER_CORES=2 | ||||||
|  | # Worker节点使用的最大内存数 | ||||||
|  | SPARK_WORKER_MEMORY=1g | ||||||
|  | # 每台机器启动Worker实例的数量 | ||||||
|  | SPARK_WORKER_INSTANCES=1 | ||||||
|  | # JDK安装位置 | ||||||
|  | JAVA_HOME=/usr/java/jdk1.8.0_201 | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | 进入`${SPARK_HOME}/sbin/`目录下,启动服务: | ||||||
|  |  | ||||||
|  | ```shell | ||||||
|  | # ./start-all.sh | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ### 5. 验证启动是否成功 | ||||||
|  |  | ||||||
|  | 访问8080端口,查看Spark的Web-UI界面 | ||||||
|  |  | ||||||
|  | <div align="center"> <img width="600px" src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/spark-web-ui.png"/> </div> | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ## 附:一个简单的词频统计例子,感受spark的魅力 | ||||||
|  |  | ||||||
|  | #### 1. 准备一个词频统计的文件样本wc.txt,内容如下: | ||||||
|  |  | ||||||
|  | ```txt | ||||||
|  | hadoop,spark,hadoop | ||||||
|  | spark,flink,flink,spark | ||||||
|  | hadoop,hadoop | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | #### 2. 指定spark master 节点地址,启动spark-shell | ||||||
|  |  | ||||||
|  | ```shell | ||||||
|  | # spark-shell --master spark://hadoop001:7077 | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | #### 3. 在scala交互式命令行中执行如下命名 | ||||||
|  |  | ||||||
|  | ```scala | ||||||
|  | val file = spark.sparkContext.textFile("file:///usr/app//wc.txt") | ||||||
|  | val wordCounts = file.flatMap(line => line.split(",")).map((word => (word, 1))).reduceByKey(_ + _) | ||||||
|  | wordCounts.collect | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | 执行过程如下: | ||||||
|  |  | ||||||
|  | <div align="center"> <img  src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/spark-shell.png"/> </div> | ||||||
|  |  | ||||||
|  | 通过spark shell web-ui可以查看作业的执行情况,访问端口为4040 | ||||||
|  |  | ||||||
|  | <div align="center"> <img width="600px" src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/spark-shell-web-ui.png"/> </div> | ||||||
| @@ -2,29 +2,23 @@ | |||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | >**系统环境**:centos 7.6 | ||||||
|  | > | ||||||
|  | >**JDK版本**:jdk 1.8.0_20 | ||||||
|  | > | ||||||
|  | >**Hadoop版本**:hadoop-2.6.0-cdh5.15.2 | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
| <nav> | <nav> | ||||||
| <a href="#一安装JDK">一、安装JDK</a><br/> | <a href="#一安装JDK">一、安装JDK</a><br/> | ||||||
| <a href="#二配置-SSH-免密登录">二、配置 SSH 免密登录</a><br/> | <a href="#二配置-SSH-免密登录">二、配置 SSH 免密登录</a><br/> | ||||||
|         <a href="#21-配置ip地址和主机名映射在配置文件末尾添加ip地址和主机名映射">2.1 配置ip地址和主机名映射,在配置文件末尾添加ip地址和主机名映射</a><br/> |  | ||||||
|         <a href="#22--执行下面命令行一路回车生成公匙和私匙"> 2.2  执行下面命令行,一路回车,生成公匙和私匙</a><br/> |  | ||||||
|         <a href="#33-进入`~ssh`目录下查看生成的公匙和私匙并将公匙写入到授权文件">3.3 进入`~/.ssh`目录下,查看生成的公匙和私匙,并将公匙写入到授权文件</a><br/> |  | ||||||
| <a href="#三HadoopHDFS环境搭建">三、Hadoop(HDFS)环境搭建</a><br/> | <a href="#三HadoopHDFS环境搭建">三、Hadoop(HDFS)环境搭建</a><br/> | ||||||
|         <a href="#31-下载CDH-版本的Hadoop">3.1 下载CDH 版本的Hadoop</a><br/> |  | ||||||
|         <a href="#32-解压软件压缩包">3.2 解压软件压缩包</a><br/> |  | ||||||
|         <a href="#33-修改Hadoop相关配置文件">3.3 修改Hadoop相关配置文件</a><br/> |  | ||||||
|         <a href="#34-关闭防火墙">3.4 关闭防火墙</a><br/> |  | ||||||
|         <a href="#35-启动HDFS">3.5 启动HDFS</a><br/> |  | ||||||
|         <a href="#36-验证是否启动成功">3.6 验证是否启动成功</a><br/> |  | ||||||
| <a href="#四HadoopYARN环境搭建">四、Hadoop(YARN)环境搭建</a><br/> | <a href="#四HadoopYARN环境搭建">四、Hadoop(YARN)环境搭建</a><br/> | ||||||
|         <a href="#41-修改Hadoop配置文件指明mapreduce运行在YARN上">4.1 修改Hadoop配置文件,指明mapreduce运行在YARN上</a><br/> |  | ||||||
|         <a href="#42-在sbin目录下启动YARN">4.2 在sbin目录下启动YARN</a><br/> |  | ||||||
|         <a href="#43-验证是否启动成功">4.3 验证是否启动成功</a><br/> |  | ||||||
| </nav> | </nav> | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
| ## 一、安装JDK | ## 一、安装JDK | ||||||
|  |  | ||||||
| Hadoop 需要在java环境下运行,所以需要先安装Jdk,安装步骤见[Linux下JDK的安装](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/JDK%E5%AE%89%E8%A3%85.md) | Hadoop 需要在java环境下运行,所以需要先安装Jdk,安装步骤见[Linux下JDK的安装](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/JDK%E5%AE%89%E8%A3%85.md) | ||||||
|   | |||||||
							
								
								
									
										38
									
								
								notes/installation/虚拟机静态IP配置.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										38
									
								
								notes/installation/虚拟机静态IP配置.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,38 @@ | |||||||
|  | # 虚拟机静态IP配置 | ||||||
|  |  | ||||||
|  | >  虚拟机环境:centos 7.6 | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ### 1. 查看当前网卡名称 | ||||||
|  |  | ||||||
|  | 	本机网卡名称为`enp0s3` | ||||||
|  |  | ||||||
|  | <div align="center"> <img src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/en0s3.png"/> </div> | ||||||
|  |  | ||||||
|  | ### 2. 编辑网络配置文件 | ||||||
|  |  | ||||||
|  | ```shell | ||||||
|  | # vim /etc/sysconfig/network-scripts/ifcfg-enp0s3 | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | 添加如下网络配置,指明静态IP和DNS: | ||||||
|  |  | ||||||
|  | ```shell | ||||||
|  | BOOTPROTO=static | ||||||
|  | IPADDR=192.168.200.226 | ||||||
|  | NETMASK=255.255.255.0 | ||||||
|  | GATEWAY=192.168.200.254 | ||||||
|  | DNS1=114.114.114.114 | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | 修改后完整配置如下: | ||||||
|  |  | ||||||
|  | <div align="center"> <img src="https://github.com/heibaiying/BigData-Notes/blob/master/pictures/ifconfig.png"/> </div> | ||||||
|  |  | ||||||
|  | ### 3. 重启网络服务 | ||||||
|  |  | ||||||
|  | ```shell | ||||||
|  | #  systemctl restart network | ||||||
|  | ``` | ||||||
|  |  | ||||||
| @@ -1,9 +1,26 @@ | |||||||
| ## 大数据环境搭建指南 | ## 大数据环境搭建指南 | ||||||
|  |  | ||||||
| ### 一、JDK |  | ||||||
|  |  | ||||||
| 1. [linux环境下JDK的安装](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/JDK安装.md) |  | ||||||
|  |  | ||||||
| ### 二、Hadoop | ## 一、JDK | ||||||
|  |  | ||||||
|  | 1. [Linux环境下JDK的安装](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/JDK安装.md) | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ## 二、Hadoop | ||||||
|  |  | ||||||
|  | 1. [Hadoop单机版本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Hadoop单机版本环境搭建.md) | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ## 三、Spark | ||||||
|  |  | ||||||
|  | 1. [Spark单机版本环境搭建]((https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Spark单机版本环境搭建.md)) | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ## 网络配置 | ||||||
|  |  | ||||||
|  | + [虚拟机静态IP配置](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/虚拟机静态IP配置.md) | ||||||
|  |  | ||||||
| 1. [hadoop单机版本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/hadoop单机版本环境搭建.md) |  | ||||||
							
								
								
									
										
											BIN
										
									
								
								pictures/en0s3.png
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										
											BIN
										
									
								
								pictures/en0s3.png
									
									
									
									
									
										Normal file
									
								
							
										
											Binary file not shown.
										
									
								
							| After Width: | Height: | Size: 26 KiB | 
							
								
								
									
										
											BIN
										
									
								
								pictures/ifconfig.png
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										
											BIN
										
									
								
								pictures/ifconfig.png
									
									
									
									
									
										Normal file
									
								
							
										
											Binary file not shown.
										
									
								
							| After Width: | Height: | Size: 21 KiB | 
							
								
								
									
										
											BIN
										
									
								
								pictures/spark-download.png
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										
											BIN
										
									
								
								pictures/spark-download.png
									
									
									
									
									
										Normal file
									
								
							
										
											Binary file not shown.
										
									
								
							| After Width: | Height: | Size: 20 KiB | 
							
								
								
									
										
											BIN
										
									
								
								pictures/spark-shell-web-ui.png
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										
											BIN
										
									
								
								pictures/spark-shell-web-ui.png
									
									
									
									
									
										Normal file
									
								
							
										
											Binary file not shown.
										
									
								
							| After Width: | Height: | Size: 49 KiB | 
							
								
								
									
										
											BIN
										
									
								
								pictures/spark-shell.png
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										
											BIN
										
									
								
								pictures/spark-shell.png
									
									
									
									
									
										Normal file
									
								
							
										
											Binary file not shown.
										
									
								
							| After Width: | Height: | Size: 22 KiB | 
							
								
								
									
										
											BIN
										
									
								
								pictures/spark-web-ui.png
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										
											BIN
										
									
								
								pictures/spark-web-ui.png
									
									
									
									
									
										Normal file
									
								
							
										
											Binary file not shown.
										
									
								
							| After Width: | Height: | Size: 73 KiB | 
		Reference in New Issue
	
	Block a user