From 822bd0f5fb76d545c4c36add4fce8722c8e30b94 Mon Sep 17 00:00:00 2001 From: luoxiang <2806718453@qq.com> Date: Mon, 27 May 2019 22:40:46 +0800 Subject: [PATCH] =?UTF-8?q?=E6=A0=BC=E5=BC=8F=E8=B0=83=E6=95=B4?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 79 +-- ...用.md => Azkaban_Flow_1.0_的使用.md} | 0 ...用.md => Azkaban_Flow_2.0_的使用.md} | 0 .../{Hbase Java API.md => Hbase_Java_API.md} | 0 notes/{Hbase Shell.md => Hbase_Shell.md} | 0 ...nix.md => Hbase的SQL中间层_Phoenix.md} | 504 +++++++++--------- ...nux中大数据常用软件安装指南.md | 29 +- ...署.md => Azkaban_3.x_编译及部署.md} | 0 8 files changed, 303 insertions(+), 309 deletions(-) rename notes/{Azkaban Flow 1.0 的使用.md => Azkaban_Flow_1.0_的使用.md} (100%) rename notes/{Azkaban Flow 2.0 的使用.md => Azkaban_Flow_2.0_的使用.md} (100%) rename notes/{Hbase Java API.md => Hbase_Java_API.md} (100%) rename notes/{Hbase Shell.md => Hbase_Shell.md} (100%) rename notes/{Hbase的SQL层——Phoenix.md => Hbase的SQL中间层_Phoenix.md} (97%) rename notes/installation/{Azkaban 3.x 编译及部署.md => Azkaban_3.x_编译及部署.md} (100%) diff --git a/README.md b/README.md index 2b4edc2..25a7230 100644 --- a/README.md +++ b/README.md @@ -52,14 +52,14 @@ 1. [分布式文件存储系统——HDFS](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hadoop-HDFS.md) 2. [分布式计算框架——MapReduce](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hadoop-MapReduce.md) 3. [集群资源管理器——YARN](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hadoop-YARN.md) -4. [Hadoop单机伪集群环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/hadoop%E5%8D%95%E6%9C%BA%E7%89%88%E6%9C%AC%E7%8E%AF%E5%A2%83%E6%90%AD%E5%BB%BA.md) +4. [Hadoop单机伪集群环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/hadoop单机版本环境搭建.md) 5. [HDFS常用Shell命令](https://github.com/heibaiying/BigData-Notes/blob/master/notes/HDFS常用Shell命令.md) 6. [HDFS Java API的使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/HDFS-Java-API.md) ## 二、Hive 1. [数据仓库Hive简介](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive.md) -2. [Linux环境下Hive的安装部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Linux%E7%8E%AF%E5%A2%83%E4%B8%8BHive%E7%9A%84%E5%AE%89%E8%A3%85%E9%83%A8%E7%BD%B2.md) +2. [Linux环境下Hive的安装部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Linux环境下Hive的安装部署.md) 4. [Hive CLI和Beeline命令行的基本使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/HiveCLI和Beeline命令行的基本使用.md) 5. [Hive 核心概念讲解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive核心概念讲解.md) 6. [Hive 常用DDL操作](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hive常用DDL操作.md) @@ -94,53 +94,35 @@ 3. [Spark Streaming 整合 Flume](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Streaming整合Flume.md) 4. [Spark Streaming 整合 Kafka](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spark_Streaming整合Kafka.md) -## 四、Flink - -TODO - -## 五、Storm +## 四、Storm 1. [Storm和流处理简介](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm和流处理简介.md) 2. [Storm核心概念详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm核心概念详解.md) -3. [Storm单机版本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Storm%E5%8D%95%E6%9C%BA%E7%89%88%E6%9C%AC%E7%8E%AF%E5%A2%83%E6%90%AD%E5%BB%BA.md) +3. [Storm单机版本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Storm单机版本环境搭建.md) 4. [Storm编程模型详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm编程模型详解.md) 5. [Storm项目三种打包方式对比分析](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm三种打包方式对比分析.md) 6. [Storm集成Redis详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm集成Redis详解.md) 7. [Storm集成HDFS/HBase](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm集成HBase和HDFS.md) 8. [Storm集成Kafka](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Storm集成Kakfa.md) -## 六、Flume +## 五、Flink -1. [Flume简介及基本使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Flume简介及基本使用.md) -2. [Linux环境下Flume的安装部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Linux%E4%B8%8BFlume%E7%9A%84%E5%AE%89%E8%A3%85.md) -3. [Flume整合Kafka](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Flume整合Kafka.md) +TODO -## 七、Sqoop - -1. [Sqoop简介与安装](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Sqoop简介与安装.md) - -2. [Sqoop的基本使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Sqoop基本使用.md) - -## 八、Azkaban - -1. [Azkaban简介](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Azkaban简介.md) -2. [Azkaban3.x 编译及部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Azkaban%203.x%20%E7%BC%96%E8%AF%91%E5%8F%8A%E9%83%A8%E7%BD%B2.md) -3. [Azkaban Flow 1.0 的使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Azkaban%20Flow%201.0%20%E7%9A%84%E4%BD%BF%E7%94%A8.md) -4. [Azkaban Flow 2.0 的使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Azkaban%20Flow%202.0%20%E7%9A%84%E4%BD%BF%E7%94%A8.md) - -## 九、HBase +## 六、HBase 1. [Hbase 简介](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase简介.md) -2. [HBase系统架构及数据结构](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase%E7%B3%BB%E7%BB%9F%E6%9E%B6%E6%9E%84%E5%8F%8A%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84.md) -3. [HBase基本环境搭建(Standalone /pseudo-distributed mode)](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Hbase%E5%9F%BA%E6%9C%AC%E7%8E%AF%E5%A2%83%E6%90%AD%E5%BB%BA.md) -4. [HBase常用Shell命令](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase%20Shell.md) -5. [HBase Java API](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase%20Java%20API.md) +2. [HBase系统架构及数据结构](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase系统架构及数据结构.md) +3. [HBase基本环境搭建(Standalone /pseudo-distributed mode)](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Hbase基本环境搭建.md) +4. [HBase常用Shell命令](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase_Shell.md) +5. [HBase Java API](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase_Java_API.md) 6. [Hbase 过滤器详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase过滤器详解.md) 7. [HBase 协处理器详解](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase协处理器详解.md) -8. [HBase 容灾与备份](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase%E5%AE%B9%E7%81%BE%E4%B8%8E%E5%A4%87%E4%BB%BD.md) -9. [HBase的SQL中间层——Phoenix](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase%E7%9A%84SQL%E5%B1%82%E2%80%94%E2%80%94Phoenix.md) -10. [Spring/Spring Boot 整合 Mybatis + Phoenix](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spring%2BMybtais%2BPhoenix%E6%95%B4%E5%90%88.md) -## 十、Kafka +8. [HBase 容灾与备份](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase容灾与备份.md) +9. [HBase的SQL中间层——Phoenix](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Hbase的SQL中间层_Phoenix.md) +10. [Spring/Spring Boot 整合 Mybatis + Phoenix](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Spring+Mybtais+Phoenix整合.md) + +## 七、Kafka 1. [Kafka 核心概念介绍](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Kafka核心概念介绍.md) 2. [基于Zookeeper搭建Kafka高可用集群](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/基于Zookeeper搭建Kafka高可用集群.md) @@ -149,7 +131,7 @@ TODO 5. Kafka 副本机制以及选举原理剖析 6. Kafka的数据可靠性 -## 十一、Zookeeper +## 八、Zookeeper 1. [Zookeeper 简介及核心概念](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Zookeeper简介及核心概念.md) 2. [Zookeeper单机环境和集群环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Zookeeper单机环境和集群环境搭建.md) @@ -157,6 +139,25 @@ TODO 4. [Zookeeper Java 客户端——Apache Curator](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Zookeeper_Java客户端Curator.md) 5. [Zookeeper ACL权限控制](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Zookeeper_ACL权限控制.md) +## 九、Flume + +1. [Flume简介及基本使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Flume简介及基本使用.md) +2. [Linux环境下Flume的安装部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Linux下Flume的安装.md) +3. [Flume整合Kafka](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Flume整合Kafka.md) + +## 十、Sqoop + +1. [Sqoop简介与安装](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Sqoop简介与安装.md) + +2. [Sqoop的基本使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Sqoop基本使用.md) + +## 十一、Azkaban + +1. [Azkaban简介](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Azkaban简介.md) +2. [Azkaban3.x 编译及部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Azkaban_3.x_编译及部署.md) +3. [Azkaban Flow 1.0 的使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Azkaban_Flow_1.0_的使用.md) +4. [Azkaban Flow 2.0 的使用](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Azkaban_Flow_2.0_的使用.md) + ## 十二、Scala 1. [Scala简介及开发环境配置](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Scala简介及开发环境配置.md) @@ -174,8 +175,10 @@ TODO 13. [隐式转换和隐式参数](https://github.com/heibaiying/BigData-Notes/blob/master/notes/Scala隐式转换和隐式参数.md) - - ## 十三、公共内容 -1. [大数据应用常用打包方式](https://github.com/heibaiying/BigData-Notes/blob/master/notes/大数据应用常用打包方式.md) \ No newline at end of file +1. [大数据应用常用打包方式](https://github.com/heibaiying/BigData-Notes/blob/master/notes/大数据应用常用打包方式.md) + +
+ +## 后记:bookmark_tabs: \ No newline at end of file diff --git a/notes/Azkaban Flow 1.0 的使用.md b/notes/Azkaban_Flow_1.0_的使用.md similarity index 100% rename from notes/Azkaban Flow 1.0 的使用.md rename to notes/Azkaban_Flow_1.0_的使用.md diff --git a/notes/Azkaban Flow 2.0 的使用.md b/notes/Azkaban_Flow_2.0_的使用.md similarity index 100% rename from notes/Azkaban Flow 2.0 的使用.md rename to notes/Azkaban_Flow_2.0_的使用.md diff --git a/notes/Hbase Java API.md b/notes/Hbase_Java_API.md similarity index 100% rename from notes/Hbase Java API.md rename to notes/Hbase_Java_API.md diff --git a/notes/Hbase Shell.md b/notes/Hbase_Shell.md similarity index 100% rename from notes/Hbase Shell.md rename to notes/Hbase_Shell.md diff --git a/notes/Hbase的SQL层——Phoenix.md b/notes/Hbase的SQL中间层_Phoenix.md similarity index 97% rename from notes/Hbase的SQL层——Phoenix.md rename to notes/Hbase的SQL中间层_Phoenix.md index 5afb005..83ca85a 100644 --- a/notes/Hbase的SQL层——Phoenix.md +++ b/notes/Hbase的SQL中间层_Phoenix.md @@ -1,252 +1,252 @@ -# Hbase的SQL中间层——Phoenix - - - -## 一、Phoenix简介 - -Phoenix是HBase的开源SQL层。使得您可以使用标准JDBC API而不是常规HBase客户端API来操作Hbases上的数据。 - -Phoenix完全使用Java编写,作为HBase内嵌的JDBC驱动。Phoenix查询引擎会将SQL查询转换为一个或多个HBase scan,并编排并行执行以生成标准的JDBC结果集,同时Phoenix还拥有二级索引等Hbase不具备的特性,这使得Phoenix具有极好的性能表现。 - -
- - - -## 二、Phoenix安装 - -> 我们可以按照官方安装说明进行安装,官方说明如下: -> -> - download and expand our installation tar -> - copy the phoenix server jar that is compatible with your HBase installation into the lib directory of every region server -> - restart the region servers -> - add the phoenix client jar to the classpath of your HBase client -> - download and setup SQuirrel as your SQL client so you can issue adhoc SQL against your HBase cluster - -### 2.1 下载并解压 - -官方下载地址: http://phoenix.apache.org/download.html - -官方针对Apache版本和CDH版本的HBase均提供了安装包,按需下载即可。这里我们下载的版本为`4.14.0-cdh5.14.2` - -```shell -# 下载 -wget http://mirror.bit.edu.cn/apache/phoenix/apache-phoenix-4.14.0-cdh5.14.2/bin/apache-phoenix-4.14.0-cdh5.14.2-bin.tar.gz -# 解压 -tar tar apache-phoenix-4.14.0-cdh5.14.2-bin.tar.gz -``` - -### 2.2 拷贝Jar包 - -按照官方文档的说明,需要将phoenix server jar 添加到所有 Region Servers上 Hbase 安装目录的 lib目录下。 - -这里由于我搭建的是Hbase伪集群,所以只需要拷贝到当前机器的HBase的lib目录下。如果是真实集群,则使用scp命令分发到所有Region Servers机器上。 - -```shell -cp /usr/app/apache-phoenix-4.14.0-cdh5.14.2-bin/phoenix-4.14.0-cdh5.14.2-server.jar /usr/app/hbase-1.2.0-cdh5.15.2/lib -``` - -### 2.3 重启 Region Servers - -```shell -# 停止Hbase -stop-hbase.sh -# 启动Hbase -start-hbase.sh -``` - -### 2.4 启动Phoenix - -在Phoenix解压目录下的`bin`目录下执行如下命令,需要指定Zookeeper的地址: - -+ 如果HBase采用Standalone模式或者伪集群模式搭建,则采用内置的 Zookeeper,默认端口为2181; -+ 如果是HBase是集群模式并采用自己搭建的Zookeeper集群,则按照自己的实际情况指定端口 - -```shell -# ./sqlline.py hadoop001:2181 -``` - -### 2.5 启动结果 - -启动后则进入了Phoenix交互式SQL命令行,可以使用`!table`或`!tables`查看当前所有表的信息 - -
- - - -## 三、Phoenix 简单使用 - -### 3.1 创建表 - -```sql -CREATE TABLE IF NOT EXISTS us_population ( - state CHAR(2) NOT NULL, - city VARCHAR NOT NULL, - population BIGINT - CONSTRAINT my_pk PRIMARY KEY (state, city)); -``` - -
- -新建的表会按照特定的规则转换为Hbase上的表,关于表的信息,可以通过Hbase Web UI 进行查看: - -
- -### 3.2 插入数据 - -Phoenix 中插入数据采用的是`UPSERT`而不是`INSERT`,因为Phoenix并没有更新操作,插入相同主键的数据就视为更新,所以`UPSERT`就相当于`UPDATE`+`INSERT` - -```shell -UPSERT INTO us_population VALUES('NY','New York',8143197); -UPSERT INTO us_population VALUES('CA','Los Angeles',3844829); -UPSERT INTO us_population VALUES('IL','Chicago',2842518); -UPSERT INTO us_population VALUES('TX','Houston',2016582); -UPSERT INTO us_population VALUES('PA','Philadelphia',1463281); -UPSERT INTO us_population VALUES('AZ','Phoenix',1461575); -UPSERT INTO us_population VALUES('TX','San Antonio',1256509); -UPSERT INTO us_population VALUES('CA','San Diego',1255540); -UPSERT INTO us_population VALUES('TX','Dallas',1213825); -UPSERT INTO us_population VALUES('CA','San Jose',912332); -``` - -### 3.3 修改数据 - -```sql --- 插入主键相同的数据就视为更新 -UPSERT INTO us_population VALUES('NY','New York',999999); -``` - -
- -### 3.4 删除数据 - -```sql -DELETE FROM us_population WHERE city='Dallas'; -``` - -
- -### 3.5 查询数据 - -```sql -SELECT state as "州",count(city) as "市",sum(population) as "热度" -FROM us_population -GROUP BY state -ORDER BY sum(population) DESC; -``` - -
- - - -### 3.6 退出命令 - -```sql -!quit -``` - - - -### 3.7 扩展 - -从上面的简单操作中我们可以看出,Phoenix 查询语句与我们正常使用的SQL是基本相同的,关于Phoenix 支持的语句、数据类型、函数、序列(和Oracle中序列类似)因为涵盖内容很广,可以参考其官方文档,官方上有详尽的配图说明的: - -+ 语法(Grammar):https://phoenix.apache.org/language/index.html - -+ 函数(Functions):http://phoenix.apache.org/language/functions.html - -+ 数据类型(Datatypes):http://phoenix.apache.org/language/datatypes.html - -+ 序列(Sequences):http://phoenix.apache.org/sequences.html - -+ 联结查询(Joins):http://phoenix.apache.org/joins.html - - - -## 四、Phoenix Java API - -因为Phoenix遵循JDBC规范,并提供了对应的数据库驱动PhoenixDriver,这使采用Java对其进行操作的时候,就如同对其他关系型数据库(例如 MySQL)操作一样。 - -因为在实际的开发中我们通常都是采用第三方框架,比如mybatis,Hibernate,Spring Data 等,很少使用原生Java API操作关系型数据库,所以这里只给出一个简单的查询作为示例,并在下一篇文章中给出Spring boot + mybatis + Phoenix 的整合用例。 - -### 4.1 引入Phoenix core JAR包 - -如果是maven项目,直接在maven中央仓库找到对应的版本,导入依赖即可 - -```xml - - - org.apache.phoenix - phoenix-core - 4.14.0-cdh5.14.2 - -``` - -如果是普通项目,则可以从Phoenix 解压目录下找到对应的JAR包,然后手动引入 - -
- -### 4.2 简单的Java API实例 - -```java -import java.sql.Connection; -import java.sql.DriverManager; -import java.sql.PreparedStatement; -import java.sql.ResultSet; - - -public class PhoenixJavaApi { - - public static void main(String[] args) throws Exception { - - // 加载数据库驱动 - Class.forName("org.apache.phoenix.jdbc.PhoenixDriver"); - - /* - * 指定数据库地址,格式为 jdbc:phoenix:Zookeeper地址 - * 如果HBase采用Standalone模式或者伪集群模式搭建,则HBase默认使用内置的Zookeeper,默认端口为2181 - */ - Connection connection = DriverManager.getConnection("jdbc:phoenix:192.168.200.226:2181"); - - PreparedStatement statement = connection.prepareStatement("SELECT * FROM us_population"); - - ResultSet resultSet = statement.executeQuery(); - - while (resultSet.next()) { - System.out.println(resultSet.getString("city") + " " - + resultSet.getInt("population")); - } - - statement.close(); - connection.close(); - } -} -``` - -结果如下: - -
- - - -# 参考资料 - -1. http://phoenix.apache.org/ +# Hbase的SQL中间层——Phoenix + + + +## 一、Phoenix简介 + +Phoenix是HBase的开源SQL层。使得您可以使用标准JDBC API而不是常规HBase客户端API来操作Hbases上的数据。 + +Phoenix完全使用Java编写,作为HBase内嵌的JDBC驱动。Phoenix查询引擎会将SQL查询转换为一个或多个HBase scan,并编排并行执行以生成标准的JDBC结果集,同时Phoenix还拥有二级索引等Hbase不具备的特性,这使得Phoenix具有极好的性能表现。 + +
+ + + +## 二、Phoenix安装 + +> 我们可以按照官方安装说明进行安装,官方说明如下: +> +> - download and expand our installation tar +> - copy the phoenix server jar that is compatible with your HBase installation into the lib directory of every region server +> - restart the region servers +> - add the phoenix client jar to the classpath of your HBase client +> - download and setup SQuirrel as your SQL client so you can issue adhoc SQL against your HBase cluster + +### 2.1 下载并解压 + +官方下载地址: http://phoenix.apache.org/download.html + +官方针对Apache版本和CDH版本的HBase均提供了安装包,按需下载即可。这里我们下载的版本为`4.14.0-cdh5.14.2` + +```shell +# 下载 +wget http://mirror.bit.edu.cn/apache/phoenix/apache-phoenix-4.14.0-cdh5.14.2/bin/apache-phoenix-4.14.0-cdh5.14.2-bin.tar.gz +# 解压 +tar tar apache-phoenix-4.14.0-cdh5.14.2-bin.tar.gz +``` + +### 2.2 拷贝Jar包 + +按照官方文档的说明,需要将phoenix server jar 添加到所有 Region Servers上 Hbase 安装目录的 lib目录下。 + +这里由于我搭建的是Hbase伪集群,所以只需要拷贝到当前机器的HBase的lib目录下。如果是真实集群,则使用scp命令分发到所有Region Servers机器上。 + +```shell +cp /usr/app/apache-phoenix-4.14.0-cdh5.14.2-bin/phoenix-4.14.0-cdh5.14.2-server.jar /usr/app/hbase-1.2.0-cdh5.15.2/lib +``` + +### 2.3 重启 Region Servers + +```shell +# 停止Hbase +stop-hbase.sh +# 启动Hbase +start-hbase.sh +``` + +### 2.4 启动Phoenix + +在Phoenix解压目录下的`bin`目录下执行如下命令,需要指定Zookeeper的地址: + ++ 如果HBase采用Standalone模式或者伪集群模式搭建,则采用内置的 Zookeeper,默认端口为2181; ++ 如果是HBase是集群模式并采用自己搭建的Zookeeper集群,则按照自己的实际情况指定端口 + +```shell +# ./sqlline.py hadoop001:2181 +``` + +### 2.5 启动结果 + +启动后则进入了Phoenix交互式SQL命令行,可以使用`!table`或`!tables`查看当前所有表的信息 + +
+ + + +## 三、Phoenix 简单使用 + +### 3.1 创建表 + +```sql +CREATE TABLE IF NOT EXISTS us_population ( + state CHAR(2) NOT NULL, + city VARCHAR NOT NULL, + population BIGINT + CONSTRAINT my_pk PRIMARY KEY (state, city)); +``` + +
+ +新建的表会按照特定的规则转换为Hbase上的表,关于表的信息,可以通过Hbase Web UI 进行查看: + +
+ +### 3.2 插入数据 + +Phoenix 中插入数据采用的是`UPSERT`而不是`INSERT`,因为Phoenix并没有更新操作,插入相同主键的数据就视为更新,所以`UPSERT`就相当于`UPDATE`+`INSERT` + +```shell +UPSERT INTO us_population VALUES('NY','New York',8143197); +UPSERT INTO us_population VALUES('CA','Los Angeles',3844829); +UPSERT INTO us_population VALUES('IL','Chicago',2842518); +UPSERT INTO us_population VALUES('TX','Houston',2016582); +UPSERT INTO us_population VALUES('PA','Philadelphia',1463281); +UPSERT INTO us_population VALUES('AZ','Phoenix',1461575); +UPSERT INTO us_population VALUES('TX','San Antonio',1256509); +UPSERT INTO us_population VALUES('CA','San Diego',1255540); +UPSERT INTO us_population VALUES('TX','Dallas',1213825); +UPSERT INTO us_population VALUES('CA','San Jose',912332); +``` + +### 3.3 修改数据 + +```sql +-- 插入主键相同的数据就视为更新 +UPSERT INTO us_population VALUES('NY','New York',999999); +``` + +
+ +### 3.4 删除数据 + +```sql +DELETE FROM us_population WHERE city='Dallas'; +``` + +
+ +### 3.5 查询数据 + +```sql +SELECT state as "州",count(city) as "市",sum(population) as "热度" +FROM us_population +GROUP BY state +ORDER BY sum(population) DESC; +``` + +
+ + + +### 3.6 退出命令 + +```sql +!quit +``` + + + +### 3.7 扩展 + +从上面的简单操作中我们可以看出,Phoenix 查询语句与我们正常使用的SQL是基本相同的,关于Phoenix 支持的语句、数据类型、函数、序列(和Oracle中序列类似)因为涵盖内容很广,可以参考其官方文档,官方上有详尽的配图说明的: + ++ 语法(Grammar):https://phoenix.apache.org/language/index.html + ++ 函数(Functions):http://phoenix.apache.org/language/functions.html + ++ 数据类型(Datatypes):http://phoenix.apache.org/language/datatypes.html + ++ 序列(Sequences):http://phoenix.apache.org/sequences.html + ++ 联结查询(Joins):http://phoenix.apache.org/joins.html + + + +## 四、Phoenix Java API + +因为Phoenix遵循JDBC规范,并提供了对应的数据库驱动PhoenixDriver,这使采用Java对其进行操作的时候,就如同对其他关系型数据库(例如 MySQL)操作一样。 + +因为在实际的开发中我们通常都是采用第三方框架,比如mybatis,Hibernate,Spring Data 等,很少使用原生Java API操作关系型数据库,所以这里只给出一个简单的查询作为示例,并在下一篇文章中给出Spring boot + mybatis + Phoenix 的整合用例。 + +### 4.1 引入Phoenix core JAR包 + +如果是maven项目,直接在maven中央仓库找到对应的版本,导入依赖即可 + +```xml + + + org.apache.phoenix + phoenix-core + 4.14.0-cdh5.14.2 + +``` + +如果是普通项目,则可以从Phoenix 解压目录下找到对应的JAR包,然后手动引入 + +
+ +### 4.2 简单的Java API实例 + +```java +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import java.sql.ResultSet; + + +public class PhoenixJavaApi { + + public static void main(String[] args) throws Exception { + + // 加载数据库驱动 + Class.forName("org.apache.phoenix.jdbc.PhoenixDriver"); + + /* + * 指定数据库地址,格式为 jdbc:phoenix:Zookeeper地址 + * 如果HBase采用Standalone模式或者伪集群模式搭建,则HBase默认使用内置的Zookeeper,默认端口为2181 + */ + Connection connection = DriverManager.getConnection("jdbc:phoenix:192.168.200.226:2181"); + + PreparedStatement statement = connection.prepareStatement("SELECT * FROM us_population"); + + ResultSet resultSet = statement.executeQuery(); + + while (resultSet.next()) { + System.out.println(resultSet.getString("city") + " " + + resultSet.getInt("population")); + } + + statement.close(); + connection.close(); + } +} +``` + +结果如下: + +
+ + + +# 参考资料 + +1. http://phoenix.apache.org/ diff --git a/notes/Linux中大数据常用软件安装指南.md b/notes/Linux中大数据常用软件安装指南.md index c9718c4..c50af6c 100644 --- a/notes/Linux中大数据常用软件安装指南.md +++ b/notes/Linux中大数据常用软件安装指南.md @@ -5,56 +5,47 @@ 1. [Linux环境下JDK安装](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Linux下JDK安装.md) 2. [Linux环境下Python安装](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Linux下Python安装.md) - - ### 二、Hadoop -1. [Hadoop单机版本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/hadoop%E5%8D%95%E6%9C%BA%E7%89%88%E6%9C%AC%E7%8E%AF%E5%A2%83%E6%90%AD%E5%BB%BA.md) - +1. [Hadoop单机版本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/hadoop单机版本环境搭建.md) +2. Hadoop集群环境搭建 +3. 基于Zookeeper搭建Hadoop的HA集群 ### 三、Spark 1. [Spark单机版本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Spark单机版本环境搭建.md) - +2. Spark集群环境搭建 ### 四、Storm 1. [Storm单机版本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Storm单机版本环境搭建.md) - +2. Storm集群环境搭建 ### 五、Hbase -1. [Hbase基本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Hbase%E5%9F%BA%E6%9C%AC%E7%8E%AF%E5%A2%83%E6%90%AD%E5%BB%BA.md) - +1. [Hbase基本环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Hbase基本环境搭建.md) +2. Hbase集群环境搭建 ### 六、Flume -1. [Linux环境下Flume的安装部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Linux%E4%B8%8BFlume%E7%9A%84%E5%AE%89%E8%A3%85.md) - - +1. [Linux环境下Flume的安装部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Linux下Flume的安装.md) ### 七、Azkaban -1. [Azkaban3.x编译及部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Azkaban%203.x%20%E7%BC%96%E8%AF%91%E5%8F%8A%E9%83%A8%E7%BD%B2.md) - - +1. [Azkaban3.x编译及部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Azkaban_3.x_编译及部署.md) ### 八、Hive -1. [Linux环境下Hive的安装部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Linux%E7%8E%AF%E5%A2%83%E4%B8%8BHive%E7%9A%84%E5%AE%89%E8%A3%85%E9%83%A8%E7%BD%B2.md) - - +1. [Linux环境下Hive的安装部署](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Linux环境下Hive的安装部署.md) ### 九、Zookeeper 1. [Zookeeper单机环境和集群环境搭建](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/Zookeeper单机环境和集群环境搭建.md) - - ### 十、Kafka 1. [基于Zookeeper搭建Kafka高可用集群](https://github.com/heibaiying/BigData-Notes/blob/master/notes/installation/基于Zookeeper搭建Kafka高可用集群.md) \ No newline at end of file diff --git a/notes/installation/Azkaban 3.x 编译及部署.md b/notes/installation/Azkaban_3.x_编译及部署.md similarity index 100% rename from notes/installation/Azkaban 3.x 编译及部署.md rename to notes/installation/Azkaban_3.x_编译及部署.md