Hortonworks的开源框架SHC的使用(一)

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Hortonworks的开源框架SHC的使用(一)•
2. 编译源码包
(1) 解压源码包,修改项目根目录下的pom文件
根据自己使用的版本修改各组件的版本号,以下是我修改后的pom文件:
```
<groupId>com.hortonworks</groupId>
<artifactId>shc</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
<packaging>pom</packaging>
<properties>
<spark.version>2.3.0</spark.version>
<hbase.version>1.2.6</hbase.version>
<phoenix.version>4.14.0-HBase-1.2</phoenix.version>
<java.version>1.8</java.version>
</properties>
```
说明:
•以上内容只是我修改的部分,没有修改的我没有贴出来
•我修改了version,那么在子模块的pom中,也要修改为一样的version
以下是我修改后的两个子模块core和examples中的pom文件,只修改了version
<parent>
<groupId>com.hortonworks</groupId>
<artifactId>shc</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
<relativePath>../pom.xml</relativePath>
</parent>
<artifactId>shc-core</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
<packaging>jar</packaging>
<name>HBase Spark Connector Project Core</name>
<parent>
<groupId>com.hortonworks</groupId>
<artifactId>shc</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
<relativePath>../pom.xml</relativePath>
</parent>
<artifactId>shc-examples</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
<packaging>jar</packaging>
<name>HBase Spark Connector Project Examples</name>
(2) 编译源码
在源码包根目录下执行mvn命令:
mvn install -DskipT ests
执行成功后,你的本地maven仓库中已经有了这个项目的jar包
3. 创建测试shc的maven工程
(1) 新建maven工程,在pom中引入我们编译好的shc-core的依赖
注意,我们只需要shc-core的依赖
<dependency>
<groupId>com.hortonworks</groupId>
<artifactId>shc-core</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
</dependency>
(2) 导入spark相关的依赖,并解决依赖冲突
# 以下spark的依赖包排除了hadoop-client包,因为与shc-core中的hadoop-client有版本冲突
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.3.0</version>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.3.0</version>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.3.0</version>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId> <version>2.3.0</version>
</dependency>
# 手动引入hadoop-client的依赖
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.6</version>
</dependency>
这里选择引入hadoop2.7.6版本的hadoop-client是因为2.7版本的hadoop可以兼容全部版本的hbase,下图为hbase官网的hadoop与hbase各版本的兼容性对照表:
(3) 引入hbase相关的依赖并解决依赖冲突
这里只需要排除掉冲突的依赖就可以了
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>1.2.6</version>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId> </exclusion>
<exclusion>
<groupId>ty</groupId>
<artifactId>netty-all</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>1.2.6</version>
<exclusions>
<exclusion>
<groupId>ty</groupId>
<artifactId>netty-all</artifactId>
</exclusion>
</exclusions>
</dependency>
(4) 把hdfs-site.xml、core-site.xml和hbase-site.xml放到项目的resources目录下
(5) 其他
•我在ideal中创建的这个maven工程,加入了scala-sdk,不再赘述
•修改了项目架构,加入了scala主程序文件夹和测试文件夹
•配置maven相关的插件,加入scala编译插件
•<build>
•<plugins>

•<plugin>
•<groupId>net.alchim31.maven</groupId>
•<artifactId>scala-maven-plugin</artifactId>
•<version>3.4.2</version>
•<executions>
•<execution>
•<id>scala-compile-first</id>
•<phase>process-resources</phase>
•<goals>
•<goal>add-source</goal>
•<goal>compile</goal>
•</goals>
•</execution>
•<configuration>
•<scalaVersion>2.11.8</scalaVersion>
•<recompileMode>incremental</recompileMode> •<useZincServer>true</useZincServer>
•<args>
•<arg>-unchecked</arg>
•<arg>-deprecation</arg>
•<arg>-feature</arg>
•</args>
•<javacArgs>
•<javacArg>-source</javacArg>
•<javacArg>1.8</javacArg>
•<javacArg>-target</javacArg>
•<javacArg>1.8</javacArg>
•</javacArgs>
•</configuration>
•</plugin>

•<plugin>
•<groupId>org.apache.maven.plugins</groupId> •<artifactId>maven-compiler-plugin</artifactId> •<version>3.8.0</version>
•<executions>
•<execution>
•<phase>compile</phase>
•<goals>
•<goal>compile</goal>
•</goals>
•</execution>
•<configuration>
•<skip>true</skip>
•<source>1.8</source>
•<target>1.8</target>
•<encoding>UTF-8</encoding>
•</configuration>
•</plugin>

•<plugin>
•<groupId>org.apache.maven.plugins</groupId> •<artifactId>maven-resources-plugin</artifactId> •<version>3.1.0</version>
•<configuration>
•<encoding>UTF-8</encoding>
•</configuration>
•</plugin>

•<plugin>
•<groupId>org.apache.maven.plugins</groupId> •<artifactId>maven-shade-plugin</artifactId> •<version>3.1.1</version>
•<executions>
•<execution>
•<phase>package</phase>
•<goals>
•<goal>shade</goal>
•</goals>
•<configuration>
•<!-- 设置jar包的主类 -->
•<!--
•<transformers>
•<transformer
implementation="org.apache.maven.plugins.shade.resource.Ma nifestResourceTransformer">
•<mainClass>com.bonc.rdpe.spark.hbase.SparkToHBase< /mainClass>
•</transformer>
•</transformers>
•-->
•<filters>
•<filter>
•<artifact>*:*</artifact>
•<excludes>
•<exclude>META-INF/*.SF</exclude>
•<exclude>META-INF/*.DSA</exclude>
•<exclude>META-INF/*.RSA</exclude>
•</excludes>
•</filter>
•</filters>
•</configuration>
•</execution>
•</executions>
•</plugin>

•</plugins>
•</build>
至此,开始测试shc之前的全部准备工作就做好了!。

相关文档
最新文档