通过thrift使用c++访问HBase
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
通过Thrift使用C++访问HBase完整文档一Linux系统下Thrift安装
1.1 安装libevent
./configure --prefix=/usr/local/libevent
make
make install
1.2 安装boost
./bootstrap.sh
./bjam "-sTOOLS=gcc" "--without-python" install
1.3 安装Thrift
chmod +x configure
./configure --with-python=no
make
make install
二生成Hbase的client代码
执行命令:
thrift --gen cpp Hbase.thrift
生成gen-cpp文件夹。
三准备hbase
步骤:
1 首先确认hbase正常工作:
查看thriftserver端口状态,如:
hbase.regionserver.thrift.port=9090
lsof –i:9090
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
java 5846 root 167u IPv6 5505603 TCP *:websm (LISTEN)
如果thriftserver服务未启动,则bin/hbase-daemon.sh start thrift启动。
2 编写测试程序,并编译。
如测例的Makefile文件如下:
THRIFT_DIR = /usr/local/include/thrift
LIB_DIR = /usr/local/lib
GEN_SRC = ./gen-cpp/Hbase.cpp \
./gen-cpp/Hbase_types.cpp \
./gen-cpp/Hbase_constants.cpp
default: DemoClient
DemoClient: DemoClient.cpp
g++ -DHA VE_NETINET_IN_H -o DemoClient -I${THRIFT_DIR} -I/usr/include -I./gen-cpp -L${LIB_DIR} -lthrift DemoClient.cpp ${GEN_SRC}
clean:
rm -rf DemoClient
3 如果遇到无法找到libthrift.so.0库的问题,则需要执行
export LD_LIBRARY_PA TH=/usr/local/lib:$LD_LIBRARY_PATH
4 执行程序
./DemoClient localhost 9090
注意:
9090是ThriftServer的默认监听端口。
四Windows系统下Thrift安装
4.1 编译boost库
1、在网站下载boost_1_36_0文件包。
/。
2、由于boost是采用其自己的bjam工具通过命令行进行编译的,所以:如果在Windows下开启console窗口(单击“开始”按钮,单击“运行”,敲入“cmd”),必须将Visual Studio中C++目录下的环境vcvarsall.bat配置脚本运行一遍,以设置好VC的编译器环境变量。
如果从vs2005的工具菜单进入命令提示窗口(单击“开始”按钮,指向“所有程序”,指向“Microsoft Visual Studio 2005”,指向“Visual Studio 工具”,然后单击“Visual Studio 2005 命令提示”),则不需要运行Visual Studio中C++目录下的环境vcvarsall.bat配置脚本。
3、解压缩到d:\boost_1_36_0\目录下。
4、编译bjam。
从vs2005的工具菜单进入命令提示窗口(单击“开始”按钮,指向“所有程序”,指向“Microsoft Visual Studio 2005”,指向“Visual Studio 工具”,然后单击“Visual Studio 2005 命令提示”),cd到d:\boost_1_36_0\tools\jam\src下执行build.bat,会在d:\boost_1_36_0\tools\jam\src\bin.ntx86\下生成bjam.exe,將bjam.exe复制到d:\boost_1_36_0\下。
5、设定编译环境。
修改user-config.jam (d:\boost_1_36_0\tools\build\v2\user-config.jam) 的MSVC configuration
# MSVC configuration
# Configure msvc (default version, searched in standard location
# and PATH).
# using msvc ;
using msvc : 8.0 : : <compileflags>/wd4819 <compileflags>/D_CRT_SECURE_NO_DEPRECATE
<compileflags>/D_SCL_SECURE_NO_DEPRECATE
<compileflags>/D_SECURE_SCL=0 ;
6、编译boost
將目录cd到d:\boost_1_36_0\下执行
(1) 编译不带ICU支持的boost库
此种情况下的boost库编译起来比较的简单,在准备好的console窗口中输入:bjam --without-python --toolset=msvc-8.0 --build-type=complete --prefix="d:\boost_1_36_0" stage
就可以了,如果要安装的话则输入:
bjam --without-python --toolset=msvc-8.0 --build-type=complete
--prefix="d:\boost_1_36_0" install
(2) 编译具有ICU支持的boost库
首先我们必须编译ICU库才能够编译boost库,在准备好的console窗口中输入:bjam -sICU_PATH=d:\ICU --without-python --toolset=msvc-8.0 --build-type=complete --prefix="d:\boost_1_36_0" stage
就可以了,如果要安装的话则输入:
bjam -sICU_PATH=d:\ICU --without-python --toolset=msvc-8.0 --build-type=complete --prefix="d:\boost_1_36_0" install
通过上面的方法可以很正常完成boost各种需要版本的关系。
参数说明:
--without-python 表示不使用python
--toolset : 所使用compiler,Visual Studio 2005 为msvc-8.0
--build-type:编译类型,complete表示生成所有的版本(debug,release等)--prefix:指定编译后library的的目录
这一步要花比较长的时间(大约几十分钟,视机器配置而定)
7、设定vs2005环境。
Tools -> Options -> Projects and Solutions -> VC++ Directories
在Library files加上d:\ boost_1_36_0\lib
在Include files加上d:\boost_1_36_0\include\boost-1_36\
注意:以上的各个目录只是作为例子说明,下载下来的文件包很小,但是编译后整个文件包就非常大(超过4G)。
在编译时要留足空间。
测试如下
// BoostTest.cpp : 定义控制台应用程序的入口点。
#include "stdafx.h"
#include <boost/lexical_cast.hpp>
#include <iostream>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
using boost::lexical_cast;
int a=lexical_cast<int>("123");
cout<<"number: "<<a <<endl;
return 1;
}
测例能够正常运行,说明安装成功。
4.2 编译thrift库
需要安装好VS2010的编译器。
下载好Thrift-0.8.0版源码并解压到相关目录。
4.2.1首先需要编译libthrift.lib库文件
步骤如下:
1 使用vs2010打开…\thrift-0.8.0\lib\cpp目录下libthrift.vcxproj项目。
2 给项目添加boost 附加包含目录和附加库目录。
如:
附加库目录:…\boost_1_36_0\lib
附加包含目录:…\boost_1_36_0\include\boost-1_36
3 生成解决方案,会产生libthrift.lib库文件。
注意:
如果编译时报int32_t未知类型之类的错误:
则修改项目C/C++的预处理器定义属性,添加
HA VE_CONFIG_H
BOOST_HAS_STDINT_H
4.2.2然后编译thrift库文件(本步骤只生成静态lib库文件) 4.2.2.1下载Thrift-0.8.0.exe生成Hbase的thrift客户端API Thrift-0.8.0.exe –gen cpp Hbase.thrift
生成gen-cpp目录,如下:
Hbase_server.skeleton.cpp文件之后未用到。
4.2.2.2 使用vs2010新建win32控制台应用程序
点击下一步,选择DLL(D),点击完成。
把第一步中生成的.h和.cpp文件拷贝到本工程的源文件夹里。
注意:需要添加#include "stdafx.h"到拷贝的文件。
4.2.2.3 添加附加包含目录和附加库目录
如:
附加库目录:…\boost_1_36_0\lib
附加包含目录:…\boost_1_36_0\include\boost-1_36和…\thrift-0.8.0\lib\cpp\src
4.2.2.4 生成解决方案,会产生thriftdll.lib库文件(名字视所建工程名而定)
注意:
如果编译时报int32_t未知类型之类的错误:
则修改项目C/C++的预处理器定义属性,添加
HA VE_CONFIG_H
BOOST_HAS_STDINT_H
4.3 编译hbaseclient测试
此时,已生成libthrift.lib和thriftdll.lib库文件。
如果要使用vs2010开发客户端应用程序,需要修改应用项目的配置属性。
如:
附加包含目录:…\boost_1_36_0\include\boost-1_36和…\thrift-0.8.0\lib\cpp\src 和…\gen-cpp
附加库目录:…\boost_1_36_0\lib 和…\boost_1_36_0\boost\bin\vc9\lib
链接器->输入->附加依赖项增加:
libboost_thread-vc90-mt-gd-1_36.lib
libthrift.lib
thriftdll.lib
设置完成后,可以执行测例查看是否成功。
测例可使用第五部分例子。
五例子
c++例子
DemoClient.cpp文件
(注意:该测例可以在Linux系统上运行,在windows系统上,如果报:
错误,则需要在文件头增加宏定义:#define_WIN32_WINNT 0x0500) #include <stdint.h>
#include <config.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/time.h>
#include <poll.h>
#include <iostream>
#include <boost/lexical_cast.hpp>
#include <protocol/TBinaryProtocol.h>
#include <protocol/TCompactProtocol.h>
#include <transport/TSocket.h>
#include <transport/TTransportUtils.h>
#include "Hbase.h"
using namespace apache::thrift;
using namespace apache::thrift::protocol;
using namespace apache::thrift::transport;
using namespace apache::hadoop::hbase::thrift;
namespace {
typedef std::vector<std::string> StrVec;
typedef std::map<std::string,std::string> StrMap;
typedef std::vector<ColumnDescriptor> ColVec;
typedef std::map<std::string,ColumnDescriptor> ColMap; typedef std::vector<TCell> CellVec;
typedef std::map<std::string,TCell> CellMap;
static void
printRow(const std::vector<TRowResult> &rowResult)
{
for (size_t i = 0; i < rowResult.size(); i++) {
std::cout << "row: " << rowResult[i].row << ", cols: ";
for (CellMap::const_iterator it = rowResult[i].columns.begin();
it != rowResult[i].columns.end(); ++it) {
std::cout << it->first << " => " << it->second.value << "; ";
}
std::cout << std::endl;
}
}
static void
printVersions(const std::string &row, const CellVec &versions)
{
std::cout << "row: " << row << ", values: ";
for (CellVec::const_iterator it = versions.begin(); it != versions.end(); ++it) {
std::cout << (*it).value << "; ";
}
std::cout << std::endl;
}
}
int
main(int argc, char** argv)
{
if (argc < 3) {
std::cerr << "Invalid arguments!\n" << "Usage: DemoClient host port" << std::endl;
return -1;
}
boost::shared_ptr<TTransport> socket(new TSocket(argv[1], boost::lexical_cast<int>(argv[2])));
boost::shared_ptr<TTransport> transport(new TBufferedTransport(socket));
boost::shared_ptr<TProtocol> protocol(new TBinaryProtocol(transport));
HbaseClient client(protocol);
try {
transport->open();
std::string t("demo_table");
//
// Scan all tables, look for the demo table and delete it.
//
std::cout << "scanning tables..." << std::endl;
StrVec tables;
client.getTableNames(tables);
for (StrVec::const_iterator it = tables.begin(); it != tables.end(); ++it) { std::cout << " found: " << *it << std::endl;
if (t == *it) {
if (client.isTableEnabled(*it)) {
std::cout << " disabling table: " << *it << std::endl;
client.disableTable(*it);
}
std::cout << " deleting table: " << *it << std::endl;
client.deleteTable(*it);
}
}
//
// Create the demo table with two column families, entry: and unused: //
ColVec columns;
columns.push_back(ColumnDescriptor());
columns.back().name = "entry:";
columns.back().maxVersions = 10;
columns.push_back(ColumnDescriptor());
columns.back().name = "unused:";
std::cout << "creating table: " << t << std::endl;
try {
client.createTable(t, columns);
} catch (const AlreadyExists &ae) {
std::cerr << "W ARN: " << ae.message << std::endl;
}
ColMap columnMap;
client.getColumnDescriptors(columnMap, t);
std::cout << "column families in " << t << ": " << std::endl;
for (ColMap::const_iterator it = columnMap.begin(); it != columnMap.end(); ++it) {
std::cout << " column: " << it-> << ", maxVer: " << it->second.maxVersions << std::endl;
}
//
// Test UTF-8 handling
//
std::string invalid("foo-\xfc\xa1\xa1\xa1\xa1\xa1");
std::string valid("foo-\xE7\x94\x9F\xE3\x83\x93\xE3\x83\xBC\xE3\x83\xAB");
// non-utf8 is fine for data
const std::map<Text, Text> attributes;
std::vector<Mutation> mutations;
mutations.push_back(Mutation());
mutations.back().column = "entry:foo";
mutations.back().value = invalid;
client.mutateRow(t, "foo", mutations, attributes);
// try empty strings
mutations.clear();
mutations.push_back(Mutation());
mutations.back().column = "entry:";
mutations.back().value = "";
client.mutateRow(t, "", mutations, attributes);
// this row name is valid utf8
mutations.clear();
mutations.push_back(Mutation());
mutations.back().column = "entry:foo";
mutations.back().value = valid;
client.mutateRow(t, valid, mutations, attributes);
// non-utf8 is now allowed in row names because HBase stores values as binary mutations.clear();
mutations.push_back(Mutation());
mutations.back().column = "entry:foo";
mutations.back().value = invalid;
client.mutateRow(t, invalid, mutations, attributes);
// Run a scanner on the rows we just created
StrVec columnNames;
columnNames.push_back("entry:");
std::cout << "Starting scanner..." << std::endl;
int scanner = client.scannerOpen(t, "", columnNames, attributes);
try {
while (true) {
std::vector<TRowResult> value;
client.scannerGet(value, scanner);
if (value.size() == 0)
break;
printRow(value);
}
} catch (const IOError &ioe) {
std::cerr << "FATAL: Scanner raised IOError" << std::endl; }
client.scannerClose(scanner);
std::cout << "Scanner finished" << std::endl;
//
// Run some operations on a bunch of rows.
//
for (int i = 100; i >= 0; --i) {
// format row keys as "00000" to "00100"
char buf[32];
sprintf(buf, "%05d", i);
std::string row(buf);
std::vector<TRowResult> rowResult;
mutations.clear();
mutations.push_back(Mutation());
mutations.back().column = "unused:";
mutations.back().value = "DELETE_ME";
client.mutateRow(t, row, mutations, attributes);
client.getRow(rowResult, t, row, attributes);
printRow(rowResult);
client.deleteAllRow(t, row, attributes);
mutations.clear();
mutations.push_back(Mutation());
mutations.back().column = "entry:num";
mutations.back().value = "0";
mutations.push_back(Mutation());
mutations.back().column = "entry:foo";
mutations.back().value = "FOO";
client.mutateRow(t, row, mutations, attributes);
client.getRow(rowResult, t, row, attributes);
printRow(rowResult);
// sleep to force later timestamp
poll(0, 0, 50);
mutations.clear();
mutations.push_back(Mutation());
mutations.back().column = "entry:foo";
mutations.back().isDelete = true;
mutations.push_back(Mutation());
mutations.back().column = "entry:num";
mutations.back().value = "-1";
client.mutateRow(t, row, mutations, attributes);
client.getRow(rowResult, t, row, attributes);
printRow(rowResult);
mutations.clear();
mutations.push_back(Mutation());
mutations.back().column = "entry:num";
mutations.back().value = boost::lexical_cast<std::string>(i); mutations.push_back(Mutation());
mutations.back().column = "entry:sqr";
mutations.back().value = boost::lexical_cast<std::string>(i*i); client.mutateRow(t, row, mutations, attributes);
client.getRow(rowResult, t, row, attributes);
printRow(rowResult);
mutations.clear();
mutations.push_back(Mutation());
mutations.back().column = "entry:num";
mutations.back().value = "-999";
mutations.push_back(Mutation());
mutations.back().column = "entry:sqr";
mutations.back().isDelete = true;
client.mutateRowTs(t, row, mutations, 1, attributes); // shouldn't override latest client.getRow(rowResult, t, row, attributes);
printRow(rowResult);
CellVec versions;
client.getVer(versions, t, row, "entry:num", 10, attributes);
printVersions(row, versions);
assert(versions.size());
std::cout << std::endl;
try {
std::vector<TCell> value;
client.get(value, t, row, "entry:foo", attributes);
if (value.size()) {
std::cerr << "FATAL: shouldn't get here!" << std::endl;
return -1;
}
} catch (const IOError &ioe) {
// blank
}
}
// scan all rows/columns
columnNames.clear();
client.getColumnDescriptors(columnMap, t);
std::cout << "The number of columns: " << columnMap.size() << std::endl;
for (ColMap::const_iterator it = columnMap.begin(); it != columnMap.end(); ++it) { std::cout << " column with name: " + it-> << std::endl; columnNames.push_back(it->);
}
std::cout << std::endl;
std::cout << "Starting scanner..." << std::endl;
scanner = client.scannerOpenWithStop(t, "00020", "00040", columnNames, attributes); try {
while (true) {
std::vector<TRowResult> value;
client.scannerGet(value, scanner);
if (value.size() == 0)
break;
printRow(value);
}
} catch (const IOError &ioe) {
std::cerr << "FATAL: Scanner raised IOError" << std::endl;
}
client.scannerClose(scanner);
std::cout << "Scanner finished" << std::endl;
transport->close();
} catch (const TException &tx) {
std::cerr << "ERROR: " << tx.what() << std::endl; }
}。