Apache HBase™ is the Hadoop database, a distributed, scalable, big data store.
Use Apache HBase™ when you need random, realtime read/write access to your Big Data. This project’s goal is the hosting of very large tables – billions of rows X millions of columns – atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google’s Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS. – From Apache HBase
部署单节点独立实例
JDK 版本要求
{% img /images/hbase-jdk-version-requirements.jpg %}
# 推荐安装 JDK 8
sudo yum install -y java-1.8.0-openjdk-devel.x86_64
下载 HBase
# 下载
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/stable/hbase-1.4.11-bin.tar.gz
# 解压
tar xzvf hbase-1.4.11-bin.tar.gz
# 删除压缩包
rm -f hbase-1.4.11-bin.tar.gz
# 重命名
mv hbase-1.4.11 hbase
# 进入到 hbase 目录下
cd hbase
配置环境变量
# HBase
# 配置 JAVA_HOME
vi conf/hbase-env.sh
# Set environment variables here.
# The java implementation to use. Java 1.7+ required.
export JAVA_HOME=/usr/lib/jvm/java
# 配置 /etc/profile
vi /etc/profile
# 添加以下内容
export PATH=$PATH:/home/vagrant/hbase
# 生效
source /etc/profile
配置 HBase
conf/hbase-site.xml
是 HBase
的主要配置文件. 你需要在本地文件系统中指定 HBase
和 ZooKeeper
写数据和**确认某些风险(acknowledge some risks)**的目录. 默认会被创建在 /tmp
目录下. 但是许多机器会在重启的时候删除 /tmp
目录下的内容, 所以你应该将数据存储到其他地方. 下面的配置会将 HBase
的数据存储在 /opt/hbase
目录下. HBase
会自动创建 /opt/hbase
目录, 如果你手动创建这个目录, HBase
将尝试做迁移操作, 这并不是你想要的.
# 编辑 conf/hbase-site.xml
vi conf/hbase-site.xml
# 添加以下内容至 <configuration></configuration> 标签中
<property>
<name>hbase.rootdir</name>
<value>file:///opt/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/opt/zookeeper</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
HBase 初体验
启动 HBase
start-hbase.sh
使用 hbase shell 命令连接 HBase
hbase shell
hbase(main):001:0>
创建表
hbase(main):001:0> create 'test', 'cf'
0 row(s) in 0.4170 seconds
=> Hbase::Table - test
使用 list 命令确认表是否存在
hbase(main):002:0> list 'test'
TABLE
test
1 row(s) in 0.0180 seconds
=> ["test"]
使用 describe 查看表的具体信息, 包括默认配置
hbase(main):003:0> describe 'test'
Table test is ENABLED
test
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE =>
'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'f
alse', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE
=> '65536'}
1 row(s)
Took 0.9998 seconds
使用 put 向表中插入数据
hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1'
0 row(s) in 0.0850 seconds
hbase(main):004:0> put 'test', 'row2', 'cf:b', 'value2'
0 row(s) in 0.0110 seconds
hbase(main):005:0> put 'test', 'row3', 'cf:c', 'value3'
0 row(s) in 0.0100 seconds
使用 scan 命令扫描表的所有数据
hbase(main):006:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1421762485768, value=value1
row2 column=cf:b, timestamp=1421762491785, value=value2
row3 column=cf:c, timestamp=1421762496210, value=value3
3 row(s) in 0.0230 seconds
使用 get 命令获取一条数据
hbase(main):007:0> get 'test', 'row1'
COLUMN CELL
cf:a timestamp=1421762485768, value=value1
1 row(s) in 0.0350 seconds
使用 disable/enable 命令禁用/启用表
如果你想删除表或修改表的设置, 或者其他情景, 你首先需要使用 disable
禁用表, 你也可以使用 enable
重新启用被禁用的表
hbase(main):008:0> disable 'test'
0 row(s) in 1.1820 seconds
hbase(main):009:0> enable 'test'
0 row(s) in 0.1770 seconds
使用 drop 命令删除表
hbase(main):011:0> drop 'test'
0 row(s) in 0.1370 seconds
退出 HBase Shell
hbase(main):011:0> drop 'test'
0 row(s) in 0.1370 seconds
停止 HBase
stop-hbase.sh
使用 jps 命令确认 HMaster 和 HRegionServer 已停止
jps