上节课我们一起简单学习了Shell脚本,这节课我们来简单学习一下HBase,首先我们来看一下HBase的简介,如下图所示。
接下来我们一块儿看一下HBase的几个概念,首先来看第一个概念:Row Key,如下图所示,Row Key顾名思义,就是把一行当做主键,由于HBase建立了索引,所以我们根据行号可以迅速定位的那一行,我们还可以通过Row Key的range来定位数据,也就是查询的时候一次查多行的数据,指定一个范围,同样可以根据索引快速为我们查询出我们想要的结果。当然,也可以通过全表扫描的方式来查询我们想要的数据,这种方式相对来说就慢了。
看完了第一个概念,我们接着来看一下第二个概念:列族,如下图所示。列族是在我们建表的时候就需要声明的,一个表可以指定一个到多个列族,列族当中可以包含多个列,这些列是可以动态增加的,一个列族当中可以有0到多个列。如果表创建好了又想增加列族,那么需要先停止表,然后Alter表增加列族,然后再重新启用表。
看完了第二个概念,我们接着来看一下第三个概念:timestamp,时间戳是用来建立索引的,通过时间戳我们可以快速找到我们想要的版本的数据。
上面简单说了一下HBase的三个概念,其实我们学习HBase有一个非常好的地方,那就是HBase的官方提供的一份中文文档,文档在哪儿呢?
首先我们进入HBase的官方网站:http://hbase.apache.org/,中文参考指南的位置如下图所示。
我们点击中文指南进去,会看到如下图所示的界面(我只截取了一小部分),可以看到它介绍的非常详细,大家可以在这个文档中去学习HBase。
好了,HBase的理论我们就先说这么多,接下来我们还是通过使用HBase来更真切的理解一下它吧。我们现在用的是Hadoop-2.2.0,那么HBase和Hadoop之间的版本匹配情况是什么样子的呢,如下图所示,这个文档已经比较老了,从中可以看到HBase从0.96版本以后便支持Hadoop2.x版本了,现在HBase的版本都0.98.23了,当然也是支持Hadoop2.x版本了。
接下来我们便从HBase官网下载适合我们Hadoop开发的HBase包,如下图所示。
下载好了包,我们把包上传到我们的虚拟机(我们随便使用一台就行,现在我们只是先学习单节点的HBase),这里我把它上传到我的itcast03这台虚拟机的root根目录下了。
然后我们开始解压该包,我们把它解压到/itcast目录下使用的命令是:tar -zxvf hbase-0.98.23-hadoop2-bin.tar.gz -C /itcast/
[root@itcast03 ~]# ls
anaconda-ks.cfg Documents hbase-0.98.23-hadoop2-bin.tar.gz install.log.syslog logs mysql-connector-java-5.1.40-bin.jar Public Templates time wc.txt
Desktop Downloads install.log jdk-7u80-linux-x64.gz Music Pictures sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz test.sh Videos
[root@itcast03 ~]# tar -zxvf hbase-0.98.23-hadoop2-bin.tar.gz -C /itcast/
解压完后我们到/itcast目录下找到我们刚才解压好的文件,并找到它的配置文件目录,如下所示。
[root@itcast03 ~]# cd /itcast/
[root@itcast03 itcast]# ls
hadoop-2.2.0 hbase-0.98.23-hadoop2 sqoop-1.4.6 sqoop-1.4.6.bin__hadoop-2.0.4-alpha
[root@itcast03 itcast]# cd hbase-0.98.23-hadoop2/
[root@itcast03 hbase-0.98.23-hadoop2]# ls
bin CHANGES.txt conf docs hbase-webapps LEGAL lib LICENSE.txt NOTICE.txt README.txt
[root@itcast03 hbase-0.98.23-hadoop2]# cd conf
[root@itcast03 conf]# ls
hadoop-metrics2-hbase.properties hbase-env.cmd hbase-env.sh hbase-policy.xml hbase-site.xml log4j.properties regionservers
[root@itcast03 conf]#
要想跑起来HBase,我们需要简单配置一下两个文件,分别是hbase-env.sh和hbase-site.xml,首先我们来配置一下hbase-env.sh文件,如下所示,habase-env.sh文件当中的export JAVA_HOME这一行的内容原来配置的是jdk1.6版本的并且是注释掉的,我们现在去掉注释并 将jdk的版本换成我们现在用的版本。改完之后保存退出。
[root@itcast03 conf]# vim hbase-env.sh
#
#/**
# * Licensed to the Apache Software Foundation (ASF) under one
# * or more contributor license agreements. See the NOTICE file
# * distributed with this work for additional information
# * regarding copyright ownership. The ASF licenses this file
# * to you under the Apache License, Version 2.0 (the
# * "License"); you may not use this file except in compliance
# * with the License. You may obtain a copy of the License at
# *
# * http://www.apache.org/licenses/LICENSE-2.0
# *
# * Unless required by applicable law or agreed to in writing, software
# * distributed under the License is distributed on an "AS IS" BASIS,
# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# * See the License for the specific language governing permissions and
# * limitations under the License.
# */# Set environment variables here.# This script sets variables multiple times over the course of starting an hbase process,
# so try to keep things idempotent unless you want to take an even deeper look
# into the startup scripts (bin/hbase, etc.)# The java implementation to use. Java 1.6 required.
export JAVA_HOME=/usr/java/jdk1.7.0_80# Extra Java CLASSPATH elements. Optional. 接着我们来配置一下habase-site.xml文件,在这个文件当中我们暂且把文件系统配成本地文件系统。如下所示,注意:<configuration>与<property>之间以及<property>与其下一级之间都是用Tab键来缩进的。
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>habase.rootdir</name>
<value>file:///root/habase</value>
</property>
</configuration>
好,配置好了上面两个文件,我们的单节点HBase就搭建好了,现在我们来启动我们的HBase。要启动HBase,我们需要到HBase的bin目录下,然后使用./start-habase.sh来启动,如下所示。
[root@itcast03 hbase-0.98.23-hadoop2]# cd bin
[root@itcast03 bin]# ls
get-active-master.rb hbase-cleanup.sh hbase-config.cmd hbase-daemons.sh local-master-backup.sh region_mover.rb replication start-hbase.cmd stop-hbase.sh zookeepers.sh graceful_stop.sh hbase.cmd hbase-config.sh hbase-jruby local-regionservers.sh regionservers.sh rolling-restart.sh start-hbase.sh test
hbase hbase-common.sh hbase-daemon.sh hirb.rb master-backup.sh region_status.rb shutdown_regionserver.rb stop-hbase.cmd thread-pool.rb
[root@itcast03 bin]# ./start-hbase.sh
starting master, logging to /itcast/hbase-0.98.23-hadoop2/bin/../logs/hbase-root-master-itcast03.out
[root@itcast03 bin]#
启动完HBase之后,我们来查看一下当前进程有哪些,发现当前多出了一个叫HMaster的进程,这个进程就是HBase的进程。
[root@itcast03 bin]# jps
5537 HMaster
5623 Jps
[root@itcast03 bin]#
启动好了HBase,我们像检查安装好mysql那样,打开一个客户端来试试是否安装成功。我们用到的是命令是hbase,这个命令后可以跟很多命令,我们输入./habase一回车它就给我们显示./habase后面都可以跟哪些内容。如下所示。我们检查HBase用到的是shell。
[root@itcast03 bin]# ./hbase
Usage: hbase [<options>] <command> [<args>]
Options:
--config DIR Configuration direction to use. Default: ./conf
--hosts HOSTS Override the list in 'regionservers' file
--auth-as-server Authenticate to ZooKeeper using servers configurationCommands:
Some commands take arguments. Pass no args or -h for usage.
shell Run the HBase shell
hbck Run the hbase 'fsck' tool
hlog Write-ahead-log analyzer
snapshot Create a new snapshot of a table
snapshotinfo Tool for dumping snapshot information
hfile Store file analyzer
zkcli Run the ZooKeeper shell
upgrade Upgrade hbase
master Run an HBase HMaster node
regionserver Run an HBase HRegionServer node
zookeeper Run a Zookeeper server
rest Run an HBase REST server
thrift Run the HBase Thrift server
thrift2 Run the HBase Thrift2 server
clean Run the HBase clean up script
classpath Dump hbase CLASSPATH
mapredcp Dump CLASSPATH entries required by mapreduce
pe Run PerformanceEvaluation
ltt Run LoadTestTool
canary Run the Canary tool
version Print the version
CLASSNAME Run the class named CLASSNAME
[root@itcast03 bin]# 我们来执行一下./hbase shell
[root@itcast03 bin]# ./hbase shell
2016-10-30 11:51:13,095 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.98.23-hadoop2, r44c724b56dc1431209f561cb997fce805f9f45f9, Wed Oct 5 01:05:05 UTC 2016hbase(main):001:0> 我们执行一下help命令,如下所示。说明单节点HBase成功。hbase(main):001:0> help
HBase Shell, version 0.98.23-hadoop2, r44c724b56dc1431209f561cb997fce805f9f45f9, Wed Oct 5 01:05:05 UTC 2016
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.COMMAND GROUPS:
Group name: general
Commands: processlist, status, table_help, version, whoami Group name: ddl
Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, show_filters Group name: namespace
Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables Group name: dml
Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve Group name: tools
Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, compact_rs, flush, hlog_roll, major_compact, merge_region, move, split, trace, unassign, zk_dump Group name: replication
Commands: add_peer, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replicated_tables, remove_peer, set_peer_tableCFs, show_peer_tableCFs, update_peer_config Group name: snapshots
Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, delete_table_snapshots, list_snapshots, list_table_snapshots, restore_snapshot, snapshot Group name: security
Commands: grant, list_security_capabilities, revoke, user_permission Group name: visibility labels
Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibilitySHELL USAGE:
Quote all names in HBase Shell such as table and column names. Commas delimit
command parameters. Type <RETURN> after entering a command to run it.
Dictionaries of configuration used in the creation and alteration of tables are
Ruby Hashes. They look like this: {'key1' => 'value1', 'key2' => 'value2', ...}and are opened and closed with curley-braces. Key/values are delimited by the
'=>' character combination. Usually keys are predefined constants such as
NAME, VERSIONS, COMPRESSION, etc. Constants do not need to be quoted. Type
'Object.constants' to see a (messy) list of all constants in the environment.If you are using binary keys or values and need to enter them in the shell, use
double-quote'd hexadecimal representation. For example: hbase> get 't1', "key\x03\x3f\xcd"
hbase> get 't1', "key\003\023\011"
hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
For more on the HBase Shell, see http://hbase.apache.org/book.html
hbase(main):002:0>
好,本小节我们便一起学习到这里。