Saturday, 17 September 2016

Installing Apache Hbase on Windows using Cygwin64


After installing hadoop on windows using Cygwin which we learnt in our previous blog(Installing Apache Hadoop on Windows 10 using Cygwin64), we now install Hbase on windows using Cygwin. 



Tools Used :


  • Apache Hbase 1.2.3
  • Cygwin64
  • Java 1.8
  • Hadoop 2.7.1



Download hbase-1.2.3-bin.tar.gz binary from here and place under c:/cygwin/root/usr/local.

Start Cygwin terminal as administrator and issue below commands to extract hbase-1.2.3-bin.tar.gz content.

$ cd /usr/local
$ tar xvf hbase-1.2.3-bin.tar.gz
Create logs folder i.e. C:\cygwin\root\usr\local\hbase-1.2.3\logs 


HBase uses the ./conf/hbase-default.xml file for configuration. Some properties do not resolve to existing directories because the JVM runs on Windows. This is the major issue to keep in mind when working with Cygwin: within the shell all paths are *nix-alike, hence relative to the root /. However, every parameter that is to be consumed within the windows processes themself, need to be Windows settings, hence C:\-alike. Change following propeties in the configuration file, adjusting paths where necessary to conform with your own installation:
  • hbase.rootdir must read e.g. file:///C:/cygwin/root/tmp/hbase/data or hdfs://127.0.0.1:9000/hbase in case of hadoop file system.
  • hbase.tmp.dir must read C:/cygwin/root/tmp/hbase/tmp
  • hbase.zookeeper.quorum must read 127.0.0.1 because for some reason localhost doesn't seem to resolve properly on Cygwin.
Make sure the configured hbase.rootdir and hbase.tmp.dir directories exist and have the proper rights set up e.g. by issuing a chmod 777 on them.

Go to  c:/cygwin/root/usr/local/hbase-1.2.3/conf and add the following in hbase-site.xml file. 

<configuration>
<property>
 <name>hbase.rootdir</name> 
 <!--<value>file:///C:/cygwin/root/tmp/hbase/data</value> -->
 <value>hdfs://127.0.0.1:9000/hbase</value>
</property>
<property>
 <name>hbase.zookeeper.quorum</name> 
 <value>127.0.0.1</value> 
</property>
<property>
 <name>hbase.tmp.dir</name> 
 <value>C://cygwin/root/tmp/hbase/tmp</value>
</property>
</configuration>

Add the following to hbase-env.sh file. 

export JAVA_HOME=/usr/local/java/
export HBASE_CLASSPATH=/cygwin/root/usr/local/hbase-1.2.3/lib/
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
export HBASE_IDENT_STRING=$HOSTNAME

Start a Cygwin terminal, if you haven't already.
Please make sure hadoop is started before issuing hbase start command. Type jps to 
check if Hadoop daemon processes are running. 










Create hbase directory in hdfs.





Refer Hadoop-eclipse-plugin installation blog to create folder in hdfs using eclipse, if you haven't already.









Change directory to HBase installation using CD /usr/local/hbase-1.2.3.
Start HBase using the command sh start-hbase.sh
When prompted to accept the SSH fingerprint, answer yes.
When prompted, provide your password. Maybe multiple times.
When the command completes, the HBase server should have started.
However, to be absolutely certain, check the logs in the ./logs directory for any exceptions.













Zookeeper startup logs
2016-09-18 12:59:10,944 INFO  [main] server.ZooKeeperServer: Server environment:java.library.path=C:\java\jdk1.8.0_101\bin;C:\WINDOWS\Sun\Java\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\cygwin\root\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0;%JAVA_HOME%\bin;%CYGWIN_HOME%\bin;%HADOOP_BIN_PATH%;%HADOOP_HOME%\bin;%MAVEN_HOME%\bin;.
2016-09-18 12:59:10,944 INFO  [main] server.ZooKeeperServer: Server environment:java.io.tmpdir=C:\Users\Naveen\
2016-09-18 12:59:10,944 INFO  [main] server.ZooKeeperServer: Server environment:java.compiler=<NA>
2016-09-18 12:59:10,944 INFO  [main] server.ZooKeeperServer: Server environment:os.name=Windows 10
2016-09-18 12:59:10,944 INFO  [main] server.ZooKeeperServer: Server environment:os.arch=amd64
2016-09-18 12:59:10,944 INFO  [main] server.ZooKeeperServer: Server environment:os.version=10.0
2016-09-18 12:59:10,944 INFO  [main] server.ZooKeeperServer: Server environment:user.name=Naveen
2016-09-18 12:59:10,944 INFO  [main] server.ZooKeeperServer: Server environment:user.home=C:\Users\Naveen
2016-09-18 12:59:10,944 INFO  [main] server.ZooKeeperServer: Server environment:user.dir=C:\cygwin\root\usr\local\hbase-1.2.3
2016-09-18 12:59:10,957 INFO  [main] server.ZooKeeperServer: tickTime set to 3000
2016-09-18 12:59:10,957 INFO  [main] server.ZooKeeperServer: minSessionTimeout set to -1
2016-09-18 12:59:10,957 INFO  [main] server.ZooKeeperServer: maxSessionTimeout set to 90000
2016-09-18 12:59:11,316 INFO  [main] server.NIOServerCnxnFactory: binding to port 0.0.0.0/0.0.0.0:2181

hbase startup logs
Sun, Sep 18, 2016 12:59:06 PM Starting master on Naveen
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
open files                      (-n) 256
pipe size            (512 bytes, -p) 8
stack size              (kbytes, -s) 2032
cpu time               (seconds, -t) unlimited
max user processes              (-u) 256
virtual memory          (kbytes, -v) unlimited
2016-09-18 12:59:08,128 INFO  [main] util.VersionInfo: HBase 1.2.3
2016-09-18 12:59:08,129 INFO  [main] util.VersionInfo: Source code repository git://kalashnikov.att.net/Users/stack/checkouts/hbase.git.commit revision=bd63744624a26dc3350137b564fe746df7a721a4
.
.
.
.
.
2016-09-18 12:59:25,144 INFO  [regionserver/Naveen/192.168.56.1:0.logRoller] wal.FSHLog: Rolled WAL /hbase/WALs/naveen,59600,1474174753236/naveen%2C59600%2C1474174753236.default.1474174764214 with entries=2, filesize=303 B; new WAL /hbase/WALs/naveen,59600,1474174753236/naveen%2C59600%2C1474174753236.default.1474174764743
2016-09-18 12:59:25,242 INFO  [Naveen:59566.activeMasterManager] master.HMaster: Master has completed initialization
2016-09-18 12:59:25,244 INFO  [Naveen:59566.activeMasterManager] quotas.MasterQuotaManager: Quota support disabled
2016-09-18 12:59:25,245 INFO  [Naveen:59566.activeMasterManager] zookeeper.ZooKeeperWatcher: not a secure deployment, proceeding
2016-09-18 12:59:27,174 INFO  [HBase-Metrics2-1] impl.MetricsSystemImpl: Stopping HBase metrics system...
2016-09-18 12:59:27,183 INFO  [HBase-Metrics2-1] impl.MetricsSystemImpl: HBase metrics system stopped.
2016-09-18 12:59:27,688 INFO  [HBase-Metrics2-1] impl.MetricsConfig: loaded properties from hadoop-metrics2-hbase.properties
2016-09-18 12:59:27,693 INFO  [HBase-Metrics2-1] impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-09-18 12:59:27,693 INFO  [HBase-Metrics2-1] impl.MetricsSystemImpl: HBase metrics system started
2016-09-18 12:59:46,246 INFO  [WALProcedureStoreSyncThread] wal.WALProcedureStore: Remove log: hdfs://127.0.0.1:9000/hbase/MasterProcWALs/state-00000000000000000001.log
2016-09-18 12:59:46,246 INFO  [WALProcedureStoreSyncThread] wal.WALProcedureStore: Removed logs: [hdfs://127.0.0.1:9000/hbase/MasterProcWALs/state-00000000000000000002.log]

Type jps to check if HMaster daemon process  is running.











Next we start the HBase shell using the command sh hbase shell












Once after starting hbase, hdfs file system should show below directory structure.


















Now, lets play with some hbase commands.



We’ll start with a basic scan that returns all columns in the cars table.

Using a long column family name, such as columnfamily1 is a horrible idea in production. Every cell (i.e. every value) in HBase is stored fully qualified. This basically means that long column family names will balloon the amount of disk space required to store your data. In summary, keep your column family names as small as possible

To start, I’m going to create a new table named cars. My column family is vi, which is an abbreviation of vehicle information.

The schema that follows below is only for illustration purposes, and should not be used to create a production schema. In production, you should create a Row ID that helps to uniquely identify the row, and that is likely to be used in your queries. Therefore, one possibility would be to shift the Make, Model and Year left and use these items in the Row ID.

create 'cars', 'vi'
Let’s insert 3 column qualifies (make, model, year) and the associated values into the first row (row1).


put 'cars', 'row1', 'vi:make', 'bmw'
put 'cars', 'row1', 'vi:model', '5 series'
put 'cars', 'row1', 'vi:year', '2012'

Now let’s add a second row.

put 'cars', 'row2', 'vi:make', 'mercedes'
put 'cars', 'row2', 'vi:model', 'e class'
put 'cars', 'row2', 'vi:year', '2012'
List the tables using below commands

list


Scan a Table (i.e. Query a Table)

We’ll start with a basic scan that returns all columns in the cars table.

scan 'cars'
You should see output similar to:













Reading the output above you’ll notice that the Row ID is listed under ROW. The COLUMN+CELL field shows the column family after column=, then the column qualifier, a timestamp that is automatically created by HBase, and the value.

Importantly, each row in our results shows an individual row id + column family + column qualifier combination. Therefore, you’ll notice that multiple columns in a row are displayed in multiple rows in our results.

The next scan we’ll run will limit our results to the make column qualifier.

scan 'cars', {COLUMNS => ['vi:make']}
You should see output similar to:
















If you have a particularly large result set, you can limit the number of rows returned with the LIMIT option. In this example I arbitrarily limit the results to 1 row to demonstrate how LIMIT works.




scan 'cars', {COLUMNS => ['vi:make'], LIMIT => 1}
You should see output similar to:










Get One Row
The get command allows you to get one row of data at a time. You can optionally limit the number of columns returned. We’ll start by getting all columns in row1.


get 'cars', 'row1'
You should see output similar to:










When looking at the output above, you should notice how the results under COLUMN show the fully qualified column family:column qualifier, such as vi:make.

To get one specific column include the COLUMN option.

get 'cars', 'row1', {COLUMN => 'vi:model'}
You should see output similar 










You can also get two or more columns by passing an array of columns.

get 'cars', 'row1', {COLUMN => ['vi:model', 'vi:year']}
You should see output similar to:










Delete a Cell (Value)

delete 'cars', 'row2', 'vi:year'
Let’s check that our delete worked.

get 'cars', 'row2'
You should see output that shows 2 columns.












Disable and Delete a Table

disable 'cars'
drop 'cars'
You should see empty table list.











View HBase Command Help

help












Exit the HBase Shell

exit








To stop the HBase server issue the sh stop-hbase.sh command. And wait for it to complete!!! Killing the process might corrupt your data on disk.

$ sh stop-hbase.sh


Installing Apache Hadoop on Windows 10 using Cygwin64

This article describes how to set up and configure a single-node Hadoop installation on Windows 10 using Cygwin.

Tools and technologies used

  • Java 1.8
  • Hadoop 2.7.1
  • Cygwin64

Configure Java

Download Java from here and extract to c:\java

Add JAVA_HOME to user variable and %JAVA_HOME%\bin to Path variable.



Cygwin installation :

Download and install cygwin from official site

Create folders as below:
C:\cygwin\root // Root folder
C:\cygwin\setup // Local Package folder.
Place Cygwin64 setup file inside Local Package folder (C:\cygwin\setup)
Run setup with Install from Internet options without selecting any additional packages
Configure Environment variables for CYGWIN_HOME (C:\cygwin\root) and PATH (%CYGWIN_HOME%\bin)


Re-run setup with default options and select the below packages during installation:

 a. zlib















b. OpenSSH
















 c. tcp_wrappers
















d. diffutils

















LINKS & FILE PERMISSIONS:


Once after cygwin installation, open Cygwin terminal as administrator and run following command to create symbolic link for java.

$ LN -s /cygdrive/c/java/jdk1.8.0_101 /usr/local/java/
Create passwd and group files inside /etc folder:

C:\cygwin\root\etc\passwdC:\cygwin\root\etc\group
Set file permissions:

 $ chmod +r /etc/passwd
 $ chmod u+w
 $ chmod +r /etc/group
 $ chmod u+w /etc/group
 $ chown :Users /var
 $ chmod 757 /var
 $ chmod ug-s /var
 $ chmod +t /var
Edit the /etc/hosts.allow file using your favorite editor and make sure the following two lines are in there before the PARANOID line:

ALL : localhost 127.0.0.1/32 : allow
ALL : [::1]/128 : allow

SSH CONFIGURATION:

Configure SSH using Cygwin64 terminal [Run As Administrator]:

Run the script
$ ssh-host-config
  • If this script asks to overwrite an existing /etc/ssh_config, answer yes.
  • If this script asks to overwrite an existing /etc/sshd_config, answer yes.
  • If this script asks to use privilege separation, answer yes.
  • If this script asks to install sshd as a service, answer yes. Make sure you started your shell as Adminstrator!
  • If this script asks for the CYGWIN value, just <enter> as the default is ntsec.
  • If this script asks to create the sshd account, answer yes.
  • If this script asks to use a different user name as service account, answer no as the default will suffice.
  • If this script asks to create the cyg_server account, answer yes. Enter a password for the account.
  • Start the SSH service using net start sshd or cygrunsrv --start sshd. Notice that cygrunsrv is the utility that make the process run as a Windows service. Confirm that you see a message stating that the CYGWIN sshd service was started succesfully.

Harmonize Windows and Cygwin64
$ mkpasswd -cl > /etc/passwd
$ mkgroup --local > /etc/group

Test SSH using another Cygwin64 terminal [Local User]:
$ ssh cyg_server@localhost
cyg_server@localhost's password:
cyg_server@Naveen ~
$ logout

Configure Hadoop
Download 2.7.1 version from hadoop-2.7.1.tar.gz.

Place hadoop-2.7.1.tar.gz inside C:\cygwin\root\usr\local folder and extract using following command.

$ cd /usr/local
$ tar -xzf namenode -format













The official Hadoop release from Apache does not include a Windows binary and compiling from sources can be tedious so I have made this compiled distribution available. Download from here and replace hadoop-2.7.1/bin content

Add HADOOP_HOME and HADOOP_BIN_PATH to user variables.
Add %HADOOP_HOME%\bin and %HADOOP_BIN_PATH% to path variable.



It necessary to modify some configuration files inside /hadoop-2.7.1/etc/hadoop. All such files follow the an XML format, and the updates should concern the top-level node configuration. Specifically:

  •  yarn-site.xml:
<configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
</configuration>
  •  core-site.xml:
<configuration>
  <property>
 <name>fs.defaultFS</name>
 <value>hdfs://localhost:9000</value>
  </property>
</configuration>
  •  mapred-site.xml (create mapred-site.xml from mapred-site.xml.template if not exist):
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>
  •  hdfs-site.xml:
<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/data/dfs/namenode</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/data/dfs/datanode</value>
  </property>
</configuration>

Formatting the distributed file system

Open cygwin terminal as administrator  and navigate to bin path and
type ‘hdfs namenode -format’ to format hadoop file system,

$ cd /usr/local/hadoop-2.7.1/bin
$ hdfs namenode -format


after execution you will see below logs
STARTUP_MSG:   
java = 1.8.0_101************************************************************/16/09/09 21:39:41 
INFO namenode.NameNode: createNameNode [-format]Formatting using clusterid: CID-c7dc4583-c77b-4497-a0b6-a4506e538fbb16/09/09 21:39:45 
INFO namenode.FSNamesystem: No KeyProvider found.16/09/09 21:39:45 
INFO namenode.FSNamesystem: fsLock is fair:true16/09/09 21:39:45 
INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=100016/09/09 21:39:45 
INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true16/09/09 21:39:45 
INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.00016/09/09 21:39:45 
INFO blockmanagement.BlockManager: The block deletion will start around 2016 Sep 09 21:39:4516/09/09 21:39:45 
INFO util.GSet: Computing capacity for map BlocksMap16/09/09 21:39:45 
INFO util.GSet: VM type       = 64-bit16/09/09 21:39:45 
INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB16/09/09 21:39:45 
INFO util.GSet: capacity      = 2^21 = 2097152 entries16/09/09 21:39:45 
INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false16/09/09 21:39:45 
INFO blockmanagement.BlockManager: defaultReplication         = 116/09/09 21:39:45 
INFO blockmanagement.BlockManager: maxReplication             = 51216/09/09 21:39:45 
INFO blockmanagement.BlockManager: minReplication             = 116/09/09 21:39:45 
INFO blockmanagement.BlockManager: maxReplicationStreams      = 216/09/09 21:39:45 
INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks  = false16/09/09 21:39:45 
INFO blockmanagement.BlockManager: replicationRecheckInterval = 300016/09/09 21:39:45 
INFO blockmanagement.BlockManager: encryptDataTransfer        = false16/09/09 21:39:45 
INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 100016/09/09 21:39:45 
INFO namenode.FSNamesystem: fsOwner             = Naveen (auth:SIMPLE)16/09/09 21:39:45 
INFO namenode.FSNamesystem: supergroup          = supergroup16/09/09 21:39:45 
INFO namenode.FSNamesystem: isPermissionEnabled = true16/09/09 21:39:45 
INFO namenode.FSNamesystem: HA Enabled: false16/09/09 21:39:45 
INFO namenode.FSNamesystem: Append Enabled: true16/09/09 21:39:46 
INFO util.GSet: Computing capacity for map INodeMap16/09/09 21:39:46 
INFO util.GSet: VM type       = 64-bit16/09/09 21:39:46 
INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB16/09/09 21:39:46 
INFO util.GSet: capacity      = 2^20 = 1048576 entries16/09/09 21:39:46 
INFO namenode.FSDirectory: ACLs enabled? false16/09/09 21:39:46 
INFO namenode.FSDirectory: XAttrs enabled? true16/09/09 21:39:46 
INFO namenode.FSDirectory: Maximum size of an xattr: 1638416/09/09 21:39:46 
INFO namenode.NameNode: Caching file names occuring more than 10 times16/09/09 21:39:46 
INFO util.GSet: Computing capacity for map cachedBlocks16/09/09 21:39:46 
INFO util.GSet: VM type       = 64-bit16/09/09 21:39:46 
INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB16/09/09 21:39:46 
INFO util.GSet: capacity      = 2^18 = 262144 entries16/09/09 21:39:46 
INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.999000012874603316/09/09 21:39:46 
INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 016/09/09 21:39:46 
INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 3000016/09/09 21:39:46 
INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 1016/09/09 21:39:46 
INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 1016/09/09 21:39:46 
INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,2516/09/09 21:39:46 
INFO namenode.FSNamesystem: Retry cache on namenode is enabled16/09/09 21:39:46 
INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis16/09/09 21:39:46 
INFO util.GSet: Computing capacity for map NameNodeRetryCache16/09/09 21:39:46 
INFO util.GSet: VM type       = 64-bit16/09/09 21:39:46 
INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB16/09/09 21:39:46 
INFO util.GSet: capacity      = 2^15 = 32768 entries16/09/09 21:39:51 
INFO namenode.FSImage: Allocated new BlockPoolId: BP-818803997-192.168.56.1-147342839157316/09/09 21:39:51 
INFO common.Storage: Storage directory \data\dfs\namenode has been successfully formatted.16/09/09 21:39:52 
INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 016/09/09 21:39:52 
INFO util.ExitUtil: Exiting with status 0 16/09/09 21:39:52 
INFO namenode.NameNode: SHUTDOWN_MSG:/************************************************************SHUTDOWN_MSG: Shutting down NameNode at Naveen/192.168.56.1



Navigate to sbin directory.











Type ‘sh hadoop-daemon.sh start namenode’ to start name node.
$ sh hadoop-daemon.sh start namenode












namenode startup logs
STARTUP_MSG:   java = 1.8.0_101
************************************************************/
16/09/09 22:22:05 INFO namenode.NameNode: createNameNode []
16/09/09 22:22:08 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
16/09/09 22:22:08 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
16/09/09 22:22:08 INFO impl.MetricsSystemImpl: NameNode metrics system started
16/09/09 22:22:08 INFO namenode.NameNode: fs.defaultFS is hdfs://127.0.0.1:9000
.
.
.
dbacd449;nsid=722029399;c=0), blocks: 3, hasStaleStorage: false, processing time: 43 msecs
16/09/09 22:22:34 INFO blockmanagement.BlockManager: Total number of blocks            = 3
16/09/09 22:22:34 INFO blockmanagement.BlockManager: Number of invalid blocks          = 0
16/09/09 22:22:34 INFO blockmanagement.BlockManager: Number of under-replicated blocks = 2
16/09/09 22:22:34 INFO blockmanagement.BlockManager: Number of  over-replicated blocks = 0
16/09/09 22:22:34 INFO blockmanagement.BlockManager: Number of blocks being written    = 0
16/09/09 22:22:34 INFO hdfs.StateChange: STATE* Replication Queue initialization scan for invalid, over- and under-replicated blocks completed in 63 msec
16/09/09 22:22:54 INFO hdfs.StateChange: STATE* Safe mode ON, in safe mode extension.
The reported blocks 3 has reached the threshold 0.9990 of total blocks 3. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 9 seconds.
16/09/09 22:23:04 INFO hdfs.StateChange: STATE* Leaving safe mode after 45 secs
16/09/09 22:23:04 INFO hdfs.StateChange: STATE* Safe mode is OFF
16/09/09 22:23:04 INFO hdfs.StateChange: STATE* Network topology has 1 racks and 1 datanodes
16/09/09 22:23:04 INFO hdfs.StateChange: STATE* UnderReplicatedBlocks has 3 blocks

Type ‘sh hadoop-daemon.sh start datanode’ to start data node
$ sh hadoop-daemon.sh start datanode












datanode startup logs
STARTUP_MSG:   java = 1.8.0_101
************************************************************/
16/09/09 22:22:13 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
16/09/09 22:22:13 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
16/09/09 22:22:13 INFO impl.MetricsSystemImpl: DataNode metrics system started
16/09/09 22:22:13 INFO datanode.BlockScanner: Initialized block scanner with targetBytesPerSec 1048576
.
.
.
16/09/09 22:22:34 INFO datanode.DataNode: Namenode Block pool BP-531935505-192.168.56.1-1472983681288 (Datanode Uuid 964b90aa-8843-4eea-9f50-182e5bab213a) service to /127.0.0.1:9000 trying to claim ACTIVE state with txid=556
16/09/09 22:22:34 INFO datanode.DataNode: Acknowledging ACTIVE Namenode Block pool BP-531935505-192.168.56.1-1472983681288 (Datanode Uuid 964b90aa-8843-4eea-9f50-182e5bab213a) service to /127.0.0.1:9000
16/09/09 22:22:34 INFO datanode.DataNode: Successfully sent block report 0x18e88ecf05bac,  containing 1 storage report(s), of which we sent 1. The reports had 3 total blocks and used 1 RPC(s). This took 15 msec to generate and 286 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5.
16/09/09 22:22:34 INFO datanode.DataNode: Got finalize command for block pool BP-531935505-192.168.56.1-1472983681288

Type ‘sh yarn-daemon.sh start resourcemanager’  to start resource manager


$ sh hadoop-daemon.sh start resourcemanager

Resource Manager startup logs
STARTUP_MSG:   java = 1.8.0_101
************************************************************/
16/09/09 22:22:12 INFO event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher
16/09/09 22:22:12 INFO event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher
.
.
.
16/09/09 22:22:40 INFO security.NMTokenSecretManagerInNM: Rolling master-key for container-tokens, got key with id -46717855
16/09/09 22:22:40 INFO nodemanager.NodeStatusUpdaterImpl: Registered with ResourceManager as Naveen:56407 with total resource of <memory:8192, vCores:8>
16/09/09 22:22:40 INFO nodemanager.NodeStatusUpdaterImpl: 
Notifying ContainerManager to unblock new container-requests

Type ‘sh yarn-daemon.sh start nodemanager’ to start node manager
$ sh hadoop-daemon.sh start nodemanager







Node Manager startup logs
STARTUP_MSG:   java = 1.8.0_101
************************************************************/
16/09/09 22:22:12 INFO event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher
16/09/09 22:22:12 INFO event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher
16/09/09 22:22:12 INFO event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizationEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService
.
.
.
16/09/09 22:22:40 INFO security.NMTokenSecretManagerInNM: Rolling master-key for container-tokens, got key with id -46717855
16/09/09 22:22:40 INFO nodemanager.NodeStatusUpdaterImpl: Registered with ResourceManager as Naveen:56407 with total resource of <memory:8192, vCores:8>
16/09/09 22:22:40 INFO nodemanager.NodeStatusUpdaterImpl: Notifying ContainerManager to unblock new container-requests

Type ‘sh mr-jobhistory-daemon.sh start’  to start job history server
$ sh mr-jobhistory-daemon.sh start










Type jps to verify the daemon process
$ jps






Installation Verification 

  • namenode GUI
  • resourcemanager GUI



Resourcemanager GUI address - http://localhost:8088

Namenode GUI address – http://localhost:50070

In the next article(Installing Apache Hbase on Windows using Cygwin64) we'll see how install Hbase on windows using cygwin and practice some basic Hbase commands.