Smoke-Testing Cloudera CDH install

by Alex McLintock and Alan Duval of Alephant.co.uk, Sept 2017

 

The purpose of this page is to provide you with some basic tests to confirm that the Cloudera Distribution of Hadoop (CDH) is installed and functioning to some extent. It is NOT a full benchmark test suite.

As such, we presume that you have completed your own install, possibly by following our previous post, Doing the Install: Cloudera CDH5.12 on CentOS7.3

 

URL: http://www.alephant.co.uk/Installing_CDH_on_Hadoop-3-Install

 

Cloudera Manager Admin Console

To test the success of your install, start off by accessing the Cloudera Manager Admin Console.

To do this type, the following in to your web browser:

http://servername:7180

Where servername is the name you've given to the server that Cloudera Manager Admin Console is installed on. You will need to log in as the admin user created previously. 

If you can't access the console at all then either it is not running, or you may have a network issue. Solving that is left as an exercise for the reader.

 

Checking for "Good Health"

When the console pops up, you'll see a list of the services that are running and hopefully, to the left of these, a green circle with a tick next to each service, indicating that it is running as expected.

If, however, there are other indicators instead you may need to investigate resolving the issues. These maybe a solid orange circle, or a red circle with an exclamation mark inside.

To the right of the service name you will see orange or red circles, both with exclamations inside, and to the right of these there maybe a spanner icon.

Note - sometimes the spanner icon will appear with no exclamation between it and the service name (see Hive in the image below).

Cloudera Manager Status Screen

The spanner icon indicates that there is a configuration issue to be resolved.

For the most part, if there is an error or configuration issue icon to the right, this will cause the icon to the left to reflect this. There are, however, exceptions:

  • In the case of Hive in the image above, there are two configuration issues, but they are minor, so Hive has the green tick of Good Health.
  • In the case if Zookeeper in the image above, there are no errors or configuration issues, but the health of the service is of concern. This may mean that the service's warning threshold is set very conservatively, whilst the service itself is within normal parameters, or it may be the first in a cascade of issues - so it's best to find out which.

 

Clicking on the exclamation mark to the right of the server (with a number denoting the total number of issues) will bring up a 'Health Issues" dialog (see image below), with links to logs, a link to the relevant Cloudera Manager page for the faulty service, etc.

Cloudera Manager Status Health Issues

 

Clicking on the exclamation mark to the left of the service will take you straight to the Cloudera Manager page for the faulty service (below).

Hue Health Tests Screen

 

Likewise, clicking on the solid orange circle to the left of the service name will take you straight to the Cloudera Manager page for the service of concern (below).

Zookeeper Health Test Summary

 

Checking Heartbeats

In the Cloudera Manager Admin Console select Hosts tab, then select All Hosts.

The sixth column from the left is headed Last Heartbeat, and the value displayed should be less than 15 seconds.

Note - this test doesn't re-poll while you watch, so you will need to reload the page, or re-select Hosts - All Hosts if you have this page open and want to re-check your cluster's status after taking some action.

 

Running a MapReduce Job

Log into a single machine in the cluster. We went back to Hosts > All Hosts and confirmed the machines with most roles (which were machines 1 and 2), and ran the following command on both.

sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 100

 

We viewed the the job by selecting Clusters > yarn Applications - the relevant output looked like this:

 

Note - Cloudera's instructions suggest selecting Clusters > ClusterName > yarn Applications. This is a little confusing as you don't select 'ClusterName' (which is to say, the imaginatively titled 'Cluster 1', in the image below). The yarn Application link is listed under Cluster 1 as an available selection (in the right hand column of the image below). If you do click on Cluster 1, you go to the cluster summary, and there is no YARN Applications link there.

Clusters - YARN Applications

 

 

Cloudera Manager Health Tests

The basic smoke tests are a bit like holding up a mirror to someone's mouth to see if they are still breathing. If you want to investigate each individual service in more detail then you run Cloudera Manager Health Tests for each service. You can continue reading the documentation for those on the Cloudera website.

 

URL: https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ht.html

 

Here are the links for the individual health tests listed in the above URL:

Tags