Friday, March 25, 2016



Cloudera Professional Services
Services Engagement Prerequisites


Revision History
Version
Author
Description
Date
0.1
Ben Spivey
Initial version
2013-09-06
0.2
Jean-Marc Spaggiari
Update network section;
Overall document review.
2013-09-10
0.3
Prateek Rungta
Minor corrections and updates
2014-03-05
0.4
Kaufman Ng
Expanded repo and JDK sections
2014-05-12
0.5
Dave Beech
Minor corrections
2014-05-13
0.6
Kaufman Ng
Expanded JDK section for multiple versions of CM/CDH
2014-05-13
0.7
Ian Buss
Minor corrections and additions for SLES and Ubuntu
2014-11-14
1.0
James Kinley
Updated to 2015 template;
Updated for C5.3.
2015-01-06
1.1
Ian Buss
Reverted to Calibri and added page numbers
2015-01-09
1.2
Jan Kunigk
Add read example and direct flag for disk I/O test
2015-02-19
1.3
Ryan Fishel
Moved references to bottom of document
2015-02-27
1.4
James Kinley
Add database and checklist sections, plus minor corrections and updates
2015-03-30
1.5
Prateek Rungta
Updated Database Sections, and other minor tweaks
2015-04-21


Contents

Introduction 5
Logistics & Administrative 5
Cloudera Reference Architectures 5
Cluster Hardware 6
Disks 6
Operating System 8
Operating System Version 8
Swappiness 8
Date Synchronization & NTP 8
Firewalls 9
Kernel Security Modules 9
Secure Shell 10
Disable non-required services 11
User Limits (ulimits) 11
Transparent Huge Pages 11
Java 12
Network 13
Network Architecture Diagrams 13
DNS 13
/etc/hosts 14
IPv6 14
Static IP 15
Hostnames 15
Network Interface Cards (NICs) 16
Databases 17
Repository Access 19
Cloudera Manager 19
CDH 19
Security 20
Prerequisites Checklist 21
References 24

Introduction

This document describes the important prerequisites that should be completed prior to a Cloudera Professional Services engagement.

Logistics & Administrative

Prior to Cloudera personnel arriving onsite, the customer must ensure that any required onboarding is complete or planned to be completed at the start of the engagement. This includes, but is not limited to:
  • Site access (physical security and badging);
  • Accounts and network access;
  • IT equipment; and
  • Anything else required for Cloudera personnel to perform the work outlined in the Statement of Work.

Cloudera Reference Architectures


  • Customer should inform Cloudera if one of the certified reference architectures will be followed. For example:
    • HP Reference Architecture for Cloudera Enterprise 5 [1];
    • Dell Cloudera Solution Reference Architecture [2];
    • Cisco UCS CPAv2 for Big Data with Cloudera [3].

Cluster Hardware

Disks

  • The disks that will be used by CDH on the worker hosts (i.e. used by the DataNode and NodeManager) should be configured as JBOD;
    • RAID should not be used for these disks;
    • RAID can be used for the Operating System (OS) disks if required;
  • The disks that will be used by CDH on the master hosts (i.e. used by the NameNode, Zookeeper, and JournalNode) can be configured as JBOD or with RAID;
    • Note: these master processes are sensitive to I/O latency and therefore it is recommended that their data directories are located on separate disks to avoid contention;

$ df –h
$ lsblk

  • Ensure that disks are formatted with EXT3, EXT4, or XFS and mounted with the noatime option. For example:

$ cat /etc/fstab

/dev/sda1 /data/1 xfs defaults,noatime 1 2
/dev/sdb1 /data/2 xfs defaults,noatime 1 2
/dev/sdc1 /data/3 xfs defaults,noatime 1 2
/dev/sdd1 /data/4 xfs defaults,noatime 1 2
/dev/sde1 /data/5 xfs defaults,noatime 1 2
/dev/sdf1 /data/6 xfs defaults,noatime 1 2
...
/dev/sdx1 /data/x xfs defaults,noatime 1 2

$ fdisk -l

  • Note: the use of noatime implies nodiratime;
  • Note: the use of EXT3 on RHEL 6.x is not recommended as performance issues have been reported;
  • Ensure that the Logical Volume Manager (LVM) is not configured:

$ df –h
$ lsblk
$ lvdisplay

Additionally, look for /dev/mapper or /dev/XX (where XX is not sd).

  • Ensure that the BIOS is configured correctly. For example, if you have SATA drives make sure that IDE emulation is not enabled;
  • Verify that the controller firmware is up to date and check for potential disk errors:

$ dmesg | egrep -i 'sense error'
$ dmesg | egrep -i 'ata bus error'

  • Ensure that there is sufficient space under the root, /opt, /usr, and /var filesystems [4];
  • Test disk I/O speed;
    • You should expect to get more than 70MB/sec on regular 7200 RPM disks. Anything below that could be an indication of a problem;
    • The following commands are an example of how to test read and write speeds (where /data/01 is the mount point of /dev/sda1):

$ hdparm –t /dev/sda1
$ dd bs=1M count=1024 if=/dev/zero of=/data/01 oflag=direct conv=fdatasync
$ dd bs=1M count=1024 of=/dev/null if=/data/01 iflag=direct conv=fdatasync

  • Ensure that the disks have no bad sectors:

$ badblocks -v /dev/sda1
$ badblocks -v /dev/sdb1
...
$ badblocks -v /dev/sdx1


Operating System

Operating System Version


  • Ensure that a supported Operating System is in use [5]:
To check kernel cat /proc/version

$ cat /etc/redhat-release (RHEL)
$ cat /etc/SuSE-release (SLES)
$ cat /etc/issue (Ubuntu)

Swappiness


  • Set vm.swappiness=0 for kernel versions earlier than 2.6.32-303;
  • For later kernel versions set it to 1:

echo "vm.swappiness = 1" >> /etc/sysctl.conf
echo 1 > /proc/sys/vm/swappiness

Date Synchronization & NTP


  • All cluster hosts should be configured to use Network Time Protocol (NTP) to synchronize the clocks;
  • The following is an example of how to check for clock skew:

$ date
$ date -s "<known good date output>" (to set date)
$ grep server /etc/ntp.conf (to view ntp servers)
$ service ntpd start (to start ntp on RHEL)
$ service ntp start (to start ntp on SLES and Ubuntu)
$ ntpq -p


Firewalls


  • Cloudera recommends disabling all firewall software on and between the cluster hosts;
  • If customer policy requires that firewalls must be enabled then please alert Cloudera prior to the start of the engagement:
systemctl disable firewalld
systemctl stop firewalld
systemctl status firewalld

(RHEL/CentOS)
/sbin/chkconfig --list iptables
/sbin/chkconfig --list ip6tables

/sbin/chkconfig iptables off
/sbin/chkconfig ip6tables off

(SLES)
/sbin/chkconfig --list SuSEfirewall2_init
/sbin/chkconfig --list SuSEfirewall2_setup

/sbin/chkconfig SuSEfirewall2_init off
/sbin/chkconfig SuSEfirewall2_setup off

(Ubuntu)
sudo ufw status
sudo ufw disable

Kernel Security Modules


  • For RHEL and CentOS, Cloudera requires that SELinux be disabled;
  • If customer policy requires that SELinux be enabled then please alert Cloudera prior to the start of the engagement;
  • The following command tests whether SELinux is enabled. It outputs "disabled" if SELinux is disabled, or closes without error if it is enabled:

selinuxenabled || echo "disabled"

  • To disable SELinux, edit /etc/sysconfig/selinux and make sure that the following setting is present: SELINUX=disabled;
  • Note: a reboot is required in order for the SELinux changes to take effect;

  • SLES and Ubuntu use AppArmor as their kernel security mechanism. Ensure that the configured profiles are non-restrictive and will not interfere with CDH. For example, the default configuration on an Ubuntu server is:

$ sudo service apparmor status
apparmor module is loaded.
5 profiles are loaded.
5 profiles are in enforce mode.
/sbin/dhclient
/usr/lib/NetworkManager/nm-dhcp-client.action
/usr/lib/connman/scripts/dhclient-script
/usr/sbin/ntpd
/usr/sbin/tcpdump
0 profiles are in complain mode.
2 processes have profiles defined.
2 processes are in enforce mode.
/sbin/dhclient (755)
/usr/sbin/ntpd (1489)
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.

Secure Shell


  • Ensure that the secure shell daemon (sshd) is running:

(RHEL/CentOS/SLES)
$ service sshd status
$ service sshd start (to start)

(Ubuntu)
$ sudo service ssh status
$ sudo service ssh start (to start)

Disable non-required services


  • Cloudera recommends disabling any non-required services. For example:

(RHEL/CentOS/SLES)
$ service cups stop && chkconfig cups off
$ service postfix stop && chkconfig postfix off

(Ubuntu)
sudo service cups stop && sudo update-rc.d cups remove
sudo service postfix stop && sudo update-rc.d postfix remove

User Limits (ulimits)


  • Ensure that the nofile and nproc ulimits for the mapred, hdfs, and hbase users are increased to at least 32k;
  • Note: Cloudera Manager sets the ulimits automatically, so you do not need to do the following when using Cloudera Manager to provision your cluster:

$ echo hdfs - nofile 32768 >> /etc/security/limits.conf
$ echo mapred - nofile 32768 >> /etc/security/limits.conf
$ echo hbase - nofile 32768 >> /etc/security/limits.conf

$ echo hdfs - nproc 32768 >> /etc/security/limits.conf
$ echo mapred - nproc 32768 >> /etc/security/limits.conf
$ echo hbase - nproc 32768 >> /etc/security/limits.conf

Transparent Huge Pages


(RHEL/CentOS 6 only)
  • Disable Redhat Transparent Huge Pages (THP) by running the following command and adding it to /etc/rc.local:

echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

Java


  • Cloudera Manager and CDH are supported with Oracle JDK;
  • A supported version of Oracle JDK will be installed as part of the Cloudera Manager installation;
  • If Oracle JDK will be installed prior to Cloudera Manager then ensure it is a supported version [12]:

java -version
javac -version
update-java-alternatives --list
alternatives --display java

  • Note: Java 1.6 update 18 is a known bad version.

Network

Network Architecture Diagrams


  • Please provide a network architecture diagram prior to the engagement;
  • If known, the diagram should include details of the network hardware specifications (e.g. switch model numbers), speed and cardinality of links (including uplinks), and link bonding details (if used).

DNS


  • Ensure that DNS servers have been configured:

$ cat /etc/resolv.conf
$ cat /etc/nsswitch.conf (should have a line like "hosts: files dns")
$ cat /etc/host.conf (should have a line like "order hosts, bind")

  • Note that if /etc/resolv.conf is pointing to an external address (e.g. 8.8.8.8) then local hostnames will have to be resolved using /etc/hosts;
  • Ensure that both forward and reverse DNS are functional. The following should be performed for each DNS server listed in /etc/resolv.conf:

$ dig @<dnsserver> hostname
$ host <ip address>
$ dig -x <ip address>
$ nslookup <host name>

  • Ensure that 127.0.0.1 is set to localhost, not the hostname:

cat /etc/hosts
ping localhost

  • For larger clusters (more than 50 hosts) Cloudera recommends enabling DNS caching:

(RHEL/CentOS/SLES)
$ service nscd start && chkconfig nscd on

(Ubuntu)
$ sudo service nscd start && sudo update-rc.d nscd defaults

/etc/hosts


  • If the hosts file will be used instead of DNS (not recommended but sometimes necessary) ensure that /etc/hosts includes an entry for 127.0.0.1 (localhost) and the server's IP address, with the fully qualified domain name (FQDN) listed first, for example:

<ip address> <fqdn> <alias>
127.0.0.1 localhost
10.10.10.1 hadoop1.example.com hadoop1

IPv6


  • Ensure that IPv6 is disabled:

$ lsmod | grep ipv6

(to disable add the following to /etc/sysctl.conf)

#disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

(RHEL/CentOS)
  • Add the following to /etc/sysconfig/network:

NETWORKING_IPV6=no
IPV6INIT=no

Static IP


  • Ensure that the cluster host's IP addresses are static (or DHCP but statically assigned):

$ cat /etc/sysconfig/network-scripts/ifcfg-eth* (RHEL)
$ cat /etc/sysconfig/network/ifcfg-eth* (SLES)
$ cat /etc/network/interfaces (Ubuntu)

  • The above files should contain:
    • BOOTPROTO=static (RHEL/SLES);
    • iface <iface> inet static (Ubuntu);
    • IPADDR or address should match the output of ifconfig;
  • Note: statically assigned DHCP addresses will have to be verified by a system administrator.

Hostnames


  • Ensure the hostname is set to the fully qualified domain name (FQDN):

$ grep HOSTNAME /etc/sysconfig/network (RHEL)
$ /etc/HOSTNAME (SLES)
$ hostname --fqdn (Ubuntu)

(SLES)
  • The FQDN can be persisted across reboots by adding the following to /etc/init.d/boot.local:

/bin/hostname –-file /etc/HOSTNAME


Network Interface Cards (NICs)


  • Verify that the NICs are configured for expected speed and duplex settings, and are effectively working at the configured speed:

$ ifconfig
$ ethtool <interface> | grep Speed
$ ethtool -S <interface> | grep collision
$ ethtool -S <interface> | grep drop
$ netperf

  • Ensure that the latest network drivers/firmware from the NIC vendor are installed;
  • Note: if interface bonding is in use, then use iperf to verify the expected speeds.

Databases

  • Ensure that a supported database is in use [14];
  • An embedded PostgreSQL database may be used to meet the requirements of the services, but it is not recommended;
  • For production use, a supported external database should be used;
  • The following services need a database:

Database
Description
Est. Size
Cloudera Manager
Stores all the information about the configured services, role assignments, configuration history, commands, users, and running processes
Small (<100MB)
Activity Monitor
Contains information about past MapReduce activities.
This database is only required if a MapReduce (MRv1) service is deployed
Can grow large in large clusters
Reports Manager
Tracks disk utilization by user, group, and directory. Tracks processing activities by user, pool, and HBase table
Medium
Hive Metastore Server
Contains Hive and Impala metadata
Small
Sentry Server
Contains authorization metadata
Small
Navigator Audit Server
Contains auditing information
Can grow large in large clusters
Navigator Metadata Server
Contains authorization, policies, and audit report metadata
Small
Hue
Contains user information, saved SQL scripts, and saved workflows.
By default Hue uses an embedded SQLite database. Using a supported external database is recommended for production
Small
Oozie
Contains workflow metadata.
By default Oozie uses an embedded Derby database. Using a supported external database is recommended for production
Small

Repository Access

Cloudera Manager


  • If the cluster will have access to the Internet then the Cloudera Manager repo file can be downloaded and copied to your package manager's repositories directory. For example:

(RHEL/CentOS)
$ curl http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/cloudera-manager.repo -o /etc/yum.repos.d/cloudera-manager.repo

(SLES)
$ sudo zypper addrepo -f http://archive.cloudera.com/cm5/sles/11/x86_64/cm/cloudera-manager.repo
$ sudo zypper refresh

(Ubuntu)
$ curl http://archive.cloudera.com/cm5/ubuntu/trusty/amd64/cm/cloudera.list -o /etc/apt/sources.list.d/cloudera.list
$ sudo apt-get update

  • For more information on establishing your Cloudera Manager repository strategy, please refer to the Cloudera installation documentation online [6];
  • If the cluster will not have access to the Internet then the Cloudera Manager package repository should be mirrored internally by following the Cloudera installation documentation online [7].

CDH


  • The customer must decide how they would like to manage the CDH software installation. Specifically whether they would like to use packages or parcels as the CDH distribution format. Cloudera recommends using parcels. For more information please refer to the documentation online [9], [10];
  • If the cluster will have access to the Internet and Cloudera Manager will be used to install CDH then no additional steps are required as Cloudera Manager will manage the CDH parcels or packages;
  • If the cluster will not have access to the Internet then the CDH parcel or package repository should be mirrored internally by following the Cloudera installation documentation online [7], [8], [11].

Security


  • If the customer is planning to enable Hadoop Security (Kerberos authentication) then please refer to the Cloudera Hadoop Security Prerequisites (Authentication) document [13].



Prerequisites Checklist



Task
Completed Date
Completed By
Logistics and Administration
Engagement logistics and administrative tasks completed


Cluster Architecture
Feedback provided on Cloudera reference architectures and selected architecture approach


Cluster Hardware
Worker host Hadoop disks have been mounted correctly. I.e. JBOD with 'noatime'


Disks have been formatted with EXT3, EXT4, or XFS


Disks have been checked for errors and expected performance has been verified


Operating System
A supported Operating System (OS) is in use


The root, /opt, /usr, and /var filesystems have the required minimum free space


vm.swappiness has been configured correctly


NTP has been configured and is running on all cluster hosts


Firewall software (e.g. iptables) has been disabled on and between all cluster hosts


Kernel security modules (e.g. SELinux) have been disabled on all cluster hosts


The secure shell daemon (sshd) is running on all cluster hosts


All non-required OS services have been disabled


The required nofile and nproc ulimits have been configured


Transparent Huge Pages (THP) has been disabled on all cluster hosts (RHEL/CentOS 6.x only)


A supported version of Oracle JDK has been installed and configured


Cluster Network
Network architecture diagrams have been provided to Cloudera


DNS has been configured on all cluster hosts


If /etc/hosts is being used (not recommended) then it contains an entries for 127.0.0.1 to localhost, and the server's IP address to FQDN


IPv6 has been disabled on all cluster hosts


All cluster hosts have a static IP address


Hostnames have been set to the fully qualified domain name (FQDN) on all cluster hosts


Network Interface Cards (NICs) have been configured and expected performance has been verified


Latest NIC drivers/firmware have been installed


Databases
A supported RDBMS has been installed on a utility server for use by CM and CDH services


Package and Parcel Repositories
The cluster has access to the required CM and CDH repositories


Cluster Security
The Cloudera Hadoop Security Prerequisites documentation has been reviewed






References


[1] HP Reference Architecture for Cloudera Enterprise 5, v4.0 (December 18, 2014), [online]
[2] Dell Cloudera Solution Reference Architecture, v5.1 (July 14, 2014), [online]
[3] Cisco UCS CPAv2 for Big Data with Cloudera, 2.0 (May 13, 2014), [online]
[4] Cloudera Manager 5 Resource Requirements, Cloudera 5.3.x, [online]
[5] CDH 5 Supported Operating Systems, Cloudera 5.3.x, [online]
[7] Creating and Using a Package Repository, Cloudera 5.3.x, [online]
[8] Creating and Using a Parcel Repository, Cloudera 5.3.x, [online]
[9] Managing Software Installation, Cloudera 5.3.x, [online]
[10] Parcels Distribution Format, Cloudera 5.3.x, [online]
[11] Creating a Local Yum Repository, Cloudera 5.3.x, [online]
[12] Supported JDK Versions, Cloudera 5.3.x, [online]
[13] Cloudera Hadoop Security Prerequisites (Authentication), v1.1 (January 06, 2015)
[14] CDH 5 Supported Databases, Cloudera 5.3.x, [online]


No comments:

Post a Comment