Friday, March 25, 2016

Cassandra Cluster configuration

Prerequisites

Install cassandra on each node
Get the IP address of each node
Decide who will going to be the seed nodes. (Seed nodes are used to find each other and find the topology of the ring. They bootstrap the gossip process for new nodes joining the cluster)

In this blog, I'm going to configure 2 node cluster. But you can use the same method to configure it up to any number of nodes.

I have installed cassandra on 192.168.56.101 and 192.168.56.103. I'm going to use a one seed node (but generally it is better to have more than 1 seed node) and it will be 192.18.56.103.

If you have firewall running you have to open certain ports to allow communication between nodes. For that you will have to open port 7000 and 9160. But if you are going to use OpsCenter Monitoring as well (Which is used to monitor the cluster. This package is available if you download cassandra tar from datastax download page as I did in my previous blog) you need to open ports 7199, 8888, 61620, and 61621.

iptables -A INPUT -p tcp --dport 7000 -j ACCEPT

In the same way you have to open other ports as well.

If cassendra is running, first you have to stop cassandra.

Clear the data folder by running below command.
   
$ sudo rm -rf /var/lib/cassandra/*

Then set the properties in cassandra.yaml file. (This file is available in <installation dir>/conf folder)

Properties to set
num_tokens: recommended value: 256
-seeds: internal IP address of each seed node
listen_address:ip address of the node you are configuring
If not set, Cassandra asks the system for the local address, the one associated with its hostname. In some cases Cassandra doesn't produce the correct address and you must specify the listen_address.
endpoint_snitch: 0.0.0.0
auto_bootstrap: false (Add this setting only when initializing a fresh cluster with no data.)
This should be configured in cassandra.yaml files in all the nodes. Other than the listen address, all the other configuration properties are same for all nodes.

Sample configuration

As I have only 2 nodes I have only one seed node. But it is better to have more than 1 seed node.

On node 192.168.56.101
cluster_name: 'DemoCluster'
num_tokens: 256
seed_provider:
  - class_name: org.apache.cassandra.locator.SimpleSeedProvider
    parameters:
         - seeds: "192.168.56.103"
listen_address:192.168.56.101
rpc_address: 0.0.0.0
endpoint_snitch: RackInferringSnitch

On node 192.168.56.103
cluster_name: 'DemoCluster'
num_tokens: 256
seed_provider:
  - class_name: org.apache.cassandra.locator.SimpleSeedProvider
    parameters:
         - seeds: "192.168.56.103"
listen_address:192.168.56.103
rpc_address: 0.0.0.0
endpoint_snitch: RackInferringSnitch

 After this you can first start the seed nodes and then start the other nodes. In my case First I run below command from seed node and then from other nodes.

cduser@slave:/home/slave/dsc-cassandra-2.0.7$ bin/cassandra -f

Now you have a running cassandra multi node. You can check the status of the nodes by running the below command.










  

No comments:

Post a Comment