Saturday, April 16, 2016

Hadoop Questions & Answers - PART2

1. What is rack awareness?
Ans: Rack awareness is the way in which the namenode decides how to place blocks based on the rack definitions Hadoop will try to minimize the network traffic between datanodes within the same rack and will only contact remote racks if it has to. The namenode is able to control this due to rack awareness.

2. Which file does the Hadoop-core configuration?Ans: core-default.xml
3. Is there a hdfs command to see available free space in hdfs
Ans: hadoop dfsadmin -report
4. The requirement is to add a new data node to a running Hadoop cluster; how do I start services on just one data node?
Ans: You do not need to shutdown and/or restart the entire cluster in this case.First, add the new node's DNS name to the conf/slaves file on the master node.Then log in to the new slave node and execute −$ cd path/to/hadoop$ bin/hadoop-daemon.sh start datanode$ bin/hadoop-daemon.sh start tasktrackerthen issuehadoop dfsadmin -refreshNodes and hadoop mradmin -refreshNodes so that the NameNode and JobTracker know of the additional node that has been added.

5. How do you gracefully stop a running job?
Ans: Hadoop job –kill jobidDoes the name-node stay in safe mode till all under-replicated files are fully replicated?No. During safe mode replication of blocks is prohibited. The name-node awaits when all or majority of data-nodes report their blocks.
6. What happens if one Hadoop client renames a file or a directory containing this file while another client is still writing into it?
Ans: A file will appear in the name space as soon as it is created. If a writer is writing to a file and another client renames either the file itself or any of its path components, then the original writer will get an IOException either when it finishes writing to the current block or when it closes the file.
7. How to make a large cluster smaller by taking out some of the nodes?
Ans: Hadoop offers the decommission feature to retire a set of existing data-nodes. The nodes to be retired should be included into the exclude file, and the exclude file name should be specified as a configuration parameter dfs.hosts.exclude.The decommission process can be terminated at any time by editing the configuration or the exclude files and repeating the -refreshNodes command
8. Can we search for files using wildcards?
Ans: Yes. For example, to list all the files which begin with the letter a, you could use the ls command with the * wildcard &minu;hdfs dfs –ls a*
9. What happens when two clients try to write into the same HDFS file?
Ans: HDFS supports exclusive writes only.When the first client contacts the name-node to open the file for writing, the name-node grants a lease to the client to create this file. When the second client tries to open the same file for writing, the name-node will see that the lease for the file is already granted to another client, and will reject the open request for the second client
10. What does "file could only be replicated to 0 nodes, instead of 1" mean?
Ans: The namenode does not have any available DataNodes.
11. What is a Combiner?
Ans: The Combiner is a ‘mini-reduce’ process which operates only on data generated by a mapper. The Combiner will receive as input all data emitted by the Mapper instances on a given node. The output from the Combiner is then sent to the Reducers, instead of the output from the MappersConsider case scenario: In M/R system, - HDFS block size is 64 MB- Input format is FileInputFormat– We have 3 files of size 64K, 65Mb and 127Mb
12. How many input splits will be made by Hadoop framework?
Ans: Hadoop will make 5 splits as follows −- 1 split for 64K files- 2 splits for 65MB files- 2 splits for 127MB files
13. Suppose Hadoop spawned 100 tasks for a job and one of the task failed. What will Hadoop do?
Ans: It will restart the task again on some other TaskTracker and only if the task fails more than four ( the default setting and can be changed) times will it kill the job.
14. What are Problems with small files and HDFS?
Ans: HDFS is not good at handling large number of small files. Because every file, directory and block in HDFS is represented as an object in the namenode’s memory, each of which occupies approx 150 bytes So 10 million files, each using a block, would use about 3 gigabytes of memory. when we go for a billion files the memory requirement in namenode cannot be met.
15. What is speculative execution in Hadoop?
Ans: If a node appears to be running slow, the master node can redundantly execute another instance of the same task and first output will be taken .this process is called as Speculative execution.
16. Can Hadoop handle streaming data?
Ans: Yes, through Technologies like Apache Kafka, Apache Flume, and Apache Spark it is possible to do large-scale streaming.
17. Why is Checkpointing Important in Hadoop?
Ans: As more and more files are added the namenode creates large edit logs. Which can substantially delay NameNode startup as the NameNode reapplies all the edits. Checkpointing is a process that takes an fsimage and edit log and compacts them into a new fsimage. This way, instead of replaying a potentially unbounded edit log, the NameNode can load the final in-memory state directly from the fsimage. This is a far more efficient operation and reduces NameNode startup time.
18. What is Twitter Bootstrap?
Ans: Bootstrap is a sleek, intuitive, and powerful mobile first front-end framework for faster and easier web development. It uses HTML, CSS and Javascript.
19.Why use Bootstrap?
Ans: Bootstrap can be used as −Mobile first approach − Since Bootstrap 3, the framework consists of Mobile first styles throughout the entire library instead of in separate files.Browser Support − It is supported by all popular browsers.Easy to get started − With just the knowledge of HTML and CSS anyone can get started with Bootstrap. Also the Bootstrap official site has a good documentation.Responsive design − Bootstrap's responsive CSS adjusts to Desktops,Tablets and Mobiles.Provides a clean and uniform solution for building an interface for developers.It contains beautiful and functional built-in components which are easy to customize.It also provides web based customization.And best of all it is an open source.
20. What does Bootstrap package includes?
Ans: Bootstrap package includes −Scaffolding − Bootstrap provides a basic structure with Grid System, link styles, background. This is is covered in detail in the section Bootstrap Basic StructureCSS − Bootstrap comes with feature of global CSS settings, fundamental HTML elements styled and enhanced with extensible classes, and an advanced grid system. This is covered in detail in the section Bootstrap with CSS.Components − Bootstrap contains over a dozen reusable components built to provide iconography, dropdowns, navigation, alerts, popovers, and much more. This is covered in detail in the section Layout Components.JavaScript Plugins − Bootstrap contains over a dozen custom jQuery plugins. You can easily include them all, or one by one. This is covered in details in the section Bootstrap Plugins.Customize − You can customize Bootstrap's components, LESS variables, and jQuery plugins to get your very own version.

21. What is Contextual classes of table in Bootstrap?
Ans: The Contextual classes allow you to change the background color of your table rows or individual cells.Class Description.active Applies the hover color to a particular row or cell.success Indicates a successful or positive action.warning Indicates a warning that might need attention.danger Indicates a dangerous or potentially negative action

22.What is Bootstrap Grid System?
Ans: Bootstrap includes a responsive, mobile first fluid grid system that appropriately scales up to 12 columns as the device or viewport size increases. It includes predefined classes for easy layout options, as well as powerful mixins for generating more semantic layouts.
23. What are Bootstrap media queries?
Ans: Media Queries in Bootstrap allow you to move, show and hide content based on viewport size.Show a basic grid structure in Bootstrap.Following is basic structure of Bootstrap grid −<div class="container">   <div class="row">      <div class="col-*-*"></div>      <div class="col-*-*"></div>         </div>   <div class="row">...</div></div><div class="container">....
24. What are Offset columns?
Ans: Offsets are a useful feature for more specialized layouts. They can be used to push columns over for more spacing, for example. The .col-xs=* classes don't support offsets, but they are easily replicated by using an empty cell.
25. How can you order columns in Bootstrap?
Ans: You can easily change the order of built-in grid columns with .col-md-push-* and .col-md-pull-* modifier classes where * range from 1 to 11

No comments:

Post a Comment