Hadoop Tutorial provides a introduction into working with big data in Hadoop via the Hortonworks Sandbox, HCatalog, Pig and Hive. Learn How to handle Big Data

What is the InputSplit in map reduce software?

An input split is the slice of data to be processed by a single Mapper. It generally is of the block size which is stored on the datanode. Hadoop is a set of algorithms (an open-source software framework written in Java) for distributed storage and distributed processing of very large data sets (Big Data) on computer clusters built from commodity hardware. You can Subscribe with email for all Hadoop Tutorial

Consider case scenario: In M/R system, - HDFS block size is 64 MB



- Input format is FileInputFormat

 – We have 3 files of size 64K, 65Mb and 127Mb

 How many input splits will be made by Hadoop framework?

Hadoop will make 5 splits as follows:

- 1 split for 64K files
- 2 splits for 127MB files
- 2 splits for 65MB files

Read more about  How to use Combiner in Hadoop ?

Is it possible to create multiple table in hive for same data?


Initially it look not possible as I am a regular RDBMS user like other programmer, but then i tried to connect in Hive context. I found it is possible as Hive creates schema and append on top of an existing data file. One can have multiple schema for one data file, schema would be saved in hive’s metastore and data will not be parsed read or serialized to disk in given schema. When s/he will try to retrieve data schema will be used. Lets say if my file have 5 column (Id,Name,Class,Section,Course) we can have multiple schema by choosing any number of column.

Give Your Comments Below  about your Answers. 

What are the 'maps' and 'reduces'?


'Maps' and 'Reduces' are two phases of solving a query in HDFS.
'Map' is responsible to read the data from input location, and based on the input type  it will generate a key value pair,that is, an intermediate output in local machine.
'Reducer' is responsible to process the intermediate output received from the mapper and generate the final output. 

Get Updates

Enter your email address:

Delivered by FeedBurner

Ask Questions

Name

Email *

Message *

Popular Posts