Flume is a tool to ingest near real time data. Understood how to configure a flume agent and simulate live log data to ingest from a local directory into HDFS.
Apache Hive I
Hive is a tool that provides SQL querying of data that is stored in HDFS and H-Base. Define tables in Hive to model and view data in HDFS.
Apache Sqoop
Sqoop is a tool that is sued to exchange data between a database and the Hadoop cluster using map reduce jobs. Used Sqoop to import, export, and update data to and from a database.