Relational Data Store

Big Data: Hadoop Eco Systems

Relational Data Store with Hadoop

Pig, Hive, Sqoop

Apache Pig

Writing mappers and reducers by hand take a long time. Pig introduces Pig Latin, a scripting language that lets use SQL-like syntax to define your map and reduce steps

Highly extensible with user-defined functions (UDF's)

Pig is declarative language; High-level data flow scripting language

Execution Tools & Modes: Grunt shell or CLI; Local or MapReduce mode; Interactive or Batch

LOAD -> Transform -> Result : LOAD, FILTER, JOIN, GROUP, FOREACH, GENERATE GROUP, SUM, ORDER, DUMP

External Web Site / Technical Documents / Installation Steps:
  • Apache Pig
  • Technical Documents/Presentation
  • Installation Steps
  • Apache Hive

    Distributing SQL queries with Hadoop, Translates SQL queries to MapReduce or Tez jobs on your cluster.

    The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive

    Pros: Uses familiar SQL syntax (HiveQL), interactive, Scalable: work with "big data" on a cluster; Easy OLAP queries; Highly optimized; Highly extensible

    Cons: High latency - not appropriate for OLTP; Stores data de-normalized; SQL is limited as compared to Pig and Spark; No transactions; No record-level updates, inserts, deletes

    External Web Site / Technical Documents / Installation Steps:
  • Apache Hive
  • Technical Documents/Presentation
  • Installation Steps
  • Apache Sqoop

    Handle Big Data kicks off MapReduce jobs to handle importing or exporting data

    Sqoop Import Data from Relational Database like MySQL, Oracle to HDFS

    Sqoop Import Data from Relational Databsae like MySQL, Oracle directly into Hive

    Sqoop Incremental imports - keep your relational database and Hadoop in sync; --check-column and --last-value

    Sqoop Export data from Hive to Relational Database like MySQL, Oracle

    External Web Site / Technical Documents / Installation Steps:
  • Apache Sqoop
  • Technical Documents/Presentation
  • Installation Steps