Apache Sqoop
(TM) is a tool designed for efficiently transferring bulk data between
Apache Hadoop
and structured datastores such as relational databases.
Sqoop
supports incremental loads of a single table or a free form SQL query as
well as saved jobs which can be run multiple times to import updates made to a
database since the last import. Imports can also be used to populate tables in Hive or HBase
.
Exports can be used to put data from Hadoop
into a relational database.
Sqoop
got the name from sql+hadoop. Sqoop
became a top-level Apache project in March 2012.
1 Sqoop basics
To simply list database tables you can use sqoop list-tables
command:
$ sqoop list-tables --driver com.informix.jdbc.IfxDriver --connect "jdbc:informix-sqli://host:port/dbname:INFORMIXSERVER=server ...
You can read more about integration Hadoop
with relational databases at: