QnaList > Groups > Sqoop-User > Nov 2015
faq

Sqoop Oracle And MS Access

Hi,
I've been using sqoop for a while and have recently come across two issues:
   1. We have a use case where we import data as avro from an oracle
   database (so it's a standard use case).
   The problem is that there are really a lot of tables, and most of these
   tables contain fields with "strange" types such as "Timestamp(6) with
   Timezone".
   By default sqoop complains about these fields and breaks, unless one
   manually specifies how to convert these columns with "--map-column-java".
   (We tried this for a few tables, mapping unsupported types to String).
   As I mentioned the number of tables is so large that using this
   workaround is not a viable option.
   Is there a way to tell sqoop to convert unknown types to String when
   importing as avro? [I couldn't find anything on the documentation].
   2. Another use case we are facing is importing data from Microsoft
   Access Dbs. Is there a way to do this with sqoop? [I tried using a few jdbc
   drivers for Access, but with no success]
Thanks for the support!
Cheers,
GG

asked Nov 6 2015 at 00:32

Giovanni M.L.G. 's gravatar image



2 Replies for : Sqoop Oracle And MS Access
Giovanni,
Are you using Sqoop 1.4.6? Have you tried using the “—direct” option for Oracle in order to utilise the Data Connector for Oracle and Hadoop?
You can get more details in the documentation here: http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_data_connector_for_oracle_and_hadoop
This has much better support for Oracle data types and also better performance. By default it will convert timestamp with timezone to String as Sqoop normally stores dates as a number which loses the timezone information. There is a parameter documented here if you prefer to store the data as a number http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_java_sql_timestamp
In regards to Access I have never tried this with Sqoop but did you try the JDBC to ODBC bridge driver? From memory the easysoft driver lets you run a bridge on a Windows server that you can connect to from the Linux server so it may work.
David
From: Giovanni M.L.G. [mailto:[email protected]]
Sent: Friday, 6 November 2015 7:33 PM
To: [email protected]
Subject: Sqoop Oracle and MS Access
Hi,
I've been using sqoop for a while and have recently come across two issues:
  1.  We have a use case where we import data as avro from an oracle database (so it's a standard use case).
The problem is that there are really a lot of tables, and most of these tables contain fields with "strange" types such as "Timestamp(6) with Timezone".
By default sqoop complains about these fields and breaks, unless one manually specifies how to convert these columns with "--map-column-java". (We tried this for a few tables, mapping unsupported types to String).
As I mentioned the number of tables is so large that using this workaround is not a viable option.
Is there a way to tell sqoop to convert unknown types to String when importing as avro? [I couldn't find anything on the documentation].
  2.  Another use case we are facing is importing data from Microsoft Access Dbs. Is there a way to do this with sqoop? [I tried using a few jdbc drivers for Access, but with no success]
Thanks for the support!
Cheers,
GG

answered Nov 6 2015 at 15:32

David Robson 's gravatar image


Giovanni,
What is the issue you are facing with Ms Access through the JDBC drivers?  We are working on some contributions to make Sqoop work better with third party JDBC drivers.
Regarding the Oracle question, the DataDirect Oracle JDBC driver has a property to convert those types to string if you cannot find a way through Sqoop.  However, it is not a free solution.
http://media.datadirect.com/download/docs/jdbc/alljdbc/help.html?_ga=1.23504382.576670223.1301850762#page/userguide/rfi1359986091890.html#
FetchTSWTZasTimestamp
Purpose
Determines whether column values with the TIMESTAMP WITH TIME ZONE data type are returned as a JDBC CHAR or TIMESTAMP data type.
This property only applies to connections to Oracle9i and higher.
Valid Values
true | false
Behavior
If set to true, column values with the TIMESTAMP WITH TIME ZONE data type are returned as a JDBC TIMESTAMP data type.
If set to false, column values with the TIMESTAMP WITH TIME ZONE data type are returned as a JDBC VARCHAR data type.
Default
false
Data Type
boolean
See also
Sent from my iPad
Hi,
I've been using sqoop for a while and have recently come across two issues:
  1.  We have a use case where we import data as avro from an oracle database (so it's a standard use case).
The problem is that there are really a lot of tables, and most of these tables contain fields with "strange" types such as "Timestamp(6) with Timezone".
By default sqoop complains about these fields and breaks, unless one manually specifies how to convert these columns with "--map-column-java". (We tried this for a few tables, mapping unsupported types to String).
As I mentioned the number of tables is so large that using this workaround is not a viable option.
Is there a way to tell sqoop to convert unknown types to String when importing as avro? [I couldn't find anything on the documentation].
  2.  Another use case we are facing is importing data from Microsoft Access Dbs. Is there a way to do this with sqoop? [I tried using a few jdbc drivers for Access, but with no success]
Thanks for the support!
Cheers,
GG

answered Nov 6 2015 at 04:59

Sumit Sarkar 's gravatar image


Related discussions

Tagged

Group Sqoop-user

asked Nov 6 2015 at 00:32

active Nov 6 2015 at 04:59

posts:3

users:3

©2013 QnaList.com