Hi, I’m using Meltano to load CSV into oracle data...
# troubleshooting
t
Hi, I’m using Meltano to load CSV into oracle database using target-oracle. (this is my first time using Meltano) Is there any config to change the default column size from CHAR(255) to be higher? (some columns cannot be inserted)
h
Hi, I’m developing that target, the max length of 255 has to be a copy-paste error on my side. If you give me a day, I will have an update ready for you. I don’t (yet) use it in production myself, so any feedback you have would be very valuable.
t
Thank you for your response Henning. Yes, I can wait for the update 🙂
h
I have updated the target so that it writes varchar(2000) by default, let me know if that works for you.
t
Thanks Henning. This should work for me. I have encountered another issue while trying to load the data from the
tap-spreadsheets-anywhere
to insert into
target-oracle
. The tap tries to add columns
_smart_source_bucket
,
_smart_source_file
,
_smart_source_lineno
. This generates the sql query
Copy code
INSERT INTO sample_data_temp (id, name, _smart_source_bucket, _smart_source_file, _smart_source_lineno) 
VALUES (1, 'Test', 'filepath', 'SAMPLEDATA.xlsx', 3)
;
which will generate error
Copy code
SQL Error: ORA-00911: invalid character
00911. 00000 -  "invalid character"
*Cause:    The identifier name started with an ASCII character other than a
           letter or a number. After the first character of the identifier
           name, ASCII characters are allowed including "$", "#" and "_".
           Identifiers enclosed in double quotation marks may contain any
           character other than a double quotation. Alternate quotation
           marks (q'#...#') cannot use spaces, tabs, or carriage returns as
           delimiters. For all other contexts, consult the SQL Language
           Reference Manual.
*Action:   Check the Oracle identifier naming convention. If you are
           attempting to provide a password in the IDENTIFIED BY clause of
           a CREATE USER or ALTER USER statement, then it is recommended to
           always enclose the password in double quotation marks because
           characters other than the double quotation are then allowed.
because the column name starts with
_
is not wrapped in double quote. I tried to check the config of tap-spreadsheets-anywhere to turn off the column insertion but not found one. Could you help check this?
h
Damn, I thought I had taken care of that issue… I will dig into it. I’ll have something for you later today (Europe time).
I used the opportunity to clean up some more special-character issues, but that means that these changes in principle are breaking. Just so I don’t yolo-merge this, I have created a new branch I hope you can test. Just change the definition in your
meltano.yaml
:
Copy code
- name: target-oracle
    variant: radbrt
    pip_url: git+<https://github.com/radbrt/target-oracle.git@illegal_colnames>
This should move leading underscores to trailing underscores, and handle a few edge cases involving periods etc. For normal column names it should not matter at all.
t
I tested the
target-oracle
with your branch and it works fine. Now these columns can be inserted. Thanks. I have another issue regarding the column discovery. It seems that the
target-oracle
will create the column type of Number(38,0) for the integer columns and Number(22,16) for the float columns. The Number(22,16) means there are 16 decimals and only 6 digits for numbers so I cannot insert the data higher than 1 million into that column. Could you please check? (it will return
ORA-01438: value larger than specified precision allowed for this column
). Maybe make it (38,16)? One more question, is there a way to possible force all columns to be created as string column?
h
Ah, thanks for The feedback. The 38,0 thing was weird, possibly a leftover from snowflake etc. As for the decimals, I see my mistake. 22+16=38… I’ll merge the branch + fix the decimal thing this afternoon. I have said it before, but thank you very much for your patience and feedback here. It is great help in working out the bugs.
Regarding the decimal number definition, let me know if you have any opinion about a good “generic” definition. I don’t have any strong opinions, and intended to write 38,16 simply because it would leave 16 decimals.
t
Based on my experience the number in front of the decimals could go up to 13-17 digits (trillions) esp. with our currency. For the decimals, it could be up to 6 digits from my personal experience. So I would suggest (38, 10) as a starting point.
h
I merged the previous branch, and created a new one named
number_datatypes
with some changes to number types: • By default, NUMBER(38,10) is used • A new config
prefer_float_over_numeric
is
false
by default, but can be set to true so that it uses FLOAT datatypes for numerics. I strongly suspect numbers are converted to floats at some point during the ingest process anyways, so that using float in the DB is actually the most accurate representation of the data type at that point. Let me know how it works in reality.
t
The fix in branch
number_datatypes
seems to work. I managed to EL the first excel file into Oracle db. Let me test the remaining excel files and will get back to you.
@Henning Holgersen We have managed to EL most of our excel files. There is one problem remaining which is the size of CHAR columns (which is 2000 as per last fix). There is a column that has the original of 1000 Thai characters (which can take up to 3000 bytes max.). Is it possible to change the column type of string characters to NVARCHAR2(4000) as per Oracle documentation? (which should cover most used cases) >> https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/datatype-limits.html
h
I came to think about that after I pushed… give me an hour.
Branch is updated, should work if you
meltano install --clean
.
t
Yep, that should work. Thank you for your help. We finally managed to import all excels into the Oracle db. 👍
h
Excellent. I’ll merge the changes in. And create a new tag then.