hi everyone, how to configure target-oracle? Is th...
# getting-started
r
hi everyone, how to configure target-oracle? Is there any example to make it works from postgres to oracle?
h
Hi, I'm maintaining target-oracle, it is a good question so as a first step I created a branch that contains a demo config including a small tap to read from that should work (and a docker-compose file so you can test): https://github.com/radbrt/target-oracle/tree/demo I'm not at my normal computer right now and don't have the oracle drivers, so I can't actually verify that it works yet. Installing the oracle drivers is the biggest pain point. It uses the python library cx-Oracle for connecting. Also, fyi, I'm mostly on mac/linux, so there might be issues on Windows I'm not aware of.
Sorry, I somehow forgot to commit everything before I pushed, it is there now.
t
It looks like a lot of those config options aren’t listed on the Hub https://hub.meltano.com/loaders/target-oracle cc @pat_nadolny
r
thanks everyone for support. @Henning Holgersen you are right, driver for oracle is the key. I found the solution. I am using Linux, so I just installed the packages based on this for Oracle. Now, it is fine based on your target-oracle https://www.oracle.com/database/technologies/instant-client/linux-x86-64-downloads.html
@Henning Holgersen I noticed that from the Postgres table to the Oracle table, it overrides the table's properties (column types) in Oracle, which already existed/created. Is there any way to work around this? For example, from a Postgres table to another Postgres table, it won't override the table's properties or types if it has already been created or existed in the destination database.
h
@rida Right, I haven’t actually tested this explicitly, the most common use case I see is tables that are created and maintained by Meltano and the target in a somewhat yolo fashion. The short answer is that I can’t currently give any guarantees. The rambling answer is that I imagine that between postgres DBs, especially if you are using the meltanolabs tap+target variants, it is easier to create consistent mappings from postgres-type to singer-type to postgres-type. One solution would be to go through the mappings and improve them a little, but since “data contracts” are hot these days I’d be equally interested in a “don’t touch the schema” option that would fail rather than to create or alter columns. Any thoughts on these options? If you can give some examples of column type mappings (postgres-type => oracle-type) you think are odd, I’d be interested to see.
r
@Henning Holgersen In Oracle, I have to work with specific column types. It is important to keep it this way. That is the main reason I have to create a table before loading data from Postgres to Oracle, which means I only need data to load unlike another method to load everything including its table's property. Yes, you are correct about don't touch the schema because I also need to load this exact same table to other databases too. It will be a problem later on. This is why currently I am not touching the schema, and seeking your idea about an alternative solution for target-oracle. As you mentioned, tap-postgres target-postgres, it won't override the table's property according to its capability. For my case, please take a look below: existed table in Oracle: right side image or 2nd image Overrides by Postgres table: left side image or 1st image
h
@rida ah, so a column that is for example varchar(125) in postgres becomes varchar(4000) in oracle... I'm not totally surprised. If you are OK with maintaining the schema in Oracle manually, the most robust solution might be to add an option to not allow any changes to schema (which means runs will fail if there are incompatibilities). This is an option I have pondered before, as a general matter. Alternatively we could try to go through the type conversions so they are to be more true to the original, but I suspect there will still be some curious type conversions if we do that.
r
@Henning Holgersen Thanks for your response. Manually maintaining its types is the issue because every time it loads, it will replace all which we change. About schema, how to achieve that with Postgres to Oracle?
h
@rida I'm not entirely sure I understand what you mean, but it sounds like you really need the types to stay exactly consistent without maintaining the schema yourself? For the varchar examples that might be doable (I might have some time to work on that in a few days), but you will face similar issues with other column types that can't be solved easily. The singer type system simply isn't as detailed as the databases.
r
@Henning Holgersen Let me clarify it further. Actually, I want only the data from Postgres to Oracle because I already had the table in Oracle. Therefore, when it loads, it should load only the data and it should not drop our existing table and re-create the table in Oracle. Currently, it did the opposite of what I want. So, if it will process like I want, it will load the data, and it won't touch/change the existing table's property (types, etc.). I hope it clarifies the misunderstanding.
@Henning Holgersen The main problem is that I can't convert that data type for example varchar2(4000) to varchar2(15) for customer_id on Oracle. When loading from Postgres to Oracle, it overrides varchar2(15) to varchar2(4000). I tried schema and/or metadata following the doc: https://docs.meltano.com/concepts/plugins/#load-schema-extra but still no luck. any example or hint would be really helpful
h
@rida I will probably have some time to work on it this weekend. My first instinct is to add a "don't touch my schema" option - in your use case, this might work if you start by truncating the target table (instead of recreating it). Generally, a lot of targets end up creating varchar(many thousands) simply because it usually doesn't hurt and it reduces the risk of errors. But I will, of course, also look at following the schema more closely.
r
@Henning Holgersen awesome, looking forward to see the update
h
@rida I have a rudimentary prototype of the freeze-schema option, you can try this now:
Copy code
- name: target-oracle
    variant: radbrt
    pip_url: git+<https://github.com/radbrt/target-oracle.git@freeze-schema>
    config:
      freeze_schema: true
Basically what this branch and this config does is that it will still create tables for you, and it will still add columns if they don’t exist (and it will still create varchar(4000) columns), but if you create the columns beforehand with the datatype you want, it will not change them - it will simply try to insert the rows and if it fails, it fails. Let me know if it makes sense and works for you.
r
@Henning Holgersen thanks for the work. However, it makes no changes. I followed what you mentioned above and I even went further to re-create a new project to test this, but still not working for me. It is still the same issued case.
h
@rida hmmm… I must have misunderstood your workflow. What I thought would work for you: 1. Manually run a create table statement in your target database, so that you are guaranteed to have the column types you want 2. Load some data into it with target-oracle 3. Truncate the target table (with the
TRUNCATE TABLE ...
command) before you run a full reload 4. Run a full load After this, the columns types should stay the same. If you drop and recreate the table instead, it won’t work. One more question: what tap are you using? I tried the meltanolabs variant of tap-postgres, and as far as I could tell it didn’t send any information about column lengths, which means it isn’t possible to recreate the source length.
r
@Henning Holgersen I used tap-postgres target-oracle. About the created table, it was already created in Oracle. But i will try with what you stated above again.
@Henning Holgersen I tried like what you mentioned above, but it still overrides the columns' datatypes. I checked it seems it overrides only the string type. For example, it overrides customer_id varchar2(15) to become varchar2(4000). It used varchar2(4000) for all string type.
h
@rida hmmm… I tested this myself, did you remember to copy the yaml I wrote above? In your meltano.yml file you need to specify the branch name in your pip_url, and you need to specify freeze_schema: true in the config. After you double-check this, you need to reinstall it by running ‘meltano install —clean’. If this doesn’t work, I’d be happy to jump on a call with you to help debugging. If this works and makes sense for more than me, I will merge the branch.
r
@Henning Holgersen ‘meltano install —clean’ did the work. It is working fine now. Thanks a lot. But I wonder one more thing. How to deal with truncating/deleting all table data in Oracle before loading it?
h
Great to hear @rida. I don’t plan on adding a truncate feature (most targets don’t have anything like that), the few times I need to do a full load with delete I orchestrate it through Prefect (dagster and airflow works just as well), to truncate and then run meltano from within the same workflow.
u
FWIW theres an open issue for this truncate behavior in the SDK and I started work on a PR but we're still trying to figure out the safest way to implement it. Feel free to add your thoughts in that issue or PR if you have any
r
@pat_nadolny awesome, thank you for this. I am looking forward to seeing this feature.