Hi everyone! I'm working on a tap for <dbase file...
# singer-tap-development
Hi everyone! I'm working on a tap for dbase files using the SDK. I'm PoCing to make it agnostic to the filesystem so it's able to read files from the OS, S3, Google Drive, etc. That part is built on top of Will McGugan's PyFilesystem. I'm able to temporarily patch the
function so the
package opens files with pyfilesystem2's methods. You can see that here: https://github.com/edgarrmondragon/tap-dbf/blob/patch-fs-open/tap_dbf/tap.py#L91. The only issue is
can load records in two ways: read everything into memory during instantiation, or lazily iterate from the file. The first mode is not ideal for large files but works well with the patch since nothing else is trying to read files in the same context. The second one fails with fs.errors.ResourceNotFound: resource '/etc/timezone' not found. That is because I'm patching
when the first record is read but `pendulum.now()` in the SDK is also trying to read from the filesystem. This is not a bug in the SDK per se, but
would fix the issue and afaik not break anything since
to UTC anyway. So, do you think this thing merits an issue? A MR? That I don't try this patching witchery 😅?