I m trying to make logging less verbose for <https github co Meltano #singer-tap-development

I'm trying to make logging less verbose for <this>...

prratek_ramchandani

02/22/2022, 2:45 PM

I'm trying to make logging less verbose for this SDK based Instagram tap. It uses parent-child streams in a few places and one in particular has a large number of parent objects. To fetch insights for a media object you have to pass in the media ID, making the media stream the parent and you could have potentially thousands of those. • I can reduce metric logging by just setting

metrics_log_level

to DEBUG or NONE • I can log state less frequently by changing

state_partitioning_keys

to track less state bookmarks in the first place • What I'm left with is the SDK still logging

Beginning full table sync of ... with context ...

after each media object, leading to in our case tens of thousands of lines of logs. Is there anything we can do to not log that for child streams in particular?

edgar_ramirez_mondragon

02/22/2022, 4:38 PM

The SDK could certainly handle log granularity and control better. At the moment you can't control the output level for particular streams, but you can change it for the whole tap by adding these lines to your tap.py module:

Copy code

import logging

logger = logging.getLogger("tap-example")  # change this to the name of your tap
logger.setLevel(logging.WARNING)

prratek_ramchandani

02/22/2022, 5:41 PM

okay, yeah this was my fallback option. thanks! is it possible to set log level for a single tap in the meltano project instead of in the tap? it sounds like my best option would be to set a

MELTANO_CLI_LOG_LEVEL

environment variable in that tap's execution environment

edgar_ramirez_mondragon

02/22/2022, 7:48 PM

is it possible to set log level for a single tap in the meltano project instead of in the tap?

not currently

it sounds like my best option would be to set a
MELTANO_CLI_LOG_LEVEL
environment variable in that tap's execution environment

so yeah, I think you could still do

Copy code

plugins:
  tap-example:
    config:
      metrics_log_level: $MELTANO_CLI_LOG_LEVEL

prratek_ramchandani

02/22/2022, 8:00 PM

cool, thanks edgar!

aaronsteers

02/24/2022, 2:30 AM

Hi, @prratek_ramchandani - Just following up on this and want to confirm: I too have been blasted by log messages in a parent-child context. The "Beginning full table sync of ... with context ..." message that you quote makes tons of sense in non-parent-child streams, and mostly good sense when number of child contexts is 50 or less. But in high-cardinality child use cases, I agree it's just pretty painful. In most cases, we probably do want to log which stream is syncing, which essentially is what we're declaring here in the logs.

edgar_ramirez_mondragon

02/24/2022, 3:16 AM

@aaronsteers I created https://gitlab.com/meltano/sdk/-/issues/335

prratek_ramchandani

02/24/2022, 3:43 AM

yeah that makes sense @aaronsteers. and @edgar_ramirez_mondragon i'll keep an eye on that issue - in my mind the two ways it's useful to be able to customize logging is by stream and by message type (metric, state, etc).

2 Views

Open in Slack

Previous Next