matt_elgazar
05/25/2024, 4:54 AMactions - set tickers: [ "AYO.F" ] . I believe my code handles empty data by yielding an empty record with the required properties.
When I run meltano invoke tap-yfinance the tap works properly and you can see actions ran fine (no errors and contains the required properties). However, when running meltano el tap-yfinance target-jsonl --select actions it fails with simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
By entering debug mode (I’m using pycharm) I was able to pinpoint the below:
breaks:
def get_records(self, context: dict | None) -> Iterable[dict]:
<http://logging.info|logging.info>(f"\n\n\n*** Running ticker {context['ticker']} *** \n\n\n")
state = self.get_context_state(context)
financial_tap = FinancialTap(schema=self.schema, ticker=context["ticker"], config=self.config, name=self.name)
df = getattr(financial_tap, self.method_name)(ticker=context["ticker"]) # returns empty df
yield {"timestamp": "2021-01-01 00:00:00"}
works fine:
def get_records(self, context: dict | None) -> Iterable[dict]:
<http://logging.info|logging.info>(f"\n\n\n*** Running ticker {context['ticker']} *** \n\n\n")
state = self.get_context_state(context)
financial_tap = FinancialTap(schema=self.schema, ticker=context["ticker"], config=self.config, name=self.name)
# df = getattr(financial_tap, self.method_name)(ticker=context["ticker"])
yield {"timestamp": "2021-01-01 00:00:00"}
In debug mode, df = getattr(<>) returns an empty df as expected, and going through the lines of code it yields {"timestamp": "2021-01-01 00:00:00"} as expected. I’m really stuck now, beacuse I’m not sure why commenting a line that doesn’t get used breaks only when calling meltano el
tldr; meltano el breaks, but meltano invoke works. Error is simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0) but the data is a valid json. meltano el breaks from a line of code that doesn’t even get usedEdgar Ramírez (Arch.dev)
05/27/2024, 5:39 PMsimplejson.scanner.JSONDecodeError makes me think it's an error coming from a call to simplejson.load, and that is not done anywhere in the SDK afaict. So, it's probably coming from target-jsonl or upstream from it in the singer-python library (maybe here?), that's why meltano invoke works but meltano el does not.
I'd inspect the output of meltano invoke line by line to see if there's something that's not valid JSON and is causing the target to crash.Edgar Ramírez (Arch.dev)
05/28/2024, 3:42 PMmatt_elgazar
05/28/2024, 3:44 PMyfinance library. My hypothesis is that they integrated some sort of logging change where if the ticker returns an empty dataset it calls logging.error or something somewhere, but I haven’t validated that. I’ll have to dig into it further, because when I downgrade to yfinance version 0.38 then meltano works fine, but 0.40 breaks meltano el. I’m not sure why that is since there’s not necessarily an error return from the command per say.Edgar Ramírez (Arch.dev)
05/28/2024, 3:50 PMmeltano invoke tap-yfinance > inspect.me.jsonl and searching the file for a line that's not valid json.matt_elgazar
05/28/2024, 3:51 PMprint to stdout!matt_elgazar
05/28/2024, 3:52 PMprint is called?Edgar Ramírez (Arch.dev)
05/28/2024, 4:47 PMmatt_elgazar
05/28/2024, 5:53 PMimport yfinance as yf
yf.Ticker('XYZKALSFJKSDLF1231293098F').actions
XYZKALSFJKSDLF1231293098F: No timezone found, symbol may be delisted
Series([], dtype: object)
^ Of course that’s an invalid ticker. Now if we look at a ticker that used to be valid:
yf.Ticker('AYO.F').actions
AYO.F: No price data found, symbol may be delisted (1d 1925-06-21 -> 2024-05-28)
Series([], dtype: object)
^ this output may be the issue because it’s an invalid jsonmatt_elgazar
05/28/2024, 5:54 PMactions) you may be able to reproduce
tickers: ["AYO.F"]Edgar Ramírez (Arch.dev)
05/28/2024, 7:29 PMmatt_elgazar
05/28/2024, 7:31 PMprice_history_wide so it doesnt use pdr It might be here though! https://github.com/ranaroussi/yfinance/blob/930b305327e2e3769b5d62115b3ab25bc58f28de/yfinance/utils.py#L58-L62matt_elgazar
05/28/2024, 7:32 PMactions tap calls FinancialStream (a base stream) which calls ActionsStream