emcp
03/19/2022, 8:42 PMemcp
03/19/2022, 8:45 PMplugins:
extractors:
- name: tap-ibkr-tickers
namespace: tap_ibkr
executable: /home/tap-ibkr/tap-ibkr.sh
config:
host_tws_thrift: 127.0.0.1
tws_thrift_port: 9090
select:
- ib_tickers.*
- name: tap-ibkr-news
namespace: tap_ibkr
executable: /home/tap-ibkr/tap-ibkr.sh
config:
host_tws_thrift: 127.0.0.1
tws_thrift_port: 9090
target_host: 1.2.3.4
target_username: myusername
target_password: 1234
select:
- ib_news.*
As you can see.. I have 1 tap.. and trying to select the different streams
streams.py
class IBTickersStream(IBKRStream):
name = "ib_tickers"
primary_keys = ["contract_id", "query_start_time"]
replication_key = None
partitions = get_all_tickers()
schema = th.PropertiesList(
th.Property("symbol", th.StringType),
th.Property("sec_type", th.StringType),
th.Property("primary_exchange", th.StringType),
th.Property("exchange", th.StringType),
th.Property("currency", th.StringType),
th.Property("contract_id", th.IntegerType),
th.Property("query_start_time", th.DateTimeType)
).to_dict()
class IBNewsStream(NewsStream):
name = "ib_news"
primary_keys = ["article_id", "provider_code"]
replication_key = None
schema = th.PropertiesList(
th.Property("symbol", th.StringType),
th.Property("contract_id", th.IntegerType),
th.Property("provider_code", th.StringType),
th.Property("article_id", th.StringType),
th.Property("article_timestamp", th.DateTimeType),
th.Property("headline", th.StringType),
th.Property("extra_data", th.StringType),
).to_dict()
Any advice appreciated..
as of now, the behavior I see is...
if I call a pipeline attempting to ONLY select tickers... It will call news .... then the tickers.. and I do not understand how to properly segregate without writing a whole entire second tap.. which makes little sense to me and I must be doing something wrongemcp
03/19/2022, 8:46 PMclass IBKRStream(Stream):
def get_records(self, context: Optional[dict]) -> Iterable[dict]:
# GETS THE TICKERS
class NewsStream(Stream):
def get_records(self, context: Optional[dict]) -> Iterable[dict]:
# GETS THE NEWS
visch
03/21/2022, 1:34 PMvisch
03/21/2022, 1:36 PMget_all_tickers()
and IBKRStream
and NewsStream
emcp
03/22/2022, 7:09 AMemcp
03/22/2022, 7:21 AMget_all_tickers()
simply returns partitions of combinations of the alphabet. (a, b, c….. zzzz) up to length 4. these partitions are fed.. and meant only for the Tickers stream.. with news we have to go search the last valid tickers already in the Target (postgresql)