julian_knight
11/15/2022, 7:25 PMjulian_knight
11/15/2022, 7:36 PMshopify_orders
into events where event_name = 'Order placed'
?Sven Balnojan
11/15/2022, 7:45 PMjulian_knight
11/15/2022, 8:01 PMjulian_knight
11/15/2022, 8:02 PMSven Balnojan
11/15/2022, 8:07 PMvisch
11/15/2022, 8:10 PMchristoph
11/15/2022, 8:26 PMvisch
11/15/2022, 8:27 PMjulian_knight
11/15/2022, 8:33 PMvisch
11/15/2022, 8:34 PMSven Balnojan
11/15/2022, 8:35 PMvisch
11/15/2022, 8:38 PMHenning Holgersen
11/17/2022, 8:51 PMjulian_knight
11/17/2022, 9:30 PM| (application_user_id, 5) | (email, <mailto:julian@example.com|julian@example.com>) |
| (application_user_id, 5) | (phone, +14045551234) |
| (shopify_customer_id, 12345) | (email, <mailto:julian@example.com|julian@example.com>) |
| (shopify_customer_id, 67890) | (email, <mailto:julian@example.com|julian@example.com>) |
| (mixpanel_distinct_id, ABCD) | (email, <mailto:julian.alt@example.com|julian.alt@example.com>) |
| (mixpanel_distinct_id, ABCD) | (application_user_id, 5) |
| (application_user_id, 6) | (email, <mailto:othercustomer@foo.com|othercustomer@foo.com>) |
2. Attach to these a generated global_customer_id
(ideally deterministically so that they are more stable), assigning the same id to all rows that have values in common
| (application_user_id, 5) | (email, <mailto:julian@example.com|julian@example.com>) | 1 |
| (application_user_id, 5) | (phone, +14045551234) | 1 |
| (shopify_customer_id, 12345) | (email, <mailto:julian@example.com|julian@example.com>) | 1 |
| (shopify_customer_id, 67890) | (email, <mailto:julian@example.com|julian@example.com>) | 1 |
| (mixpanel_distinct_id, ABCD) | (email, <mailto:julian.alt@example.com|julian.alt@example.com>) | 1 |
| (mixpanel_distinct_id, ABCD) | (application_user_id, 5) | 1 |
| (application_user_id, 6) | (email, <mailto:othercustomer@foo.com|othercustomer@foo.com>) | 2 |
3. Attach the global_customer_id
to any source tables desired by joining on their identifier
SELECT shopify_customers.*, global_customers.global_customer_id
FROM shopify_customers
LEFT JOIN global_customers on global_customers.source = 'shopify' and global_customers.value = shopify_customers.shopify_customer_id
Step 2 is a bit hand-wavy currently; I think the SQL there is more complicated than it sounds. But we’ll see how it goes