It's unfortunate that "ETL" stuck in mindshare, as afaik almost all use cases ar...

dragonwriter · on Feb 20, 2024

I think ETL is right from the perspective where E refers to “from the source of data” and L refers to “to the ultimate store of data”.

But the ETL functionality should itself lives in a (sub)system that has its own logical datastore (which may or may not be physically separate from the destination store), and things should be ELT where the L is with respect to that store. So, its E(LTE)L, in a sense.

appplication · on Feb 21, 2024

For those confused as to whether ETL or ELT is ultimately more appropriate for you… almost everyone is really just doing ETLTLTLT or ELTLTLTL anyways. The distinction is really moot.

ethbr1 · on Feb 21, 2024

Maybe my understanding is incorrect, but expansion on the distinction.

Assumptions -- We're talking about two separate systems (source and destination) with non-neglible transfer time (although perhaps "quick")

ETL -- Performing the transform before/during the load, such that fields in the destination are not guaranteed to have existed in the source (i.e. 2 db model)

ELT -- Performing a 1:1 copy of source into an intermediary table/db (albeit perhaps with filtering), then performing a transform on the intermediary table/db to generate the destination table/db (either realized or materialized at query time), with the intermediary table/database history retained (i.e. 3 table/db model)

In short distinction, if regeneration or altering the destination is required, ETL relies on history being available in the upstream source.

ELT pulls control of that to the destination-owner, as they're retaining the raw data on their side.