I think it depends on whether you use it for operations or for data analysis. Sp...

nojito · on July 2, 2022

1 s vs 1 ms is not a great comparison.

Polars excels when pandas operations take 30 seconds or a minute to complete. Bringing that time down to the second or ms mark is really amazing.

anigbrowl · on July 2, 2022

I love pandas and work with quite small datasets for EDA (10^(3..6) most of the time) but even then I run into slowdowns. I don't really mind as I'm pursuing my own research rather than satisfying an employer/client, and often figuring out why something is slow turns into a useful learning experience (the canonical rite of passage for new pandas users is using lambda functions with df.apply instead of looping).

I've definitely procrastinated doing some analyses or turning prototypes into dashboards because of the potential for small slowdowns to turn into big slowdowns, so it's nice to have other options available. I'm very interested in Dask but have also been apprehensive about doing something stupid and incurring a huge bill by failing to think through my problem sufficiently.