packages are in zip files. zip files have their TOC at the end.
So, instead of downloading the entire zip they just get the end
of the file, read the TOC, then from that download just the metadata
part
I've written that code before for my own projects.
2. They cache the results of packages unzipped and then link into your environment
This means there's no files being copied on the 2nd install. Just links.
Both of those are huge time wins that would be possible in any language.
3. They store their metadata as a memory dump
So, on loading there is nothing to parse.
Admittedly this is hard (impossible?) in many languages. Certainly not possible in Python
and JavaScript. You could load binary data but it won't be useful without copying it into
native numbers/strings/ints/floats/doubles etc...
I've done this in game engines to reduce load times in C/C++ and to save memory.
It'd be interesting to write some benchmarks for the first 2. The 3rd is a win but I suspect the first 2 are 95% of the speedup.
> Invoke is a Python library for managing shell-oriented subprocesses and organizing executable Python code into CLI-invokable tasks. It draws inspiration from various sources (make/rake, Fabric 1.x, etc) to arrive at a powerful & clean feature set.
I just tried ruff last night and ran into the match-case support issue. I'm following https://github.com/charliermarsh/ruff/issues/282 and looking forward to trying ruff again once that issue is closed.
I feel like it have taken quite a few tools some time to catch up. Or if they've caught up, it wasn't straight out of the box. Like, I had to upgrade my linter to handle match cases, but that bump (as we were running an older version) also introduced new rules breaking other code. Since I didn't want to take that work right then, I rewrote to an older if statement and put a task in the backlog to upgrade the linter.
So I've actually seen little use of match cases so far.
to be honest it is fairly recent and is not exactly "basic", I'm sure ruff people will end up covering it in the near future, I just thought it might bring some balance to mention that it doesn't yet go a hundred percent
On a Mac, for ad-hoc OCR, I use the immensely useful CleanShot X https://cleanshot.com/ (which is well worth paying for).
Among many other things, it offes OCR of any region on the screen
for larger-scale OCR processing of pdfs and other files, I love how s3-ocr https://simonwillison.net/2022/Jun/30/s3-ocr/ makes working with AWS Textract OCR more accessible (though, somehow, Textract refuses to fully OCR larger pdfs I possess..)