Hacker Newsnew | past | comments | ask | show | jobs | submit | captnswing's commentslogin

Extremely interesting presentation from Charlie Marsh about all the optimizations https://youtu.be/gSKTfG1GXYQ?si=CTc2EwQptMmKxBwG


Thanks. So from the video the biggest wins were

1. they way get the metadata for a package.

packages are in zip files. zip files have their TOC at the end. So, instead of downloading the entire zip they just get the end of the file, read the TOC, then from that download just the metadata part

I've written that code before for my own projects.

2. They cache the results of packages unzipped and then link into your environment

This means there's no files being copied on the 2nd install. Just links.

Both of those are huge time wins that would be possible in any language.

3. They store their metadata as a memory dump

So, on loading there is nothing to parse.

Admittedly this is hard (impossible?) in many languages. Certainly not possible in Python and JavaScript. You could load binary data but it won't be useful without copying it into native numbers/strings/ints/floats/doubles etc...

I've done this in game engines to reduce load times in C/C++ and to save memory.

It'd be interesting to write some benchmarks for the first 2. The 3rd is a win but I suspect the first 2 are 95% of the speedup.



> Invoke is a Python library for managing shell-oriented subprocesses and organizing executable Python code into CLI-invokable tasks. It draws inspiration from various sources (make/rake, Fabric 1.x, etc) to arrive at a powerful & clean feature set.

Thank you for linking to it.


this ^


ruff is the ticket. it replaced isort, flake8 for me, never looked back


it also doesn't know how to deal with simple constructs like match...case

I agree that ruff seems to be the way forward, it's (almost) at feature-parity, it is extremely fast, but I think it needs polishing


I just tried ruff last night and ran into the match-case support issue. I'm following https://github.com/charliermarsh/ruff/issues/282 and looking forward to trying ruff again once that issue is closed.


I was gonna ask if match case wasn't a really recent thing, but it seems to be from 3.10 released in October 2021.


I feel like it have taken quite a few tools some time to catch up. Or if they've caught up, it wasn't straight out of the box. Like, I had to upgrade my linter to handle match cases, but that bump (as we were running an older version) also introduced new rules breaking other code. Since I didn't want to take that work right then, I rewrote to an older if statement and put a task in the backlog to upgrade the linter.

So I've actually seen little use of match cases so far.


to be honest it is fairly recent and is not exactly "basic", I'm sure ruff people will end up covering it in the near future, I just thought it might bring some balance to mention that it doesn't yet go a hundred percent


IIRC this is Ruff's criteria for 1.0


yes it does. see see https://github.com/charliermarsh/ruff#supported-rules for the rules it supports. "IOO1" being the code for isort

relevant section from my pyproject.toml

  [tool.ruff]
  line-length = 88
  # pyflakes, pycodestyle, isort
  select = ["F", "E", "W", "I001"]


But does it just lint, or also effectively sort the imports?



+100 on ruff.

replaced both flake8 and isort across all my projects


I miss the customer in this. I've given the same advice in this format

Customer > Company

Company > Team

Team > Self


On a Mac, for ad-hoc OCR, I use the immensely useful CleanShot X https://cleanshot.com/ (which is well worth paying for).

Among many other things, it offes OCR of any region on the screen

for larger-scale OCR processing of pdfs and other files, I love how s3-ocr https://simonwillison.net/2022/Jun/30/s3-ocr/ makes working with AWS Textract OCR more accessible (though, somehow, Textract refuses to fully OCR larger pdfs I possess..)


On the latest MacOS, OCR happens automatically in any screenshot, or any image you open in Preview.

Try Command+Shift+4, grab part of the screen, click the pop-up, and just select text.


Agree, except: friends, don't let friends use Selenium.

use PlayWright!!


or cypess


or puppeteer


back in the days, I really liked https://malsup.com/jquery/taconite/, which had a similar approach (but using xml snippets sent from server over ajax)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: