Hacker Newsnew | past | comments | ask | show | jobs | submit | c0nstantine's commentslogin

Working on trre - extension of regex for text editing. I'm redesigning the underlying engine to operate on deterministic automata (transducers) for most expressions. Theoretically, it should outperform AWK in complex text-processing tasks.

https://github.com/c0stya/trre


I have a minimalistic one (7 lines in Python) using convolutions:

https://c0stya.github.io/articles/game_of_life.html

### The code ###

  import numpy as np
  from scipy.signal import convolve2d

  field = np.random.randint(0, 2, size=(100, 100)) # 100x100 field size
  kernel = np.ones((3, 3))
  
  for i in range(1000): # 1000 steps
      new_field = convolve2d(field, kernel, mode="same")
      field = (new_field == 3) + (new_field == 4) * field


That is fabulous! As a DSP/ML guy, this is a contender for my favorite so far.


Thank you for your feedback. There is a bunch of deterministic methods to infer regex from samples (positive and negative). There are ml-based as well. But it is a different story.


Let me know if you need any help. Not it is still raw but I hope I'll polish it soon.


Hi. Missed your message initially. Helix is a great project. Let me know if/how I can help. The trre is a bit raw. But hope I can polish it within a month or two.


Oh, I'm not directly associated with it, and I don't use it mostly because that specific feature is missing. And they're in rust, so I'm not sure how well c bindings work. Still, really cool project.


I agree with the point that precedence is arbitrary. The current version looks like this:

1 Escaped characters

2 []

3 ()

4 * + ? {m,n}

5 :

6 . (implicit concatenation)

7 |

I have some reasons to put it that way. I want : to be somewhat 'atomic'. If you think about '*' or '+' they can be lower in the table as well. Anyway, I will try to put : lower in the next version and see how it goes.


Thank you for doing my work! :)


Hi,

If I understand it correctly you want to change something inside the "..." block and change the quotas to single '.

It can be done by this expression:

echo '"hello world" "hello again!"' | ./trre "\":'.+?:-\":'"

'-' '-'

So I substitute the text inside "" by symbol - using this expression ".+?:-" and simultaneously changing the surrounding quota.

Question mark means non-greedy mode.


Oh, I've learnt a lot. Initially wanted to complete the whole project in two weeks and it took a few months. The hardest part was the DFT determinization algorithm design.

Thanks for your feedback!


Thank you! Still a lot of work to do. I really like the jq style.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: