ads slot

Latest Posts:

Срочно нужен инженер схемотехник! Опыт проектирования схем в Altium Designer; ТК; Подробнее здесь

Lion Kimbro: 2-Dimensional Regexes

Lion Kimbro: 2-Dimensional Regexes:


One of the themes I work on is creating new mediums for programming.


But what about in the mean-time, where we only have plain text? Well, I think it's underutilized. We can out-Lisp Lisp by string interpretation.


I've written before tkinter-based code before for creating GUIs out of text representations.




X- Tagged Text Browser ---------------------------------------------X
| |
| tags: [.tags_input........] "- text_out -------------------" |
| | | |
| found: | | |
| [=entries_found===========] | | |
| [ ] | | |
| [ ] | | |
| [ ] | | |
| [ ] | | |
| [ ] | | |
| [ ] | | |
| [ ] | | |
| [ ] | | |
| [ ] | | |
| [ ] | | |
| [=========================] "------------------------------" |
| (save_edits) (quit) |
X-------------------------------------------------------------------X




(A GUI, represented in ASCII text.)


One of the fundamental mechanisms making this possible is code for reading (say) 2-Dimensional representations in characters of rectangles.


I think I have, 3 times, for different libraries, written 2-D rectangle recognition code.


Let me tell you, it's a real bore.


Here is an example of what some such code looks like:



# Extend right, from the top-left.
# It is possible that there is a label here.
while extend_right.scan_char(h_char, 1, 0) is not None:
pass
if extend_right.at_topright_corner():
label = None
else:
extend_right.scan_char(" ")
label = extend_right.read_label()
if label == "":
return None
extend_right.scan_char(" ")
while extend_right.scan_char(h_char, 1, 0) is not None:
pass
if not extend_right.at_topright_corner():
return None
r_corner_char = extend_right.char()
if r_corner_char is None:
return None
# Extend down.
left = self.PRN.cursor(self.PSX, self.PSY+1)
right = self.PRN.cursor(extend_right.PSX,
extend_right.PSY+1)
if ((left.char() == l_corner_char) and
(right.char() == r_corner_char) and
left.at_bottomleft_corner() and
right.at_bottomright_corner()):
l_v_char = None
r_v_char = None
else:
l_v_char = left.char()
r_v_char = right.char()
if l_v_char is None: return None
if r_v_char is None: return None
while ((left.scan_char(l_v_char, 0, 1) is not None) and
(right.scan_char(r_v_char, 0, 1) is not None)):
if (left.at_bottomleft_corner()
and right.at_bottomright_corner()):
break
if ((left.char() != l_corner_char)
or (right.char() != r_corner_char)):
return None



(This is just the beginning. It goes on and on, and then there are the support functions, and then there are the variants, ...)


It's boring.


It's the programming equivalent of shoveling dirt.


While I worked on the code, over and over and over again, and painstakingly debugged it, I kept thinking, "I need regexes... I need regexes... I need some kind of 2-dimensional regex..."


But every time I looked at how regex languages, I thought of two things + 1 conclusion:



  1. Boy, this is really complicated. It's like writing freaking compiler optimizations.

  2. This is really boring stuff.

  3. I don't have time for this. <- the conclusion


I'd think it over off and on across time. And then recently, I got an idea how to do it pretty easily.


I realized that there is a very small set of things that I am doing, and a fairly easy way to make the machinery for it.


So here's the language I came up with:



c type      flags description
= ========= ===== ====================================================================
> direction head right
< direction head left
^ direction head up
v direction head down
0 tar start a measurement
I measure 1st time: store measurement I, further times: require equivalence
J measure 1st time: store measurement J, further times: require equivalence
K measure 1st time: store measurement K, further times: require equivalence
S recording 1st: start recording, 2nd: stop and store, 3rd: start anew, ...
R require require the next character (literal)
r req-meta r! = req. nothing; r[A-Z] -- req. from set; r[#] -- req. from recorded
! move ----- go until requirements can't be met, then pause
. move 1E--- take a step if requirement is met, otherwise: ERR out
, move 1---- take a step if requirement is met, otherwise, pause
X accept all done! accept what was found



Then I wrote an "engine" that reads the instructions, takes a function for reading a character from a position, takes a start position, and the character sets fed in (for lower-case alphabet "r" requirements,) ...


...and it worked!


So the code for scanning a rectangle is like so:


R+>. 0R-!I R+v. 0R|!J R+<. 0R-!I R+^. 0R|!J X


Here's one of the rectangles that it scans:



+----------------------------------+
| |
| x--------------------x |
| | | |
| | | |
| x--------------------x |
| |
+----------------------------------+



I used that in one of the test cases.


Reading from the start, it says, "Require a +. Head to the right. Eat one character, or fail. Now start measuring. Require a dash, and go as far as you can. Save the distance traveled in register I. Now read a +, but head down this time. Start measuring again. Head down, gobbling up the pipes. Mark how far you went in register J. Gobble up a +, heading left. Start measuring again. Gobble all the dashes. Now, check your distance with register I. Is it good? Keep going, otherwise -- abort, this isn't the rectangle we're wanting. Now gobble the +, heading up. Start measuring again, and eat all the pipes headed up. Check against register J. Good? Ok, you're done! Accept."


If you wrap the whole expression in S's ("store this to a list of expressions,") you get in your response:


"'+----------------------------------+||||||+----------------------------------+||||||"


...which is a pretty dang cool way of verifying that your code works, if you ask me: a complete track record of everywhere that the code went.


It's been a while since I've blogged about programming, and I don't really have a storage system worked out for posting & sharing the code.


But, here's the header (and docstring):



@note_function("2d", "space 2d line lines recognition recognizer ascii " +
"character characters machine machines abstract")
def recognizer_2d(instructions, read_pos_fn, start_pos=rt_zero,
registers=None):
"""Abstract 2d recognizer.

read_pos_fn(rt_pos): -> return a character representing the position;
will be compared against the requirement

RC>. 0R-!I RCv.
C---------C
0R:!J : : 0R:!J
(checks): : (records)
C---------C
RC^. 0R-!I RC<.

Returns: dictionary on success;
None on failure

{"I": goal for I measurement (or None)
"J": goal for J measurement (or None)
"pos": position rectangle
"step": step rectangle,
"recorded_text": list of recorded strings}

Spirograph pg. 48
"""



Note that: because it takes a function to identify characters, this function can be used with any 2-dimensionally indexed schema.


For example, you could use it with graphics, by supplying a function that returns a "X" for a non-black pixel, and " " for a black pixel. (Or some related scheme of translating pixel colors into character codings.) Then the very same function (and codes) can be used to identify rectangles (and what have you other simple/rectilinear shapes) in graphical images.


This is the first time in a very long time I have posted about my coding adventures online.


In particular, I am writing for Planet Python, because Python is my favorite programming language in the world.


For those who don't know me, my interests are in what I call "Visionary Programming" and "Improvisational Programming."


That is, I like to write code to support a way of programming that allows you to get from a visionary idea, to an implementation, in as short a time as possible. (For example, I wrote the engine here in 1 unfocused lazy day that involved a lot (too much) of listening to Dubstep on Youtube... I probably put in 3-4 hours of real work on it.)


Much of this involves working on remaking the representation of software ideas. When you have a solid representation, (which can take some time to find,) you can then experiment and find alternatives very quickly, and flow around like water. It's a lot of fun, and really gets to the magic of programming: the "Wow!"


Please let me know if you had a good time reading this. Thank you!





Permalink

| Leave a comment »

Share on Google Plus

Автор Expert

Я простой житель планеты земля.

Комментарии

comments powered by HyperComments

Алексей Навальный

Мухамеду Али было 74 года

ali from Mab on Vimeo . Он говорил, что хотел бы дожить до ста лет. Мохаммед Али – один из самых известных боксеров в истории мировог...

Получайте обновления на Email

Проблемы на работе?