Mode Syntax Options - Positional Modes
Submitted by Friday, 23 September, 2005 - 14:48
on
I do a lot of work with EDI at work, and I've been trying to write a mode for it, but I keep running up against JEdit's mode syntax. It just doesn't support what I need.
EDI is a positional syntax, with fields being of fixed or semi-variable length, seperated by a delimiter. The delimiter itself is variable: While there are supposed to be specific choices, in reality any punctuation character has probably been used. To complicate things more, EDI has the concept of 'segments', which are often seperated by line breaks, but not always. (There again, any punctuation character should probably be considered valid, though in both cases it will stay constant within a file.)
So, ideally, I'd like to match the first few characters to a sequence, then use the next character as a delimiter (whatever character that is), then divide the rest of the file into fields by that character. A known number of fields from the start I can find the segment delimiter, which I would like to use as a line-ending. (In the cases where it isn't, there is usually no other line ends in the file.) Also, I'd like to be able to highlight as tokens certain fields, counting from line/segment-ends.
Problem is, there is no good way to do this in mode-syntax. I can highlight the delimiters, by entering a list of the common ones, but any attempt to highlight what is *within* them meets with problems. The most common problem is that the delimiters can only be part of *one* match. So, for instance, a 'span' rule will only match every other field.
Some of the 'interesting' fields of EDI have specific values, which I can then sequence-match, but many do not.
So, some specific suggestions:
- A delimiter-count match. Choose a delimiter, and match specific sequences divided by that delimiter. (And, possibly, end-of-lines). The delimiter itself is not part of the sequence matched, or at least can be applied a seperate token.
- An extension of the 'Terminate' rule, allowing termination at a specific character (or even regrex) instead of at a specific position. Again, the option to 'eat' the matched character or not.
- A property/rule to allow/force wrapping on specified non-whitespace characters.
- The ablity to _disallow_ wrapping on whitespace characters.
These may not be the best way to approach the problems, but they are rules that I can think of that would help. EDI itself does not have the concept of escape characters, but I can imagine that would be a useful addition to a delimiter rule.
EDI is a positional syntax, with fields being of fixed or semi-variable length, seperated by a delimiter. The delimiter itself is variable: While there are supposed to be specific choices, in reality any punctuation character has probably been used. To complicate things more, EDI has the concept of 'segments', which are often seperated by line breaks, but not always. (There again, any punctuation character should probably be considered valid, though in both cases it will stay constant within a file.)
So, ideally, I'd like to match the first few characters to a sequence, then use the next character as a delimiter (whatever character that is), then divide the rest of the file into fields by that character. A known number of fields from the start I can find the segment delimiter, which I would like to use as a line-ending. (In the cases where it isn't, there is usually no other line ends in the file.) Also, I'd like to be able to highlight as tokens certain fields, counting from line/segment-ends.
Problem is, there is no good way to do this in mode-syntax. I can highlight the delimiters, by entering a list of the common ones, but any attempt to highlight what is *within* them meets with problems. The most common problem is that the delimiters can only be part of *one* match. So, for instance, a 'span' rule will only match every other field.
Some of the 'interesting' fields of EDI have specific values, which I can then sequence-match, but many do not.
So, some specific suggestions:
- A delimiter-count match. Choose a delimiter, and match specific sequences divided by that delimiter. (And, possibly, end-of-lines). The delimiter itself is not part of the sequence matched, or at least can be applied a seperate token.
- An extension of the 'Terminate' rule, allowing termination at a specific character (or even regrex) instead of at a specific position. Again, the option to 'eat' the matched character or not.
- A property/rule to allow/force wrapping on specified non-whitespace characters.
- The ablity to _disallow_ wrapping on whitespace characters.
These may not be the best way to approach the problems, but they are rules that I can think of that would help. EDI itself does not have the concept of escape characters, but I can imagine that would be a useful addition to a delimiter rule.