
On Thu, 09 Aug 2012 12:21:13 +1000, James Harper <james.harper@bendigoit.com.au> wrote:
The problem is that my sed script says to start at the "(" and then read up until a ")", but I really mean to say read up until a matching ")". Can I do this with sed or should I be using something else?
If you mean that you are using a regular expression, then the theoretical answer is that you can't match an arbitrarily nested set of parentheses. This is because a regular expression only has a finite state sequence, whereas nested parentheses can be nested arbitrarily deep (the state machine has no memory apart from the current state). However, if you know in advance the maximum nesting, you can construct a regular expression to do the matching. For example, the following regex will match 1 or 2 levels of parentheses: ([^()]*\(([^()]*)[^()]*\)*) It works by using an inner \(...\)* to match the (optional) inner parentheses. So for example if you give it the string: ab(cd(ef)gh(ij)kl)mn it will match: (cd(ef)gh(ij)kl) Whereas if you give it the string: cd(ef)gh it will match: (ef) It gets rather unreadable and typo prone if you have to be able to nest more than a few times; if so, you might be better off using a parsing language which is more powerful than regular expressions, or making use of some quirk in the input to find the last parenthesis (eg a trailing space or end of line). Glenn -- sks-keyservers.net 0xb1e82ec9228ac090