<h2 class="title" style="clear: both"><a xmlns="http://www.w3.org/1999/xhtml" id="id2826402"/>Conclusion: From Custom to Customary Law</h2></div></div><p xmlns="http://www.w3.org/1999/xhtml">We have examined the customs which regulate the ownership and control
I have a statement doing this:
Code:
# Mod <p> and <h> tags to preserve them cat $FILE | sed 's/<p.*>/[p]/g' | sed 's/<\/p>/[\/p]/g' | sed 's/<h\([1-3]\)\(.\)*></[h\1]</g' | sed 's/<\/h\([1-3]\)\(.\)*></[\/h\1]</g' > $tmp1
Cat the file, change <> tags around p and /p to [] and tags from h* and /h* to []. However, the statement does this:
Quote:
[h2][p]We have examined the customs which regulate the ownership and control
The third SED expression is matching everything until the last '>' (just before the [p] tag). How do I make it so it only matches up to the FIRST '>' it comes to? I.E. I want this:
Quote:
[h]<a xmlns="http://www.w3.org/1999/xhtml" id="id2826402"/>Conclusion: From Custom to Customary Law[/h2]</div></div>[p]We have examined the customs which regulate the ownership and control
I split the statement into two lines in the end when I cleaned up the script, I also played around and got it to do what I wanted it to:
Code:
# Mod <p> and <h> tags to preserve them cat $FILE | sed 's/<p.*>/[p]/g' | sed 's/<\/p>/[\/p]/g' > $tmp1 cat $tmp1 | sed 's/<h\([1-3]\)\([^>]\)*>/[h\1]/g' | sed 's/<\/h\([1-3]\)\(.\)*>/[\/h\1]/g' > $tmp2
Simple after I re-read my book on pattern matching, I had missed it the first few times I scanned through it.
Now after a few more scripts I've got a legal, up-to-date copy of the book "The Cathedral and the Bazaar" by Eric Raymond, it's a good read
It's been a bit of a long day and my brain's clearly missing something obvious. I have a file containing lines of input. I want to use grep to select those that end in a forward slash (ultimately I want to select everything but them, but that's a simple flag). What regexp do I need? I thought I've tried everything obvious:
Code:
egrep "/$" < input
Edd
_________________
timark_uk wrote:
Gay sex is better than no sex
timark_uk wrote:
Edward Armitage is Awesome. Yes, that's right. Awesome with a A.
It's been a bit of a long day and my brain's clearly missing something obvious. I have a file containing lines of input. I want to use grep to select those that end in a forward slash (ultimately I want to select everything but them, but that's a simple flag). What regexp do I need? I thought I've tried everything obvious:
It worked fine in the end as was, when the input was piped straight in from the previous stage. I swear there must be something installed that uses hamsters as line endings on these damn CSC machines!
_________________
timark_uk wrote:
Gay sex is better than no sex
timark_uk wrote:
Edward Armitage is Awesome. Yes, that's right. Awesome with a A.
Users browsing this forum: No registered users and 36 guests
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum