x404.co.uk - View topic - SED/regex help

View unanswered posts | View active topics It is currently Mon Jul 14, 2025 8:57 am

SED/regex help

Moderators: saspro, HeatherKay, belchingmatt, timark_uk, Blue_Nowhere, l3v1ck, veato

Page 1 of 1

[ 6 posts ]

Print view

Previous topic | Next topic

SED/regex help

Author

Message

forquare1

I haven't seen my friends in so long

Joined: Thu Apr 23, 2009 6:36 pm
Posts: 5158
Location: /dev/tty0

SED/regex help

Hi all,

I have a number of files with a line like this:


	Quote: <h2 class="title" style="clear: both"><a xmlns="http://www.w3.org/1999/xhtml" id="id2826402"/>Conclusion: From Custom to Customary Law</h2></div></div><p xmlns="http://www.w3.org/1999/xhtml">We have examined the customs which regulate the ownership and control

I have a statement doing this:


	Code: # Mod <p> and <h> tags to preserve them cat $FILE \| sed 's/<p.>/[p]/g' \| sed 's/<\/p>/[\/p]/g' \| sed 's/<h$[1-3]$$.$></[h\1]</g' \| sed 's/<\/h$[1-3]$$.$*></[\/h\1]</g' > $tmp1

Cat the file, change <> tags around p and /p to [] and tags from h* and /h* to []. However, the statement does this:


	Quote: [h2][p]We have examined the customs which regulate the ownership and control

The third SED expression is matching everything until the last '>' (just before the [p] tag). How do I make it so it only matches up to the FIRST '>' it comes to? I.E. I want this:


	Quote: [h]<a xmlns="http://www.w3.org/1999/xhtml" id="id2826402"/>Conclusion: From Custom to Customary Law[/h2]</div></div>[p]We have examined the customs which regulate the ownership and control

Thanks,
Ben

Fri Jan 22, 2010 2:10 pm

forquare1

I haven't seen my friends in so long

Joined: Thu Apr 23, 2009 6:36 pm
Posts: 5158
Location: /dev/tty0

Re: SED/regex help

Solved

I split the statement into two lines in the end when I cleaned up the script, I also played around and got it to do what I wanted it to:


	Code: # Mod <p> and <h> tags to preserve them cat $FILE \| sed 's/<p.>/[p]/g' \| sed 's/<\/p>/[\/p]/g' > $tmp1 cat $tmp1 \| sed 's/<h$[1-3]$$[^>]$>/[h\1]/g' \| sed 's/<\/h$[1-3]$$.$*>/[\/h\1]/g' > $tmp2

Simple after I re-read my book on pattern matching, I had missed it the first few times I scanned through it.

Now after a few more scripts I've got a legal, up-to-date copy of the book "The Cathedral and the Bazaar" by Eric Raymond, it's a good read

Sat Jan 23, 2010 5:36 pm

EddArmitage

I haven't seen my friends in so long

Joined: Thu Apr 23, 2009 9:40 pm
Posts: 5288
Location: ln -s /London ~

Re: SED/regex help

My turn:

It's been a bit of a long day and my brain's clearly missing something obvious. I have a file containing lines of input. I want to use grep to select those that end in a forward slash (ultimately I want to select everything but them, but that's a simple flag). What regexp do I need? I thought I've tried everything obvious:


	Code: egrep "/$" < input

Edd

_________________


	timark_uk wrote: Gay sex is better than no sex


	timark_uk wrote: Edward Armitage is Awesome. Yes, that's right. Awesome with a A.

Tue Feb 02, 2010 4:27 pm

Nick

Spends far too much time on here

Joined: Thu Apr 23, 2009 11:36 pm
Posts: 3527
Location: Portsmouth

Re: SED/regex help

Argh my head has just imploded.

We had to write Sed in C last year. Absolute hell!!!!!!!

_________________

Tue Feb 02, 2010 8:34 pm

forquare1

I haven't seen my friends in so long

Joined: Thu Apr 23, 2009 6:36 pm
Posts: 5158
Location: /dev/tty0

Re: SED/regex help

EddArmitage wrote:


	Code: egrep "/$" < input

Edd

I'd do something like:


	Code: egrep \/$ < input

Wed Feb 03, 2010 11:51 am

EddArmitage

I haven't seen my friends in so long

Joined: Thu Apr 23, 2009 9:40 pm
Posts: 5288
Location: ln -s /London ~

Re: SED/regex help

forquare1 wrote:

EddArmitage wrote:


	Code: egrep "/$" < input

I'd do something like:


	Code: egrep \/$ < input

It worked fine in the end as was, when the input was piped straight in from the previous stage. I swear there must be something installed that uses hamsters as line endings on these damn CSC machines!

_________________


	timark_uk wrote: Gay sex is better than no sex


	timark_uk wrote: Edward Armitage is Awesome. Yes, that's right. Awesome with a A.

Wed Feb 03, 2010 11:57 am

Page 1 of 1

[ 6 posts ]

Who is online

Users browsing this forum: No registered users and 19 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum