All popular CMS softwares come with addons/plugins to automatically generate sitemaps for your website everytime you add new content. I recently added a few documentation sites which are all plain html files. I needed a tool to generate the sitemap files for these websites. After googling around a bit, I came across google’s own sitemap generation tool. It’s a very good tool and I highly recommend it for sitemap generation for html websites.Continue reading

I wrote a post about how to print just the last substring of a delimited string. Instead if you’d want to delete just the last substring, and keep rest of the string intact, thats a little trickier.

Here’s how you do it.

@~$ echo "abc:cde:efg" | awk 'BEGIN {FS=ORS=":"} {for(i=1;i<NF;i++) print $i}' | sed 's/$/\n/'
abc:cde:

FS is the field separator, while ORS is the output record separator. ORS by default is the new line character \n. What we are doing here is splitting the string using awk, and then printing all but the last records i < NF. The sed statment is just adding a newline at the end.

If you want to skip the trailing delimiter, you could use the code below.

@~$ echo "abc:cde:efg" | awk 'BEGIN {FS=ORS=":"} {for(i=1;i<NF;i++) print $i}' | sed 's/:$/\n/'
abc:cde

Here the sed statement replaces the trailing : with a newline. To skip the newline, you can change the sed statement to sed 's/:$//'.

Some time back I wrote this post showing how to split a string into substrings separated by multi character delimiters. Didn’t realize then that there’s a much easier solution using awk. Using the same example as used in the previous post, here’s the solution.

echo "abcd<>efgh<>ijkl<>mn op<>qr st<>uv wx<>yz" | awk 'BEGIN {FS="<>"} {for(i=1;i<=NF;i++)print $i}'

The delimiter here is "<>".

This will print out all the substrings. If you want an individual substring, you can use something like

echo "abcd<>efgh<>ijkl<>mn op<>qr st<>uv wx<>yz" | awk 'BEGIN {FS="<>"} {print $1}'
echo "abcd<>efgh<>ijkl<>mn op<>qr st<>uv wx<>yz" | awk 'BEGIN {FS="<>"} {print $2}'

Thats how easy it is.