I wrote a post about how to print just the last substring of a delimited string. Instead if you’d want to delete just the last substring, and keep rest of the string intact, thats a little trickier.

Here’s how you do it.

@~$ echo "abc:cde:efg" | awk 'BEGIN {FS=ORS=":"} {for(i=1;i<NF;i++) print $i}' | sed 's/$/\n/'
abc:cde:

FS is the field separator, while ORS is the output record separator. ORS by default is the new line character \n. What we are doing here is splitting the string using awk, and then printing all but the last records i < NF. The sed statment is just adding a newline at the end.

If you want to skip the trailing delimiter, you could use the code below.

@~$ echo "abc:cde:efg" | awk 'BEGIN {FS=ORS=":"} {for(i=1;i<NF;i++) print $i}' | sed 's/:$/\n/'
abc:cde

Here the sed statement replaces the trailing : with a newline. To skip the newline, you can change the sed statement to sed 's/:$//'.

Some time back I wrote this post showing how to split a string into substrings separated by multi character delimiters. Didn’t realize then that there’s a much easier solution using awk. Using the same example as used in the previous post, here’s the solution.

echo "abcd<>efgh<>ijkl<>mn op<>qr st<>uv wx<>yz" | awk 'BEGIN {FS="<>"} {for(i=1;i<=NF;i++)print $i}'

The delimiter here is "<>".

This will print out all the substrings. If you want an individual substring, you can use something like

echo "abcd<>efgh<>ijkl<>mn op<>qr st<>uv wx<>yz" | awk 'BEGIN {FS="<>"} {print $1}'
echo "abcd<>efgh<>ijkl<>mn op<>qr st<>uv wx<>yz" | awk 'BEGIN {FS="<>"} {print $2}'

Thats how easy it is.

Bash variables can’t be passed into awk, like one can with a lot of other commands. They need to be declared using the -v option. Here’s a code snippet which shows how.

#!/bin/bash
mainstring="abcdef"
substring="cde"
awk -v a="$mainstring" -v b="$substring" 'BEGIN  { print index(a,b) }'

The above awk command is for locating the position of the first letter of the substring cde within the main string abcdef. In this case the position is 3, which is the position of the character ‘c’ in the main string. The variables $mainstring and $substring are passed into awk by pre-declaring them as a and b respectively.