Bash string manipulation methods| Linux

There are no direct parameter expansions to give either the first or last character of a string, but by using the wildcard (?), a string can be expanded to everything except its first or last character:

$ var=strip

$ allbutfirst=${var#?}

$ allbutlast=${var%?}

$ sa "$allbutfirst" "$allbutlast"

:trip:

:stri:

The values of allbutfirst and allbutlast can then be removed from the original variable to give the first or last character:

$ first=${var%"$allbutfirst"}

$ last=${var#"$allbutlast"}

$ sa "$first" "$last"

:s:

:p:

 

The first character of a string can also be obtained with printf:

 

printf -v first "%c" "$var"

 

To operate on each character of a string one at a time, use a while loop and a temporary variable that stores the value of var minus its first character. The temp variable is then used as the pattern in a

${var%PATTERN} expansion. Finally, $temp is assigned to var, and the loop continues until there are no characters left in var:

 

while [ -n "$var" ]

do

temp=${var#?}     ## everything but the first character char=${var%"$temp"} ## remove everything but the first character

: do something with "$char"

var=$temp     ## assign truncated value to var

done

 

 

Reversal

 

You can use the same method to reverse the order of characters in a string. Each letter is tacked on to the end of a new variable (Listing 7-3).

 

Listing . revstr, Reverse the Order of a String; Store Result in _REVSTR

 

_revstr() #@ USAGE: revstr STRING

{

var=$1

_REVSTR=

while [ -n "$var" ]

do

temp=${var#?}

_REVSTR=$temp${var%"$temp"}

var=$temp

done

}

 

Case Conversion

In the Bourne shell, case conversion was done with external commands such as tr, which translates characters in its first argument to the corresponding character in its second argument:

$ echo abcdefgh | tr ceh CEH # c => C, e => E, h => H

abCdEfgH

$ echo abcdefgh | tr ceh HEC # c => H, e => E, h => C

abHdEfgC

Ranges specified with a hyphen are expanded to include all intervening characters:

$ echo touchdown | tr 'a-z' 'A-Z' TOUCHDOWN

In the POSIX shell, short strings can be converted efficiently using parameter expansion and a function containing a case statement as a lookup table. The function looks up the first character of its first argument and stores the uppercase equivalent in _UPR. If the first character is not a lowercase letter, it is unchanged (Listing 7-4).

 

Listing . to_upper, Convert First Character of $1 to Uppercase

 

to_upper()

case $1 in

a*) _UPR=A ;; b*) _UPR=B ;; c*) _UPR=C ;; d*) _UPR=D ;;

e*) _UPR=E ;; f*) _UPR=F ;; g*) _UPR=G ;; h*) _UPR=H ;; i*) _UPR=I ;; j*) _UPR=J ;; k*) _UPR=K ;; l*) _UPR=L ;; m*) _UPR=M ;; n*) _UPR=N ;; o*) _UPR=O ;; p*) _UPR=P ;;

q*) _UPR=Q ;; r*) _UPR=R ;; s*) _UPR=S ;; t*) _UPR=T ;; u*) _UPR=U ;; v*) _UPR=V ;; w*) _UPR=W ;; x*) _UPR=X ;; y*) _UPR=Y ;; z*) _UPR=Z ;; *) _UPR=${1%${1#?}} ;;

esac

 

To capitalize a word (that is, just the first letter), call to_upper with the word as an argument, and append the rest of the word to $_UPR:

$ word=function

$ to_upper "$word"

$ printf "%c%s\n" "$_UPR" "${word#?}"

Function

 

To convert the entire word to uppercase, you can use the upword function shown in Listing 7-5.

Listing . upword, Convert Word to Uppercase

 

_upword() #@ USAGE: upword STRING

{

local word=$1

while [ -n "$word" ] ## loop until nothing is left in $word do

to_upper "$word"

_UPWORD=$_UPWORD$_UPR

word=${word#?} ## remove the first character from $word done

}

upword()

{

_upword "$@"

printf "%s\n" "$_UPWORD"

}

You can use the same technique to convert uppercase to lowercase, but I’ll leave the coding of that as an exercise for you.

Comparing Contents Without Regard to Case

 

When getting user input, a programmer often wants to accept it in either uppercase or lowercase or even a mixture of the two. When the input is a single letter, as in asking for Y or N, the code is simple. There is a choice of using the or symbol (|):

 

read ok

case $ok in

y|Y) echo "Great!" ;;

n|N) echo Good-bye exit 1

;;

*) echo Invalid entry ;;

esac

 

or a bracketed character list:

 

read ok

case $ok in

[yY]) echo "Great!" ;;

[nN]) echo Good-bye exit 1

;;

*) echo Invalid entry ;;

esac

When the input is longer, the first method requires all possible combinations to be listed, for example:

jan | jaN | jAn | jAN | Jan | JaN | JAn | JAN) echo “Great!” ;;

 

The second method works but is ugly and hard to read, and the longer the string is, the harder and uglier it gets:

 

read monthname

case $monthname in ## convert $monthname to number

[Jj][Aa][Nn]*) month=1 ;; [Ff][Ee][Bb]*) month=2 ;;

## ...put the rest of the year here

[Dd][Ee][Cc]*) month=12 ;;

[1-9]|1[0-2]) month=$monthname ;; ## accept number if entered

*) echo "Invalid month: $monthname" >&2 ;;

esac

 

A better solution is to convert the input to uppercase first and then compare it:

 

_upword "$monthname"

case $_UPWORD in ## convert $monthname to number

JAN*) month=1 ;; FEB*) month=2 ;;

## ...put the rest of the year here

DEC*) month=12 ;;

[1-9]|1[0-2]) month=$monthname ;; ## accept number if entered

*) echo "Invalid month: $monthname" >&2 ;;

Esac

In bash-4.0, you can replace the _upword function with case ${monthname^^} in, although I might keep it in a function to ease transition between versions of bash:

 

_upword()

{

_UPWORD=${1^^}

}

 

Check for Valid Variable Name

 

You and I know what constitutes a valid variable name, but do your users? If you ask a user to enter a variable name, as you might in a script that creates other scripts, you should check that what is entered is a valid name. The function to do that is a simple check for violation of the rules: a name must contain only letters, numbers, and underscores and must begin with a letter or an underscore (Listing 7-6).

 

Listing . validname, Check $1 for a Valid Variable or Function Name

 

validname() #@ USAGE: validname varname case $1 in

## doesn't begin with letter or underscore, or

## contains something not letter, number, or underscore

[!a-zA-Z_]* | *[!a-zA-z0-9_]* ) return 1;;

esac

 

The function is successful if the first argument is a valid variable name; otherwise, it fails.

 

$ for name in name1 2var first.name first_name last-name

> do

>    validname "$name" && echo " valid: $name" || echo "invalid: $name"

> done

valid: name1 invalid: 2var invalid: first.name

valid: first_name invalid: last-name

 

Insert One String into Another

 

To insert a string into another string, it is necessary to split the string into two parts—the part that will be to the left of the inserted string and the part to the right. Then the insertion string is sandwiched

between them.

This function takes three arguments: the main string, the string to be inserted, and the position at which to insert it. If the position is omitted, it defaults to inserting after the first character. The work is

done by the first function, which stores the result in _insert_string. This function can be called to save

the cost of using command substitution. The insert_string function takes the same arguments, which it passes to _insert_string and then prints the result (Listing 7-7).

 

Listing . insert_string, Insert One String into Another at a Specified Location

 

_insert_string() #@ USAGE: _insert_string STRING INSERTION [POSITION]

{

local insert_string_dflt=2    ## default insert location local string=$1    ## container string

local i_string=$2    ## string to be inserted

local i_pos=${3:-${insert_string_dflt:-2}} ## insert location

local left right    ## before and after strings left=${string:0:$(( $i_pos - 1 ))}    ## string to left of insert

right=${string:$(( $i_pos – 1 ))}    ## string to right of insert

_insert_string=$left$i_string$right    ## build new string

}

 

insert_string()

{

_insert_string "$@" && printf "%s\n" "$_insert_string"

}

 

Examples

 

$ insert_string poplar u 4 popular

$ insert_string show ad 3

shadow

$ insert_string tail ops ## use default position topsail

 

Overlay

 

To overlay a string on top of another string, the technique is similar to inserting a string, the difference being that the right side of the string begins not immediately after the left side but at the length of the overlay further along (Listing 7-8).

Listing . overlay, Place One String Over the Top of Another

 

_overlay() #@ USAGE: _overlay STRING SUBSTRING START

{     #@ RESULT: in $_OVERLAY

local string=$1 local sub=$2 local start=$3

local left right

left=${string:0:start-1}    ## See note below right=${string:start+${#sub}-1}

_OVERLAY=$left$sub$right

}

 

overlay() #@ USAGE: overlay STRING SUBSTRING START

{

_overlay "$@" && printf "%s\n" "$_OVERLAY"

}

Examples

$ {

> overlay pony b 1

> overlay pony u 2

> overlay pony s 3

> overlay pony d 4

> } bony puny

posy pond

Trim Unwanted Characters

Variables often arrive with unwanted padding, usually spaces or leading zeroes. These can easily be removed with a loop and a case statement:

var="     John     "

while :    ## infinite loop do

case $var in

' '*) var=${var#?} ;; ## if $var begins with a space remove it

*' ') var=${var%?} ;; ## if $var ends with a space remove it

leftspaces"} ## $var now contains "John"

This technique is refined a little for the trim function (Listing 7-9). Its first argument is the string to be trimmed. If there is a second argument, that is the character that will be trimmed from the string. If no character is supplied, it defaults to a space.

Listing . trim, Trim Unwanted Characters

_trim() #@ Trim spaces (or character in $2) from $1

{

local trim_string

_TRIM=$1 trim_string=${_TRIM##*[!${2:- }]}

_TRIM=${_TRIM%"$trim_string"}

trim_string=${_TRIM%%[!${2:- }]*}

_TRIM=${_TRIM#"$trim_string"}

}

 

trim() #@ Trim spaces (or character in $2) from $1 and print the result

{

_trim "$@" && printf "%s\n" "$_TRIM"

}

Examples

$ trim "     S p a c e d o u t     " S p a c e d o u t

$ trim "0002367.45000" 0

2367.45

 

In case of any ©Copyright or missing credits issue please check CopyRights page for faster resolutions.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.