botbotbot 's blog

Text Processing in Shell

Most of example come from problem in Linux Shell - Hackerrank Text Processing Commands

Sed

Trip:
sed in OSX may not working as example. You should try gsed.
gsed can install via brew and call gsed instead of sed

$ cat in
1234 5678 9101 1234  
2999 5178 9101 2234  
$ sed -re 's/([0-9]{4}\s){3}/**** **** **** /' in
**** **** **** 1234  
**** **** **** 2234  
$ cat in
1234 5678 9101 1234  
2999 5178 9101 2234  
$ sed -r 's/([0-9]{4}\s?)([0-9]{4}\s?)([0-9]{4}\s?)([0-9]{4})/\4
\3\2\1/' in
1234 9101 5678 1234  
2234 9101 5178 2999  
$ cat in
hello cat meow meow Cat Cat caT love cAt
$ sed -e 's/cat/{\0}/gi' in
hello {cat} meow meow {Cat} {Cat} {caT} love {cAt}
$ cat in
hello cat meow meow Cat Cat caT love cAt
$ sed -e 's/cat/dog/gi' in
hello dog meow meow dog dog dog love dog

Reference:

  1. Sed - An Introduction and Tutorial by Bruce Barnett
  2. Advanced Bash-Scripting Guide: Sed
  3. Sed Command in Unix and Linux Examples
  4. Unix Sed Tutorial: Advanced Sed Substitution Examples
  5. regular expression in sed for masking credit card
  6. sed, a stream editor
  7. The Basics of Using the Sed Stream Editor to Manipulate Text in Linux
  8. Intermediate Sed: Manipulating Streams of Text in a Linux Environment
  9. Unix - Regular Expressions with SED
  10. Unix Sed Tutorial: Append, Insert, Replace, and Count File Lines
  11. Add Character to the Beginning or to the End of Each Line using AWK and SED
  12. sed, a stream editor
  13. Sed & Awk Book
  14. Overview of Regular Expression Syntax

Awk

$ cat in
A 25 27 50  
B 35 75  
C 75 78  
D 99 88 76  
$ awk '{ if ($4 =="") print "Not all scores are available for",$1; }' in
Not all scores are available for B
Not all scores are available for C
$ cat in
B 35 37 75  
C 75 78 80  
$ awk '{print $1,":",($4 >= 50 && $2 >= 50 && $3 >= 50)?"Pass":"Fail"}' in
B : Fail
C : Pass
$ cat in
B 35 37 75  
C 75 78 80  
$ awk '{ score=($2+$3+$3)/3; \
if (score >= 80) grade = "A"; \
else if (score >= 60) grade = "B"; \
else if(score >= 50) grade= "C"; \
else grade = "FAIL "; \
print $0,":",grade }'
B 35 37 75 : FAIL
C 75 78 80 : B
$ cat in
A 25 27 50  
B 35 37 75  
C 75 78 80  
D 99 88 76  
$ awk 'ORS=NR%2?";":"\n"' in
A 25 27 50;B 35 37 75  
C 75 78 80;D 99 88 76  

Reference:

  1. Awk in 20 Minutes
  2. Awk Introduction Tutorial – 7 Awk Print Examples
  3. 4 Awk If Statement Examples ( if, if else, if else if, :?)
  4. AWK Scripting: How to define awk variables
  5. 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

Grep

$ grep -o text # show only match

Reference:

  1. Return a regex match in a BASH script, instead of replacing it
  2. Grep and Regex
  3. Regular Expression for finding double characters in Bash
  4. 15 Practical Grep Command Examples In Linux / UNIX

Paste

$ paste file1 file2
$ paste -d, file1 file2
$ paste -s file1

Reference:

  1. How to join columns of two files in unix system
  2. 10 examples of paste command usage in Linux
  3. paste command: setting (multiple) delimiters

Bash Substring

# $ {var::-1}
$ a=123
$ echo "${a::-1}"
12

Reference:

  1. Substring in bash