Linux Extract Email Addresses and Web URLs From A Long Document
March 30, 2024 Leave a comment
Alright! This is a tiny post about specifically two functions. You can either use it directly from the command line or embed those pieces into another script to do the job it is made for. I have used them in both forms. So, thought to share it with you people. 🙂
The file I am using to get the stuff from is quite big and filled with so much text. Refer to as a README.md file in the screenshots. I believe the similar file I have used in the video too.
Extracting Email Addresses From The Document
#!/usr/bin/env bash filename=$1 egrep -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+.[a-zA-Z0-9.-]\b+" $filename
Hey, it is darn simple. In crux, it happens between the word boundary and the use of specific characters and symbols.
Example:
Extracting The Web URLs From The Document
#!/usr/bin/env bash filename=$1 if [[ $1 == "" ]];then echo you need to provide the filename. exit 1 fi sed -ne 's/.*\(http[^"]*\).*/\1/p' < $filename
Ah, it is even easier, simple capture with some regex and replay it to print.
Example:
Alternatively, You can take a peek at my YouTube Video regarding that.