Text Processing: Viewing and Manipulating Text
You'll process text constantly. Logs. Configs. Data. These tools are essential.
Master these, and you can work with text efficiently. These are the tools you'll use every day.
Viewing Files
cat: Display Entire File
cat file.txt # Display file
cat file1.txt file2.txt # Display multiple (concatenated)
cat -n file.txt # Show line numbers
What it does: Shows file contents. All of it.
When to use: Small files. Quick viewing.
Real examples:
cat /etc/hostname # Show hostname
cat config1.txt config2.txt > combined.txt # Combine files
My take: Use cat for small files. For large files, use less.
less: View Large Files
less largefile.txt # View file
Navigation:
Space= Next pageb= Previous pageq= Quit/pattern= Searchn= Next search resultG= Go to endg= Go to beginning
My take: Use less for large files. It's better than cat for anything longer than a screen.
Real example:
less /var/log/syslog # View system log
head: First Lines
head file.txt # First 10 lines
head -n 20 file.txt # First 20 lines
head -c 100 file.txt # First 100 characters
What it does: Shows beginning of file.
When to use: Quick peek at file start.
Real example:
head -n 50 /var/log/syslog # First 50 log entries
tail: Last Lines
tail file.txt # Last 10 lines
tail -n 20 file.txt # Last 20 lines
tail -f file.txt # Follow (watch as it grows)
The -f flag: Follows the file. Shows new lines as they're added.
Real example:
tail -f /var/log/nginx/access.log # Watch access log in real-time
My take: tail -f is magic. Watch logs in real-time. You'll use it constantly.
Searching Text
grep: Find Text
grep "pattern" file.txt
grep -i "pattern" file.txt # Case insensitive
grep -n "pattern" file.txt # Show line numbers
grep -r "pattern" directory/ # Recursive
My take: grep is what you'll use most. Find text in files. Essential.
Real example:
grep -i error /var/log/syslog # Find errors
Processing Text
cut: Extract Columns
cut -d: -f1 /etc/passwd # Extract first field (usernames)
cut -d',' -f1,3 file.csv # Extract columns 1 and 3
The -d flag: Delimiter. -f flag: Fields.
Real example:
cut -d: -f1 /etc/passwd # List all usernames
My take: Useful for structured data. CSV files. Config files with delimiters.
sort: Sort Lines
sort file.txt # Sort alphabetically
sort -n file.txt # Sort numerically
sort -r file.txt # Reverse sort
sort -u file.txt # Unique (remove duplicates)
Real example:
sort /etc/passwd # Sort user list
ps aux | sort -k3 -rn # Sort by CPU usage
My take: Useful for organizing output. Sorting logs. Finding duplicates.
uniq: Remove Duplicates
uniq file.txt # Remove consecutive duplicates
sort file.txt | uniq # Remove all duplicates
uniq -c file.txt # Count occurrences
My take: Usually used with sort. sort | uniq removes all duplicates.
wc: Count
wc file.txt # Lines, words, characters
wc -l file.txt # Just lines
wc -w file.txt # Just words
wc -c file.txt # Just characters
Real example:
wc -l /var/log/syslog # Count log lines
My take: Useful for statistics. How many lines? How many words?
Combining Commands (Pipes)
cat file.txt | grep "pattern" | sort | uniq
The pipe |: Sends output of one command to input of next.
Real examples:
ps aux | grep nginx # Find nginx processes
cat /var/log/syslog | grep error | tail -20 # Last 20 errors
cut -d: -f1 /etc/passwd | sort | uniq # Sorted unique usernames
My take: Pipes are powerful. Combine commands. Build pipelines. This is how you process data.
Common Patterns
Find Errors in Logs
grep -i error /var/log/syslog | tail -20
Count Unique Values
cut -d: -f1 /etc/passwd | sort | uniq | wc -l
Extract Information
grep "pattern" file.txt | cut -d' ' -f2
Common Mistakes (I've Made These)
-
Using
caton large files: Don't. Useless. -
Forgetting
-fin tail:tail -ffollows.tailjust shows last lines. -
Not quoting patterns in grep:
grep pattern fileworks, butgrep "pattern" fileis safer. -
Wrong delimiter in cut: Make sure
-dmatches your data format. -
Forgetting to sort before uniq:
uniqonly removes consecutive duplicates. Sort first.
What's Next?
Now that you can process text, you can work with logs and data. Combine these tools. Build pipelines. Process anything.
Or learn about Command History to work more efficiently.
Personal note: When I started, I'd open files in editors to search. Then I learned
grep. Now I grep everything. It's faster. It's more powerful. Text processing tools changed how I work.