Text Manipulation FAQ

This is the FAQ for the article entitled Huge Text File, Need to Extract Specific Lines? Here’s How. Working examples and more detailed information for each of these commands is available through that link. All these commands work in the Linux terminal, some of them will work in the Windows command line.

When using sed, add a “-i” switch and do not specify a destination file if you want the changes to be applied to the source file without producing a backup. For example:

sed -i 's/StringToTeplace/ReplacementString/g' source.txt

Would work directly inside source.txt without creating a backup.

How do I split a large file into small chunks?

split -l [number of lines] [filename]

Remember to remove the parentheses i.e [ and ]

How do I shuffle the data in a text file?

shuf input.txt > output.txt

How do I sort the data in a text file?

sort input.txt > output.txt

How do I delete specific characters or text from a file?

grep "criteria" sourcefile.txt > destinationfile.txt

How do I replace specific characters or text from a file?

sed 's/StringToTeplace/ReplacementString/g' source.txt > destination.txt

Sometimes the forward slashes need to be replaced with # e.g

sed 's#StringToTeplace#ReplacementString#g' source.txt > destination.txt

How do I delete the Nth character within every line of a file?

sed 's/^(.{#}).(.*)/12/' sourcefile > outputfile

Replace # with the character position. The first character you count has the value of 0, zero.

How do I delete the first N characters of every line within a file?

sed  's .{#}  ' source.txt > destination.shtml

Replace # with the number of characters to be removed.

How do I delete the last N characters of every line within a file?

sed 's/.{#}$//g' source.txt > destination.shtml

Replace # with the number of characters to be removed.

How do I delete everything after a specific character in every line within a file?

sed 's/[character].*/[character]/g' source.txt > destination.text

Replace [character] with the demarcation character or characters (do not include the parentheses “[]”)

This can also be used to replace everything after the character with different characters (the second [character] designates the replacements). Leave out the second [character] to delete the [character] too.

How do I delete everything before a specific character in every line within a file?

sed 's/.*[character]/[character]/g' source.txt > destination.text

Replace [character] with the demarcation character or characters (do not include the parentheses “[]”)

This can also be used to replace everything before the character with different characters (the second [character] designates the replacements). Leave out the second [character] to delete the [character] too.

How do I add characters to the END of every line within a file?

sed 's/$/text to add/g' source.txt > destination.txt

Replace “text to add” with the characters to be added to the end of each data line.

How do I add characters to the BEGINNING of every line within a file?

sed 's/^/text to add/g' source.txt > destination.txt

Replace “text to add” with the characters to be added to the beginning of each data line.

How do I remove duplicate lines of data within a file?

uniq source.txt > destination.txt

The above command checks for repeated data in sequentially so requires the data to be sorted alphanumerically. Alternatively, and better, use the Awk command below which does not require the data to be presorted:

awk '!x[$0]++' source.txt > destination.txt

How do I extract lines within a file that contain specific data?

grep "specific data" source.txt > destination.txt

The quotation marks are essential parts of this command.

Replace specific data with the data the lines to be extracted contain. The quotation marks must be present.

How do I merge n files into one file to separate columns?

paste -d 'delimiter' file1 file2 > newfile

Replace delimiter with the column separation character or code e.g a comma (,) or space ( ) or set of characters (xxxxx)

Sharing is caring!

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
1
0
Would love your thoughts, please comment.x
()
x