Picture this: you have a thousand posts in your blog and some maliciously minded individual (not my first choice of words) comes along, sticks his digital hatchet into your database and hacks away at it until every post is filled with spam links to Viagra stores. Just as you think things can’t get any worse, they just did: you’ve written 30 posts since your last database backup!
What do you do?
Is there a quick way to get rid of all that lovely free Viagra advertising you’re giving away?
Absolutely. With a plugin to edit your database and a few regular expressions, anyone can easily mass edit WordPress posts and pages.
Along Comes Search Regex
Search Regex is the search and replace plugin I use most. It allows a WordPress database to be scanned for specific words and phrases which can then be replaced or deleted. There is no need to physically browse a site’s database because everything can be done from its WordPress admin panel (a picture of that follows in a few paragraphs).
In technical lingo, those words and phrases – whether digits, letters or symbols – are called strings. I will occasionally refer to them as strings from this point on.
More than a simple plain text string finder that deals with, “I’m looking for ‘XYZ’ so find ‘XYZ’, damn it!” search pleas, Search Regex lets us use regular expressions for finding and editing strings.
Whereas a plain text search and replace simply replaces one specific text string with another specific text string, regular expressions make it possible for generalizations to be used in search criteria, for search expressions to specify a context for the search string and for us to do some pretty damn nifty stuff with any matches found. For example,
In the sentence “I love eating really hot curry for breakfast and I love eating sugary cream cakes at night”, a plain text search could replace “love” for “hate” to make the sentence read “I hate eating really hot curry for breakfast and I hate eating sugary cream cakes at night.”
Running a regular expression search on the same sentence, we could swap the positions of the strings “really hot curry” and “sugary cream cakes” to rewrite the sentence as “I love eating sugary cream cakes for breakfast and I love eating really hot curry at night.”
Also, using a regular expression, we could specify that the word “love” should only be replaced with “hate” when it is followed by the string “eating really hot curry” to make it read “I hate eating really hot curry for breakfast and I love eating sugary cream cakes at night.”
Those three examples are very basic but they show the potential of using regular expressions for mass editing WordPress posts and pages.
Staying Safe
Search Regex has a few safety restrictions that limit the data that can be edited. The WordPress database tables that the plugin has access to are those that store post and page content, namely
- comment author, the author’s email address and the comment content
- the post URL, title, meta values, excerpt, content and sniplet
- post tags and tag slugs
Notice the absence of the word “page” in the above list. WordPress treats posts and pages equally so performing a search and replace on post content will also search and replace page content.
Don’t worry about getting things wrong the first time you use Search Regex. Any mistakes will be limited to the selected category plus only one of the three search options commits changes to the database.
The three search options are
- Search
- Replace
- Search and Replace
Pressing Search scans the database for the specified string. No replacements are made.
Pressing Replace scans for the specified string and shows a preview of the replacement in each line the search string is found within.
The picture a few paragraphs below here shows typical results from pressing the Replace button. Green lines show the original string, cream/orange tinted lines show the matched string as it would look were the change committed and any text highlighted in pink is where any changes will be made.
At the far right of each projected change there is a link entitled “Replace”. Click it to commit the change individually.
Pressing Search and Replace finds the search string and overwrites it with its replacement. Pressing this button commits changes to the database without an option to review changes so be certain changes do not have unintended consequences before you press it.
Always remember to backup your database before you use Search Regex to mass edit posts. It is best to backup your complete database. If your database is too large to backup without memory allocation errors occurring then you might want to only backup the tables that the plugin has access to. Those tables are wp_posts, wp_postmeta, wp_comments and wp_commentmeta.
The picture above shows the admin panel for Search Regex. When using the examples shown below – under Useful Search Regex Patterns – in Search Regex, place the Search Pattern in the Search Pattern field and the Replacement Pattern in the Replacement Pattern field. Tick the box next to “Regex” to tell the plugin that the search and replace patterns are regular expressions.
Search Regex has a few quirks that limit the metacharacters that can be used in replacement patterns:
- line break metacharacters such as “\f, \r, \n and \t” are not currently supported in the replacement pattern. Use HTML “<br />” or “<p></p>” tags to introduce a line break through the replacement pattern. Line break metacharacters are supported in the search pattern.
- the whitespace metacharacter (\s) is not currently supported in the replacement pattern so use a regular space where you would usually use a whitespace metacharacter. It is supported in the search pattern.
That’s the end of this little Search Regex instruction guide (I wondered where I’d get “instruction guide” written into the post). The next part of this post demonstrates a few useful regular expressions that can be used with Search Regex to effect changes across WordPress posts in bulk. There was going to be a further explanation of regular expressions tacked onto the end of this post but the reference table and examples within it made it warrant being turned into a post of its own which can be viewed here.
The regex examples shown below are accompanied by the minimum of explanation. Their main purpose is to provide a quick reference of valid Search Regex expressions that may help you mass edit your WordPress posts.
Click here to learn more about regular expressions and to view a quick-reference table for the language used by regular expressions.
Useful Search Regex Patterns
I’ve chosen to use an hash (#) symbol for the delimiter. Remember to change it if your search pattern contains an hash sign or to use a backslash (\) in front of any hash symbol that occurs within your search pattern.
All these search expressions already include opening and closing delimiters.
Add text to the top of every post
Search Pattern
#^^#
Replacement Expression
<p>YOUR TEXT</p>
Notes
HTML Paragraph tags are used to create a clear break between the inserted text and the original content. Remove the paragraph tags to concatenate the inserted text onto the beginning of every post’s opening paragraph – remember to include a space at the replacement string’s end.
You can easily use any other HTML element or a plugin shortcode in the replacement text. For example, you could insert an image with <p><img src=”FILE LOCATION” title=”” alt=”” /></p>.
Add text to the bottom of every post
Search Expression
#$$#
Replacement Pattern
<p>YOUR TEXT</p>
Notes
Again, the HTML Paragraph tags are used to create a clear break between the inserted text and the original content.
Trim double space characters
It’s very easy to accidentally double tap the space bar while typing a WordPress post. Use this expression to reduce those double spaces to single ones.
Search Pattern
#([ ])+#
Replacement Pattern
\1
Notes
Be careful when applying this regex. It will trim any spaces you may have used to style the layout of your posts within <pre> and <code> tags.
Trim space characters from the beginning of a line
This removes any space characters that might precede the beginning of a line. There shouldn’t be any spaces before any lines within any WordPress post. WordPress automatically removes them, but just in case…
Search Pattern
#^[ ]+(.+)#
Replacement Pattern
\1
Trim space characters from the end of a line
Search Pattern
#[ ]+(?=$)#
Replacement Pattern
LEAVE BLANK
Delete accidental double word occurrences.
This looks for words containing only the characters “a-z, A-Z, 0-9 and _”, checks for a space after them then tests whether the word following is a duplicate of its precedent. Duplicate words are replaced by a single instance.
Search Pattern
#(\b(\w+\s\b)\2+)#
Replacement Pattern
\2
Notes
Check each replacement individually. It will edit word duplications that were constructed for effect.
Delete or replace a string only when it occurs after a specific marker
Search Pattern
#(?<=MARKER([ /t]+)?)(STRING TO EDIT)#
Replacement Pattern
Leave blank to delete the matched string else put in your replacement string or pattern.
Notes
Replace “MARKER” with the string that precedes the string that needs editing.
The optional “([ /t]+)?” token may be removed. It is only present to cover any space or tabs between the marker and string being edited.
If the marker is on a different line to the sting being edited, suffix the marker with “(\r?\n)+”. For example #(?<=marker(\r?\n)+)(string to edit)#
Delete or replace a string only when it occurs before a marker
Search Pattern
#(STRING TO EDIT)(?=([ /t]+)?MARKER)#
Replacement Pattern
Leave blank to delete the matched string else put in your replacement string or pattern.
Notes
The optional “([ /t]+)?” token may be removed. It is only present to cover any space or tabs between the marker and string being edited.
If the marker is on a different line to the sting being edited, prefix the marker with “(\r?\n)+”. For example #(string to edit)(?=(\r?\n)+marker)#
Delete or replace a string only when it occurs between two specific markers
Search Pattern
#(?<=FIRST MARKER)(STRING TO EDIT)(?=SECOND MARKER)#
Replacement Pattern
Leave blank to delete the matched string else put in your replacement string or pattern.
Notes
The notes for the previous two regular expressions apply to this one.
Delete all occurrences of any URL containing a common domain name
Search Pattern
#(<a.*domain.tld.*>)(.*)(</a>)#
Replacement Pattern
LEAVE BLANK
Notes
This will remove the specified URL regardless of any class, id, style, title, name or rel attributes etc. It removes from <a> to </a> inclusive of the <a></a> tags.
Delete a URL that occurs before a specified marker
Search Pattern
#(<a .*>)(.*)(</a>)(?=MARKER)#
Replacement Pattern
LEAVE BLANK
Notes
If the specified marker is on a newline then prefix it with “(\r?\n)+”. For example (<a .*>)(.*)(</a>)(?=(\r?\n)+MARKER)
Delete a URL that occurs after a specified marker
Search Pattern
#(?<=MARKER)(<a.*domain.tld.*>)(.*)(</a>)#
Notes
If the specified marker is on a newline then suffix it with “(\r?\n+)”. For example (?<=MARKER(\r?\n)+)(<a.*domain.tld.*>)(.*)(</a>)
Replacement Pattern
LEAVE BLANK
Delete a URL that occurs between two specified markers
Search Pattern
#(?<=MARKER ONE)(<a.*domain.tld.*>)(.*)(</a>)(?=MARKER TWO)#
Replacement Expression
LEAVE BLANK
Notes
The notes for the previous two regular expressions apply to this one too.
Downloads and Resources
Search Regex can be downloaded from WordPress
The plugin’s home page is here
PowerGrep – A Regular Expressions Text Editor. Free download
Introduction to Regular Expressions with table of metacharacters and examples