Categories
geek opensource software tips wordpress

warmup your site or wordpress blog with a single command line statement

The Mother huge spider statue contrasting against the sky.

GNU Wget is a powerful tool when it comes to downloading files from the web or mirroring sites. It’s command line features can be daunting and not very obvious. With some experimentation, reading the (f..) manual and some Googling you can get it to do some pretty neat tricks for you.
All of that is from the command line too, which is great if you want to schedule this kind of magic or use it in a script.

For example, you might want to warm-up your site or WordPress blog so your homepage and all posts linked from it are present in your cache when a visitor arrives. I’m assuming you are using a caching on your site otherwise this is pretty pointless. For WordPress you can use a caching plugin like W3 Total Cache for example.

With Wget, it goes like this:

wget.exe http://n3wjack.net --spider --no-directories --level=1 --recursive 
         --accept-regex=n3wjack.net/20[1..9].*

The command line parameters (in order) mean something like:

  1. Crawl n3wjack.net.
  2. Crawl it like a spider (follow the links).
  3. Don’t create directories for downloads.
  4. Crawl 1 level deep (so anything linked on the homepage is OK, but don’t go deeper).
  5. Do this recursively (so it actually goes 1 level deep).
  6. Follow only links that start with "/201..." to "/209..." (it’s a regular expression).
    This one is a trick to have it only follow links to blog-posts because my URL scheme begins with the year of the post (2015, 2016, …). It’s good until 2099, which should do the trick I guess. :)
    This way I’m also avoiding it loading all tag, category or page links.

If your site has a different URL scheme you’ll have to change the accept regex pattern to fit your scheme.

You can download Wget from the GNU site. It’s Open Source and is available for Windows, Mac and various Unix systems.
For Chocolatey users, there is a wget package available to install it on your system.

Categories
blog geek mystuff n3wjack opensource programming software wordpress

a wordpress full site spell checker tool

A while ago I noticed that some of my older posts had some silly misspellings in it, so I was looking for a way to spell check all my posts in one shot. I couldn’t really find anything that was free, so I figured I’d try to write something myself to do this for me.

I knew about the free and open source Hunspell spell checker and that you can use it from the command line. So I thought using that together with the WordPress export XML file which has all your post’s content it should be possible to spell check the whole lot.

The end result is a PowerShell script which reads out the XML export file and runs it through Hunspell, parses the spelling errors found and finally bundling it all into a simple HTML report.

It worked nicely for me, even though it’s pretty crude and simple. I only had to use this once, so I don’t see the point of fine-tuning it a lot further.

However this could be handy for others who want to do the same thing, so I cleaned it up a bit, slapped a readme file on it and posted it on Github as the WordPress full site spell checker.
Check it out if you want to spell check your WordPress blog in a single run and maybe this will be good enough to get your job done. You find more info on how to set up and use it on the Github page.

That very basic report I was talking about.

Categories
blog tips wordpress

how to exclude yourself from WordPress analytics

I use a number of analytics tools to see how little hits I get a month and one of the things that annoyed me is that my own visits as I’m writing posts or looking up older posts also get counted. There’s a silly trick to avoid this and it’s so easy it’s stupid I didn’t think of it before.

WordPress has these widgets in the Appearance menu which make it easy to put all sorts of components in your sidebar and footer. I also use the Text Widget to insert snippets of custom javascript code in my pages, things like those analytics tracker code for example.

To exclude yourself from those stats all you need to do is make sure that code doesn’t get included when you are browsing your own site. Here’s how it works.

  1. Put your web analytics script code in a sidebar text widget. Leave the title empty if you don’t want anything to show up.
  2. Click the “Visibility” button at the bottom of the widget panel.
  3. In the options, choose “Hide” if: “User” is “Logged in”.
  4. Save.

You can set visibility options on WordPress sidebar widgets.

That’s all.
The cool thing is this works with any analytics tool (or any other custom javascript code you want to exclude yourself from) without having to figure out if it has any support for that itself.

Categories
geek hosting internet security wordpress

how to secure your wordpress blog

Carcassonne castle wall

WordPress is popular and as it goes with all kinds of popular software, it becomes a target for hackers trying to take over and use your site to send spam into the world, or just cause some other kind of mayhem.

To protect yourself from this kind of trouble, there are a few things you can do to prevent bad things from happening to your precious WordPress site.

  1. First, keep your WP software up-to-date. There are usually some security fixes in there and you do want to have those live on your public facing site. Hackers know what the vulnerabilities are in old WP versions and scan the internet automatically for unpatched sites. Don’t become an easy target by not having the latest version of WP installed. The latest version of WP (v3.7.1) is able to do security updates itself which is awesome. Be sure to check if your site supports this and activate it if it does.
  2. Keep your plugins up-to-date as well for the very same reason. Old plugins can offer a way in for hackers, and we don’t want that to happen.
  3. Delete (old) plugins you don’t use anymore, or replace them with newer ones. JetPack has a lot on board out of the box now so you can probably ditch a few old plugins. The fewer plugins you have, the less possible vulnerabilities your site has.
  4. Take regular backups. In case something goes wrong, you can at least restore a version you know isn’t compromised.
  5. Harden your WP site by configuring your .htaccess file if your site runs on an Apache web server. It’s explained nicely how to do that in the link. It can prevent hackers that do get access through a bad plugin to do any more damage to the rest of your site.
  6. Use a long, hard to guess and preferably random password for your admin account. Using a different admin user is also a good idea. Brute force login attempts are made against the default “admin” user, so if that one has a long random password you’re pretty safe there. You can use something easier to remember for an alternative admin account if you want, but I recommend you to use something like KeePass to manage long & hard to guess passwords anyway.

Here are some plugins that can help with these tips:

  • WordFence scans your site for possible vulnerabilities by checking your installed WP and plugin files with the ones from the official releases. It also helps with the first 2 tips by warning you by email if a plugin or WP itself needs an update. Quite handy.
  • All In One WordPress security & Firewall plugin scans your site settings for security vulnerabilities and helps you get rid of them. It also has a firewall built in.
  • WP security audit log won’t prevent anything, but it keeps track of logins, updates of plugins etc, so that if something weird happens, you can use it to figure out the “when” and “what”.
  • A backup plugin. There are plenty and you should pick one that fits your needs. I’ve used BackUpWordPress for a DB backup only, but it can also backup the files. It emails you with either the zipped backup or a link to download it if it’s too big to stuff in the email. Another good option is UpdraftPlus which can backup your files & DB to remote storage like Google Drive or Dropbox a.o. Your hoster might also have a full backup feature, which is usually the best option anyway as it will backup more than just your WP site.
  • BruteProtect protects (as it says) against brute force login attempts, a problem a lot of WP blogs had to deal with lately. Next to that you should of course make sure you have a complex password for your admin account.
  • Bad Behavior is mainly a tool to combat spam, but since it scans for incoming malicious requests it can also block the occasional bot looking for vulnerable sites.

For a more extensive guide to securing your WordPress site, also check out this Bloggers Guide to WordPress Security. It’s long and full of great tips and guides covering a wide rang of security practices like how to combat spam, CAPTCHA’s and setting up HTTPS.

Categories
geek google hosting internet security wordpress

guess who got hacked

Night Work

Let me tell you about that time my site got hacked.

Once upon a time I received this email from Google. Now when Google emails you, you usually pay attention, even it it’s a bot. Those guys know their stuff.
The email told me that my site was possibly hacked because it was suddenly feeding spam when the Google bot was passing by.
The reason why I got this email is because I use the free web master tools from the G btw. That way they know my site has behaved nicely over the years, and when it suddenly started spewing spam, they knew something bad was up.

The scary part is that this only happened when Googlebot was munching my pages. Not when I or any other human passed by with a browser. So in other words, I didn’t have a clue.
Because it was quite the mystery, I checked my web folder and found a few suspicious files and folders in there. Suspicious, because I never put them there.

I found a folder named “coockies“, an unknown common.php, session.php and coockies.txt file. My .htaccess file was also changed. All php files and the .htaccess had the same timestamp. I compared my complete WP installation with the original installation files to be sure no other files were modified, which turned out to be the case.

The folder seemed to contain files with file names resembling URIs of my blog posts. The content was unreadable and appeared garbage. I’m guessing it was an encoded version of the spam my site was feeding Google.

At first I thought my WP blog was hacked, but the entry point was simply the modified .htaccess file. It contained a few new rewrite rules which checked the user agent of the incoming request, and if that matched any of the major crawlers, it would redirect to the new php files, which would feed the spammy content.

Cleaning up turned out to be rather easy.
I deleted all the new files, restored my old .htaccess file (hurrah for backups) and changed my site passwords just to be sure.

The fishy thing about all this is that I’m still not sure how these files got on my system (hence the password changes). The timestamp on the files seemed to point to the moment I last ran a WP and plugin update on my site. Maybe it was pulled in with a compromised plugin, but there is no way to tell which one it could have been. Another option is a compromised FTP account, but that password was already random before I changed it so that seems unlikely. I still changed it to a random and longer one to be sure.

I also took some extra defensive measures to try to avoid this kind of hack in the future, but that’s for another post.

Photo by Thomas Heylen, cc-licensed.