Categories
geek internet mystuff n3wjack opensource programming software tips tools

an open source web crawler

Being a web developer, it’s often handy to crawl one of your sites and see if any links are broken, or are given plain 500 errors because something is broken.

A classic tool to do this with, is Xenu’s Link Sleuth. That tool is old though, no longer updated and it’s a pure GUI tool. Since I couldn’t find what I wanted as a ready to use command line tool, I got down and wrote my own. It took a while, but recently it became functional enough to be released as a v1 and open it up to the world as an open source tool.

So by this I present *drumroll*, the Sitecrawler command line based site crawler (yeah, I know, naming is hard).

What can it do?

  • Crawl a site by following all links in a page. It only crawls internal links and HTML content.
  • Crawls links only once. No crawling loops please.
  • Possibility to export the crawled links to a CSV file, containing the referrer links (handy for tracking 404’s).
  • Limit crawl time to a fixed number of minutes for large sites.
  • Set the number of parallel jobs to use for crawling.
  • Add a delay to throttle requests on slow sites, or sites with rate limits.

It’s written in .NET 6, so it runs on Windows, Mac and Linux. Check it out on GitHub for more details and downloads. It’s proven useful for me already, so I hope it does the same for you.

Categories
geek mystuff n3wjack opensource software tools

inbox clean up tool update

A while ago, I wrote a command line tool to clean up any IMAP inbox, and delete the oldest emails if the inbox gets over a certain amount of mails.
This is handy and has been doing its thing for a long time, but recently I wanted to extend that, and also delete emails in a specific timeframe. Let’s say between 23h and 6h of each day.

Say hello to v2.0 of the IMAP Cleanup tool, which now has a slightly modified command line, where you can specify if you want to delete using the count, or if you want to use time as the criteria.

It looks like this if you want ot use count (without the login credentials):

.\ImapCleanup.exe count --keep 500

or to use time:

.\ImapCleanup.exe time --from 23:00 --to 6:00

It’s pretty fast in doing its job, so you can run it a number of times sequentially for more specific cleanup jobs.

You can find all the details on how to set it up and use it on the Github page.

Categories
breakcore dnb gabber hardcore jungle mp3 music mystuff n3wjack

darkstep 022 remix compilation

This was a fun one. A remix compo on darkstep.org I joined a while ago, and got a drum’n’bass/crossbreed kinda track selected on called “Fist Fight”.

There’s more track on it by other artists, ranging from jungle, drum’n’bass to speedcore, breakcore and whatever is in between.

You can still send in your own remix if you like, it’s still open. Download the sample pack and get tracking!

Categories
breakcore gabber hardcore jungle mp3 music mystuff n3wjack

short dark shock 5, a darkstep, breakcore compilation

A while back I ran into a tweet from the Darkstep recordings label, asking for submissions for a compilation album. To apply you had to send in a 1-minute clip, which would all be compiled and mixed into a single album.

Track style could be anything, as long as it’s hard & fast. Darkstep, breakcore, jungle, drum’n’bass, glitch, industrial techno, you name it.

I thought I’d give it a shot in Renoise to get to know the software a bit, and it turned out to be pretty much a hardcore techno track. Not the greatest thing I’ve come up with, but it was good enough.

If you’re in the mood for a very noisy and glitchy mix, this might be right for you.

Categories
geek mystuff n3wjack opensource software tools

automatically delete emails with IMAP Cleanup

Let’s say you have this IOT device like a motion detection camera. Which sends you emails. Emails you keep in a separate IMAP mailbox. Lots of emails. So you want to delete those emails in some automated fashion, because doing that daily is oh so boring (remember, lots of emails).

Wouldn’t it be great if there was like some handy command line tool that would delete the oldest emails and keep like a 1.000 or 500 of them only? Well yes, that way I could script that annoying job and run it daily.

I didn’t find something that already did this. So I figured I’d be able to whip something up in an hour or so using PowerShell, or maybe a small .NET console application using an existing IMAP library.

Well, 3 different IMAP libraries and about 4 hours later I did have something rudimentary that finally did what it was supposed to do. Delete the oldest email, and leave the most recent 1000 (or whatever number you want) behind. That took longer than expected, so to regain as much time spent on this as possible I threw the ImapCleanup tool on GitHub, including binary downloads. I hope someone else will find this useful as well.

Beware though. This tool deletes emails. Be careful which mailbox you point this at, and make sure you test it in advance on a dummy mailbox. Maybe your email server behaves differently than mine, and important emails get digitally shredded by mistake.