Being a web developer, it’s often handy to crawl one of your sites and see if any links are broken, or are given plain 500 errors because something is broken.
A classic tool to do this with, is Xenu’s Link Sleuth. That tool is old though, no longer updated and it’s a pure GUI tool. Since I couldn’t find what I wanted as a ready to use command line tool, I got down and wrote my own. It took a while, but recently it became functional enough to be released as a v1 and open it up to the world as an open source tool.
So by this I present *drumroll*, the Sitecrawler command line based site crawler (yeah, I know, naming is hard).
What can it do?
Crawl a site by following all links in a page. It only crawls internal links and HTML content.
Crawls links only once. No crawling loops please.
Possibility to export the crawled links to a CSV file, containing the referrer links (handy for tracking 404’s).
Limit crawl time to a fixed number of minutes for large sites.
Set the number of parallel jobs to use for crawling.
Add a delay to throttle requests on slow sites, or sites with rate limits.
It’s written in .NET 6, so it runs on Windows, Mac and Linux. Check it out on GitHub for more details and downloads. It’s proven useful for me already, so I hope it does the same for you.
Sometimes you write this fancy batch script that does a bunch of stuff, and you want to have it print out some status information as it’s doing its thing.
Sometimes this fancy script is doing a lot and there’s going to be a lot of stuff printed, so it would be nice if you could overwrite the previous bit of text. Basically, you want a progress bar or progress indicator of some sort.
There are a few ways of doing this, and one involves manipulating the $Host.UI.RawUI.CursorPosition values. That needs “a lot” of code for something that you really don’t want to write a lot of code for. There are also the oldskool typewrite control characters, however. Like the Carriage Return, `r in PowerShell, which does pretty much the same thing.
So this bit of code, for example, prints everything out on a single line, even though it’s doing that a hundred times:
write-host "`r- Items to process: $($_)".PadRight(25)
Note the “`r” at the beginning of the line. This will reset the cursor to the beginning of the current line, printing the text behind it over any text already present on that line. Do this in a loop, and you keep writing over the previously printed text.
This also explains the PadRight() statement, which makes sure that there are spaces added to the end of the line to erase any characters left over if the previous line was longer than the current one. This happens a number of times in this case, as we’re counting from 100 to 0. I know there are smarter ways to fix this, but this works just fine right here (KISS).
Here’s another example using the CR trick. An actual character based progress bar. Just copy-paste and run it in a shell to see the effect:
The following example is a bit more complex. It displays a spinner for longer running operations using a set of characters.
# Animation object to keep state.
$global:animation = @{ Chars = "-\|/"; Ix = 0; Counter = 0 }
# Animate one step every 500 calls. Lower the number for a faster animation.
function Animate() {
$a = $global:animation
if (($a.Counter % 500) -eq 0) {
Write-Host " $($a.Chars[$a.Ix])`r" -NoNewline
$a.Ix = ($a.Ix + 1) % $a.Chars.Length
}
$a.Counter++
}
# Usage example. Call the animation in a loop.
$largeImages = ls *.jpg -r | where { $_.length -gt 100000; animate }
There’s also the official Write-Progress PowerShell commandlet to show a progress bar on the screen. You might want to check that out too. I’m not a fan of it myself, because it tends to act strange when you scroll in your shell window, but for more complex status updates it can be really handy.
I hope this helps to make your scripts a bit more informative (or fun) when running long jobs.
Are you using Git a lot from the command line? Isn’t it annoying that you have to open a browser and click your way to the GitHub, GitLab or Azure DevOps repo website to create a pull request or do something else that can’t be done in your shell? To solve that problem, I have a PowerShell function that opens the Git repository straight from your current folder in your shell, in your default browser. It checks the Git config for the origin URL, and opens it automatically. It’s super handy to quickly check the online repo, create pull requests etc.
Copy and paste the code below in a .ps1 script, and you’re set.
function Open-RepoInBrowser
{
$url = git config --get remote.origin.url
if ($url -like "*git@*")
{
# Get the URL from an SSH cloned repo.
$url = $url -replace 'git@', ''
$url = "https://" + ($url -replace ':', '/')
}
if ($url -eq $null)
{
Write-Warning "No URL found. Make sure you are in the root of the Git repository."
return
}
Write-Host "Opening URL $url"
start $url
}
# Execute the function
Open-RepoInBrowser
Now navigate into a Git repository folder, run your script, and see that website open up in whatever your default browser is. Note that PowerShell also runs on most Linux versions these days, so nothing is stopping you from using this easy shortcut.
Here’s a dumb problem I keep having on my work laptop. For some reason, Visual Studio 2022 shows me a notification that it can’t find the Cascadia fonts, and that a reboot will probably fix the problem.
That’s great and all, but I’m like in the middle of something and have a ton of other apps open and really don’t feel like rebooting right now (do we ever?). But being stuck looking at code in an ugly ass Courier font, isn’t what a self-respecting developer feels like doing either, right?
Last time I ran into this, I figured I might as well find the font and see if I couldn’t just reinstall it. VS should pick it up again after a restart. Turns out I was right. No reboot needed, here’s how you reinstall the Cascadia fonts on your machine:
Open the path C:\Windows\Fonts
Lookup the Cascadia fonts. There should be 2, CascadiaCode.ttf and CascadiaMono.ttf.
For each font file, double click it. A window will open, previewing the font. In the top toolbar, click Install.
Now restart Visual Studio. You’ll see your code represented in a pretty font once again.
Wouldn’t you like to be greeted with some random ASCII art when you open up a new PowerShell command window? I thought so!
Here’s a project just for you. Download the ASCII Art Message of the Day project, link the script in your PowerShell profile and bam!, random ASCII art awesomeness every time you open a shell.
Follow the installation instructions from the readme f ile, and you are set. You can even customize what color you want to use. I know, it’s fantastic. The random ASCII art comes from asciiart.eu, so check it out if you want to have an idea of what you’ll be getting.