There are many ways to crawl a site, here is one I found easy.
I often need to work with sites offline for whatever reason. Here is how I do it at the command line with Ubuntu and
$ wget https://www.andrewjstevens.com \ --recursive \ --no-clobber \ --page-requisites \ --html-extension \ --convert-links \ --domains andrewjstevens.com
- recursive: Recursively download the entire site.
- no-clobber: Do not overwrite existing files. Great if you need to interrupt or resume the download.
- page-requisites: Grab all page items including images, CSS, JS, etc.
- html-extension: Save files with a
- convert-links: Convert links, so the site will work offline.
- domains: Only follow links on the specified domain.