Getting started

SiteCrawler documents are called sessions. You can make a new session for every time you want to download a site, but you can also save your sessions for later reuse. It's even possible to pause a session while it's downloading, save it to a file and continue download the remaining portions later.

There are many settings in SiteCrawler, but in many cases, you only need to use a few. First of all, fill in the starting address in the field above the tabs. This is the first page that SiteCrawler downloads. It then follows the links on this page, and downloads those, and so on. This continues until there are no more items left to download.

You can start the session right after filling in the starting address, but you might want to set some more things first.

You can use the download folder menu (under the General tab) to specify where the downloaded items are placed locally.

The crawling scope is a general rule defining which items are to be downloaded. By default, this is set to Same Directory, which means that only resources (pages, images, etc) which are located in the same server directory as the starting address, or in directories below it, are downloaded. Read more

The Folder Structure options lets you decide how downloaded files are arranged inside the download folder. The default is Use directories from URL, which means that the download folder is populated with a folder for each host name. Each of those is populated with the downloaded pages arranged in the same directory structure as on the web server.

The second option makes all files go into a single folder (without any sub-folders), which is placed inside the download folder.

When you have set up the session the way you want it, click the Start button in the toolbar to start downloading. While you're downloading, you can pause to change any of the settings. Changes take effect instantly when resuming. You can also save the session for later, and it'll pick up where you left off.

If you want more fine-grained settings, you can read more about rules and restrictions.