What should I use to crawl/download/archive an entire website? It's all simple static pages (no JavaScript), but has lots of links to download small binary files, which I also want to preserve. Any OS -- just want the best tools.

@cancel In the few times I did it in the past I used wget --mirror with a few tweaked parameters for directory traversal and domain-spanning.

@blindcoder It only seems to download .html, images, css, etc.


@cancel It can only follow HTML code, naturally, it'll follow all hyperlinks regardless of data type.

· · Web · 1 · 0 · 0

@blindcoder No, it's not downloading .zip files that are linked from .html files.

Sign in to participate in the conversation

The Mastodon instance for Berlin. Open to all. Die Mastoden-Instanz für Berlin, offen für Alle, selbst Brandenburger 😉