What should I use to crawl/download/archive an entire website? It's all simple static pages (no JavaScript), but has lots of links to download small binary files, which I also want to preserve. Any OS -- just want the best tools.

@cancel In the few times I did it in the past I used wget --mirror with a few tweaked parameters for directory traversal and domain-spanning.

@blindcoder It only seems to download .html, images, css, etc.

Follow

@cancel It can only follow HTML code, naturally, it'll follow all hyperlinks regardless of data type.

· · Web · 1 · 0 · 0

@blindcoder No, it's not downloading .zip files that are linked from .html files.

Sign in to participate in the conversation
toot.BERLIN

The Mastodon instance for Berlin. Open to all. Die Mastoden-Instanz für Berlin, offen für Alle, selbst Brandenburger 😉