Simple and extensible Java crawler.

Visitor

All Visitors are runned on a multi-thread environment, so they MUST be Thread Safe. Don't know if your visitor is thread safe? Send us an email at the user list.

ContentVisitor

The only code you'll usually need to write is a implementation of net.vidageek.crawler.ContentVisitor . This interface provides two methods:

PageVisitor

PageVisitor is a sub interface of ContentVisitor. Usually, you'll won't need to implement this since you can use an already implemented PageVisitor.

Did I mention that you can compose these PageVisitors?