Sunday, December 2, 2018

Apache Nutch is the most complete, open source crawler that you can find for Java.

Highly extensible, highly scalable Web crawler

Nutch is a well matured, production ready Web crawler. Nutch 1.x enables fine grained configuration, relying on Apache Hadoop™ data structures, which are great for batch processing.

Featured Post

Will <b>blockchain</b> revolutionize the way we vote?