WebSuck goes through the web-pages you specify and checks for links and data files. The links are followed, and the data files are output in the format of your choice.
xinabse is a search engine for small to medium sized sites. It consists of a HTTP spider written in Perl and a templatable frontend in PHP. Keywords and sites are stored in a MySQL database.
YaCY is a p2p-based distributed Web Search Engine.
Yahoo BOSS (Build your Own Search Service) is a search API. This PHP 5.3 package can retrieve the results of Web, news, and image searches, and also cache them.