LinuxLinks.com
Newbies What Next ? News Forums Calendar

Search





News Sections
Home
General News (3972/0)
Reviews (626/0)
Press Releases (464/0)
Distributions (187/0)
Software (807/0)
Hardware (522/0)
Security (192/0)
Tutorials (337/0)
Off Topic (180/0)


User Functions
Username:

Password:

Don't have an account yet? Sign up as a New User


Events
There are no upcoming events



Build a Web spider on Linux   
Wednesday, November 15 2006 @ 02:00 PM EST
Contributed by: sde

Web spiders are software agents that traverse the Internet gathering, filtering, and potentially aggregating information for a user. Using common scripting languages and their collection of Web modules, you can easily develop Web spiders. This article shows you how to build spiders and scrapers for Linux® to crawl a Web site and gather information, stock data, in this case

A spider is a program that crawls the Internet in a specific way for a specific purpose. The purpose could be to gather information or to understand the structure and validity of a Web site. Spiders are the basis for modern search engines, such as Google and AltaVista. These spiders automatically retrieve data from the Web and pass it on to other applications that index the contents of the Web site for the best set of search terms.

Similar to a spider, but with more interesting legal questions, is the Web scraper. A scraper is a type of spider that targets specific content from the Web, such as the cost of products or services. One use of the scraper is for competitive pricing, to identify the price of a given product to tailor your price or advertise it accordingly. A scraper can also aggregate data from a number of Web sources and provide that information to a user.

Full article

  [ Views: 1358 ]  


Build a Web spider on Linux | 0 comments | Create New Account
The following comments are owned by whoever posted them. This site is not responsible for what they say.
No user comments.


What's Related
  • Full article
  • More by sde
  • More from Tutorials


  • Story Options
  • Mail Story to a Friend
  • Printable Story Format


  • We have written a range of guides highlighting excellent free books for popular programming languages. Check out the following guides: C, C++, C#, Java, JavaScript, CoffeeScript, HTML, Python, Ruby, Perl, Haskell, PHP, Lisp, R, Prolog, Scala, Scheme, and SQL.

    Built with GeekLog and phpBB
    Comments to the webmaster are welcome
    Copyright 2009 LinuxLinks.com - All rights reserved