DOI:10.1016/S0169-7552(98)00110-X - Corpus ID: 7587743
The Anatomy of a Large-Scale Hypertextual Web Search Engine
@article{Brin1998TheAO, title={The Anatomy of a Large-Scale Hypertextual Web Search Engine}, author={Sergey Brin and Lawrence Page}, journal={Comput. Networks}, year={1998}, volume={30}, pages={107-117}, url={https://api.semanticscholar.org/CorpusID:7587743} }
- Sergey Brin, Lawrence Page
- Published in Comput. Networks 1 April 1998
- Computer Science
- Comput. Networks
16,460 Citations
Topics
Web Search Engine (opens in a new tab)PageRank (opens in a new tab)Random Surfer (opens in a new tab)PageRank Algorithm (opens in a new tab)Large-scale Search Engine (opens in a new tab)Crawl (opens in a new tab)World Wide Web Worm (opens in a new tab)Anchor Text (opens in a new tab)Google Search Engine (opens in a new tab)Link Structures (opens in a new tab)
16,460 Citations
A parallel view for search engines
- Graciela Verónica Gil CostaAndrea PersicoA. M. Printista
- Computer Science
- 2005
This paper describes the cooperative work between the Crawler, Indexer and the Searcher and describes Scalability is of concern during index construction as well as during query processing.
Mining the Web's Link Structure
- Soumen ChakrabartiB. Dom J. Kleinberg
- Computer Science
- 1999
Clever is a search engine that analyzes hyperlinks to uncover two types of pages: authorities, which provide the best source of information on a given topic; and hubs, which provides collections of links to authorities.
An Analytical Study of Intelligent Parallel Web Crawler
- R.AsaaithambiDr. V. P. Eswaramurthy
- Computer Science
- 2017
Standard crawler architecture and the modules of a proposed system are described, which describes standard crawler architecture and the modules of a proposed system, used to access information from WWW.
A Review of Web Search Engine Applications: InfoSpider, Waco, WebComb and ContentUsageAnts Models
- Taroub Issa
- Computer Science
- 2012
The way in which search engines work is presented and the basic tasks of a search engine are described and an overview of some models that been used to improve the way they work like Waco, InfoSpider, ContentUsageAnts and Webcomb model are introduced.
CRAWLING THE WEB: DISCOVERY AND MAINTENANCE OF LARGE-SCALE WEB DATA
- Amit Chawla
- Computer Science
- 2013
The basics of crawlers and the commonly used techniques of crawling the web are discussed, the pseudo code of basic crawling algorithms, their implementation in C language along with simplified flowcharts are discussed and a comparison study is given in a table.
Web Search
- Computer Science
- 2008
The purpose of the chapter is to describe the techniques that are at the core of today's search engines (such as Google 1, Yahoo! 2, Microsoft Live Search 3 or Exalead 4), that is, mostly keyword search in collections of text documents.
Link-Based Web Analysis : PageRank and HITS Algorithms
Algorithms to improve the performance of vertical search engine spiders were investigated: a breadth-first graph-traversal algorithm with no heuristics to refine the search process, a best-first traversal algorithm that used a hyperlink-analysis heuristic, and a spreading-activation algorithm based on modeling the Web as a neural network.
ea tu re Mining the Web ’ s Link Structure
- SoumenChakrabarti Tomkins
- Computer Science
The creation of a hyperlink by the author of a Web page represents an implicit endorsement of the page being pointed to; by mining the collective judgment contained in the set of such endorsements, the Clever system can gain a richer understanding of the relevance and quality of the Web's contents.
The past, present and future of web information retrieval
- Monika Henzinger
- Computer ScienceIS&T/SPIE Electronic Imaging
- 2003
An exciting new form of search for the future is query-free search: While a user performs her daily tasks, searches are automatically performed to supply her with information that is relevant to her activity.
Web Crawler Architecture
- Marc Najork
- Computer ScienceEncyclopedia of Database Systems
- 2009
In order to crawl a substantial fraction of the “surface web” in a reasonable amount of time, web crawlers must download thousands of pages per second, and are typically distributed over tens or hundreds of computers.
...
...
28 References
Lycos: design choices in an Internet search service
- M. I. Mauldin
- Computer Science
- 1997
The history and precursors of the Lycos system for collecting, storing, and retrieving information about pages on the Web are outlined and some of the design choices made in building this Web indexer are discussed.
Lycos : Design choices in an Internet search service
- Craig A. Knoblock
- Computer Science
- 1997
The history and precursors of the Lycos system for collecting, storing, and retrieving information about pages on the Web are outlined and some of the design choices made in building this Web indexer are discussed.
ParaSite: Mining Structural Information on the Web
- Ellen Spertus
- Computer ScienceComput. Networks
- 1997
The Quest for Correct Information on the Web: Hyper Search Engines
- M. Marchiori
- Computer ScienceComput. Networks
- 1997
Search En-gines for the World Wide Web: A Compara-tive Study and Evaluation Methodology
- Hao-Hua ChuM. Rosenthal
- Computer Science
- 1996
The authors of this study found that Alta Vista outperformed Excite and Lycos in both search facilities and retrieval performance although Lycos had the largest coverage of Web resources among the three Web search engines examined.
GENVL and WWWW: Tools for taming the Web
- O. McBryan
- Computer ScienceWWW Spring 1994
- 1994
HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering
- Ron WeissB. Vélez David Gifford
- Computer Science
- 1996
Experience with HyPursuit suggests that abstraction functions based on hypertext clustering can be used to construct meaningful and scalable cluster hierarchies, and is encouraged by preliminary results on clustering based on both document contents and hyperlink structures.
Queries and computation on the web
- S. AbiteboulV. Vianu
- Computer Science
- 1997
Surprisingly, stratified and well-founded semantics for negation turn out to have basic shortcomings in this context, while inflationary semantics emerges as an appealing alternative.
The Effectiveness of GlOSS for the Text Database Discovery Problem
- L. GravanoH. Garcia-MolinaA. Tomasic
- Computer ScienceSIGMOD Conference
- 1994
A practical solution based on estimating the result size of a query and a database is presented and the GlOSS Glossary of Servers Server is evaluated based on a trace of real user queries.
...
...
Related Papers
Showing 1 through 3 of 0 Related Papers