To: MIKE REDDERT who wrote (15154 ) 9/7/1999 10:18:00 AM From: Still Rolling Respond to of 29970
Sorry if this has already been posted: > Excite Enlarging Index, Partnered With LookSmart > > In August, Excite began the first phase of an ambitious plan to enlarge > its search index to 250 million web pages and improve the relevancy of > its search results. The search engine also debuted new LookSmart-powered > directory listings. > > Under its new indexing system, which has been in the works for the past > year and a half, Excite plans to visit 500 million or more pages across > the web on a regular basis. It will then retain only those pages that it > determines are most popular, or which offer the best quality > information, or which seem to satisfy the queries its users make. > > This "visit many, keep some" approach is how Excite hopes to expand its > index coverage without simultaneously overwhelming users with irrelevant > or off-topic documents. > > "We don't think just adding more content will do the job for us," said > Kris Carpenter, Excite's Director of Search Products. "We view that as > our number one challenge, understanding what's out there and producing > that top quality content in the first two pages of results." > > Excite is using a number of "off-the-page" criteria to determine both > which pages to retain in its index and how to rank those pages in > response to queries. By off-the-page, I mean factors that are not tied > to what's on the page itself. > > For instance, search engines have traditionally ranked pages by criteria > such as where and how often search terms appear in them. Since these > factors happen "on-the-page," webmasters could make changes to their > pages to try and increase rankings. > > In contrast, off-the-page criteria are those not directly in a > webmaster's control. A good example is link popularity. It is very > difficult for a webmaster to try an outwit a good system that uses link > popularity as a ranking criteria. That's because such a system leverages > information from across the web, which a single webmaster cannot > control. > > Excite has long made use of link popularity, and that criteria is now > being given heavier weight in its new system. Some have also noticed > that Excite has been measuring clickthrough from its results. Carpenter > said the Excite has experimented with using this data to influence > rankings, but that it is not currently being used as part of its > relevancy system. > > Excite is also using another set of off-the-page information that I > can't disclose publicly. I can say that it is unique among the major > search engines in using this type of information, and that it would > seemingly offer yet another way of getting the best information to the > top of search results lists. Of course, the proof will be if relevancy > actually does improve in the long term. > > Each of these off-the-page criteria are weighted differently, but term > frequency and location still come into play. In general, the mixture > should work to reward sites with good content or that at least somehow > distinguish themselves online. > > One big plus to the expanded Excite index will be that good pages should > no longer suddenly disappear from the service for no apparent reason. > This problem has plagued Excite over the past year. It would constantly > drop pages out of its index to make room for new finds. As a result, > webmasters with good representation in Excite might suddenly find all > their pages gone. Similarly, this had an adverse impact on searchers, > because pages that were satisfying their queries one week might no > longer be present the next. With the new system, pages that are deemed > popular or high quality in some way should be retained. > > So when does all this happen? Excite says it is currently at about 113 > million web pages indexed, and that they will increase their volume of > pages indexed by, on average, a rate of over a million pages per day. It > is also introducing a new system meant to revisit pages based on how > often they change, in order to keep the entire index as fresh as > possible. > > In addition to crawling the web, Excite has also maintained a > human-compiled directory of web sites. As at Yahoo, this is where sites > have been reviewed by editors and organized into categories. A new deal > struck in August means that this web directory will now be produced by > LookSmart. In fact, LookSmart's information has already be integrated > into Excite. >