Google's Fresh Crawl
by Phil Craven
IMPORTANT
As of the Dance/Update that began in mid-May 2003, Google has changed the way that it does its Main and Fresh crawls, and its process of incorporating new pages into the index. Until the new processes are better understood, the information below should be ignored.
Google's Fresh Crawl explained
Google does two types of crawl:- the main crawl and the fresh crawl. The main crawl is done once a month; the fresh crawl is done more-or-less daily, but only some pages are crawled. Google is still experimenting with which sites and pages to crawl and how deep to crawl. Neither type of crawl puts any new pages into Google's main index. That only happens at the next update - at the conclusion of the next Google Dance. Fresh crawls can be distinguished from main crawls by the IP addresses used by Googlebot. Fresh crawl: 64.68.82...; Main crawl: 216.239.46...
The fresh crawl recrawls pages that are already in the index, picking up new pages along the way. Fresh-crawled new pages are evaluated in some way and inserted into the search results straight away, which means that new pages can be found by surfers almost immediately, even though they are not yet in Google's main index. A new page can be added to a site today and traffic could start arriving on it within hours.
Also, updated pages that are already in Google's main index, are re-evaluated in some way and inserted into the search results in places that reflect the changes. E.g. the day after the link to this site's SEO Copywriting page was placed on the index page, the index page showed up at #3 for the search term "seo copywriting". The index page was well established in Google's main index, but the SEO copywriting part of it was new, and was given the "fresh" treatment. Very soon after that, the SEO copywriting page itself was 'fresh' ranked at #1.
This is good news for surfers and webmasters, although some websites can suffer for a while due to fresh-crawled new pages pushing them down the rankings.
In practise, many new fresh-crawled pages enjoy a flury of traffic while they are not in the main index. When they have been included in the main index, they take their place in the rankings according to their evaluated merit, and the traffic tends to be reduced unless the page actually merits its 'fresh' ranking, of course.
At the time of writing, the fresh crawl is still new, but my theory of the experience of a new page is this:-
Sometime during a month, the new page is found by Google and fresh-crawled. It is evaluated in some way and placed in a 'fresh' index. From there it is inserted into the rankings, according to its 'fresh' evaluation.
The page is involved in the next end-of-month dance but, because it hasn't yet been main-crawled, it isn't included in the actual update and isn't placed in the main index. It continues to be a 'fresh' page.
Then the main crawl gets underway. If the page still exists, it is crawled and will be included in the following update, when it will enter the main index. During this period, it may keep the 'fresh' ranking that it achieved provided that other new pages don't come along to push it down. It is only after the page enters the main index that it's true ranking is seen.
Because of the page's revised evaluation when entering the main index, traffic from it is likely to drop. That's assuming that the page didn't really merit its 'fresh' ranking.
It should be noted that Google is continually updating the rankings and 'fresh' rankings are very volatile in that they come, go and change during a page's 'fresh' period.
As I said, the fresh crawl is still quite new and not yet fully understood. The experience of a new page from fresh crawl to main index is what I believe I have observed, but my conclusions could easily be wrong. The reason I believe that new pages don't enter the main index until the dance and update after they have been main-crawled, even though they have usually been involved in one dance, is because Google still shows no links to them until after the update following their first main crawl. This is my theory of a new page's experience but, like any theory, it may need to be revised in the light of new observations.
Addendum
As of the New Year 2003 update, Google is applying Toolbar PR0 (zero PageRank) values to some new pages. PR0 normally indicates that a page has been penalized, but these PR0s are not penalties. From my observations, it appears that the values apply to pages that have been fresh-crawled and have gone through an update following the fresh-crawl. Such pages don't get into the main index until after they have been main-crawled and gone through the update after that. It appears that, between the two updates, Google applies PR0 to the pages.
The reason for it may be to do with how Google inserts 'fresh' pages into the rankings or it may be for some other reason entirely. Also, it may be that different PR values are applied to different pages, but it is brand new and, as yet, I have seen only PR0 values applied.آخرین مقالات طراحی سایت
- فایروال ها چگونه کار می کنند؟
- همه چیز در مورد PageRank گوگل
- روند زندگی یک دامنه
- معرفی انواع دامنه اینترنتی (Domain)
- مالتی مديا چیست؟
- هاست چیست ؟
- دامین چیست؟
- 10 دلیل برتر برای خرید دامنه Tel.
- Disabling the Image Toolbar in IE 6 for Your Site
- Two Common Web Design Myths
- How to Add Background Music to Your Web Page
- How to Set Up A Custom 404 File Not Found Page
- Should I Display an Email Address on My Site or Use a Contact Form?
- 6 Things to Note Before Changing Your Site Design
- What's The Difference Between Liquid, Elastic, Relative, Fluid, Flexible and Fixed Layouts?
- What Sort of Website Should I Create In Order to Earn Money?
- 10 اشتباه نابخشودنی تبلیغات اینترنتی
- چرا ظاهر یک سایت به اندازه برنامه نویسی آن با اهمیت است؟
- مزایای تجارت الکترونیک نسبت به تجارت سنتی
- Starting an Online Business Directory - A Great Way to Get Links
دانلود رایگان کتابهای طراحی سایت
- Developers Guide to Web Application Security
- HTML and XML for Beginners
- PHP 5 in Practice
- Expert SQL Server 2008 Development
- Implementing SOA Using Java EE
- Creating Cool Web Sites With HTML, XHTML, And CSS
- Foundation Joomla!
- Web Design for Developers
- Joomla! with Flash
- WordPress MU 2.8: Beginner's Guide
- The Art and Science of Web Design
- Neuro Web Design: What Makes Them Click?
- Build Your Own ASP.NET 2.0 Web Site Using C# & VB
- Visual Design for the Modern Web
- The Web Startup Success Guide
- Building Social Web Applications
- Ajax for Web Application Developers
- HTML, XHTML, and CSS, Sixth Edition
- Professional ASP.NET 2.0 AJAX
- ASP.NET AJAX in Action

