Stay organized with collections
Save and categorize content based on your preferences.
Friday, October 30, 2009
Google uses numerous sources to find new webpages, from links we find on the web to
submitted URLs. We aim to
discover new pages quickly so that users can find new content in Google search results soon after
they go live. We recently launched a feature that uses RSS and Atom feeds for the discovery of
new webpages.
RSS/Atom feeds have been very popular in recent years as a mechanism for content publication.
They allow readers to check for new content from publishers. Using feeds for discovery allows us
to get these new pages into our index more quickly than traditional crawling methods. We may use
many potential sources to access updates from feeds including Reader, notification services, or
direct crawls of feeds. Going forward, we might also explore mechanisms such as
PubSubHubbub
to identify updated items.
In order for us to use your RSS/Atom feeds for discovery, it's important that crawling these files
is not disallowed by your robots.txt.
To find out if Googlebot can crawl your feeds and find your pages as fast as possible, test your
feed URLs with the
robots.txt tester in Google Webmaster Tools.
Written by Raymond Lo, Guhan Viswanathan, and Dave Weissman, Crawl and Indexing Team
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],[],[[["\u003cp\u003eGoogle is now utilizing RSS and Atom feeds to discover and index new webpages more quickly.\u003c/p\u003e\n"],["\u003cp\u003eThis approach enables Google to add new content to search results faster than traditional crawling methods.\u003c/p\u003e\n"],["\u003cp\u003eWebsite owners should ensure their robots.txt file allows Googlebot to crawl their RSS/Atom feeds for optimal indexing.\u003c/p\u003e\n"],["\u003cp\u003eGoogle may use various sources to access feed updates, including Reader, notification services, or direct crawls.\u003c/p\u003e\n"]]],["Google uses various sources, including submitted URLs, to find new webpages. A new feature utilizes RSS and Atom feeds to expedite the discovery and indexing of fresh content. Feeds are accessed through methods like direct crawls or notification services. Ensuring that feed crawling isn't blocked by `robots.txt` is crucial for fast indexing. Webmasters can test feed URLs via the `robots.txt` tester tool to confirm accessibility. PubSubHubbub is a future technology they may also explore to identify updated items.\n"],null,["| It's been a while since we published this blog post. Some of the information may be outdated (for example, some images may be missing, and some links may not work anymore).\n\nFriday, October 30, 2009\n\n\nGoogle uses numerous sources to find new webpages, from links we find on the web to\n[submitted URLs](https://www.google.com/addurl/). We aim to\ndiscover new pages quickly so that users can find new content in Google search results soon after\nthey go live. We recently launched a feature that uses RSS and Atom feeds for the discovery of\nnew webpages.\n\n\nRSS/Atom feeds have been very popular in recent years as a mechanism for content publication.\nThey allow readers to check for new content from publishers. Using feeds for discovery allows us\nto get these new pages into our index more quickly than traditional crawling methods. We may use\nmany potential sources to access updates from feeds including Reader, notification services, or\ndirect crawls of feeds. Going forward, we might also explore mechanisms such as\n[PubSubHubbub](https://code.google.com/p/pubsubhubbub/)\nto identify updated items.\n\n\nIn order for us to use your RSS/Atom feeds for discovery, it's important that crawling these files\nis not disallowed by your [robots.txt](/search/docs/crawling-indexing/robots/intro).\nTo find out if Googlebot can crawl your feeds and find your pages as fast as possible, test your\nfeed URLs with the\n[robots.txt tester in Google Webmaster Tools](https://support.google.com/webmasters/answer/6062598).\n\n\nWritten by Raymond Lo, Guhan Viswanathan, and Dave Weissman, Crawl and Indexing Team"]]