Operating on search engines and within SEO can be tricky. With so many methods and theories circulating the industry chatter it can be a challenge to remember the guidelines set forth by Google, Bing and Yahoo.
Here is a run down, word for work (taken directly from their respective pages that outline the rules), regarding the guidelines for each.
We strongly encourage you to pay very close attention to the “Quality Guidelines,” which outline some of the illicit practices that may lead to a site being removed entirely from the Google index or otherwise impacted by an algorithmic or manual spam action. If a site has been affected by a spam action, it may no longer show up in results on Google.com or on any of Google’s partner sites.
Submit a Sitemap using Google Webmaster Tools. Google uses your Sitemap to learn about the structure of your site and to increase our coverage of your webpages.
Make sure all the sites that should know about your pages are aware your site is online.
Design and content guidelines
Make a site with a clear hierarchy and text links. Every page should be reachable from at least one static text link.
Offer a site map to your users with links that point to the important parts of your site. If the site map has an extremely large number of links, you may want to break the site map into multiple pages.
Keep the links on a given page to a reasonable number.
Create a useful, information-rich site, and write pages that clearly and accurately describe your content.
Think about the words users would type to find your pages, and make sure that your site actually includes those words within it.
Try to use text instead of images to display important names, content, or links. The Google crawler doesn’t recognize text contained in images. If you must use images for textual content, consider using the “ALT” attribute to include a few words of descriptive text.
Make sure that your <title> elements and ALT attributes are descriptive and accurate.
Check for broken links and correct HTML.
If you decide to use dynamic pages (i.e., the URL contains a “?” character), be aware that not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them few.
Allow search bots to crawl your sites without session IDs or arguments that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in incomplete indexing of your site, as bots may not be able to eliminate URLs that look different but actually point to the same page.
Make sure your web server supports the If-Modified-Since HTTP header. This feature allows your web server to tell Google whether your content has changed since we last crawled your site. Supporting this feature saves you bandwidth and overhead.
Make use of the robots.txt file on your web server. This file tells crawlers which directories can or cannot be crawled. Make sure it’s current for your site so that you don’t accidentally block the Googlebot crawler. Visit http://code.google.com/web/controlcrawlindex/docs/faq.html to learn how to instruct robots when they visit your site. You can test your robots.txt file to make sure you’re using it correctly with the robots.txt analysis tool available in Google Webmaster Tools.
Make reasonable efforts to ensure that advertisements do not affect search engine rankings. For example, Google’s AdSense ads and DoubleClick links are blocked from being crawled by a robots.txt file.
If your company buys a content management system, make sure that the system creates pages and links that search engines can crawl.
Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don’t add much value for users coming from search engines.
Monitor your site’s performance and optimize load times. Google’s goal is to provide users with the most relevant results and a great user experience. Fast sites increase user satisfaction and improve the overall quality of the web (especially for those users with slow Internet connections), and we hope that as webmasters improve their sites, the overall speed of the web will improve.Google strongly recommends that all webmasters regularly monitor site performance using Page Speed, YSlow, WebPagetest, or other tools. For more information, tools, and resources, see Let’s Make The Web Faster. In addition, the Site Performance tool in Webmaster Tools shows the speed of your website as experienced by users around the world.
These quality guidelines cover the most common forms of deceptive or manipulative behavior, but Google may respond negatively to other misleading practices not listed here. It’s not safe to assume that just because a specific deceptive technique isn’t included on this page, Google approves of it. Webmasters who spend their energies upholding the spirit of the basic principles will provide a much better user experience and subsequently enjoy better ranking than those who spend their time looking for loopholes they can exploit.
If you believe that another site is abusing Google’s quality guidelines, please let us know by filing a spam report. Google prefers developing scalable and automated solutions to problems, so we attempt to minimize hand-to-hand spam fighting. While we may not take manual action in response to every report, spam reports are prioritized based on user impact, and in some cases may lead to complete removal of a spammy site from Google’s search results. Not all manual actions result in removal, however. Even in cases where we take action on a reported site, the effects of these actions may not be obvious.
Quality guidelines – basic principles
Make pages primarily for users, not for search engines.
Don’t deceive your users.
Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you’d feel comfortable explaining what you’ve done to a website that competes with you, or to a Google employee. Another useful test is to ask, “Does this help my users? Would I do this if search engines didn’t exist?”
Think about what makes your website unique, valuable, or engaging. Make your website stand out from others in your field.
These guidelines cover a broad range of topics and are intended to help your content be found and indexed within Bing. These guidelines will not cover every instance, nor provide prescriptive actions specific to every website. For more information, you should read our self-help documents and follow the Bing Webmaster Blog. Inside Bing Webmaster Tools account, you will find SEO Reports and our SEO Analyzer tool for on-demand scanning of individual pages. Both resources will offer basic guidance around what SEO work can be applied to a given website. Beyond these features and sources, you may wish to consider seeking advice from a third party expert.
Content is what Bing seeks. By providing clear, deep, easy to find content on your website, we are more likely to index and show your content in search results. Websites that are thin on content, showing mostly ads, or otherwise redirect visitors away from themselves to other sites quickly tend to not rank well. Your content should be easy to navigate to, rich enough to engage the visitor and provide them the information they seek and as fresh as possible. In many cases, content produced today will still be relevant years from now. In some cases, however, content produced today will go out of date quickly.
Links help Bing find new content and establish a vote of confidence between websites. The site linking to your content is essentially telling us they trust your content. Bing wants to see links grow organically, and abuses to this such as buying links or participating in link schemes (link farms, etc.) lead to the value of such links being deprecated. Excessive link manipulation can lead to your site being delisted.
Social media plays a role in today’s effort to rank well in search results. The most obvious part it plays is via influence. If you are influential socially, this leads to your followers sharing your information widely, which in turn results in Bing seeing these positive signals. These positive signals can have an impact on how you rank organically in the long run.
Being indexed is the first step to developing traffic from Bing. The main pathways to being indexed are:
Links to your content help Bing find it, which can lead us to index your content
Use of features within Bing Webmaster Tools such as Submit URL and Sitemap Upload are also ways to ensure we are aware of your content
Managing how Bingbot crawls your content can be done using the Crawl Control feature inside Bing Webmaster Tools. This feature allows you to control when, and at what pace, Bingbot crawls your website. Webmasters are encouraged to allow Bingbot to crawl quickly and deeply to ensure we find and index as much content as possible.
Page Load Time (PLT)
This element has a direct impact on the satisfaction a user has when they visit your website. Slow load times can lead to a visitor simply leaving your website, seeking their information elsewhere. If they came from our search results that may appear to us to be an unsatisfactory result that we showed. Faster is better, but take care to balance absolute page load speed with a positive, useful user experience.
This file is a touch point for Bingbot to understand how to interact with your website and its content. You can tell Bingbot where to go, where not to go and by doing so guide its efforts to crawl your content. The best practice is to have this file placed at the root of your domain (www.yourwebsite.com/robots.txt) and maintain it to ensure it remains accurate.
This file is very powerful and has the capacity to block Bingbot from crawling your content. Should you block Bingbot, we will not crawl your content and your site or content from your site may not appear in our search results.
This file often resides at the root of your host, say, www.yourdomain.com/sitemap.xml, and contains a list of all of the URLs from your website. Large sites may wish to create an index file containing links to multiple sitemap.xml documents, each containing URLs from the website. Care should be taken to keep these files as clean as possible, so remove old URLs if you take that content off your website.
Most websites have their sitemap files crawled daily to locate any fresh content. It’s important to keep your sitemap files clean and current to help us find your latest content.
If you move content on your website from one location to another, using a redirect makes sense. It can help preserve value the search engine has assigned to the older URL, helps ensure any bookmarks people have remain useful and keeps visitors to your website engaged with your content. Bing prefers you use a 301 permanent redirect when moving content, should the move be permanent. If the move is temporary, then a 302 temporary redirect will work fine. Do not use the rel=canonical tag in place of a proper redirect.
The rel=canonical element helps us determine which version of a URL is the original, when multiple version of a URL return the same content. This can happen when, for example, you append a tracking notation to a URL. Two discrete URLs then exist, yet both have identical content. By implementing a rel=canonical, you can tell us the original one, giving us a hint as to where we should place our trust. Do not use this element in place of a proper redirect when moving content.
Search engine optimization is a valid practice which seeks to improve a website, making content easier to find and more relevant. Taken to extremes, some practices can be abused. The vast majority of instances render a website more appealing to Bing, though performing SEO-related work is no guarantee of rankings or the possibility to receive traffic from Bing. The main area of focus when optimizing a website should include:
<title> tags – keep these clear and relevant
<meta description> tags – keep these clear and relevant, though use the added space to expand on the <title> tag in a meaningful way
alt attributes – use this attribute on <img> tags to describe the image, so that we can understand the content of the image
<h1> tag – helps users understand the content of a page more clearly when properly used
Internal links – helps create a view of how content inside your website is related. Also helps users navigate easily to related content.
Links to external sources – be careful who you link to as it’s a signal you trust them. The number of links pointing from your page to external locations should be reasonable.
Social sharing – enabling social sharing encourages visitors to share your content with their networks
XML Sitemaps – make sure you have these set up and that you keep them fresh and current
Navigational structure – keep it clean, simple and easy to crawl
Graceful degradation – enable a clean down-level experience so crawlers can see your content
URL structure – avoid using session IDs, &, # and other characters when possible
Robots.txt – often placed at root of domain, be careful as its powerful; reference sitemap.xml (or your sitemap-index file) in this document
Verify that Bingbot is not disallowed or throttled in robots.txt: reference
Define high crawl rate hours in the Bing Webmaster Tools via the Crawl Control feature.
Verify that Bingbot is not blocked accidentally at the server level by doing a “Fetch as Bingbot”: reference
Webmasters are encouraged to use the Ignore URL Parameters (found under Configure My Site) tool inside Bing Webmaster Tools to help Bingbot understand which URLs are to be indexed and which URLs from a site may be ignored
Links – cross link liberally inside your site between relevant, related content; link to external sites as well
URL structure and keyword usage – keep it clean and keyword rich when possible
Clean URLs – no extraneous parameters (sessions, tracking, etc.)
HTML & XML sitemaps – enable both so users and crawlers can both find what they need – one does not replace the other
Content hierarchy – structure your content to keep valuable content close to the home page
Global navigation – springs from hierarchy planning + style of nav (breadcrumb, link lists, etc.) – helps ensure users can find all your content
Titles – unique, relevant, 65 characters or so long
H1, H2 and other H* tag usage to show content structure on page
Only one <H1> tag per page
ALT tag usage – helps crawlers understand what is in an image
Keyword usage within the content/text – use the keyword/phrase you are targeting a few times; use variations as well
Anchor text – using targeted keywords as the linked text (anchor text) to support other internal pages
Build based on keyword research – shows you what users are actually looking for
Keep out of rich media and images – don’t use images to house your content either
Create enough content to fully meet the visitor’s expectations. There are no hard and fast rules on the number of words per page, but providing more relevant content is usually safe.
Produce new content frequently – crawlers respond to you posting fresh content by visiting more frequently
Make it unique – don’t reuse content from other sources – critical – content must be unique in its final form on your page
Content management – using 301s to reclaim value from retiring content/pages – a 301 redirect can pass some value from the old URL to the new URL
<rel canonical> to help engines understand which page should be indexed and have value attributed to it
404 error page management can help cleanse old pages from search engine indexes; 404 page should return a 404 code, not a 200 OK code. Reference.
Plan for incoming & outgoing link generation – create a plan around how to build links internally and externally
Internal & external link management – execute by building internal links between related content; consider social media to help build external links, or simply ask websites for them; paying for links is risky
Content selection – planning where to link to – be thoughtful and link to only directly related/relevant items of content internally and externally
Link promotion via social spaces – these can drive direct traffic to you, and help users discover content to link to for you
Managing anchor text properly – carefully plan which actual words will be linked – use targeted keywords wherever possible
This is when you show one version of a webpage to a search crawler like Bingbot, and another to normal visitors.
Link schemes – link farms, three way linking, etc.
Such schemes are intended to inflate the number of links pointed at a website. While they may succeed in increasing the number, they fail to bring quality links to the site, netting no positive gains.
Social media schemes
Like farms are similar to link farms in that they seek to artificially exploit a network effect to game the algorithm. The reality is these are easy to see in action and their value is deprecated. Auto follows encourage follower growth on social sites such as Twitter. They work by automatically following anyone who follows you. Over time this creates a scenario where the number of followers you have is more or less the same as the number of people following you. This does not indicate you have a strong influence. Following relatively few people while having a high follower count would tend to indicate a stronger influential voice.
Meta refresh redirects
These redirects reside in the code of a website and are programmed for a preset time interval. They automatically redirect a visitor when the time expires, redirecting them to other content. Rather than using meta refresh redirects, we suggest you use a normal 301 redirect.
Duplicating content across multiple URLs can lead to Bing losing trust in some of those URLs over time. This issue should be managed by fixing the root cause of the problem. The rel=canonical element can also be used but should be seen as a secondary solution to that of fixing the core problem. If excessive parameterization is causing duplicate content issue, we encourage you to use the Ignore URL Parameters tool.