MatrixAdapt | Logiciel de gestion d'Entreprise, Création et référencement des sites web

from web contents: URL removal explained, Part I: URLs & directories 2013

salam every one, this is a topic from google web master centrale blog: Webmaster level: All

There's a lot of content on the Internet these days. At some point, something may turn up online that you would rather not have out there—anything from an inflammatory blog post you regret publishing, to confidential data that accidentally got exposed. In most cases, deleting or restricting access to this content will cause it to naturally drop out of search results after a while. However, if you urgently need to remove unwanted content that has gotten indexed by Google and you can't wait for it to naturally disappear, you can use our URL removal tool to expedite the removal of content from our search results as long as it meets certain criteria (which we'll discuss below).

We've got a series of blog posts lined up for you explaining how to successfully remove various types of content, and common mistakes to avoid. In this first post, I'm going to cover a few basic scenarios: removing a single URL, removing an entire directory or site, and reincluding removed content. I also strongly recommend our previous post on managing what information is available about you online.

Removing a single URL

In general, in order for your removal requests to be successful, the owner of the URL(s) in question—whether that's you, or someone else—must have indicated that it's okay to remove that content. For an individual URL, this can be indicated in any of three ways:

block the page from crawling via a robots.txt file
block the page from indexing via a noindex meta tag
indicate that the page no longer exists by returning a 404 or 410 status code

Before submitting a removal request, you can check whether the URL is correctly blocked:

robots.txt: You can check whether the URL is correctly disallowed using either the Fetch as Googlebot or Test robots.txt features in Webmaster Tools.
noindex meta tag: You can use Fetch as Googlebot to make sure the meta tag appears somewhere between the <head> and </head> tags. If you want to check a page you can't verify in Webmaster Tools, you can open the URL in a browser, go to View > Page source, and make sure you see the meta tag between the <head> and </head> tags.
404 / 410 status code: You can use Fetch as Googlebot, or tools like Live HTTP Headers or web-sniffer.net to verify whether the URL is actually returning the correct code. Sometimes "deleted" pages may say "404" or "Not found" on the page, but actually return a 200 status code in the page header; so it's good to use a proper header-checking tool to double-check.

If unwanted content has been removed from a page but the page hasn't been blocked in any of the above ways, you will not be able to completely remove that URL from our search results. This is most common when you don't own the site that's hosting that content. We cover what to do in this situation in a subsequent post. in Part II of our removals series.

If a URL meets one of the above criteria, you can remove it by going to http://www.google.com/webmasters/tools/removals, entering the URL that you want to remove, and selecting the "Webmaster has already blocked the page" option. Note that you should enter the URL where the content was hosted, not the URL of the Google search where it's appearing. For example, enter
   http://www.example.com/embarrassing-stuff.html
not
   http://www.google.com/search?q=embarrassing+stuff

This article has more details about making sure you're entering the proper URL. Remember that if you don't tell us the exact URL that's troubling you, we won't be able to remove the content you had in mind.

Removing an entire directory or site

In order for a directory or site-wide removal to be successful, the directory or site must be disallowed in the site's robots.txt file. For example, in order to remove the http://www.example.com/secret/ directory, your robots.txt file would need to include:
   User-agent: *
   Disallow: /secret/
It isn't enough for the root of the directory to return a 404 status code, because it's possible for a directory to return a 404 but still serve out files underneath it. Using robots.txt to block a directory (or an entire site) ensures that all the URLs under that directory (or site) are blocked as well. You can test whether a directory has been blocked correctly using either the Fetch as Googlebot or Test robots.txt features in Webmaster Tools.

Only verified owners of a site can request removal of an entire site or directory in Webmaster Tools. To request removal of a directory or site, click on the site in question, then go to Site configuration > Crawler access > Remove URL. If you enter the root of your site as the URL you want to remove, you'll be asked to confirm that you want to remove the entire site. If you enter a subdirectory, select the "Remove directory" option from the drop-down menu.

Reincluding content

You can cancel removal requests for any site you own at any time, including those submitted by other people. In order to do so, you must be a verified owner of this site in Webmaster Tools. Once you've verified ownership, you can go to Site configuration > Crawler access > Remove URL > Removed URLs (or > Made by others) and click "Cancel" next to any requests you wish to cancel.

Still have questions? Stay tuned for the rest of our series on removing content from Google's search results. If you can't wait, much has already been written about URL removals, and troubleshooting individual cases, in our Help Forum. If you still have questions after reading others' experiences, feel free to ask. Note that, in most cases, it's hard to give relevant advice about a particular removal without knowing the site or URL in question. We recommend sharing your URL by using a URL shortening service so that the URL you're concerned about doesn't get indexed as part of your post; some shortening services will even let you disable the shortcut later on, once your question has been resolved.

Edit: Read the rest of this series:
Part II: Removing & updating cached content
Part III: Removing content you don't own
Part IV: Tracking requests, what not to remove

Companion post: Managing what information is available about you online

Posted by Susan Moskwa, Webmaster Trends Analystthis is a topic published in 2013... to get contents for your blog or your forum, just contact me at: devnasser@gmail.com

from web contents: BlogHer 2007: Building your audience 2013

salam every one, this is a topic from google web master centrale blog: Last week, I spoke at BlogHer Business about search engine optimization issues. I presented with Elise Bauer, who talked about the power of community in blogging. She made great points about the linking patterns of blogs. Link out to sites that would be relevant and useful for your readers. Comment on blogs that you like to continue the conversation and provide a link back to your blog. Write useful content that other bloggers will want to link to. Blogging connects readers and writers and creates real communities where valuable content can be exchanged. I talked more generally about search and a few things you might consider when developing your site and blog.

Why is search important for a business?
With search, your potential customers are telling you exactly what they are looking for. Search can be a powerful tool to help you deliver content that is relevant and useful and meets your customers' needs. For instance, do keyword research to find out the most common types of searches that are relevant to your brand. Does your audience most often search for "houses for sale" or "real estate"? Check your referrer logs to see what searches are bringing visitors to your site (you can find a list of the most common searches that return your site in the results from the Query stats page of webmaster tools). Does your site include valuable content for those searches? A blog is a great way to add this content. You can write unique, targeted articles that provide exactly what the searcher wanted.

How do search engines index sites?
The first step in the indexing process is discovery. A search engine has to know the pages exist. Search engines generally learn about pages from following links, and this process works great. If you have new pages, ensure relevant sites link to them, and provide links to them from within your site. For instance, if you have a blog for your business, you could provide a link from your main site to the latest blog post. You can also let search engines know about the pages of your site by submitting a Sitemap file. Google, Yahoo!, and Microsoft all support the Sitemaps protocol and if you have a blog, it couldn't be easier! Simply submit your blog's RSS feed. Each time you update your blog and your RSS feed is updated, the search engines can extract the URL of the latest post. This ensures search engines know about the updates right away.

Once a search engine knows about the pages, it has to be able to access those pages. You can use the crawl errors reports in webmaster tools to see if we're having any trouble crawling your site. These reports show you exactly what pages we couldn't crawl, when we tried to crawl them, and what the error was.

Once we access the pages, we extract the content. You want to make sure that what your page is about is represented by text. What does the page look like with Javascript, Flash, and images turned off in the browser? Use ALT text and descriptive filenames for images. For instance, if your company name is in a graphic, the ALT text should be the company name rather than "logo". Put text in HTML rather than in Flash or images. This not only helps search engines index your content, but also makes your site more accessible to visitors with mobile browsers, screen readers, or older browsers.

What is your site about?
Does each page have unique title and meta description tags that describe the content? Are the words that visitors search for represented in your content? Do a search of your pages for the queries you expect searchers to do most often and make sure that those words do indeed appear in your site. Which of the following tells visitors and search engines what your site is about?

Option 1
If you're plagued by the cliffs of insanity or the pits of despair, sign up for one of our online classes! Learn the meaning of the word inconceivable. Find out the secret to true love overcoming death. Become skilled in hiding your identity with only a mask. And once you graduate, you'll get a peanut. We mean it.

Option 2
See our class schedule here. We provide extensive instruction and valuable gifts upon graduation.

When you link to other pages in your site, ensure that the anchor text (the text used for the link) is descriptive of those pages. For instance, you might link to your products page with the text "Inigo Montoya's sword collection" or "Buttercup's dresses" rather than "products page" or the ever-popular "click here".

Why are links important?
Links are important for a number of reasons. They are a key way to drive traffic to your site. Visitors of other sites can learn about your site through links to it. You can use links to other sites to provide valuable information to your visitors. And just as links let visitors know about your site, they also let search engines know about it. Links also tell search engines and potential visitors about your site. The anchor text describes what your site is about and the number of relevant links to your pages are an indicator of how popular and useful those pages are. (You can find a list of the links to your site and the most common anchor text used in those links in webmaster tools.)

A blog is a great way to build links, because it enables you to create new content on a regular basis. The more useful content you have, the greater the chances someone else will find that content valuable to their readers and link to it. Several people at the BlogHer session asked about linking out to other sites. Won't this cause your readers to abandon your site? Won't this cause you to "leak out" your PageRank? No, and no. Readers will appreciate that you are letting them know about resources they might be interested in and will remember you as a valuable source of information (and keep coming back for more!). And PageRank isn't a set of scales, where incoming links are weighted against outgoing ones and cancel each other out. Links are content, just as your words are. You want your site to be as useful to your readers as possible, and providing relevant links is a way, just as writing content is, to do that.

The key is compelling content
Google's main goal is to provide the most useful and relevant search results possible. That's the key thing to keep in mind as you look at optimizing your site. How can you make your site the most useful and relevant result for the queries you care about? This won't just help you in the search results, which after all, are just the means to the end. What you are really interested in is keeping your visitors happy and coming back. And creating compelling and useful content is the best way to do that.this is a topic published in 2013... to get contents for your blog or your forum, just contact me at: devnasser@gmail.com

from web contents: Using RSS/Atom feeds to discover new URLs 2013

salam every one, this is a topic from google web master centrale blog: Webmaster Level: Intermediate

Google uses numerous sources to find new webpages, from links we find on the web to submitted URLs. We aim to discover new pages quickly so that users can find new content in Google search results soon after they go live. We recently launched a feature that uses RSS and Atom feeds for the discovery of new webpages.

RSS/Atom feeds have been very popular in recent years as a mechanism for content publication. They allow readers to check for new content from publishers. Using feeds for discovery allows us to get these new pages into our index more quickly than traditional crawling methods. We may use many potential sources to access updates from feeds including Reader, notification services, or direct crawls of feeds. Going forward, we might also explore mechanisms such as PubSubHubbub to identify updated items.

In order for us to use your RSS/Atom feeds for discovery, it's important that crawling these files is not disallowed by your robots.txt. To find out if Googlebot can crawl your feeds and find your pages as fast as possible, test your feed URLs with the robots.txt tester in Google Webmaster Tools.

Written by Raymond Lo, Guhan Viswanathan, and Dave Weissman, Crawl and Indexing Teamthis is a topic published in 2013... to get contents for your blog or your forum, just contact me at: devnasser@gmail.com

from web contents: Traffic drops and site architecture issues 2013

salam every one, this is a topic from google web master centrale blog:

Webmaster Level: Intermediate.

We hear lots of questions about site architecture issues and traffic drops, so it was a pleasure to talk about it in greater detail at SMX London and I'd like to highlight some key concepts from my presentation here. First off, let's gain a better understanding of drops in traffic, and then we'll take a look at site design and architecture issues.

Understanding drops in traffic

As you know, fluctuations in search results happen all the time; the web is constantly evolving and so is our index. Improvements in our ability to understand our users' interests and queries also often lead to differences in how our algorithms select and rank pages. We realize, however, that such changes might be confusing and sometimes foster misconceptions, so we'd like to address a couple of these myths head-on.

Myth number 1: Duplicate content causes drops in traffic!
Webmasters often wonder if the duplicates on their site can have a negative effect on their site's traffic. As mentioned in our guidelines, unless this duplication is intended to manipulate Google and/or users, the duplication is not a violation of our Webmaster Guidelines. The second part of my presentation illustrates in greater detail how to deal with duplicate content using canonicalization.

Myth number 2: Affiliate programs cause drops in traffic!
Original and compelling content is crucial for a good user experience. If your website participates in affiliate programs, it's essential to consider whether the same content is available in many other places on the web. Affiliate sites with little or no original and compelling content are not likely to rank well in Google search results, but including affiliate links within the context of original and compelling content isn't in itself the sort of thing that leads to traffic drops.

Having reviewed a few of the most common concerns, I'd like to highlight two important sections of the presentation. The first illustrates how malicious attacks -- such as an injection of hidden text and links -- might cause your site to be removed from Google's search results. On a happier note, it also covers how you can use the Google cache and Webmaster Tools to identify this issue. On a related note, if we've found a violation of the Webmaster Guidelines such as the use of hidden text or the presence of malware on your site, you will typically find a note regarding this in your Webmaster Tools Message center.
You may also find your site's traffic decreased if your users are being redirected to another site...for example, due to a hacker-applied server- or page-level redirection triggered by referrals from search engines. A similar scenario -- but with different results -- is the case in which a hacker has instituted a redirection for crawlers only. While this will cause no immediate drop in traffic since users and their visits are not affected, it might lead to a decrease in pages indexed over time.

Site design and architecture issues
Now that we've seen how malicious changes might affect your site and its traffic, let's examine some design and architecture issues. Specifically, you want to ensure that your site is able to be both effectively crawled and indexed, which is the prerequisite to being shown in our search results. What should you consider?

First off, check that your robots.txt file has the correct status code and is not returning an error.
Keep in mind some best practices when moving to a new site and the new "Change of address" feature recently added to Webmaster Tools.
Review the settings of the robots.txt file to make sure no pages -- particularly those rewritten and/or dynamic -- are blocked inappropriately.
Finally, make good use of the rel="canonical" attribute to reduce the indexing of duplicate content on your domain. The example in the presentation shows how using this attribute helps Google understand that a duplicate can be clustered with the canonical and that the original, or canonical, page should be indexed.

In conclusion, remember that fluctuations in search results are normal but there are steps that you can take to avoid malicious attacks or design and architecture issues that might cause your site to disappear or fluctuate unpredictably in search results. Start by learning more about attacks by hackers and spammers, make sure everything is running properly at crawling and indexing level by double-checking the HTML suggestions in Webmaster Tools, and finally, test your robots.txt file in case you are accidentally blocking Googlebot. And don't forget about those "robots.txt unreachable" errors!

Written by Luisella Mazza, Search Quality Senior Analyst

this is a topic published in 2013... to get contents for your blog or your forum, just contact me at: devnasser@gmail.com

MatrixAdapt | Logiciel de gestion d'Entreprise, Création et référencement des sites web

Les nouveautés et Tutoriels de Votre Codeur | SEO | Création de site web | Création de logiciel

from web contents: URL removal explained, Part I: URLs & directories 2013

from web contents: BlogHer 2007: Building your audience 2013

from web contents: Using RSS/Atom feeds to discover new URLs 2013

from web contents: Traffic drops and site architecture issues 2013

Labels

Blog Arşivi