MatrixAdapt | Logiciel de gestion d'Entreprise, Création et référencement des sites web

salam every one, this is a topic from google web master centrale blog:

Duplicate content. There's just something about it. We keep writing about it, and people keep asking about it. In particular, I still hear a lot of webmasters worrying about whether they may have a "duplicate content penalty."
Let's put this to bed once and for all, folks: There's no such thing as a "duplicate content penalty." At least, not in the way most people mean when they say that.
There are some penalties that are related to the idea of having the same content as another site—for example, if you're scraping content from other sites and republishing it, or if you republish content without adding any additional value. These tactics are clearly outlined (and discouraged) in our Webmaster Guidelines:

Don't create multiple pages, subdomains, or domains with substantially duplicate content.
Avoid... "cookie cutter" approaches such as affiliate programs with little or no original content.
If your site participates in an affiliate program, make sure that your site adds value. Provide unique and relevant content that gives users a reason to visit your site first.

(Note that while scraping content from others is discouraged, having others scrape you is a different story; check out this post if you're worried about being scraped.)
But most site owners whom I hear worrying about duplicate content aren't talking about scraping or domain farms; they're talking about things like having multiple URLs on the same domain that point to the same content. Like www.example.com/skates.asp?color=black&brand=riedell and www.example.com/skates.asp?brand=riedell&color=black. Having this type of duplicate content on your site can potentially affect your site's performance, but it doesn't cause penalties. From our article on duplicate content:

Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don't follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.

This type of non-malicious duplication is fairly common, especially since many CMSs don't handle this well by default. So when people say that having this type of duplicate content can affect your site, it's not because you're likely to be penalized; it's simply due to the way that web sites and search engines work.
Most search engines strive for a certain level of variety; they want to show you ten different results on a search results page, not ten different URLs that all have the same content. To this end, Google tries to filter out duplicate documents so that users experience less redundancy. You can find details in this blog post, which states:

When we detect duplicate content, such as through variations caused by URL parameters, we group the duplicate URLs into one cluster.
We select what we think is the "best" URL to represent the cluster in search results.
We then consolidate properties of the URLs in the cluster, such as link popularity, to the representative URL.

Here's how this could affect you as a webmaster:

In step 2, Google's idea of what the "best" URL is might not be the same as your idea. If you want to have control over whether www.example.com/skates.asp?color=black&brand=riedell or www.example.com/skates.asp?brand=riedell&color=black gets shown in our search results, you may want to take action to mitigate your duplication. One way of letting us know which URL you prefer is by including the preferred URL in your Sitemap.
In step 3, if we aren't able to detect all the duplicates of a particular page, we won't be able to consolidate all of their properties. This may dilute the strength of that content's ranking signals by splitting them across multiple URLs.

In most cases Google does a good job of handling this type of duplication. However, you may also want to consider content that's being duplicated across domains. In particular, deciding to build a site whose purpose inherently involves content duplication is something you should think twice about if your business model is going to rely on search traffic, unless you can add a lot of additional value for users. For example, we sometimes hear from Amazon.com affiliates who are having a hard time ranking for content that originates solely from Amazon. Is this because Google wants to stop them from trying to sell Everyone Poops? No; it's because how the heck are they going to outrank Amazon if they're providing the exact same listing? Amazon has a lot of online business authority (most likely more than a typical Amazon affiliate site does), and the average Google search user probably wants the original information on Amazon, unless the affiliate site has added a significant amount of additional value.
Lastly, consider the effect that duplication can have on your site's bandwidth. Duplicated content can lead to inefficient crawling: when Googlebot discovers ten URLs on your site, it has to crawl each of those URLs before it knows whether they contain the same content (and thus before we can group them as described above). The more time and resources that Googlebot spends crawling duplicate content across multiple URLs, the less time it has to get to the rest of your content.
In summary: Having duplicate content can affect your site in a variety of ways; but unless you've been duplicating deliberately, it's unlikely that one of those ways will be a penalty. This means that:

You typically don't need to submit a reconsideration request when you're cleaning up innocently duplicated content.
If you're a webmaster of beginner-to-intermediate savviness, you probably don't need to put too much energy into worrying about duplicate content, since most search engines have ways of handling it.
You can help your fellow webmasters by not perpetuating the myth of duplicate content penalties! The remedies for duplicate content are entirely within your control. Here are some good places to start.

Posted by Susan Moskwa, Webmaster Trends Analyst

this is a topic published in 2013... to get contents for your blog or your forum, just contact me at: devnasser@gmail.com

from web contents: Demystifying the "duplicate content penalty" 2013

Seo Master present to you: By Andrew Bowers, Google Developer Programs

If you missed a session you really wanted to see at Google I/O, you'll be happy to know that over
70 of the sessions, as well as the presentation slides, have now been posted.

With 70+ videos at around 60 minutes a piece, that's a lot of lunch breaks one can spend catching up. Being an organizer of the conference, I don't get to sit through many talks myself, so I plan to do just that. Lunch in October, anyone?2013, By: Seo Master

seo Google I/O recorded sessions now posted 2013

salam every one, this is a topic from google web master centrale blog:

Webmaster level: Intermediate to Advanced

Users are taught to protect themselves from malicious programs by installing sophisticated antivirus software, but often they may also entrust their private information to websites like yours, in which case it’s important to protect their data. It’s also very important to protect your own data; if you have an online store, you don’t want to be robbed.

Over the years companies and webmasters have learned—often the hard way—that web application security is not a joke; we’ve seen user passwords leaked due to SQL injection attacks, cookies stolen with XSS, and websites taken over by hackers due to negligent input validation.

Today we’ll show you some examples of how a web application can be exploited so you can learn from them; for this we’ll use Gruyere, an intentionally vulnerable application we use for security training internally, too. Do not probe others’ websites for vulnerabilities without permission as it may be perceived as hacking; but you’re welcome—nay, encouraged—to run tests on Gruyere.

Client state manipulation - What will happen if I alter the URL?

Let’s say you have an image hosting site and you’re using a PHP script to display the images users have uploaded:

http://www.example.com/showimage.php?imgloc=/garyillyes/kitten.jpg

So what will the application do if I alter the URL to something like this and userpasswords.txt is an actual file?

http://www.example.com/showimage.php?imgloc=/../../userpasswords.txt

Will I get the content of userpasswords.txt?

Another example of client state manipulation is when form fields are not validated. For instance, let’s say you have this form:

It seems that the username of the submitter is stored in a hidden input field. Well, that’s great! Does that mean that if I change the value of that field to another username, I can submit the form as that user? It may very well happen; the user input is apparently not authenticated with, for example, a token which can be verified on the server.
Imagine the situation if that form were part of your shopping cart and I modified the price of a $1000 item to $1, and then placed the order.

Protecting your application against this kind of attack is not easy; take a look at the third part of Gruyere to learn a few tips about how to defend your app.

Cross-site scripting (XSS) - User input can’t be trusted

A simple, harmless URL:
http://google-gruyere.appspot.com/611788451095/%3Cscript%3Ealert('0wn3d')%3C/script%3E
But is it truly harmless? If I decode the percent-encoded characters, I get:

<script>alert('0wn3d')</script>

Gruyere, just like many sites with custom error pages, is designed to include the path component in the HTML page. This can introduce security bugs, like XSS, as it introduces user input directly into the rendered HTML page of the web application. You might say, “It’s just an alert box, so what?” The thing is, if I can inject an alert box, I can most likely inject something else, too, and maybe steal your cookies which I could use to sign in to your site as you.

Another example is when the stored user input isn’t sanitized. Let’s say I write a comment on your blog; the comment is simple:

<a href=”javascript:alert(‘0wn3d’)”>Click here to see a kitten</a>

If other users click on my innocent link, I have their cookies:

You can learn how to find XSS vulnerabilities in your own web app and how to fix them in the second part of Gruyere; or, if you’re an advanced developer, take a look at the automatic escaping features in template systems we blogged about on our Online Security blog.

Cross-site request forgery (XSRF) - Should I trust requests from evil.com?

Oops, a broken picture. It can’t be dangerous--it’s broken, after all--which means that the URL of the image returns a 404 or it’s just malformed. Is that true in all of the cases?

No, it’s not! You can specify any URL as an image source, regardless of its content type. It can be an HTML page, a JavaScript file, or some other potentially malicious resource. In this case the image source was a simple page’s URL:

That page will only work if I’m logged in and I have some cookies set. Since I was actually logged in to the application, when the browser tried to fetch the image by accessing the image source URL, it also deleted my first snippet. This doesn’t sound particularly dangerous, but if I’m a bit familiar with the app, I could also invoke a URL which deletes a user’s profile or lets admins grant permissions for other users.

To protect your app against XSRF you should not allow state changing actions to be called via GET; the POST method was invented for this kind of state-changing request. This change alone may have mitigated the above attack, but usually it's not enough and you need to include an unpredictable value in all state changing requests to prevent XSRF. Please head to Gruyere if you want to learn more about XSRF.

Cross-site script inclusion (XSSI) - All your script are belong to us

Many sites today can dynamically update a page's content via asynchronous JavaScript requests that return JSON data. Sometimes, JSON can contain sensitive data, and if the correct precautions are not in place, it may be possible for an attacker to steal this sensitive information.

Let’s imagine the following scenario: I have created a standard HTML page and send you the link; since you trust me, you visit the link I sent you. The page contains only a few lines:

<script>function _feed(s) {alert("Your private snippet is: " + s['private_snippet']);}</script><script src="http://google-gruyere.appspot.com/611788451095/feed.gtl"></script>

Since you’re signed in to Gruyere and you have a private snippet, you’ll see an alert box on my page informing you about the contents of your snippet. As always, if I managed to fire up an alert box, I can do whatever else I want; in this case it was a simple snippet, but it could have been your biggest secret, too.

It’s not too hard to defend your app against XSSI, but it still requires careful thinking. You can use tokens as explained in the XSRF section, set your script to answer only POST requests, or simply start the JSON response with ‘\n’ to make sure the script is not executable.

SQL Injection - Still think user input is safe?

What will happen if I try to sign in to your app with a username like

JohnDoe’; DROP TABLE members;--

While this specific example won’t expose user data, it can cause great headaches because it has the potential to completely remove the SQL table where your app stores information about members.

Generally, you can protect your app from SQL injection with proactive thinking and input validation. First, are you sure the SQL user needs to have permission to execute “DROP TABLE members”? Wouldn’t it be enough to grant only SELECT rights? By setting the SQL user’s permissions carefully, you can avoid painful experiences and lots of troubles. You might also want to configure error reporting in such way that the database and its tables’ names aren’t exposed in the case of a failed query.
Second, as we learned in the XSS case, never trust user input: what looks like a login form to you, looks like a potential doorway to an attacker. Always sanitize and quotesafe the input that will be stored in a database, and whenever possible make use of statements generally referred to as prepared or parametrized statements available in most database programming interfaces.

Knowing how web applications can be exploited is the first step in understanding how to defend them. In light of this, we encourage you to take the Gruyere course, take other web security courses from the Google Code University and check out skipfish if you're looking for an automated web application security testing tool. If you have more questions please post them in our Webmaster Help Forum.

Written by Gary Illyes, Webmaster Trends Analyst

this is a topic published in 2013... to get contents for your blog or your forum, just contact me at: devnasser@gmail.com

MatrixAdapt | Logiciel de gestion d'Entreprise, Création et référencement des sites web

Les nouveautés et Tutoriels de Votre Codeur | SEO | Création de site web | Création de logiciel

from web contents: Demystifying the "duplicate content penalty" 2013

seo Google I/O recorded sessions now posted 2013

from web contents: Website Security for Webmasters 2013

Labels

Blog Arşivi