MatrixAdapt | Logiciel de gestion d'Entreprise, Création et référencement des sites web

Seo Master present to you: Author Picture

By Felipe Hoffa, Cloud Platform team

Google BigQuery is designed to make it easy to analyze large amounts of data quickly. Today we announced several updates that give BigQuery the ability to handle arbitrarily large result sets, use window functions for advanced analytics, and cache query results. You are also getting new UI features, larger interactive quotas, and a new convenient tiered pricing scheme. In this post we'll dig further into the technical details of these new features.

Large results

BigQuery is able to process terabytes of data, but until today BigQuery could only output up to 128 MB of compressed data per query. Many of you asked for more and from now on BigQuery will be able to output results as large as the largest tables our customers have ever had.

To get this benefit, you should enable the new "--allow_large_results" flag when issuing a query job, and specify a destination table. All results will be saved to the new specified table (or appended, if the table exists). In the updated web UI these options can be found under the new "Enable Options" menu.

With this feature, you can run big transformations on your tables, plus get big subsets of data to further analyze from the new table.

Analytic functions

BigQuery's power is in the ability to interactively run aggregate queries over terabytes of data, but sometimes counts and averages are not enough. That's why BigQuery also lets you calculate quantiles, variance and standard deviation, as well as other advanced functions.

To make BigQuery even more powerful, today we are adding support for window functions (also known as "analytical functions") for ranking, percentiles, and relative row navigation. These new functions give you different ways to rank results, explore distributions and percentiles, and traverse results without the need for a self join.

To introduce these functions with an advanced example, let's use the dataset we collected from the Data Sensing Lab at Google I/O. With the percentile_cont() function it's easy to get the median temperature over each room:


SELECT percentile_cont(0.5) OVER (PARTITION BY room ORDER BY data) AS median, room
FROM [io_sensor_data.moscone_io13]
WHERE sensortype='temperature'

In this example, each original data row shows the median temperature for each room. To visualize it better, it's a good idea to group all results by room with an outer query:


SELECT MAX(median) AS median, room FROM (
  SELECT percentile_cont(0.5) OVER (PARTITION BY room ORDER BY data) AS median, room
  FROM [io_sensor_data.moscone_io13]
  WHERE sensortype='temperature'
)
GROUP BY room

We can add an additional outer query, to rank the rooms according to which one had the coldest median temperature. We'll use one of the new ranking window functions, dense_rank():


SELECT DENSE_RANK() OVER (ORDER BY median) rank, median, room FROM (
  SELECT MAX(median) AS median, room FROM (
    SELECT percentile_cont(0.5) OVER (PARTITION BY room ORDER BY data) AS median, room
    FROM [io_sensor_data.moscone_io13]
    WHERE sensortype='temperature'
  )
  GROUP BY room
)

We've updated the documentation with descriptions and examples for each of the new window functions. Note that they require the OVER() clause, with an optional PARTITION BY and sometimes required ORDER BY arguments. ORDER BY tells the window function what criteria to use to rank items, while PARTITION BY allows you to define multiple groups to be analyzed independently of each other.

The window functions don't work with the big GROUP EACH BY and JOIN EACH BY operators, but they do work with the traditional GROUP BY and JOIN BY. As a reminder, we announced GROUP EACH BY and JOIN EACH BY last March, to allow large join and group operations.

Query caching

BigQuery now remembers values that you've previously computed, saving you time and the cost of recalculating the query. To maintain privacy, queries are cached on a per-user basis. Cached results are only returned for tables that haven't changed since the last query, or for queries that are not dependent on non-deterministic parameters (such as the current time). Reading cached results is free, but each query still counts against the max number of queries per day quota. Query results are kept cached for 24 hours, on a best effort basis. You can disable query caching with the new flag --use_cache in bq, or "useQueryCache" in the API. This feature is also accessible with the new query options on the BigQuery Web UI.

BigQuery Web UI: Query validator, cost estimator, and abandonment

The BigQuery UI gets even better: You'll get instant information while writing a query if its syntax is valid. If the syntax is not valid, you'll know where the error is. If the syntax is valid, the UI will inform you how much the query would cost to run. This feature is also available with the bq tool and API, using the --dry_run flag.

An additional improvement: When running queries on the UI, previously you had to wait until its completion before starting another one. Now you have the option to abandon it, to start working on the next iteration of the query without waiting for the abandoned one.

Pricing updates

Starting in July, BigQuery pricing becomes more affordable for everyone: Data storage costs are going from $0.12/GB/month to $0.08/GB/month. And if you are a high-volume user, you'll soon be able to opt-in for tiered query pricing, for even better value.

Bigger quota

To support larger workloads we're doubling interactive query quotas for all users, from 200GB + 1 concurrent query, to 400 GB of concurrent queries + 2 additional queries of unlimited size.

These updates make BigQuery a faster, smarter, and even more affordable solution for ad hoc analysis of extremely large datasets. We expect they'll help to scale your projects, and we hope you'll share your use cases with us on Google+.

The BigQuery UI features a collection of public datasets for you to use when trying out these new features. To get started, visit our sign-up page and Quick Start guide. You should take a look at our API docs, and ask questions about BigQuery development on Stack Overflow. Finally, don't forget to give us feedback and join the discussion on our Cloud Platform Developers Google+ page.

Felipe Hoffa has recently joined the Cloud Platform team. He'd love to see the world's data accessible for everyone in BigQuery.

Posted by Ashleigh Rentz, Editor Emerita

2013, By: Seo Master

seo Google BigQuery new features: bigger, faster, smarter 2013

salam every one, this is a topic from google web master centrale blog:

Webmaster level: All

Google’s Webmaster Team is responsible for most of Google’s informational websites like Google’s Jobs site or Privacy Centers. Maintaining tens of thousands of pages and constantly releasing new Google sites requires more than just passion for the job: it requires quality management.

In this post we won’t talk about all the different tests that can be run to analyze a website; instead we’ll just talk about HTML and CSS validation, and tracking quality over time.

Why does validation matter? There are different perspectives on validation—at Google there are different approaches and priorities too—but the Webmaster Team considers validation a baseline quality attribute. It doesn’t guarantee accessibility, performance, or maintainability, but it reduces the number of possible issues that could arise and in many cases indicates appropriate use of technology.

While paying a lot of attention to validation, we’ve developed a system to use it as a quality metric to measure how we’re doing on our own pages. Here’s what we do: we give each of our pages a score from 0-10 points, where 0 is worst (pages with 10 or more HTML and CSS validation errors) and 10 is best (0 validation errors). We started doing this more than two years ago, first by taking samples, now monitoring all our pages.

Since the beginning we’ve been documenting the validation scores we were calculating so that we could actually see how we’re doing on average and where we’re headed: is our output improving, or is it getting worse?

Here’s what our data say:

Validation score development 2009-2011.

On average there are about three validation issues per page produced by the Webmaster Team (as we combine HTML and CSS validation in the scoring process, information about the origin gets lost), down from about four issues per page two years ago.

This information is valuable for us as it tells us how close we are to our goal of always shipping perfectly valid code, and it also tells us whether we’re on track or not. As you can see, with the exception of the 2nd quarter of 2009 and the 1st quarter of 2010, we are generally observing a positive trend.

What has to be kept in mind are issues with the integrity of the data, i.e. the sample size as well as “false positives” in the validators. We’re working with the W3C in several ways, including reporting and helping to fix issues in the validators; however, as software can never be perfect, sometimes pages get dinged for non-issues: see for example the border-radius issue that has recently been fixed. We know that this is negatively affecting the validation scores we’re determining, but we have no data yet to indicate how much.

Although we track more than just validation for quality control purposes, validation plays an important role in measuring the health of Google’s informational websites.

How do you use validation in your development process?

Posted by Jens O. Meiert, Google Webmaster Team

this is a topic published in 2013... to get contents for your blog or your forum, just contact me at: devnasser@gmail.com

from web contents: Validation: measuring and tracking code quality 2013

salam every one, this is a topic from google web master centrale blog:

Webmaster level: Intermediate

So you’re going global, and you need your website to follow. Should be a simple case of getting the text translated and you’re good to go, right? Probably not. The Google Webmaster Team frequently builds sites that are localized into over 40 languages, so here are some things that we take into account when launching our pages in both other languages and regions.

(Even if you think you might be immune to these issues because you only offer content in English, it could be that non-English language visitors are using tools like Google Translate to view your content in their language. This traffic should show up in your analytics dashboard, so you can get an idea of how many visitors are not viewing your site in the way it’s intended.)

More languages != more HTML templates

We can’t recommend this enough: reuse the same template for all language versions, and always try to keep the HTML of your template simple.

Keeping the HTML code the same for all languages has its advantages when it comes to maintenance. Hacking around with the HTML code for each language to fix bugs doesn’t scale–keep your page code as clean as possible and deal with any styling issues in the CSS. To name just one benefit of clean code: most translation tools will parse out the translatable content strings from the HTML document and that job is made much easier when the HTML is well-structured and valid.

How long is a piece of string?

If your design relies on text playing nicely with fixed-size elements, then translating your text might wreak havoc. For example, your left-hand side navigation text is likely to translate into much longer strings of text in several languages–check out the difference in string lengths between some English and Dutch language navigation for the same content. Be prepared for navigation titles that might wrap onto more than one line by figuring out your line height to accommodate this (also worth considering when you create your navigation text in English in the first place).

Variable word lengths cause particular issues in form labels and controls. If your form layout displays labels on the left and fields on the right, for example, longer text strings can flow over into two lines, whereas shorter text strings do not seem associated with their form input fields–both scenarios ruin the design and impede the readability of the form. Also consider the extra styling you’ll need for right-to-left (RTL) layouts (more on that later). For these reasons we design forms with labels above fields, for easy readability and styling that will translate well across languages.

Screenshots of Chinese and German versions of web forms

click to enlarge

Also avoid fixed-height columns–if you’re attempting to neaten up your layout with box backgrounds that match in height, chances are when your text is translated, the text will overrun areas that were only tall enough to contain your English content. Think about whether the UI elements you’re planning to use in your design will work when there is more or less text–for instance, horizontal vs. vertical tabs.

On the flip side

Source editing for bidirectional HTML can be problematic because many editors have not been built to support the Unicode bidirectional algorithm (more research on the problems and solutions). In short, the way your markup is displayed might get garbled:

<p>ابةتث <img src="foo.jpg" alt=" جحخد"< ذرزسش!</p>

Our own day-to-day usage has shown the following editors to currently provide decent solutions for bidirectional editing: particularly Coda, and also Dreamweaver, IntelliJ IDEA and JEditX.

When designing for RTL languages you can build most of the support you need into the core CSS and use the directional attribute of the html element (for backwards compatibility) in combination with a class on the body element. As always, keeping all styles in one core stylesheet makes for better maintainability.

Some key styling issues to watch out for: any elements floated right will need to be floated left and vice versa; extra padding or margin widths applied to one side of an element will need to be overridden and switched, and any text-align attributes should be reversed.

We generally use the following approach, including using a class on the body tag rather than a html[dir=rtl] CSS selector because this is compatible with older browsers:

Elements:

<body class="rtl">
<h1><a href="http://www.blogger.com/"><img alt="Google" src="http://www.google.com/images/logos/google_logo.png" /></a> Heading</h1>

Left-to-right (default) styling:

h1 {
  height: 55px;
  line-height: 2.05;
  margin: 0 0 25px;
  overflow: hidden;
}
h1 img {
  float: left;
  margin: 0 43px 0 0;
  position: relative;
}

Right-to-left styling:

body.rtl {
  direction: rtl;
}
body.rtl h1 img {
  float: right;
  margin: 0 0 0 43px;
}

(See this in action in English and Arabic.)

One final note on this subject: most of the time your content destined for right-to-left language pages will be bidirectional rather than purely RTL, because some strings will probably need to retain their LTR direction–for example, company names in Latin script or telephone numbers. The way to make sure the browser handles this correctly in a primarily RTL document is to wrap the embedded text strings with an inline element using an attribute to set direction, like this:

<h2>‫עוד ב- <span dir="ltr">Google</span>‬</h2>

In cases where you don’t have an HTML container to hook the dir attribute into, such as title elements or JavaScript-generated source code for message prompts, you can use this equivalent to set direction where ‫ and ‬‬ are Unicode control characters for right-to-left embedding:

<title>&#x202B;‫הפוך את Google לדף הבית שלך‬&#x202C;</title>

Example usage in JavaScript code:

var ffError = '\u202B' +'כדי להגדיר את Google כדף הבית שלך ב\x2DFirefox, לחץ על הקישור \x22הפוך את Google לדף הבית שלי\x22, וגרור אותו אל סמל ה\x22בית\x22 בדפדפן שלך.'+ '\u202C';

(For more detail, see the W3C’s articles on creating HTML for Arabic, Hebrew and other right-to-left scripts and authoring right-to-left scripts.)

It’s all Greek to me…

If you’ve never worked with non-Latin character sets before (Cyrillic, Greek, and a myriad of Asian and Indic), you might find that both your editor and browser do not display content as intended.

Check that your editor and browser encodings are set to UTF-8 (recommended) and consider adding a element and the lang attribute of the html element to your HTML template so browsers know what to expect when rendering your page–this has the added benefit of ensuring that all Unicode characters are displayed correctly, so using HTML entities such as é (é) will not be necessary, saving valuable bytes! Check the W3C’s tutorial on character encoding if you’re having trouble–it contains in-depth explanations of the issues.

A word on naming

Lastly, a practical tip on naming conventions when creating several language versions. Using a standard such as the ISO 639-1 language codes for naming helps when you start to deal with several language versions of the same document.

Using a conventional standard will help users understand your site’s structure as well as making it more maintainable for all webmasters who might develop the site, and using the language codes for other site assets (logo images, PDF documents) is handy to be able to quickly identify files.

See previous Webmaster Central posts for advice about URL structures and other issues surrounding working with multi-regional websites and working with multilingual websites.

That’s a summary of the main challenges we wrestle with on a daily basis; but we can vouch for the fact that putting in the planning and work up front towards well-structured HTML and robust CSS pays dividends during localization!

Posted by Kathryn Cullen, Google Webmaster Team