er
Tampilkan postingan dengan label crawling and indexing. Tampilkan semua postingan
Tampilkan postingan dengan label crawling and indexing. Tampilkan semua postingan

Kamis, 17 Oktober 2013

SES Chicago - Using Images

We all had a great time at SES Chicago last week, answering questions and getting feedback.

One of the sessions I participated in was Images and Search Engines, and the panelists had great information about using images on your site, as well as on optimizing for Google Image search.

Ensuring visitors and search engines know what your content is about
Images on a site are great -- but search engines can't read them, and not all visitors can. Make sure your site is accessible and can be understood by visitors viewing your site with images turned off in their browsers, on mobile devices, and with screen readers. If you do that, search engines won't have any trouble. Some things that you can do to ensure this:

  • Don't put the bulk of your text in images. It may sound simple, but the best thing you can do is to put your text into well, text. Reserve images for graphical elements. If all of the text on your page is in an image, it becomes inaccessible.
  • Take advantage of alt tags for all of your images. Make sure the alt text is descriptive and unique. For instance, alt text such as "picture1" or "logo" doesn't provide much information about the image. "Charting the path of stock x" and "Company Y" give more details.
  • Don't overload your alt text. Be descriptive, but don't stuff it with extra keywords.
  • It's important to use alt text for any image on your pages, but if your company name, navigation, or other major elements of your pages are in images, alt text becomes especially important. Consider moving vital details to text to ensure all visitors can view them.
  • Look at the image-to-text ratio on your page. How much text do you have? One way of looking at this is to look at your site with images turned off in your browser. What content can you see? Is the intent of your site obvious? Do the pages convey your message effectively?

Taking advantage of Image search
The panelists pointed out that shoppers often use Image search to see the things they want to buy. If you have a retail site, make sure that you have images of your products (and that they can be easily identified with alt text, headings, and textual descriptions). Searchers can then find your images and get to your site.

One thing that can help your images be returned for results in Google Image search is opting in to enhanced image search in webmaster tools. This enables us to use your images in the Google Image Labeler, which harnesses the power of the community for adding metadata to your images.

Someone asked if we have a maximum number of images per site that we accept for the Image Labeler. We don't. You can opt in no matter how many, or how few, images your site has.

Update: More information on using images can be found in our Help Center. 

Senin, 30 September 2013

At the recent Search Engine Strategies conference in freezing Chicago, many of us Googlers were asked questions about duplicate content. We recognize that there are many nuances and a bit of confusion on the topic, so we'd like to help set the record straight.

What is duplicate content?
Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Most of the time when we see this, it's unintentional or at least not malicious in origin: forums that generate both regular and stripped-down mobile-targeted pages, store items shown (and -- worse yet -- linked) via multiple distinct URLs, and so on. In some cases, content is duplicated across domains in an attempt to manipulate search engine rankings or garner more traffic via popular or long-tail queries.

What isn't duplicate content?
Though we do offer a handy translation utility, our algorithms won't view the same article written in English and Spanish as duplicate content. Similarly, you shouldn't worry about occasional snippets (quotes and otherwise) being flagged as duplicate content.

Why does Google care about duplicate content?
Our users typically want to see a diverse cross-section of unique content when they do searches. In contrast, they're understandably annoyed when they see substantially the same content within a set of search results. Also, webmasters become sad when we show a complex URL (example.com/contentredir?value=shorty-george〈=en) instead of the pretty URL they prefer (example.com/en/shorty-george.htm).

What does Google do about it?During our crawling and when serving search results, we try hard to index and show pages with distinct information. This filtering means, for instance, that if your site has articles in "regular" and "printer" versions and neither set is blocked in robots.txt or via a noindex meta tag, we'll choose one version to list. In the rare cases in which we perceive that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we'll also make appropriate adjustments in the indexing and ranking of the sites involved. However, we prefer to focus on filtering rather than ranking adjustments ... so in the vast majority of cases, the worst thing that'll befall webmasters is to see the "less desired" version of a page shown in our index.

How can Webmasters proactively address duplicate content issues?
  • Block appropriately: Rather than letting our algorithms determine the "best" version of a document, you may wish to help guide us to your preferred version. For instance, if you don't want us to index the printer versions of your site's articles, disallow those directories or make use of regular expressions in your robots.txt file.
  • Use 301s: If you have restructured your site, use 301 redirects ("RedirectPermanent") in your .htaccess file to smartly redirect users, the Googlebot, and other spiders.
  • Be consistent: Endeavor to keep your internal linking consistent; don't link to /page/ and /page and /page/index.htm.
  • Use TLDs: To help us serve the most appropriate version of a document, use top level domains whenever possible to handle country-specific content. We're more likely to know that .de indicates Germany-focused content, for instance, than /de or de.example.com.
  • Syndicate carefully: If you syndicate your content on other sites, make sure they include a link back to the original article on each syndicated article. Even with that, note that we'll always show the (unblocked) version we think is most appropriate for users in each given search, which may or may not be the version you'd prefer.
  • Use the preferred domain feature of webmaster tools: If other sites link to yours using both the www and non-www version of your URLs, you can let us know which way you prefer your site to be indexed.
  • Minimize boilerplate repetition: For instance, instead of including lengthy copyright text on the bottom of every page, include a very brief summary and then link to a page with more details.
  • Avoid publishing stubs: Users don't like seeing "empty" pages, so avoid placeholders where possible. This means not publishing (or at least blocking) pages with zero reviews, no real estate listings, etc., so users (and bots) aren't subjected to a zillion instances of "Below you'll find a superb list of all the great rental opportunities in [insert cityname]..." with no actual listings.
  • Understand your CMS: Make sure you're familiar with how content is displayed on your Web site, particularly if it includes a blog, a forum, or related system that often shows the same content in multiple formats.
  • Don't worry be happy: Don't fret too much about sites that scrape (misappropriate and republish) your content. Though annoying, it's highly unlikely that such sites can negatively impact your site's presence in Google. If you do spot a case that's particularly frustrating, you are welcome to file a DMCA request to claim ownership of the content and have us deal with the rogue site.

In short, a general awareness of duplicate content issues and a few minutes of thoughtful preventative maintenance should help you to help us provide users with unique and relevant content.

Minggu, 29 September 2013

Better understanding of your site

SES Chicago was wonderful. Meeting so many of you made the trip absolutely perfect. It was as special as if (Chicago local) Oprah had joined us!

While hanging out at the Google booth, I was often asked about how to take advantage of our webmaster tools. For example, here's one tip on Common Words.

Common Words: Our prioritized listing of your site's content
The common words feature lists in order of priority (from highest to lowest) the prevalent words we've found in your site, and in links to your site. (This information isn't available for subdirectories or subdomains.) Here are the steps to leveraging common words:

1. Determine your website's key concepts. If it offers getaways to a cattle ranch in Wyoming, the key concepts may be "cattle ranch," "horseback riding," and "Wyoming."

2. Verify that Google detected the same phrases you believe are of high importance. Login to webmaster tools, select your verified site, and choose Page analysis from the Statistics tab. Here, under "Common words in your site's content," we list the phrases detected from your site's content in order of prevalence. Do the common words lack any concepts you believe are important? Are they listing phrases that have little direct relevance to your site?

2a. If you're missing important phrases, you should first review your content. Do you have solid, textual information that explains and relates to the key concepts of your site? If in the cattle-ranch example, "horseback riding" was absent from common words, you may then want to review the "activities" page of the site. Does it include mostly images, or only list a schedule of riding lessons, rather than conceptually relevant information?

It may sound obvious, but if you want to rank for a certain set of keywords, but we don't even see those keyword phrases on your website, then ranking for those phrases will be difficult.

2b. When you see general, non-illustrative common words that don't relate helpfully to your site's content (e.g. a top listing of "driving directions" or "contact us"), then it may be beneficial to increase the ratio of relevant content on your site. (Although don't be too worried if you see a few of these common words, as long as you also see words that are relevant to your main topics.) In the cattle ranch example, you would give visitors "driving directions" and "contact us" information. However, if these general, non-illustrative terms surface as the highest-rated common words, or the entire list of common words is only these types of terms, then Google (and likely other search engines) could not find enough "meaty" content.

2c. If you find that many of the common words still don't relate to your site, check out our blog post on unexpected common words.

3. Here are a few of our favorite posts on improving your site's content:
Target visitors or search engines?

Improving your site's indexing and ranking

NEW! SES Chicago - Using Images

4. Should you decide to update your content, please keep in mind that we will need to recrawl your site in order to recognize changes, and that this may take time. Of course, you can notify us of modifications by submitting a Sitemap.

Happy holidays from all of us on the Webmaster Central team!

SES Chicago: Googlers Trevor Foucher, Adam Lasnik and Jonathan Simon