Companies with global aspirations need to address China and her biggest search engine. Baidu's increasingly varied and sophisticated offerings allow us to gain a glimpse into the priorities and thought processes around search in China. Through an in-depth look at Baidu's Webmaster Tools, we'll learn what we can about SEO for Baidu at a strategic level, whether it is an emphasis on user trust, content freshness or any other myriad factors.
Baidu Webmaster Tools
Baidu relies heavily on webmaster participation to keep their index up to date. There are many tools that we are familiar with, and a few that are surprisingly included.
There are five categories for Baidu's Webmaster Tools (BWT) (see below). I will discuss the tools that most impact any decision to enter China, namely sections two through four: 1. Mobile Section 2. Crawling 3. Search Appearance 4. Optimization & Maintenance 5. Website Components
Baidu offers tools for URL submissions, robots testing and crawl diagnosis. The robots tool is beginner-friendly, as it allows you to enter rules in a text field and test them against a specified URL. Also, crawl diagnosis is as expected, with reports for crawl frequency, error reports and a “fetch as Baidu” tool, which will fetch the URL entered and evaluate the page for any errors due to redirects, servers, robots and your usual 404 errors. The weakness of Baidu’s tool is that it only evaluates the first 200 kb, and only for the few kinds of errors above.
Your strategic interest is most likely in the URL submissions tools. You may submit new pages on your site with the URL submission tool (single or in bulk), and you can even submit external inbound links.
You can submit new content to Baidu programmatically with CURL and POST commands, as well as via PHP and RUBY. With respect to indexing new content, this puts a degree of control in the webmaster’s hands that doesn’t exist in Google.
Sitemaps are supported but only as a premium feature. I couldn’t submit a sitemap, and the tooltip seems to suggest that only websites with a high authority will have the privilege of submitting new content to Baidu via sitemap. If you can't programmatically submit new content, there is a section for manual submission of new pages. It might be time now to start mending fences with your CTO who you've been pestering for more structured markup on-site.
There is also a tool for submission of pages for deindexing (死链提交). I'll call this the page-dumping tool. This is clearly a very important one, as they permit submissions of sitemap URLs (ending in txt or xml), though generally not from new users.
They even let you define the crawl frequency for your magnificent page-dumping sitemap! Removal of dead links is clearly a massive priority for Baidu. If you don’t have a document detailing trashed pages, you can submit subfolders. Their toolkit for clearing obsolete content is so extensive that they'll even let you define error pages for 3 types of errors, namely HTTP 404 errors, soft 404 errors, and HTTP redirects due to errors. There is also a meta tag which tells Baidu that this page is obsolete: <meta name="baidu-deadlink" value="1">
This section is where we can set some important information for Baidu users that are searching for our website or company, and start experimenting with Baidu's html markup features.
In website settings （站点属性), you can set your site’s Chinese name, upload a logo (if your site has reached Baidu's threshold of importance), set the appropriate toggle for your website’s mobile-responsiveness, select your industry and input a customer service phone number. These aren't very remarkable features technically but they are extremely important for building trust with Chinese consumers and putting your company in front of them.
Baidu supports sitelink extensions. However, this feature is in the beta-testing stage and participation requires submission of an application. There is also a beta program for data markup in the categories of video, software and games. Baidu is clearly playing catch up here, but for those of us who have benefited from the sitelinks and Rich Snippets that Google supports, we're ready to jump on these beta programs immediately.
The only concession to a western audience that can be seen in the Webmaster Tools is Baidu's Structured Data Plugin for WordPress.
According to Baidu, this plugin helps them speed up indexation of your discussion threads and content, and automatically generates a sitemap that gets submitted to Baidu, which is one way to get around the lack of a sitemap submission ability in the default Webmaster Tools feature set. It is extremely clear that a company that wants to enter China would do well to adopt Wordpress and install the Baidu plugin.
Optimization and Maintenance
In this section, Baidu gives us our favourite tools. We have the equivalent of Google's Search Analytics as well as link metrics.
Baidu's search analytics tool provides keyword data. My searches haven't turned up reports of Baidu suffering from the problem of 'not provided' keywords. However, I can't verify it as yet since my test websites haven't received organic traffic from Baidu. For Baidu Webmaster Tools, data from as recent as 5 hours ago and no earlier than 30 days ago are available. Top landing pages can be seen, and the same reports are available for mobile too. It's adequate, but clearly nothing to write home about.
There’s a tool for page optimization suggestions. Though I've faced the occasional request time out, it works. you can even request suggestions for websites that aren’t your own.
In terms of the quality of suggestions, it is pretty mundane. There are free plugins in the Google Chrome store that yield reports faster and with more detail. However, considering that Chinese users can't use Google easily, this is a nice default option.
For link metrics, Baidu lists found external links and dead links under their links analysis tool. Baidu groups internal dead links, outbound dead links and inbound dead links in 1 report, and gives you a tab for inbound links. In GWT, only inbound dead links are reported as crawl errors. This really drives home the point of Baidu being uber-paranoid about dead links. They will even tell you when your outbound links are dead. I wouldn't be surprised if it was a ranking factor too, which would be tragically messy for Baidu webmasters, who would then have to clean up after the trigger-happy blogger on the content team, slipping links indiscriminately into the blog.
The vulnerability tool checks whether your site is vulnerable to common hacks like SQL query injection and cross-domain scripting. There's also one for site migrations / URL structure changes. These are nice to have, but I don't imagine them carrying any strategic importance.
One notable tool unique to Baidu is their site closure tool. When you submit your website to the tool, Baidu will stop crawling your website and archive your indexed pages. Baidu's own documentation states that it preserves Baidu's index of your pages, but not your rankings. This could be useful if you would like to launch a new website and you need to time it to coincide with PR and marketing.
The selection of tools offered by Baidu are pretty revealing in the type of priorities that they have as a company. There are a bunch of unexpected security tools, and a surprisingly intense focus on keeping their index clear of obsolete content and 404 pages. There are also fields where you can submit more company information, especially the customer service phone number. Clearly the company is focused on ensuring that users receive an experience deemed trustworthy and authoritative, even more than ensuring completeness of their index.
If your company is open to investing resources into the priorities and values mentioned above, you might find China to be a very welcoming market. Let me know below if you're considering the move, and any thoughts you might have.
TL;DR 1. You can submit new content through CURL, POST, PHP and RUBY 2. Never let obsolete content / dead links fester on your site 3. Build trust with Chinese consumers through submitting basic company info such as service hotlines 4. Use Wordpress with the Baidu plugin