Get Rid of those Soft 404’s in BANS

404.txt If you have been a Build a Niche Store user for any period of time, you know that the application does not deliver true 404 errors when a page does not exist. Go ahead, visit one of your BANS store pagesand type in some jibberish into your address bar. Chances are, it will return the home page of the site.

That’s BAD – and Google Doesn’t like it!

For the past several days, the folks at Google Webmaster Blog has been writing about 404’s and what they think of them. The post that caught my eye the most was the dreaded Soft 404 Pages. The main reason they dislike soft 404’s:

We discourage the use of so-called “soft 404s” because they can be a confusing experience for users and search engines. Instead of returning a 404 response code for a non-existent URL, websites that serve “soft 404s” return a 200 response code. The content of the 200 response is often the homepage of the site, or an error page.

So, you may ask yourself… why is this bad? According to Google:

 As exemplified above, soft 404s are confusing for users, and furthermore search engines may spend much of their time crawling and indexing non-existent, often duplicative URLs on your site. This can negatively impact your site’s crawl coverage—because of the time Googlebot spends on non-existent pages, your unique URLs may not be discovered as quickly or visited as frequently.

How does this Affect You?

When several of my sites were Google-Dropped a month or two ago, one of the first things they pointed out to me was that the sites were returning status codes of 200, when in reality, the pages did not exist. If a page does not exist, it must return a status of 301, moved permanently, 302 moved temporarily, or 404 to the visitor.

Google spoke… I listened!

How to fix the BANS Soft 404 Error

Edit: 08/16/08 – The method below does not work 100% of the time. Still working on a final resolution. 

Justin over at SEOZombie write a post about fixing the 404 issueon his blog. I took what he provided and modified to my own needs. In his method, he chose to use a 301 permanent redirect to the homepage… Using that method actually returns a 301 to the headers so I decided I would create a custom 404 page in WordPress, and redirect the visitors to that page, with an index of the site.

Step 1 – Download this 404.txt file, you will add it to your index.php in BANS. The file is a bit different than that of Justin’s due to the different way I am handling the error.

Step 2- Open the index.php file on the root of your BANS store and on line 361, you will find:

include “themes/$temp/header.php”;

Paste the contents of the 404 text file JUST ABOVE the code and save the file.

Step 3- Open the htaccess file on the root of your domain. On the VERY TOP LINE, above everything else, paste the following code and save the file:

 ErrorDocument 404 /404

What I am telling the server is that when a 404 occurs, the error document is located at: /404 . You may this represented as the full url at times. The problem with listing the full url is that the server will always return the status of 200 when you use the full url. In this case, I am telling the server, 404’s need to send a 404 to the header, then redirect to www .mydomain .com/404

Step 4- Create the 404 page. For myself, I used WordPress to “Write a Page” and I edited the permalink to be named ”/404″. You can however use any method you want and I am sure you could even use the custom error messages within your hostgator account. You can customize this page any way you like. I chose to pull in the site archives as well as my store menu pages.

Step 5- IF you used WordPress for the page, you now need to hide the new page from your menus. Otherwise, everyone will see you have a page titled : 404.  For this process, I used the exclude=xx option in the WordPress code. While inside of the WordPress manage section, hover over the top of your new page and look at the url in the status bar. It will contain a page=xxx section at the end of the link. xxx will be a number that represents the page id number we need to exclude, in my case, it was 241.

Now you need to open the theme header file. If you are using one of my BANS WordPress templates, you will also have to open the inc-bans-header.php or whichever BANS file contains your wordpress menu code.

If you look in your WordPress and BANS header files, you will see code similar to this:

<? php wp_list_pages(‘&title_li=’); ?>

Essentially, you are looking for the wp_list_pages call to the database so we can add the exclude filter and NOT show the new 404 page in the menu. Since my page number was 241, I added the following code:

<?php wp_list_pages(‘exclude=241&title_li=’); ?>

Save your work, test it all out and now you have a Google Friendly 404 page!

If you want to test it out, head on over to the www.GetaRacecar.com site and start looking through the store pages. Type something abstract into the address bar and see the custom 404 page. 

Mark

Previously Published Articles You May Like to Read:

Rate This Post

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...

21 Comments »

  • Dan said:

    Thanks for the instructions for Wordpress, but what if I have a BANS site?

  • Mark (author) said:

    @ Dan –

    Not tested, but instead of redirecting to a 404 page like I have done, you could just put “/sitemap” as the redirect url.

    This assumes you have BANS on the root of the domain.

    Mark

  • Mark (author) said:

    @ Dan –

    Just to clarify that a bit more…

    Do EVERYTHING outlined in the steps above, through step 3. There is no need to go further for BANS only.

  • Bill said:

    It doesn’t work…on my site or yours. It works on the WP side but on the shop side….goes to blank screen.

  • Mark (author) said:

    @ Bill –

    Yup… only works half-way.

    Pages like this: http://www.getaracecar.com/shop/Race-Cars-For-Sale/Circle-Track-Race-Cars-For-Sale/fff

    works fine… pages with bogus data before the last forward slash though, like this: http://www.getaracecar.com/shop/Race-Cars-For-Sale/Circle-Track-Race-Cars-For-Saleffffff

    Blank screen…

    Halfway there – time to get back on the drawing board!

    Mark

  • Bill said:

    1/2 way beats no way…..I just shudder at the thought of going back and retrofitting this to all my sites.

    I hope K&A incorporate a lot of these annoying little fixes like this….like the Sitemap error etc into whatever the next download is.

  • James said:

    Mark -

    How can you tell if you’ve got ’soft 404s’? (sounds a calcium deficiency :-)

    Does it tell you in Google Webmaster Tools?

  • Steve said:

    With non WP BANS sites, I was unable to redirect to the sitemap using this method, so I have just left it as ErrorDocument 404 /404 for now. Will that satisfy Google?

  • Mark (author) said:

    @ Dan -

    You check the headers at a place like this: http://www.seoconsultants.com/tools/headers.asp

    Just plugin a bogus page and you will see what I mean.

    @ Steve -

    I have not tested, but you can probably use the custom error page with your host also. just create a custom 404.

    I will test the sitemap issue more.

    Mark

  • John Pash said:

    Eeek! What did you do to tick of the search engines? getaracecar.com isn’t indexed in Google OR Yahoo. Does it have something to do with the “10,000 other racing related websites linking back” to you? lol.

    Did you accidentally 404 your home page or something?

  • Mark (author) said:

    @ John -

    No… nothing to do with the links… this was one of my sites that Google dropped after a manual review 2 months ago.

    After MANY conversations with Google respondents, it was dropped for being a “Weak Affiliate Site”… go figure!

    I was able to get a different site reindexed back into the engine and the change over to a blog on the getaracecar site is part of my efforts to get it reindexed as well.

    FYI – Even with the Google and Yahoo deindex – it still has +500 daily uniques and is still a very good site in my portfolio.

    Mark

  • John Pash said:

    Mark,

    Man, that sux. But that traffic just shows just how much we really AREN’T dependent on the big G. It must be MSN and all the others.

    I’m always waiting for my BANS sites to go missing from the index, which is why I’m trying to Blogify them. I think it’s really unfair how Google tries to police the internet. Who are they to say that affiliate marketing isn’t a legitimate business?

    In a related issue, I would love to read what you know about manual reviews. Is there a way to detect and/or defend?

  • Mark (author) said:

    @ Josh -

    In my case… I wrote a blog post about 2 specific things.

    Using existing sites to promote new sites, and using a content spinner to increase unique content.

    In one of the posts, I used the GetaRacecar site as an example, and in another post, I used 4 other sites as examples. ALL 5 of these and several others were axed at the same time.

    My first tip to a problem was when one of the commenters on a few of my posts did not provide a website or real email address in the form, and their IP resolved to Sunnyvale Ca. although that may have been a coincidence. The subject of his/her comment was that what I was doing was crooked and deserved a google penalty!

    ALL 12 (Yes, ALL) of the sites I had listed in one of my Google Webmasters Tool accounts were delisted. (I have more than one account due to using it for a long time with other websites)

    ALL of the sites had Adsense on them as well…

    Incidentally, during my discussions I was also informed that one of the sites was reported in the Spam report tool that Google employs…

    I wish I could pinpoint or share one specific thing that prompted the manual review… but I think it was a compilation of several. I ruled out an IP based ban or a domain ownership based ban since I still have many others both on the same box, and under my domain onwership account.

    As mentioned above, I was able to get one of the sites back into the index thus far. It took several weeks and much coorespondence on the Webmaster Forums and private emails.

    Here is what I was told on the Google WebMaster Forums:

    1 – Thin Affiliate site – create content. If you can remove all BANS provided content and you have very little left… its not good.
    2 – Domain resolves with both www AND no www in the address bar (Dup content, but not a deindex issue)
    3 – No real 404 page! Every site page delivers a 200 status to the headers.
    4 – Remove Paid links (I had a paid directory)
    5 – Remove links created to boost other sites PR (I linked to other sites so they could get indexed)

    Essentially… I dont think it was an issue with BANS, just the way I blogged about using different methods, which skirted the rules.

    Kinda like speeding when their is no traffic on the highway… your a sitting duck! When you speed with a crowd, you have a much better chance they wont pinpoint you alone.

  • Josh said:

    2 – Domain resolves with both www AND no www in the address bar (Dup content, but not a deindex issue)

    ——-

    Do’h! Just checked a couple of my sites and I have this issue. What is the best solution? htaccess change? Or is there a plugin that will do the trick?

  • Mark (author) said:

    I added the following to my htaccess… Not on all istes yet, but working toward it.

    RewriteCond %{HTTP_HOST} ^mywebsite\.com [NC]
    RewriteRule ^(.*)$ http://www.mywebsite.com/$1 [L,R=301]

    Mark

  • Sean said:

    I just tested your store category for the 404, it works.

    Did you know that you can create a custom 404 page. You will need an HTML Editor to do this.

    I have a custom Logo file that my header calls. I copy this html into the 404 page and then paste part of the footer to complete the page. This is so that if a page is missing or moved, the users can still click on a category that appropriately matches their search. The body is simply a few links and a message telling the user or customer that the page has moved – please accept our apology and then say you may find these links helpful or please visit the homepage to search for another item.

    There is already a 404 page in the hosting account. Take the header (logo) and the footer, put some text in the body – thanks for visiting but the page you requested has been moved or deleted. Please revisit the homepage to find what your looking for. Yadah yadah yadah. That way your visitor still has a chance to visit your website.

  • Adrienne said:

    I just tried one of my bans sites and when I typed in some gobbledygook, it took me to a page that said sorry we don’t have what you are looking for now, so that site should be ok?

  • Christine said:

    I didn’t use wp to create the 404 page just went thru step 3 and it seems to be working for me, however I wasn’t sure what extension to use for my custom 404 page so I did 404.php and uploaded it but that doesn’t work. for example it I type in:
    http://www.MYBANSDOMAIN.com/jjjjjjjjjjjj
    it’s blank, but if I type in
    http://www.MYBANSDOMAIN.com/storepage/storecategory/jjjjjjjj
    I get 404 Not Found. still not the 404 page I created but I guess it will work?
    What about the blank page, is that ok? or what extension should I’ve used?

  • Sean said:

    I’d like to use a custom 404 page as well. This 404 page fix will work without the .htaccess – I tried it on my pure bans site.

    Kinda slacking to just now trying it on my own site. But got around to upgrading a 1.0 site to 3.0 – total redo, so I got some urls that will need 404’s.

    The 404.text file has some coding in it that should allow for it to point to the custom page. Just gotta figure it out.

  • Soft 404 in BANS Revisited | The Niche Store Builder - Succeed with Build a Niche Store said:

    [...] week, I wrote a post about the issues with the Soft 404 header status code in BANS sites and why Google hates a soft 404. After looking into it a bit further, I want to clarify that the [...]

  • Giving Away This Website is Not Going to be Easy! | Build a Niche Store Guide said:

    [...] some trash to the end of the URL, to force it to 404. You can read how to fix the soft 404 at the BANS Soft 404 issue page on the Niche Store Builder [...]

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.