404 error: where to look and how to fix

Details

A 404 error page or “page not found” is a fairly common occurrence on the Internet. A 404 error can encounter when clicking on a link or when entering the URL of a page in the browser line when the server cannot find the requested document for a browser request and returns a 404 response code.

Figure 1. Amazon.com 404 page

People usually don’t attach much importance to this when they see the error “page not found” they are disappointed or even annoyed – they click the “Back” button or close the tab. However, 404 errors can affect SEO. The extent of the impact depends on the cause of these errors and their magnitude. 

In this article, we’ll take a closer look at what 404 error means and how it affects site ranking. We will tell you how to check your site for 404 errors and give recommendations on how to fix them.

Connect the service «SEO | Search Engine Optimization» from #ADINDEX

What is a 404 error (Not Found)

A 404 error or Not Found is a standard server response code that indicates that the server cannot find the requested document.

When the page is working correctly, the server returns a 200 OK response code.

If you are reading this article now, it means that your browser has established a connection to the server, it found the requested page and returned a 200 response code (you merely don’t see this code).

Common causes of 404 errors:

  • the page has been removed;
  • broken links, when there was a mistake in the URL of the page or the URL was changed, but the link was not updated;
  • the user made a mistake while entering the URL of the page in the browser line.

The 404 errors that are related to the removal of pages on the site and the change in the formation of the URLs of the pages are the most common patterns of their appearance that we encounter in ADINDEX projects. 

So, conducting an audit for a client, we found a large number of pages with a 404 error. The problem appeared after the content manager removed items that were out of stock. As a result, this led to a partial loss of backlink mass, as external links led to removed pages.

Figure 2. Example of the dynamics of 404 errors

Figure 2. Example of the dynamics of 404 errors

How 404 errors affect SEO

What does Google say about 404?

The search engine treats 404 errors as a natural occurrence. From Google Support:

Figure 3. Google Support – 404 Errors

But at the same time, 404 errors can affect the ranking of the site.

 Leading SEO specialist Ekaterina Kiyasova

“The fact is, not so much 404 pages negatively affect SEO, but links containing URLs leading to 404 errors.” –  Leading SEO specialist Ekaterina Kiyasova

Too many links to 404 errors signal about quality issues on the website. They can worsen the indexing of the site, which in turn will lead to the loss of positions. It also creates a negative user experience – an increase in the bounce rate and a decrease in the time of interaction with the site. 

It is the “broken” links that are the critical factor. In addition, this negatively affects the crawling budget. The search robot will spend its resources on clicking on links to pages that return a 404 response code, instead of scanning all the necessary and valuable pages to promote. 

Therefore, if there are no broken links, the search robot will not be able to find any links pointing to 404 pages.

No 404 links – no 404

Figure 4. Search Console Support

We will separately consider Soft 404 errors – what is their problem?

Soft 404 error or so-called false 404 error occurs when the page does not exist but returns a 200 OK response code. Notably, such a concept was introduced only by Google’s SE. In other search engines, the term “false 404 error” does not exist.

The causes of Soft 404 errors are:

  • a blank page or very little content on the page. This may also be due to the fact that the page contains resources (images, scripts) that the search robot cannot process since access to them denied in the Robots.txt file, or there are too many resources and their processing will take too much time;
  • a redirect to an irrelevant page configured from a non-existent page;
  • incorrect server operation, when for non-existing pages a response code is returned, which is distinct from 404 or 410

HTTP status 200 OK, that is, successful, tells search engines that the page exists. Accordingly, a non-existent page will be scanned and, as a result, may get into the search results.

This situation can negatively affect the ranking of the entire site and waste a valuable crawler resource.

Three ways to check the response code of a page

1.In the browser

You can quickly check the response code of a page without using third-party tools in the Google Chrome browser – right-click anywhere on the page, click “Verify code” and go to the “Network” section – the “Status” column. The 404 response code displayed like this:

Figure 5. Google Chrome Inspector

In the status column, besides 404, many other status codes are indicated – this lists all the resources that the page loads: requested files, images, etc.

2. At https://httpstatus.io/

At one time allows you to check up to 100 URLs:

Figure 6.httpstatus.io

3.Chrome plugin

Free Redirect Path extension in Google Chrome. The extension displays 301, 302, 404, and 500 HTTP response codes, as well as redirects performed by JavaScript.

An example of displaying a 404 error in the Redirect Path extension:

Figure 7. Redirect Path

Four tools to check your site for 404 errors

There are tons of different tools and plugins for handling 404 errors. In this article, we present the main ones that we use on a daily basis:

1.Google search console

To check if the site has 404 errors, go to the Indexing Report – section “Coverage” – “Error”:

 

Figure 8. Google Search Console – Indexing Report – Coverage

Each error section contains a list of pages with the date of the last scan. To get more data about the URL, you need to click “Check URL” in the Search Console report next to the URL:

Figure 9. Search Console – Coverage – Error – Check URL

According to the Search Console help, 404 errors divided into two groups based on their finding by the search robot:

  • The submitted URL not found (error 404) means that the URL is in the Sitemap.xml file, and its indexing not prohibited in the Robots.txt file by the robots meta tag. That is, with permission to index, a request was sent to process the page, but it was not found. In this case, Google will show the source of the 404 error detection.

Example:

Figure 10. The source of detection of 404 errors indicated, the report “Sent URL not found”.

  • Not Found (404) – means the URL was found by Google without a crawl request and the help of the sitemap file. Googlebot may have discovered this URL in a link on another site. Accordingly, information about the source of the 404 error will be hidden.

Example:

Figure 11. Source of 404 error detection is hidden, the report “Not found (404)”

2.Serpstat.com

Serpstat allows you to audit a project. We usually use this service when you need to quickly assess the state of a project, to understand its strengths and weaknesses. After adding a project and parsing the site, the service will provide a list of errors with explanations, recommendations for their elimination, and gradation (high, medium, and low priority).

So, an audit of the small western project showed a fairly large number of 404 errors:

Figure 12. Serpstat.com 404 errors

The reason was that the blog on the site changed the principle of forming URLs for articles, but did not update URLs in links on one of its subdomains:

Figure 13. Source of the 404 error, Serpstat.com

3.Ahrefs (Broken Links)

Broken Link Checker tool shows internal and external broken links. Displays up to 10 URLs for free

Figure 14. Ahrefs Broken Link Checker

4.Screaming Frog and Netpeak Spider scanners

We use these programs mainly when we analyze the technical optimization of a site and form a TOR for programmers. They allow you to parse the entire site and unload charts in the context of each error.

Figure 15. 404 errors, Screaming Frog

How to eliminate the 404 errors

To eliminate the 404 errors on the site, you must determine their source, i.e., find the location of broken links. You can use any crawler to do this. After completing the parsing, it is necessary to filter by the response code – 404 in the list of all site URLs. Thus, you will get a list of all 404 errors and internal links to them. It is recommended to compare this list with the Google Search Console list.

In the case of Soft or false 404 errors, recommended configuring the server response code 404. And, similarly, delete or update links leading to them.

An example of displaying data in the Google Search Console after removing 404 errors:

Figure 16. Elimination of the 404 errors

How to avoid the emergence of the 404 errors

To prevent the appearance of 404 errors, we recommend that you follow the following rules for the most common situations:

Case 1 – Changing the principle of URL formation

  1. Configure 301 redirects from the “old” URLs to the actual page addresses.
  2. Remove the page’s old URLs from the Sitemap.xml file.
  3. Add the actual page URLs to the Sitemap.xml file.
  4. Update addresses of all internal links on the site.

Case 2 – Working with PERMANENTLY missing products (brands or other similar entities)

There is no right approach to managing missing items. The choice of optimization option depends on the priorities and the degree of acceptable risk.

Consider the commonly used approaches to managing missing items:

  • Setting the response code 404 – when deleting non-relevant goods and configuring the HTTP code 404 or 410, the site will lose positions on these product requests, which may lead to a shortfall in sales. 

It’s a good scenario when people go to the site using the “out of stock” product page and choose a different version of this product or a different product. Also, if there are external backlinks to product pages, they will be lost if the pages deleted. When using this approach, we recommend that you delay setting up the 404 response code for as long as possible. That is, for some time, “do not disconnect” goods that are not in stock. Even though the product is out of stock, such pages will still rank and can attract traffic. An example of an algorithm of the tuning:

  1. Within one year from the moment of absence of the product, the page continues to be accessible and gives a response code of 200.
  2. In the second year, the link to the page of the missing product is removed from the product listing, filter panel, Sitemap, etc., but remains available at the direct URL and gives the code 200.
  3. For the third year, the product is removed from the product base, setting the response code 404 or 410 and removing links to it from the entire site.

Timings are conditional. The goal is to provide the maximum possible ranking period for requests for a missing product.

  • Setting up a 301 redirect to a category or main page creates a bad experience because, in fact, it is a redirect to a page irrelevant to it. According to the requirements of the search engine – a 301 redirect should redirect to a page with similar content.Redirecting to an irrelevant page can lead to 404 false errors that mislead not only users but also search engines.
  • Setting up 301 redirects to pages with similar content. From the point of view of the result and following the “do no harm” rule, the most optimal option is to put a redirect on similar product pages, for example, on a newer version of the same product. This approach will keep the outcomes in the search results and save the existing backlinks. At the same time, it is important not to miss the fulfillment of the basic requirements:
  1. Refresh links in Sitemap.
  2. Update internal links on the site.

Case 3 – Working with TEMPORARILY missing items

In the case when products periodically leave stock and return, setting up a 404 server response code or configuring a 301/302 redirect is extremely risky because it can take a very long time to renew the ranking of product pages.

In this case, it is better to optimize such pages, for example:

– Implement the markup of structured data ‘In Stock’:

Figure 17. Structured data markup

– offer alternative products that are available;

– change the order of output of goods in the listing: missing products displayed at the end of the listing;

– add a new filter so that the user can independently filter products by availability.

Case 4 – The need to remove any pages on a permanent basis (service, generated by CMS, unnecessary, not bringing traffic, etc.)

  1. Configure the 410 server response code for removed pages.
  2. Remove pages from Sitemap.xml.
  3. Delete all internal links on the site that lead to removed pages.

How to create a 404 page

Error pages create a poor user experience, but an efficient custom 404 page can mitigate user frustration and encourage further exploration of the site. 

What should be an optimized 404 page, basic recommendations:

  1. When requesting a page that does not exist, the server should return error 404, “page not found.”
  2. Contain a clearly worded error message and an apology for the error.
  3. The page should be in the overall concept of the site, have an identical design to the site.
  4. Contains simplified navigation, the transition to the Home page, and relevant pages of the site
  5. Contain a site search box.
  6. Contain contact details (phone numbers, e-mail), if appropriate.

Examples of creative and informative 404 pages

Olx.ua offers to play tic-tac-toe:

Figure 18. The 404 olx.ua page

Laconic design of the 404 page of Karabas.com:

Figure 19. The 404 Karabas.com page

Comfy.ua has saved all navigation in the header, and indicated the possible reasons for the 404 error:

Figure 20. The 404 comfy.ua page

A disgruntled cat, wiggling its ears and blinking its eyes, will meet visitors of the 404 at html6.com.ru:

Figure 21. The 404 html6.com.ru/404 page

Crello “dumps” all responsibility on pigeons =) and offers to go back or create a new design directly from the 404th page:

Figure 22.The 404 crello.com/en/404 page

Canva offers to solve the sunset puzzle:

Figure 23. The 404 page canva.com/404

Pixar suggests stopping crying by depicting a bellowing girl from their cartoons on page 404:

Figure 24. The 404 pixar.com/404 page

Marvel’s 404 page also shows one of their heroines (scared):

Figure 25. 404 page marvel.com/404

And in Figma, the 404 error caption can be stretched by pulling on the points marked in the image.

Figure 26. 404 page figma.com/404/

Airbnb’s page 404 is also very creative: the girl drops ice cream on the floor, and the smile on her face gives way to an emotion of sadness:

Figure 27. 404 page airbnb.com/404

CONCLUSIONS

The 404 pages themselves do not harm SEO, but links leading to 404 errors, especially if there are many, can negatively affect a site’s organic search performance. If you do nothing, you can lose visitors and sales. Make sure the site has no broken internal links.

Make your 404 page attractive to keep the user on your site.

Connect the service «SEO | Search Engine Optimization» from #ADINDEX