Right now, no-one will blame you for thinking:
Who goes for data center servers these days? Why don’t I just get residential proxies?
Residential servers are indeed the current buzz of proxy server solutions, and they do have a great deal to offer. However, that doesn’t negate the positives that data centre servers provide. For many years data centre servers have been the backbone of every proxy server solution. They’re still the most common servers out there.
So what are data centre servers? What’s the difference between data centre servers and residential servers? How can data centre servers be preferential over residential servers?
What are data centre servers?
Data centre servers offer the most basic requirement of any proxy server: to act as the man-in-middle, shielding your IP address, and providing you with a new one. Sometimes a data centre server may be in a completely different location altogether, e.g. you can be located in the US, and your proxy server is in Japan. This flexibility opens up the possibility of accessing blocked sites and data that would otherwise not be available to you.
What are the benefits of data centre servers?
And this leads to the main benefit of data centre servers. That’s the anonymity they offer which makes them perfect for scraping publicly available data. By the way, we have a dedicated page if you need more information about web scraping basics.
So let’s get back to how data centre servers can be better than residential proxies.
First of all, they tend to be much less expensive than residential servers. You can purchase data centre servers for just a few dollars per month. However, that does depend on their location. For example, the US has the highest number of data centres and servers, making the cost there very low. If you’re looking for servers in sub-Saharan Africa or other locations without a reliable infrastructure, then you may well have to pay out much, much more. It’s all down to supply and demand. The more servers that are available, the lower the cost.
Data centre servers can often be charged just by the number of servers you have, rather than bandwidth as you find with residential servers. That’s not to say there isn’t usually some form of bandwidth limitation (just to protect the provider in case your software runs amok). But if you’re just scraping small amounts of data per query the chances that you’ll max out your bandwidth allocation are slim. So they’re easier to manage from a billing aspect as you can have much greater clarity over what your charges are going to be.
The vast majority of residential servers are also dynamic, so you won’t be able to hold onto the same IP address if you need it for a specific task. Now, this may be fine if you’re just looking for data that you can access directly from a page in one hit. However, if you need to hold and IP address to paginate through to either get the data you need or complete a specific task, then you need static IP addresses. Data centre servers come into their own here as you can quickly get a list of static IP addresses.
How to use data centre servers effectively.
You’ll need enough servers to manage your scrapes. I can’t emphasise this strongly enough. Do not abuse your servers or you risk them becoming blocked and useless to you. Your provider may not replace them for you.
Exactly how many you need depends on just three things:
- How sensitive your target site is to scraping.
- The number of requests you need to make.
- How long you have to complete your scrape.
Some sites, like Google and Amazon, can safely handle around 1 query per minute, per IP address. Some sites may be more sensitive. You may need to either spread out your page requests over a more extended period, extending the time taken to complete your report, or but more proxies to handle the load.
Another critical point is the quality of the servers, not just in terms of hardware but if they’ve been handled poorly in the past. They may have been abused to the point where they’re either blocked or even worse, blacklisted.
Don’t go any further without checking out our guide to responsible scraping. Here we’ll go into the details of how human emulation settings are essential to successful scraping.
Leading on from that, many data centre providers offer rotating proxies, either on-demand or after a set amount of time. They may advertise these as ‘fresh’ servers. While this sounds good on the surface, you need to be cautious when proceeding as the ‘new’ servers you receive will only be new to you; there is no such thing as an unused, or virgin, IPv4 address. Servers also may not be checked against any blocks for your specific use.
Some sites may say that replacing your server pool often is an excellent way to avoid detection. And while it might be a good idea to replace your server list, it’s far more critical to have high-quality human emulation settings in place. If you’re scraping correctly, you should never need to replace your server list.
The downsides to data centre servers
Are there any negatives to using data centre servers? Unfortunately yes.
Residential IP addresses are easily identifiable as originating from an ISP. Requests from them are therefore harder to distinguish from real humans. Data centre IP’s are easily recognisable as originating from a data centre, and websites know those are used often for web scraping and automated queries.
In and of itself, this difference means nothing. But if a site is especially sensitive to being scraped, then the website may view requests from data centre IP’s negatively and place more restrictions on them.
However, you could alleviate this by having the broadest possible range of C-Classes combined with proper human emulation settings in place. Having robust settings helps your requests to avoid tripping their detection algorithms, and you can fall under their radar.
In summary, data centre servers are less expensive than residential servers and are preferred, if all other considerations remain equal. They are also easier to obtain if you need smaller numbers for low volume tasks. You can also control individual IP addresses and hold on to them. This control makes them more useful for a broader range of functions.
In general, data centre providers are also more open to replacing the servers for you if you have problems, so long as the problem hasn’t been caused by yourself through misuse of the servers.