Create

Customize

Remove objects

Product shadows

Change background color

Scale

Popular tools

Photo editing

Easy photo editing for individuals, growing brands, and enterprises alike: remove backgrounds, resize, polish, and create impactful product visuals in minutes.

Use cases

Product listing

Create ad visuals

Drive online sales

Increase brand visibility

Create social media posts

By industry

Tools

Marketplace playground

Image editing playground

ROI calculator

For your business

Customer stories

Discover how enterprises, small businesses and entrepreneurs achieve professional results with Photoroom.

Inside Photoroom Stories from the teamThe Hunt for Cheap Deep Learning GPUs

The Hunt for Cheap Deep Learning GPUs

July 12, 2020

Recently, I have been struggling to find cheap and reliable GPUs to train deep learning models. In this article, I will summarize the options you have to run deep learning computations on GPUs.

Not too long ago, you could rent a beefy GPU machine for 100€/month. Hetzner, a German server provider, was offering those specs:

It was fast and reliable. The good times. However, they discontinued this offering. Nowadays, if you want to get a GPU for deep learning, you have several options:

Use a cloud provider (GCP, AWS, Azure)
Use a cloud provider with preemptible machines
Rent a bare metal machine
Build your own

Foreword

Hetzner offered cheap and reliable servers. They had a good reputation. Why did they stop? While there is no official reason, it is likely that changes in the Nvidia's license is the reason. NVIDIA updated their license to ban the use of consumer GPUs (e.g. 1080, 2080 models) in their data centers. Therefore, most large server provider stopped offering cheap GPU servers.

Using a Cloud provider

Google Cloud, AWS and Azure all offer GPU machines. This is the most expensive option in our list. In theory, you can scale your cluster's size on demand. They offer GPUs for training (V100) and inference (T4).

My experience: some providers run unscheduled maintenance on your machine. It means they will need to kill your instance to migrate it to another (but keep the content of the disk). You get a 1 hour termination notice for GCP, more for the others. It very inconvenient when you start a large training over the weekend, only to realize that your machine has been killed on Friday evening. On top of that, some regions sometimes run out of GPUs. This means that when attempting to create a machine, it will fail. This does not happen often, but when it does it is very annoying.

Pros:

Scaling on demand (limited by quota and availability)
Can pick any number of CPUs (useful for preprocessing-intensive jobs)

Cons:

Unscheduled maintenance is a pain(1 hour notice for GCP, ~24hours for AWS, can happen once a week)
Expensive

Using preemptible instances

Most cloud providers offer preemptible machines, with a significant discount (at least 50%, often more). In exchange, you accept that your machine can be killed at any moment. It is not very convenient when training models and saving the checkpoints every epoch. Working around that takes a lot of engineering.

My experience: my instances are sometimes killed in less than an hour, making it unusable. Try it out and see if it works for you (might depend on the region)

Pros:

Cheaper
Scalable

Cons:

Machine can be killed at any moment

Renting a bare metal machine

Some providers are still offering consumer GPUs, officially not for deep learning. A Google search will yield plenty of them. You can also look here. The price vary from provider to provider.

My experience: Reliability is not great. I made the mistake of using one of those servers as a production server. Then, it went down on a Saturday at 1 am. Here is the support's answer:

YMMV, and you must make your own arbitrage between price and reliability.

Pros:

Cheap, plenty
No weekly scheduled maintenance

Cons:

Sometimes unreliable (YMMV)
Does not scale quickly as with a regular cloud provider (need to order the machine, sometimes need a monthly commitment)

Subrenting a server

I never tried this, but vast.ai is a marketplace offering very affordable prices. Anyone can list a GPU there, therefore I am not exactly sure how reliable it is.

Build your GPU server

If you have the time and the rack space, building your own GPU machine might be the cheapest option. Depending on how cheap you need to go, keep an eye for used GPUs on eBay. Keep in mind that you will have to pay for electricity and that having a noisy machine heating your office in the middle of summer is the best way to turn your colleagues into enemies.

Pros:

Cheapest option (depending on electricity cost)
Custom specs (useful if you need plenty of storage)

Cons:

Time consuming
Not convenient (noise, heat)

What we ended up doing at Photoroom

For training, we built our own machine (using 2080 TIs). For larger training, we use GCP with V100s and cross fingers that there will not be any maintenance event. For inference, we use GCP's T4 GPUs, in a managed instance group. This means that if they need to kill a machine for maintenance, they will automatically spin up a new one.

Conclusion

Please keep in mind that I am not endorsing any of those options, pick one at your own risk. In the end, it's a trade-off between price, convenience, reliability and scalability. Also note that running inference on CPUs can be cheaper. A few helpful links:

Any idea on how to improve this ? Any comment? Reach out on Twitter

Eliot AndresCo-founder & CTO @ Photoroom

Design your next great image

Whether you're selling, promoting, or posting, bring your idea to life with a design that stands out.

Start a design

Keep reading

Improving the Loading Experience in SwiftUI

Vincent Pradeilles

Photoroom launches 3 new AI tools for product photography

Aisha Owolabi

Redesigning an Android UI used by millions

Aurelien Hubert

How we automated our changelog thanks to ChatGPT

Jeremy Benaim

Explore new Gen AI features coming soon to the Photoroom API

Udo Kaja

Working smarter with AI-led qualitative analysis at Photoroom

Cori Widen

AI Images: a visual toolkit for businesses

Jeanette Sha

What's new in product: September 2023

Jeanette Sha

The value of values: What we learned from an afternoon spent drawing Axolotls

Lauren Sudworth

We’re training a text-to-image model from scratch and open-sourcing it

Jon Almazán

See all