How to train, predict and host your GTP-2 model for free

John (Juan) Tubert
3 min readJul 15, 2019

--

Before you start reading my article, I would like to call out that I am not a data scientist nor a python expert, but I do have many years of experience with different technologies and a passion for AI and machine learning.

While researching a personal project, I ran into GTP-2, which is described as such in their site: “Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text” You can read more about it here: https://openai.com/blog/better-language-models/.

This article is not about GPT-2, but I did use a GPT-2 model, but you can use any others as well. This article is about how you can train a model (using Google Colaboratory), create a restful endpoint (using flask), host it online (using Google Cloud Run), and finally predict a result on a Slack/Twitter bot or website.

Since at this point it’s just a prototype, while I want it live for other people to be able to play with it, I don’t want to spend a lot of money on hosting fees and VMs. So my goal was to train the model and host it online for free.

For the training I ended up using Google Colaboratory (https://research.google.com/colaboratory/faq.html) which allows you to run your code for free. Colaboratory is a machine learning tool, basically a Jupyter notebook that doesn’t require any setup. I spent a lot of time here creating and testing my model.

Once the model was generated, I saved it on Google Drive, so then I can use it when needed and it will still be available after the Colaboratory was restarted. You can easily save your model to drive using shutil.

shutil.copyfile(“model.tar”, “/content/drive/My Drive/” + “model.tar”)

Now I can do my predictions on Google Colaboratory but as my website or slack bot it’s not able to access this as a service, I needed to move this to a server.

To create a prediction API I wrote a simple flask application that return the results on GET. Then to easily test it locally I created a docker container. Once everything was working locally I deployed my container to Google Cloud Run, Cloud Run it’s similar to a lambda function in AWS. Basically it runs when executed and can scale up and down as needed.

Before deciding to go with Google Cloud Run, I tried Google App Engine, that didn’t work and Google Cloud Compute which ended up costing about $70 per month in hosting cost (which I didn’t want to pay at this stage of the project).

If you decide to run it on Compute Engine, make sure you also install Nginx, otherwise it won’t run or it will run too slow.

The next step was to set up continuous integration so I can easily deploy my code each time I commit a new change on git. To do this I set up a Google Cloud Build trigger. You can set up your triggers using Github, Bitbucket or Google Cloud Source repo. To automatically start a build on commit, all you have to do is set up the trigger and on your repo have a simple yaml file that can look like this (if you are deploying to Cloud Run). The yaml file creates a docker image, pushes the image to the Google Container Registry, then deploys it to Google Cloud Run. In order to keep my repo small, I also decided to host the model itself in a bucket in Google Storage. Then when the docker image was created, it would copy the model first from the bucket. You can see here how my Dockerfile looked like.

Now that everything is set up, I can create new models in Google Colaboratory, upload them to a bucket, then make changes to my code locally and push them to git, which will start the build and push the latest code to Cloud Run.

Hope you found this article helpful and feel free to contact me if you have any questions.

List of technologies mentioned in this article:

Google:

  • Cloud Platform
  • Cloud Run
  • App Engine
  • Cloud Compute
  • Cloud Build
  • Cloud Source
  • Storage
  • Container Registry
  • Colaboratory

Other

  • Docker
  • shutil
  • Github
  • Bitbucket
  • GPT-2

--

--

John (Juan) Tubert

Chief Technology Officer @ Tombras, New York / Creative Technologist, passionate about Metaverse, Web3, chatbots and AI.